GLOBAL RESEARCH SYNDICATE
No Result
View All Result
  • Login
  • Latest News
  • Consumer Research
  • Survey Research
  • Marketing Research
  • Industry Research
  • Data Collection
  • More
    • Data Analysis
    • Market Insights
  • Latest News
  • Consumer Research
  • Survey Research
  • Marketing Research
  • Industry Research
  • Data Collection
  • More
    • Data Analysis
    • Market Insights
No Result
View All Result
globalresearchsyndicate
No Result
View All Result
Home Data Analysis

DeepMind Found New Approach To Create Faster RL Models

globalresearchsyndicate by globalresearchsyndicate
September 1, 2020
in Data Analysis
0
DeepMind Found New Approach To Create Faster RL Models
0
SHARES
7
VIEWS
Share on FacebookShare on Twitter

Recently, researchers from DeepMind and McGill University proposed new approaches to speed up the solution of complex reinforcement learning problems. They mainly introduced a divide and conquer approach to reinforcement learning (RL), which is combined with deep learning to scale up the potentials of the agents. 

For a few years now, reinforcement learning has been providing a conceptual framework in order to address several fundamental problems. This algorithm has been utilised in several applications, such as to model robots, simulate artificial limbs, developing self-driving cars, play games like poker, Go, and more. 

Also, the recent combination of reinforcement learning with deep learning added several impressive achievements and is found to be a promising approach to tackle important sequential decision-making problems that are currently intractable. One such issue is the amount of data needed or an RL agent to learn to perform a task.



Behind the Approach

In this project, the researchers discussed that the range of problems the RL agents can tackle could be significantly extended if they are endowed with the appropriate mechanisms to leverage prior knowledge. The framework is basically based on the premise that an RL problem can usually be decomposed into a multitude of “tasks.” 

The researchers generalised two fundamental operations in RL, policy improvement and policy evaluation, from single to multiple operands, i.e. tasks and policies, respectively. According to them, the generalisation of these two fundamental operations underlying much of RL, which is policy evaluation and policy improvement allows the solution of one task to speed up the solution of other tasks. 

The Generalised policy evaluation (GPE) is the computation of the value function of a policy on a set of tasks. The generalised version of these two procedures are jointly referred to as “generalised policy updates,”

The generalised policy updates make it possible to reuse the solution of tasks in two distinct ways. They are-

  • When a task’s reward function can be approximated as a linear combination of reward functions of other tasks, the reinforcement learning problem can be reduced to a simpler linear regression which is solvable with only a fraction of the data.
  • When the linearity constraint is not satisfied, the agent can also leverage the solution of tasks. In this case, by using them to interact with and learn about the environment. This can also considerably reduce the amount of data needed to solve the problem.

The researchers combined these two strategies in order to produce a divide-and-conquer approach to RL that can assist in scaling the agents to problems that are currently intractable due to issues like lack of data. 

They stated, “If the reward function of a task can be well approximated as a linear combination of the reward functions of tasks previously solved, we can reduce a reinforcement-learning problem to a simpler linear regression.” 

Researchers further added, “When this is not the case, the agent can still exploit the task solutions by using them to interact with and learn about the environment. Both strategies considerably reduce the amount of data needed to solve a reinforcement-learning problem.”

See Also


The Outcome

In this paper, the researchers showed the possible ways to efficiently implement GPE and GPI and discussed how their combination leads to a generalised policy whose behaviour is modulated by a vector of preferences. 

Also, the vector of preferences is considered to be the solution of a linear regression problem. This reduces a reinforcement learning task to a much simpler problem that can be solved using only a fraction of the data.

Wrapping Up

The researchers proposed a divide and conquer approach where they generalised two fundamental operations in RL, policy improvement and policy evaluation that can be used to speed up the solution of a reinforcement learning problem. The strategy is also claimed to improve the sample efficiency if the mapping from states to preferences is simpler to learn than the corresponding policy.

The source code that is used to generate all of the data in this research is available in GitHub. Get the source code here.

Provide your comments below

comments


If you loved this story, do join our Telegram Community.


Also, you can write for us and be one of the 500+ experts who have contributed stories at AIM. Share your nominations here.

Ambika Choudhury

Ambika Choudhury

A Technical Journalist who loves writing about Machine Learning and Artificial Intelligence. A lover of music, writing and learning something out of the box. Contact: [email protected]

Related Posts

How Machine Learning has impacted Consumer Behaviour and Analysis
Consumer Research

How Machine Learning has impacted Consumer Behaviour and Analysis

January 4, 2024
Market Research The Ultimate Weapon for Business Success
Consumer Research

Market Research: The Ultimate Weapon for Business Success

June 22, 2023
Unveiling the Hidden Power of Market Research A Game Changer
Consumer Research

Unveiling the Hidden Power of Market Research: A Game Changer

June 2, 2023
7 Secrets of Market Research Gurus That Will Blow Your Mind
Consumer Research

7 Secrets of Market Research Gurus That Will Blow Your Mind

May 8, 2023
The Shocking Truth About Market Research Revealed!
Consumer Research

The Shocking Truth About Market Research: Revealed!

April 25, 2023
market research, primary research, secondary research, market research trends, market research news,
Consumer Research

Quantitative vs. Qualitative Research. How to choose the Right Research Method for Your Business Needs

March 14, 2023
Next Post
Satellite Communication for IoT Networks

2020 Global Biodiesel Market Outlook with Focus on the Impact of COVID-19

Categories

  • Consumer Research
  • Data Analysis
  • Data Collection
  • Industry Research
  • Latest News
  • Market Insights
  • Marketing Research
  • Survey Research
  • Uncategorized

Recent Posts

  • Ipsos Revolutionizes the Global Market Research Landscape
  • How Machine Learning has impacted Consumer Behaviour and Analysis
  • Market Research: The Ultimate Weapon for Business Success
  • Privacy Policy
  • Terms of Use
  • Antispam
  • DMCA

Copyright © 2024 Globalresearchsyndicate.com

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Cookie settingsACCEPT
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT
No Result
View All Result
  • Latest News
  • Consumer Research
  • Survey Research
  • Marketing Research
  • Industry Research
  • Data Collection
  • More
    • Data Analysis
    • Market Insights

Copyright © 2024 Globalresearchsyndicate.com