Reinforcement Learning Flashcards

Question 1

Q

What is reinforcement learning?

Answer

A

type of machine learning that allows an agent to learn in an interactive environment by trial and error

Question 2

Q

How does a reinforcement learning model operate?

Answer

A

uses rewards and punishments as signals for positive and negative behavior. can start from a blank slate and under the right conditions achieve superhuman performance

Question 3

Q

What is the goal of reinforcement learning?

Answer

A

Find a suitable action model that would maximize the total cumulative reward of the agent

Question 4

Q

What are use-cases for reinforcement learning models?

Answer

A

robotics, business strategy planning, traffic light control, web system configuration, and aircraft and robot motion control

Question 5

Q

What are the key components in a reinforcement learning model?

Answer

A

environment, state, reward, policy, and value

Question 6

Q

What is an environment in a reinforcement learning model?

Answer

A

The world in which the agent operates and learns. It’s everything outside the agent that the agent interacts with

Question 7

Q

What is the “state” in a reinforcement learning model?

Answer

A

The current situation or condition of the environment. It represents all the information available to the agent at a given time

Question 8

Q

What is a “reward” in a reinforcement learning model?

Answer

A

A feedback signal that indicates how well the agent is performing. It’s usually a scalar value that the agent tries to maximize over time

Question 9

Q

What is a “policy” in a reinforcement learning model?

Answer

A

The strategy or set of rules that the agent follows to decide which action to take in a given state. It’s essentially the agent’s behavior

Question 10

Q

What is the value in a reinforcement learning model?

Answer

A

An estimate of the expected cumulative reward an agent can obtain from a given state. It helps the agent evaluate the long-term desirability of states and actions

Question 11

Q

How does a reinforcement learning model operate?

Answer

A

Environment and State: The environment provides the context in which the agent operates. At each time step, the environment is in a particular state. The agent observes this state (either fully or partially) and uses this information to make decisions.
Policy: Based on the observed state, the agent uses its policy to choose an action. The policy can be deterministic (always choosing the same action in a given state) or stochastic (choosing actions with certain probabilities).
Action and New State: The chosen action is applied to the environment, which then transitions to a new state. This new state is a result of the agent’s action and any inherent dynamics of the environment.
Reward: After taking an action, the agent receives a reward from the environment. This reward is a signal indicating how good or bad the action was in that particular state.
Value: The value function estimates the expected cumulative reward from a given state, assuming the agent follows its current policy. It helps the agent think long-term, beyond just the immediate reward.
Learning Process: The agent’s goal is to learn an optimal policy that maximizes the expected cumulative reward over time. It does this by updating its estimates of state values and/or action values based on the rewards it receives, and then improving its policy based on these updated estimates.

Reinforcement Learning Flashcards

(11 cards)