Learning
Acquiring new knowledge or skills and improve one’s performance
A robot can learn about .. (2)
(3) benefits of learning
3 Forms of Learning
1. Supervised learning
With external supervisor/teacher
In- and output pairs are presented & the function between these pairs is learned
3 Forms of Learning
2. Unsupervised learning
All information should be taken from the inputs alone
-> it can be useful to preprocess inputs, e.g. divide in meaningful clusters
3 Forms of Learning
3. Reinforcement learning
With an evaluation signal.
What is feedback in supervised learning?
The target action or output for a specific input.
Example of supervised learning + name
Neural network learning
-> the weights of the connections between nods are learned (connectionist learning)
How did ALVINN learn to drive?
ALVINN steers how it think it should Humans show how it should steer ALVINN computes the error ALVINN uses this error to update weights -> REPEAT
Disadvantage of supervised learning (3)
4 Characteristics of Reinforcement Learning
5 Key features of Reinforcement Learning
4 Characteristics of a complete agent
4 Elements of Reinforcement Learning (inwards to outwards)
Actuator space
Set of all possible actions
When the robot knows/learned which action to perform in each state it has learned a …
Reactive controller
Exploration (RL)
In order to learn the optimal action, the robot has to try everything (trial and error)
Exploitation (RL)
Simultaneously to exploration, the robot should perform well and exploit what it has learned.
Once mapping between inputs and actions is learned, the robot can just exploit the learned knowledge and stop exploring.. right?
No, not always. There might be (1) sensor errors (uncertainty) and (2) a changing environment.
Exploitation/Exploration dilemma (Trade off between..(2))
What is learned in RL? Consider robot’s actuator and sensor space!
The robot learns a value-function (possibly in table) with all possible state-action pairs along with their Q-values.
Q-value
Grows if good things happen and shrinks if bad things happen.
When is the RL table learning method efficient (2)?
2. States and actions are discrete
What if table learning method alone is inefficient?
Combine RL with function approximators such as neural networks