Thorndike
Was interested in animal intelligence and whether they learned through trial and error or some other process. He put cats in a puzzle box.
Supported reinforcement because correct responses were “stamped in” and incorrect responses were “stamped out”
Contiguity
When learning occurs because two events or stimuli are experienced close together in time and space. This leads to an association between them.
Reinforcement
Increases the likelihood of a behaviour by adding a positive outcome, or removing an unpleasant one
Punishment
Reduces the likelihood of a behaviour by adding an undesirable outcome, or removing a pleasant one
Law of Effect
If a reward follows a response, it will increase over time.
If a response has no reward, or an unpleasant reward, the response will decrease.
Law of Readiness
An organism’s motivation to perform a behaviour affects how easily it learns that behaviour
Law of Multiple Responses
When in a new situation, an organism may try multiple responses until it finds one that leads to the best possible solution
Law of Set or Attitude
Previous experiences can affect how new stimuli are perceived
Law of Prepotency of Elements
The capacity to selectively focus on significant details in a situation or an environment while ignoring the irrelevant parts
Law of Response by Analogy
When new learning situations are approached using past (similar) experiences
Associative Shifting
Gradually shifting the response from one stimulus to another through a series of intermediate steps
Theory of Connectionism
Learning happens by forming associations between stimuli and responses
B.F. Skinner
Separated classical from instrumental conditioning
Classical Conditioning
Responses (respondents) are produced involuntarily (reflexes)
Pavlov
Operant Conditioning
Other behaviours are operants because they are voluntary actions that operate on the environment to produce a consequence.
The consequence then influences whether the behaviour repeats.
He designed an experimental chamber where a machine measured behaviour. Responses were recorded and generated a response rate, measuring the strength of learned behaviour over time.
Skinner
What are the three primary assumptions of behaviourism?
Reinforcer
Any stimulus or event that increases the likelihood of a behaviour repeating
Positive = reward
Negative = relieving discomfort
Reinforcement Schedules
When an organism experiences reinforcements in response to specific behaviours
Continuous
Rewarded each time the desired behaviour is performed.
Helps associate the behaviour with a positive outcome. Reinforces the behaviour and increases the likelihood of it being repeated.
Fixed Ratio
Reward for every ‘x’ number of times.
This schedule encourages a high production rate to receive the reward.
A step pattern of work and break. Second best response to time.
/
\
/
\
/
Variable Ratio
Reward for an average of ‘x’ number of times. This produces the best work to time ratio.
/
\
/
\
/
\ /Fixed Interval
Rewarded after a time period.
Motivates consistent performance throughout the interval to receive the reward
A scalloped pattern. Worst response to time rate.
)
\
)
\
)
Variable Interval
Reward after an average of ‘x’ period of time. Second worst work to time rate.
/
\
/
\
/
Random Ratio
The randomness adds an element of excitement and uncertainty.
The unpredictability keeps the player engaged and motivated to continue playing. These can also include rewarding activities.