‘Selection’ as a Causal Process
Variations in …
* Pressure
* Strength
* Technique
* Hand
Variations are selected.
* i.e., become more or less probable based on how well they work.
* Behaviour is changed by experiences in the environment.
Shaping (Behavioural Selection)
Differential reinforcement of successive approximations of a target behaviour.
E.g. Training a rat’s lever press
1. Reinforce approaches to lever
2. Reinforce sniffing the lever
3. Reinforce touching with paw
4. Reinforce a full depression of the lever
Note: Extinction of earlier steps can aid shaping because of the increased variability extinction produces.
Why are clickers important in dog training?
Marking the Behaviour
“Clickers” are used so that an immediate conditionally reinforcing consequence is provided.
* Aids learning by providing immediate reinforcement and preventing accidental reinforcement of other behaviours.
Tips for Shaping
Explain the kitten experiment
Shaping of Motor Coordination
Superstitious Behaviour
Shaping
Behaviour that occurs even though it does not produce the a consequence.
* By-product of accidental reinforcement.
* Possibly maintained by
- Intermittently occurring advantageous reinforcement.
- Negatively reinforcing escape behaviours.
Human laboratory examples:
* Catania & Cutts (1963)
* Wagner & Morris (1987)
* Bruner & Revusky (1961)
Epstein, Kirshnit, Lanza & Rubin (1984) - ‘Insight’ or History of Reinforcement?
Behaviour emerged only when trained to…
* 1) Push box to various locations with a green spot.
- Pushing was extinguished in absence of green spot.
* 2) Climb box and peck banana.
Note: The birds were not trained to push the box towards the banana.
Creativity (i.e., Behavioural Variability)
Some researchers have argued that reinforcement produces response stereotypy.
Original Task (Schwartz (1982):
* Get the red square to the bottom right corner.
* Only allowed to press each key 4 times.
Problems with original task:
* The task is constrained by 4 responses on each key.
- i.e., Out of 256 possible sequences only 72 are reinforced.
* 5 presses resulted in a time-out from reinforcement (negative punishment)
- May have punished many instances of varied responding.
True or False
Variability is a dimension of behaviour that can be reinforced.
True
True or False
Reinforcement inhibits creativity
False
Some studies claim to reinforce creativity but haven’t actually made the reinforcement contingent on the creative/novel behaviour.
* i.e., They just reward performing the task and claim they have rewarded creativity.
* Often used as “evidence” that reinforcement inhibits creativity.
- E.g. Kruglanski et al. (1971)
- E.g. Amabile (1982)
Verbal Behaviour
The study of how language functions and is reinforced
SD - Salt on table, out of reach
R - Says “please pass the salt”
SD - Hearing “please pass the salt”
R - Passes the salt.
SR = SD - Receives salt
R - Says “thank you”
SR - Hearing “thank you”
What behaviour modification principle is machine learning based on?
Shaping