Week 5 - Operant Conditioning (Basic Principles) Flashcards by Phoebe McDonnell

What is Classical Conditioning?

Learning via association between stimuli, producing involuntary, reflexive responses

Example: Dog salivating when it hears a bell associated with food.

How well did you know this?

Not at all

Perfectly

What is Operant Conditioning?

Learning via consequences of voluntary behaviour, where behaviours are strengthened by reinforcement and weakened by punishment

Example: A rat pressing a lever to receive food

How well did you know this?

Not at all

Perfectly

What is the key difference between Classical and Operant Conditioning?

Classical = reflexes, associations, involuntary; Operant = decisions, consequences, voluntary.

How well did you know this?

Not at all

Perfectly

What happens to a behaviour followed by a reward according to operant conditioning?

It is repeated/increases in frequency.

How well did you know this?

Not at all

Perfectly

What happens to a behaviour followed by a punishment according to operant conditioning?

It is avoided/decreases in frequency.

How well did you know this?

Not at all

Perfectly

What is the Law of Effect?

Responses followed by satisfaction are more likely to recur, while those without satisfaction are weakened.

How well did you know this?

Not at all

Perfectly

Who expanded Thorndike’s ideas in operant conditioning?

B.F. Skinner.
- saw operant conditioning as a “theory of everything” for behaviour (though later shown to be incomplete)

How well did you know this?

Not at all

Perfectly

What is the Skinner Box?

An automated chamber for observing and measuring behaviour with levers, lights, speakers, and food dispensers.
- allowed continuous obervation and measurement of behaviour

How well did you know this?

Not at all

Perfectly

What is shaping in operant conditioning?

Reinforcing behaviours progressively closer to the target behaviour.

Rats don’t naturally press levers; how do we train such novel behaviour?

How well did you know this?

Not at all

Perfectly

What is an example of shaping?

Rewarding a rat for moving towards the lever, then only for touching it, and finally for pressing it.

How well did you know this?

Not at all

Perfectly

What is the significance of Skinner’s experiment with pigeons spinning?

Demonstrated that complex behaviours can be broken into simple reinforced steps.
* Pigeon Spin: By reinforcing partial turns, Skinner taught pigeons to spin in circles.
* Pigeon Ping Pong: Two pigeons pecked a ball back and forth, shaped through reinforcement.

How well did you know this?

Not at all

Perfectly

What do operant principles explain in real-world learning?

Animals forage where food is plentiful and avoid barren areas.
Learning is adaptive: instincts cover predictable environments, but learning is needed for variable environments

How well did you know this?

Not at all

Perfectly

What is superstitious behaviour?
Give an example.

Behaviour developed from random reinforcement, leading to illusory cause-effect beliefs.
* Skinner (1948): Randomly dispensed food to pigeons every 15 seconds.
* Pigeons developed arbitrary rituals (wing flapping, spinning, pecking), mistakenly believing these behaviours caused food.

How well did you know this?

Not at all

Perfectly

What is an example of superstitious behaviour in humans?

Athletes’ pre-game rituals or pressing pedestrian crossing buttons repeatedly.

How well did you know this?

Not at all

Perfectly

What are the techniques for teaching behaviour?

Shaping (Scan and Capture): Reinforce approximations of target behaviour.
Baiting: Use food or objects to lure behaviour (e.g., tapping, treats).
Mimicry: Reward imitative actions (e.g., parrots copying human sounds).
Sculpting: Physically guide an animal into the desired posture/behaviour, then reinforce.
Instruction/Language (humans): Verbal cues establish mental connections between actions and outcomes (e.g., “Eat veggies → get dessert”)

How well did you know this?

Not at all

Perfectly

What is chaining in operant conditioning?

Training behaviours in sequence, either forward or backward.

How well did you know this?

Not at all

Perfectly

What is the most effective method of chaining?

Study These Flashcards

Backward chaining (most effective):
- Teach the last step first so learner experiences the end goal quickly.
- Example: Tying shoelaces – child first pulls final loop through, then earlier steps are added progressively

What is a reinforcer?

Study These Flashcards

An outcome that increases the likelihood of behaviour.

What is a punisher?

Study These Flashcards

An outcome that decreases the likelihood of behaviour.

What is the difference between positive and negative in operant conditioning?

Study These Flashcards

Positive: Adding something
Negative: Removing something

Do NOT confuse positive/negative with good/bad:
Positive = addition; Negative = subtraction

What is the Operant Conditioning Contingency Square?

Study These Flashcards

Positive Reinforcement: Increases behaviour (e.g., dog gets treat for sitting)
Negative Reinforcement: Increases behaviour (e.g., car buzzer stops when seatbelt is fastened)
Positive Punishment: Decreases behaviour (e.g., barking dog gets spray collar)
Negative Punishment: Decreases behaviour (e.g., child loses iPad privileges)

What is bridging in operant conditioning?

Study These Flashcards

Combines classical and operant conditioning.
Pairing: The bridge is first paired with the actual reward (like a treat) until the learner associates the signal with the good outcome.
Marking the behavior: When the learner performs the correct behavior, the trainer immediately uses the bridge signal. This tells the learner exactly what they did right.
Bridging the delay: Since it’s not always possible to deliver the primary reward at the precise moment, the bridge provides the feedback that the reward is on its way.

What is Continuous Reinforcement?

Schedules of Reinforcement

Study These Flashcards

Reinforcing every response, good for learning new behaviours.

What is Partial Reinforcement?

Schedules of Reinforcement

Study These Flashcards

Reinforcing only some responses, creating resistance to extinction.

What is Fixed Ratio schedule? ## Footnote Types of Partial Schedules

Reinforced every nth response (e.g., café loyalty card).

What is Variable Ratio schedule?

Reinforced on average every nth response (e.g., gambling machines). ## Footnote Types of Partial Schedules

Which schedule is most resistant to extinction?

Variable Ratio. ## Footnote Most resistant to extinction → teaches persistence.

What is Fixed Interval schedule?

Reinforced after a fixed time if response occurs (e.g., weekly allowance). ## Footnote Types of Partial Schedules

What is Variable Interval schedule?

Reinforced on average after a time interval (e.g., checking social media). ## Footnote Types of Partial Schedules

What are the key effects of reinforcement schedules?

* Ratio schedules create faster learning than interval schedules because they link reinforcement directly to the number of responses, creating a stronger, more direct connection between the behavior and the reward * Variable Ratio = most resistant to extinction (basis for gambling addiction) * Fixed schedules show post-reinforcement pauses

Why is reinforcement more effective than punishment?

Reinforcement shapes long-term behaviour better than punishment.

What are some problems with punishment?

* Not as permanent * Reduces trust * Increases aggression * Risk of fear and learned helplessness

What are effective punishment guidelines?

* No escape possible * Intense (within limits) * Continuous * Immediate * Short in duration * No mixed signals * Reinforce alternative behaviours * watch for negative side effects

What are the reward variables that affect learning?

- Drive: Higher motivation (e.g., hunger) → stronger learning. - Magnitude (Size): Larger rewards = faster learning, but subject to diminishing returns. ○ Big rewards also → faster extinction if removed. - Delay: Immediate reinforcement is far more effective than delayed. - Explains difficulty resisting temptations (immediate reward vs delayed punishment, e.g., partying before an exam)

What is the Three-Term Contingency?

- Antecedent (A) / Discriminative Stimulus (Sᴰ): This is the environmental cue or signal that makes a specific behavior more likely to be reinforced or punished. - Example: A green traffic light. - Behavior (B) / Operant Response (R): This is the voluntary action or response that follows the antecedent. - Example: Pressing the brake pedal. - Consequence (C) / Outcome (Sᴿ or Sᴾ): This is what immediately follows the behavior and determines whether the behavior will occur more or less frequently in the future. - Example: The car moves forward (reinforcement). ## Footnote Skinner’s Claim: This 3-term relationship explains all operant behaviour

What is Stimulus Control?

When behaviour is governed by specific stimuli.

What is Stimulus Generalisation?

Response generalises to similar stimuli. ## Footnote (e.g., rat presses lever for green light, also for yellow light).

What is Stimulus Discrimination?

Learner distinguishes between stimuli. EG. presses only under green light, not red ## Footnote ○ Achieved through differential reinforcement.

What is the key takeaway about operant conditioning?

Explains voluntary behaviours through reinforcement and punishment.

What is Thorndike's Puzzle Box experiment?

- Cats placed in boxes could escape by pullling a string, stepping on a platform, or turning a latch. - Over time, cats escaped faster due to trial and error learning.

what is forward chaining?

Teach step 1, then 1+2, etc.

Week 5 - Operant Conditioning (Basic Principles) Flashcards

(41 cards)