Animal Training - Reinforcement Schedules

?

Continuous Reinforcement

  • The desired behaviour is reinforced every single time it occurs.
  • Best used during the initial stages of learning in order to create a strong association between the behaviour and the response.
  • Once the response is attached, reinforcement is switched to a partial reinforcement schedule
1 of 9

Partial Reinforcement

  • The response is reinforced only part of the time
  •  Learned behaviours are acquired slower in partial reinforcement
  • Response is more resistant to extinction
  • There are 4 schedules of partial reinforcement
    •  Fixed-ratio schedules
    • Variable-ratio schedules
    • Fixed-interval schedules
    • Variable-interval schedules
2 of 9

Partial Reinforcement: Fixed-ratio Schedules

  • Response is reinforced only after specified number of responses
  • Produces high, steady rate of responding with only a brief pause after the delivery of reinforcer
  • Fixed ratio means that if a behaviour is performed X number of times, there will be reinforcement on the Xth performance
  • For a fixed ratio of 1:3 every third behaviour is rewarded
  • Tends to lead to lousy performance with some animals
    • They know the first two performances will not be rewarded
    • The third one will be rewarded no matter what
  • Some assembly line production systems work on this schedule, the worker gets paid for every 10 widgets
3 of 9

Partial Reinforcement: Variable-ratio Schedules

  • Occur when a response is reinforced after an unpredictable number of responses
  • Creates a high steady rate of responding
  • Gambling and lottery games are examples of reward based on a variable ratio schedule
  • Reinforcers are distributed based on the average number of correct behaviours
  • Variable ratio of 1:3 means that on average, one out of every three behaviours will be rewarded
    • Reward must average 1 in 3
    • e.g., First, third or fourth may be rewarded
  • Often referred to as variable schedule of reinforcement or VSR
4 of 9

Partial Reinforcement: Fixed-interval schedules

  • First response is rewarded only after a specified amount of time has elapsed
  • Causes high amounts of responding near the end of the interval
  • Slower responding immediately after the delivery of the reinforcer
  • Reward will occur after a fixed amount of time
    • e.g., every five minutes
    • Pay checks work on this schedule
5 of 9

Partial Reinforcement: Variable-interval Schedules

  • A response is rewarded after an unpredictable amount of time has passed
  • Produces slow, steady rate of response
  • Variable interval schedule means that reinforcers will be distributed after a varying amount of time
    • Sometimes it will be five minutes, sometimes three, sometimes seven, sometimes on
6 of 9

Choosing a Schedule

  • Deciding when to reinforce a behaviour can depend on a number of factors
  • Where you are trying to teach a new behaviour, a continuous schedule is often a good choice
  • Once the behaviour is learned a partial schedule is often preferred
  • Reinforcing a behaviour every time it occurs can be difficult and requires a great deal of attention and resources
  • Partial schedules lead to behaviours that are more resistant to extinction but reduces the risk that the subject will become satiated
  • If the reinforcer being used is no longer desired or rewarding, the subject may stop performing
7 of 9

Extinction

  • If reinforcement fails to occur after the behaviour that has been reinforced in the past, the behaviour might extinguish
  • Variable ratio schedule of reinforcement makes the behaviour less vulnerable to extinction
  • If you’re not expecting to gain a reward every time you carry out a behaviour, you are likely to stop the first few times your action fails to generate a desired reward
  • When a behaviour that has been strongly reinforced in the past no longer gains reinforcement, you might experience an extinction burst
    • When an animal performs the behaviour over and over in a burst of activity
  • A dog that has been barking all night may have learned to do this because the owners get up when it barks to let it out. The dog may then bark louder as it tries harder to get the desired result. This is an example of an extinction burst and must be ignored.
8 of 9

Premack Principle

  • A more commonly occurring action can be used to reinforce one that doesn’t occur as often. It was developed 1965 by David Premack.
  • Could be applied to recalling a dog by only letting a dog have a treat once it returns to you, or only allowed to play fetch once it returned to you.
9 of 9

Comments

No comments have yet been made

Similar Other resources:

See all Other resources »See all Animal Management resources »