i have a question, if in continuous reinforcement he gets a dollar every time he does the behavior, why is continuous reinforcement more likely to extinct the behaviour? wouldn’t it encourage him to behave that way permanently more rather than intermittent reinforcement? i have a test tomorrow and i’m really confused about this
I suspect it has to do with evolution. The more difficult it is to obtain a reward, the more dopamine our brain is flooded with when we get it. There are certain resources that are harder to obtain (a coconut on a high tree, for example). We aren't going to work to get the harder/riskier to obtain coconut if it doesn't feel excellent to get it. There's a biological advantage to taking risks to get rewards, though (the coconut has water and nutrients) so the people who's brain chemistry rewards them for getting the coconut, survive long enough to pass on their genes, including the reward system in the brain. With that neurology, predictable and consistent is interpreted as easy to obtain and thus low value. This money mustn't be worth much if they're just "giving it away". The best things are harder to get.
could you please tell me, why the variable ratio reinforcement is so addictive? I mean what is the psychology behind it? why we get addicted to it, when we don't know when we are getting rewarded? im searching so much but i can not find the answer
I remember in class it was explained to me that it's like someone in the casino. Since they do not know when they'll win on the machine they continue to play with the hopes that the next turn will provide reinforcement. For example, they may play 2 times and is reinforced. Another 2 times are played and they receive no reinforcement, but receive it after 5. They play 3 times and as soon as they're about to give up, they win on 4. Then they win again on 2. An example our professor actually used on the class was giving pop quizzes on a variable ratio. Because we did not know when the pop quizzes were going to occur (how many class sessions would go by before the test would be given), we proactively were reinforced into going to class. We didn't know if that next class period would be the day.
Good video, except that reinforcement schedules ALWAYS map to affect and to corresponding and observable neurological events, at least for a radical behaviorist that is. Why Behavior Analysis needs to be a bit more ‘radical’. A problem with behavior analysis, which reflects a methodological behaviorism, is that it cannot capture affect, feelings, or emotion, to the resulting detriment of its predictive power. However, a radical behaviorism, which aims to explain as well as predict behavior, can. First, some context. In their 1994 book ‘Learning and Complex Behavior’ the radical behaviorists John Donahoe and David Palmer proposed a ‘Unified Reinforcement Principle’ that merged the data languages of operant and classical conditioning under a ‘discrepancy’ model of incentive or reward, and dismissed any process level distinctions between both types of conditioning. An entire issue (linked below) of the JEAB was dedicated to its review, and although the commentary on their arguments was generally favorable, their work has had little impact on behavior analysis. To D&B, reinforcement is mediated by the activity of midbrain dopamine systems that are activated or suppressed when an individual is from moment to moment anticipating or experiencing positive or negative discrepancies from what is expected. What the authors did not explain in their work, which is canon in the field of affective neuroscience, is that dopamine can phasically increase or decrease due the nature of the discrepancy and the reinforcer, is felt as positive or negative affect, and that dopamine alters the incentive value or decision utility ofmomentary behavior, which may or may not coherent with the rational ends of behavior. Here is a simple example using two concurrent contingencies of reward. A game of solitaire represents a VR schedule of reinforcement, but the ‘reinforcement’ are the continuous variable positive discrepancies occurring as one plays. These discrepancies are mapped to phasic dopamine release, which scales with the importance of the end goal (say a monetary reward) of winning the game. Thus the player would be mildly interested if the reward was low, and highly attentive and aroused if it was high. Now introduce a concurrent contingency of a piece rate FR (fixed ratio) schedule of reward representative of common working environments. Because the task is completely predictable, discrepancy is low, and dopamine systems are depressed. In other words, you are bored. Even though the piece work routine receives a larger monetary payoff and is rationally preferable, the solitaire game is ‘affectively’ preferable, creating a dilemma that initiates a third response (also likely operant) of covert neuro-muscular tension, or stress. These concurrent schedules are rife during our working days, but the affective component cannot be captured by an observation of overt behavior, it must instead be reliably inferred. Like a cold being mapped to a virus, the fact that affect can be mapped to abstract properties of response contingencies is non-controversial, and its lack of acceptance by behavior analysis is detrimental to its ability to predict, control, and understand behavior. ------------------------------------------------- My arguments and examples are expanded in the linked paper below on incentive motivation. Aimed for a lay audience, it was written in consultation with and with the endorsement of the distinguished affective neuroscientist Dr. Kent Berridge, whose research on incentive motivation conforms with the perspective of D&P. (my ‘solitaire’ example is further developed on pp.36-45) www.scribd.com/document/495438436/A-Mouse-s-Tale-a-practical-explanation-and-handbook-of-motivation-from-the-perspective-of-a-humble-creature March, 1997 Issue JEAB The S-R Issue, Its status in behavior analysis and in Donahoe and Palmer’s’ Learning and Complex Behavior’ www.ncbi.nlm.nih.gov/pmc/articles/PMC1284592/pdf/9132463.pdf Berridge and Incentive Motivation sites.lsa.umich.edu/berridge-lab/wp-content/uploads/sites/743/2020/09/Berridge-2001-Reward-learning-chapter.pdf Berridge Lab, University of Michigan sites.lsa.umich.edu/berridge-lab/
Have a test tomorrow 😅 cramming all night
Really well explained with good examples 👍🏻✨
This is a really good channel. I hope you keep making videos!
Thank you for this.. really helpful
gotta watch this video for my virtual report lmao. thank you!
Glad I could help!
i have a question, if in continuous reinforcement he gets a dollar every time he does the behavior, why is continuous reinforcement more likely to extinct the behaviour? wouldn’t it encourage him to behave that way permanently more rather than intermittent reinforcement? i have a test tomorrow and i’m really confused about this
I suspect it has to do with evolution. The more difficult it is to obtain a reward, the more dopamine our brain is flooded with when we get it. There are certain resources that are harder to obtain (a coconut on a high tree, for example). We aren't going to work to get the harder/riskier to obtain coconut if it doesn't feel excellent to get it. There's a biological advantage to taking risks to get rewards, though (the coconut has water and nutrients) so the people who's brain chemistry rewards them for getting the coconut, survive long enough to pass on their genes, including the reward system in the brain.
With that neurology, predictable and consistent is interpreted as easy to obtain and thus low value. This money mustn't be worth much if they're just "giving it away". The best things are harder to get.
Great video ty
could you please tell me, why the variable ratio reinforcement is so addictive? I mean what is the psychology behind it? why we get addicted to it, when we don't know when we are getting rewarded? im searching so much but i can not find the answer
I remember in class it was explained to me that it's like someone in the casino. Since they do not know when they'll win on the machine they continue to play with the hopes that the next turn will provide reinforcement. For example, they may play 2 times and is reinforced. Another 2 times are played and they receive no reinforcement, but receive it after 5. They play 3 times and as soon as they're about to give up, they win on 4. Then they win again on 2.
An example our professor actually used on the class was giving pop quizzes on a variable ratio. Because we did not know when the pop quizzes were going to occur (how many class sessions would go by before the test would be given), we proactively were reinforced into going to class. We didn't know if that next class period would be the day.
Good video, except that reinforcement schedules ALWAYS map to affect and to corresponding and observable neurological events, at least for a radical behaviorist that is.
Why Behavior Analysis needs to be a bit more ‘radical’.
A problem with behavior analysis, which reflects a methodological behaviorism, is that it cannot capture affect, feelings, or emotion, to the resulting detriment of its predictive power. However, a radical behaviorism, which aims to explain as well as predict behavior, can. First, some context. In their 1994 book ‘Learning and Complex Behavior’ the radical behaviorists John Donahoe and David Palmer proposed a ‘Unified Reinforcement Principle’ that merged the data languages of operant and classical conditioning under a ‘discrepancy’ model of incentive or reward, and dismissed any process level distinctions between both types of conditioning. An entire issue (linked below) of the JEAB was dedicated to its review, and although the commentary on their arguments was generally favorable, their work has had little impact on behavior analysis.
To D&B, reinforcement is mediated by the activity of midbrain dopamine systems that are activated or suppressed when an individual is from moment to moment anticipating or experiencing positive or negative discrepancies from what is expected. What the authors did not explain in their work, which is canon in the field of affective neuroscience, is that dopamine can phasically increase or decrease due the nature of the discrepancy and the reinforcer, is felt as positive or negative affect, and that dopamine alters the incentive value or decision utility ofmomentary behavior, which may or may not coherent with the rational ends of behavior.
Here is a simple example using two concurrent contingencies of reward. A game of solitaire represents a VR schedule of reinforcement, but the ‘reinforcement’ are the continuous variable positive discrepancies occurring as one plays. These discrepancies are mapped to phasic dopamine release, which scales with the importance of the end goal (say a monetary reward) of winning the game. Thus the player would be mildly interested if the reward was low, and highly attentive and aroused if it was high. Now introduce a concurrent contingency of a piece rate FR (fixed ratio) schedule of reward representative of common working environments. Because the task is completely predictable, discrepancy is low, and dopamine systems are depressed. In other words, you are bored. Even though the piece work routine receives a larger monetary payoff and is rationally preferable, the solitaire game is ‘affectively’ preferable, creating a dilemma that initiates a third response (also likely operant) of covert neuro-muscular tension, or stress. These concurrent schedules are rife during our working days, but the affective component cannot be captured by an observation of overt behavior, it must instead be reliably inferred.
Like a cold being mapped to a virus, the fact that affect can be mapped to abstract properties of response contingencies is non-controversial, and its lack of acceptance by behavior analysis is detrimental to its ability to predict, control, and understand behavior.
-------------------------------------------------
My arguments and examples are expanded in the linked paper below on incentive motivation. Aimed for a lay audience, it was written in consultation with and with the endorsement of the distinguished affective neuroscientist Dr. Kent Berridge, whose research on incentive motivation conforms with the perspective of D&P. (my ‘solitaire’ example is further developed on pp.36-45)
www.scribd.com/document/495438436/A-Mouse-s-Tale-a-practical-explanation-and-handbook-of-motivation-from-the-perspective-of-a-humble-creature
March, 1997 Issue JEAB The S-R Issue, Its status in behavior analysis and in Donahoe and Palmer’s’ Learning and Complex Behavior’
www.ncbi.nlm.nih.gov/pmc/articles/PMC1284592/pdf/9132463.pdf
Berridge and Incentive Motivation sites.lsa.umich.edu/berridge-lab/wp-content/uploads/sites/743/2020/09/Berridge-2001-Reward-learning-chapter.pdf
Berridge Lab, University of Michigan sites.lsa.umich.edu/berridge-lab/
Thanks
😊