0% found this document useful (0 votes)
5 views

Chapter 3 Learning

Uploaded by

anammehmood28
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Chapter 3 Learning

Uploaded by

anammehmood28
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 52

Learning

6th July 2020


Lecture 8
Learning : `
A relatively permanent change in behavior brought
about by experience.
Types of
Learning
Classical
conditioning: Operant
learning to link two conditioning:
stimuli in a way changing behavior
that helps us choices in
anticipate an event response to
to which
we have a consequences
reaction Cognitive learning:
acquiring new
behaviors and
information
through
observation and
information, rather
than by direct
experience
Associative Learning:
Classical Stimulus 1:
See lightning
Conditioning
How it works: after repeated
exposure to two stimuli Stimulus 2:
occurring in sequence, we Hear thunder
associate those stimuli with Here, our response
each other. to thunder becomes
Result: our natural response to associated with
one stimulus now can be lightning.
triggered by the new,
predictive stimulus.
After Repetition
Stimulus: See lightning
Response: Cover ears to avoid
sound
Associative Learning:
Operant
Conditioning
 Child associates his “response” (behavior) with
consequences.
 Child learns to repeat behaviors (saying “please”) which
were followed by desirable results (cookie).
 Child learns to avoid behaviors (yelling “gimme!”) which
were followed by undesirable results (scolding or loss of
dessert).
Cognitive
Learning
Cognitive learning refers to acquiring new
behaviors
and information mentally, rather than by
direct experience.
Cognitive learning occurs:
1.by observing events and the behavior of
others.
2.by using language to acquire information
about events experienced by others.
Behavioris
m The term behaviorism was used by John B.
Watson (1878-1958), a proponent of classical
conditioning, as well as by B.F. Skinner (1904-
1990), a leader in research about operant
 conditioning.
Both scientists believed the mental life was
much less important than behavior as a
foundation for psychological science.
 Both foresaw applications in controlling
human behavior:
Skinner conceived of
utopian
communities.
Watson went into
advertising.
Ivan Pavlov’s
Discovery
While studying salivation in
dogs, Ivan Pavlov found
that salivation from eating
food was eventually
triggered by what should
have been neutral stimuli
such as:
 just seeing the food.
 seeing the dish.
 seeing the person
who brought the
food.
 just hearing that
Before
Neutral stimulus:
Conditioning
a stimulus which does not trigger a
response

Neutral
stimulu
s (NS)
No
response
Before
Unconditioned stimulus and response:
Conditioning
a stimulus which triggers a response
naturally, before/without any conditioning

Uncondition
ed response
Unconditioned (UR): dog
stimulus (US): salivates
yummy dog
food
During
The bell/tone (N.S.) is repeatedly presented
Conditioning
with
the food (U.S.).
Neutral Uncondition
stimul Unconditione ed response
us (NS) d stimulus (UR): dog
(US) salivates
After Conditioning
The dog begins to salivate upon hearing the tone
(neutral stimulus becomes conditioned
stimulus).
Did you follow Conditioned
the changes? response:
Condition The UR and the CR are the
ed same response, triggered dog
(formerly by different events. salivates
The difference is
neutral) whether conditioning
stimulus was necessary for
the response to
happen.
The NS and the CS are
the same stimulus.
The difference is
whether the stimulus
triggers the
conditioned response.
Higher-Order
Conditioning
If the dog becomes conditioned to salivate
at the sound of a bell, can the dog be
conditioned to salivate when a light
flashes… by associating it with the BELL
instead of with food?
 Yes! The conditioned response can be
transferred from the US to a CS, then
from there to another CS.
 This is higher-order conditioning: turning
a NS into a CS by associating it with
another CS.
A man who was conditioned to associate joy
with coffee, could then learn to associate
joy with a restaurant if he was served
coffee there every time he walked in to the
restaurant.
Acquisitio Acquisition refers to the initial
stage of
nWhat gets “acquired”? learning/conditioning.

 The association between a neutral


stimulus (NS) and an unconditioned
stimulus (US).
How can we tell that acquisition
has occurred?
 The UR now gets triggered by a CS
(drooling now gets triggered by a
bell).
Timing
For the association to be acquired,
the neutral stimulus (NS) needs to
repeatedly appear before the
unconditioned stimulus (US)…about a
half-second before, in most cases.
The bell must come right before the 14
Acquisition and
Extinction
 The strength of a CR grows with conditioning.
 Extinction refers to the diminishing of a conditioned response. If
the US (food) stops appearing with the CS (bell), the CR
decreases.
Spontaneous Recovery [Return of the
CR]
After a CR (salivation) has been conditioned and then extinguished:
•following a rest period, presenting the tone alone might lead to a
spontaneous recovery (a return of the conditioned response despite
a lack of further conditioning).
•if the CS (tone) is again presented repeatedly without the US, the
CR becomes extinct again.
Generalization and
Discrimination
Please notice the narrow, psychological
definition .
Ivan Pavlov conditioned Ivan Pavlov conditioned
dogs to drool when dogs to drool at bells of a
rubbed; they then also certain pitch; slightly
drooled when different pitches did not
scratched. trigger drooling.
Generalization refers to Discrimination refers to the
the tendency to have learned ability to only
conditioned responses respond to a specific
triggered by related stimuli, preventing
stimuli. generalization.

MORE stuff makes you LESS stuff makes you


drool. drool.
Ivan Pavlov’s
Legacy
John B. Watson and Classical
Conditioning: Playing with
Fear
 In 1920, 9-month-old Little Albert was not
afraid of rats.
 John B. Watson and Rosalie Rayner then
clanged a steel bar every time a rat was
presented to Albert.
 Albert acquired a fear of rats, and
generalized this fear to other soft and furry
things.
 Watson prided
himself in his
ability to shape
people’s emotions.
He later went into
advertising.
Before Little Albert
Conditioni Experiment
ng
No
fear

NS:
rat

UCS: steel bar


hit with
hammer

Natural
reflex:
fear
Little Albert
Experiment

UCS: steel bar


NS: hit with
rat hammer

Natural
reflex:
During fear

Conditioni
ng
Little Albert
Experiment
NS: rat

Conditione
d reflex:
fear

After
Conditioni
ng
Operant
Conditioning
Operant conditioning involves How it works:
adjusting to the consequences of An act of chosen behavior
our behaviors, so we can easily (a “response”) is followed
do more
learn to of what works, and less by a reward or punitive
what doesn’t work.
of feedback from the
Examples environment.
We may smile more at work after Results:
this repeatedly gets us bigger tips. Reinforced behavior is more
We learn how to ride a bike using likely to be tried again.
the strategies that don’t make us
crash. Punished behavior is less likely
to be chosen in the future.

Response: Consequence: Behavior


balancing a receiving strengthene
Operant and Classical Conditioning are
Different Forms of Associative
Learning
Classical Operant conditioning:
 conditioning:  involves operant behavior,
chosen behaviors which
involves respondent behavior, “operate” on the
reflexive, automatic environment
reactions suchto as fear or 
 these reactions
craving these behaviors become
unconditioned stimuli (US) associated with
become associated with consequences which punish
neutral (then conditioned) (decrease) or reinforce
stimuli (increase) the operant
There is a contrast in the process
behavior
of conditioning.
The experimental (neutral) The experimental (consequence)
stimulus repeatedly precedes stimulus repeatedly follows the
the respondent behavior, and operant behavior, and
eventually triggers that behavior. eventually punishes or
reinforces that behavior.
B.F. Skinner: Behavioral
Control
B. F. Skinner saw potential for
exploring and using Edward
Thorndike’s principles much
more broadly. He wondered:
 how can we more
carefully measure the
effect of consequences
on chosen behavior?
 what else can creatures
be taught to do by B.F. Skinner
controlling trained pigeons
consequences? to play ping
pong, and guide
 what happens when we a video game
change the timing of missile.
reinforcement?
Reinforceme
nt
 Reinforcement refers to This meerkat has
any feedback from the just completed a
environment that task out in the cold
makes a behavior more
likely to recur.
 Positive (adding)
reinforcement:
adding
something For the meerkat,
desirable (e.g., this warm light
warmth) is desirable.
 Negative (taking
away)
reinforcement:
ending something
unpleasant (e.g., the
A cycle of
mutual
reinforcement
Children who have a temper tantrum
when they are frustrated may get
positively reinforced for this behavior
when parents occasionally respond
by giving in to a child’s demands.
Result: stronger, more
frequent tantrums
Parents who occasionally give in
to
tantrums may get negatively
reinforced when the child responds
by ending the tantrum.
Result: parents giving-in behavior
is strengthened (giving in
sooner and more often)
27
Discriminatio
n
 Discrimination refers to the
ability to become more and more
specific in what situations trigger
a response.
 Shaping can increase
discrimination, if
reinforcement only comes for
certain discriminative stimuli.
 For examples, dogs, rats, and Bomb-finding
spiders
even can be trained to search rat
very
for specific smells, from drugs
to explosives.
 Pigeons, seals, and manatees have
been trained to respond to
specific shapes, colors, and Manatee that
categories. selects
shapes
How often should we
reinforce?
 Do we need to give a reward every single time? Or
is that even best?
 B.F. Skinner experimented with the effects of
giving reinforcements in different patterns or
“schedules” to determine what worked best to
establish and maintain a target behavior.
 In continuous reinforcement (giving a reward
after the target every single time), the subject
acquires the desired behavior quickly.
 In partial/intermittent reinforcement (giving
rewards part of the time), the target behavior takes
longer to be acquired/established but persists
longer without reward.
Different Schedules of
Partial/Intermittent
Reinforcement
We may schedule  Fixed interval schedule:
our reward every hour
reinforcements  Variable interval schedule:
based on an reward after a
interval of time changing/random amount of
that has gone by. time passes
We may plan for  Fixed ratio schedule: reward
a certain ratio of
rewards per every five targeted behaviors
 Variable ratio schedule:
number of
instances of the reward
after a randomly chosen
desired behavior.
instance of the target behavior
Results of the different schedules of
reinforcement Which reinforcements
produce more “responding” (more
target behavior)?
 Fixed interval: slow,
unsustained Rapid Fixed interval
responding Fixed
responding nearnear
Rapid
time
responding for
interval
time for
If I’m only paid for my reinforceme
reinforcemen
ntt
Saturday work, I’m not
going to work as hard
on the other days. Variable interval
Steady
 Variable interval: respondin
slow, consistent g
responding
If I never know which day
my lucky lottery number
Effectiveness of the ratio schedules
of
Reinforcement Fixed ratio
 Fixed ratio: high rate
of responding
Buy two drinks, get one Reinforcers
free? I’ll buy a lot of
them!
 Variable ratio: high,
consistent responding,
even if reinforcement Variable ratio
stops (resists
extinction)
If the slot machine
sometimes pays, I’ll pull
the lever as many times
as possible because it
may pay this time!
Operant Effect:
Punishment
Punishments have the opposite effects of
reinforcement. These consequences make the target
behavior less likely to occur in the future.
- Negative
Punishment
+ Positive
Punishme You TAKE AWAY
nt something
pleasant/ desired
You ADD something (ex: no TV time, no
attention)-- MINUS
unpleasant/aversive is the “negative”
(ex: spank the here
child)
Positive does not mean “good” or “desirable” and
negative does not mean “bad” or “undesirable.”
When is
punishment
 Punishment works best in
effective? natural
settings when we encounter
punishing consequences from
actions such as reaching into a
fire; in that case, operant
conditioning helps us to
avoid dangers.
 Punishment is effective when
we
try to artificially create
punishing
consequences for other’s
choices;
these work best when
consequences happen as they
do in nature.
Applying operant conditioning to parenting
Problems with Physical
Punishment
 Punished behaviors may restart when
the punishment is over; learning is
not lasting.
 Instead of learning behaviors, the
child may learn to discriminate
among situations, and avoid those in
which punishment might occur.
 Instead of behaviors, the child might
learn an attitude of fear or hatred,
which can interfere with learning.
This can generalize to a fear/hatred
of all adults or many settings.
 Physical punishment models
aggression and control as a method
of dealing with problems.
Don’t think about the
beach

Don’t think about the waves,


the sand, the towels and
sunscreen, the sailboats and
surfboards.
Don’t think about the beach.

Are you obeying the


instruction? Would you
obey this instruction more
if you were punished for
thinking about the beach?
Problem:
Punishing focuses on what NOT to do, which does
not guide people to a desired behavior.
Even if undesirable behaviors do stop, another
problem behavior may emerge that serves the
same purpose, especially if no replacement
behaviors are taught and reinforced.

Lesson:
In order to teach desired
behavior, reinforce
what’s right more often
than punishing what’s
wrong.
More effective forms of operant
conditioning
The Power of Rephrasing
 Positive punishment: “You’re
playing video games instead
of practicing the piano, so I
am justified in YELLING at
you.”
 Negative punishment: “You’re
avoiding practicing, so I’m
turning off your game.”
 Negative reinforcement: “I will
stop staring at you and bugging
you as soon as I see that you
are practicing.”
 Positive reinforcement: “After
you practice, we’ll play a
game!”
Summary: Types of
Consequences
Adding stimuli Subtract stimuli Outcome
Positive + Negative – Strengthens
Reinforcement Reinforceme target behavior
(You get nt (I stop (You do
candy) yelling) chores)
Positive + Negative – Reduces
Punishme Punishment target
nt (You get (No cell behavior
spanked) phone) (cursing)
= uses = uses
desirable unpleasant
stimuli stimuli
More Operant Conditioning
Applications
Parenting
1.Rewarding small improvements toward desired behaviors works
better than expecting complete success, and also works better
than punishing problem behaviors.
2.Giving in to temper tantrums stops them in the short run
but increases them in the long run.
Self-Improvement
Reward yourself for steps you
take toward your goals. As
you establish good habits,
then make your rewards
more infrequent
(intermittent).
Role of Biology in
Conditioning
Classical Conditioning
 John Garcia and others found it was
easier to learn associations that make
sense for survival.
 Food aversions can be acquired even if
the UR (nausea) does NOT immediately
follow the NS. When acquiring food
aversions during pregnancy or illness, the
body associates nausea with whatever
food was eaten.
 Males in one study were more likely to
see a pictured woman as attractive if the
picture had a red border.
 Quail can have a sexual response linked to
a fake quail more readily and strongly
than to a red light.
Cognitive
Processes In operant
In classical conditioning
 When the dog salivates at the  In fixed-interval
bell, it may be due to conditioning
reinforcement, animals do
cognition (learning to predict, more target
even expect, the food). behaviors/responses around
 Conditioned responses can the time that the reward is
alter attitudes, even when more likely, as if expecting
we know the change is the reward.
caused by conditioning.  Expectation as a cognitive
 However, knowing that our skill is even more evident in
reactions are caused by the ability of humans to
conditioning gives us the respond to delayed
option of mentally breaking reinforcers such as a
the association, e.g. deciding  paycheck.
that nausea associated with a Higher-order conditioning can
food aversion was actually be enabled with cognition;
caused by an illness. e.g., seeing something such
 Higher-order conditioning as money as a reward
involves some cognition;  because of its indirect value.
the name of a food may Humans can set behavioral
trigger salivation. goals for self and others,
Learning, Rewards, and
Motivation
Intrinsic motivation refers to
the desire to perform a
behavior well for its own
sake. The reward is
internalized as a feeling of
satisfaction.
 Extrinsic motivation refers
to doing a behavior to
receive rewards from
others.
 Intrinsic motivation can
sometimes be reduced by What might
external rewards, and can happen if we begin
be prevented by using to reward a
continuous reinforcement. behavior someone
 One principle for was already doing
maintaining behavior is to and enjoying?
Learning by
Observation
 Can we learn new behaviors and skills without
conditioning and reward?
 Yes, and one of the ways we do so is by observational
learning: watching what happens when other people do
a behavior and learning from their experience.
 Skills required: mirroring, being able to picture
ourselves doing the same action, and cognition,
noticing consequences and associations.

Observational Learning Processes


The behavior of others serves as a model, an
Modelin example of how to respond to a situation; we
g may try this model regardless of reinforcement.
Vicarious  Vicarious: experienced indirectly, through others
 Vicarious reinforcement and punishment
Conditioni means our choices are affected as we see
ng others get consequences for their behaviors.
Albert Bandura’s Bobo Doll Experiment
(1961)
 Kids saw adults punching an inflated doll while
narrating their aggressive behaviors such as “kick
him.”
 These kids were then put in a toy-deprived
situation… and acted out the same behaviors they
had seen.
Mirroring in the
Brain
 When we watch others doing or feeling
something, neurons fire in patterns that would
fire if we were doing the action or having the
feeling ourselves.
 These neurons are referred to as mirror neurons,
and they fire only to reflect the actions or feelings
of others.
From Mirroring to
Imitation
 Humans are prone to spontaneous imitation of
both behaviors and emotions (“emotional
contagion”).
 This includes even overimitating, that is, copying
adult behaviors that have no function and no
reward.
 Children with autism are less likely to cognitively
“mirror,”
and less likely to follow someone else’s gaze as
a neurotypical toddler (left) is doing below.
Mirroring Plus Vicarious
 Mirroring enables observational learning; we
Reinforcement
cognitively practice a behavior just by watching it.
 If you combine this with vicarious reinforcement, we
are even more likely to get imitation.
 Monkey A saw Monkey B getting a banana after
pressing four symbols. Monkey A then pressed the
same four symbols (even though the symbols were in
different locations).
Prosocial Effects of Observational
Learning
 Prosocial behavior
refers to actions
which benefit
others, contribute
value to groups,
and follow moral
codes and social
norms.
 Parents try to teach
this behavior
through lectures,
but it may be taught
best through
modeling…
especially if kids can
see the benefits of
the behavior to
oneself or others.
Antisocial Effects of Observational
Learning
 What happens when we learn
from models who demonstrate
antisocial behavior, actions
that are harmful to individuals
and society?
 Children who witness violence in
their homes, but are not
physically harmed themselves,
may hate violence but still may
become violent more often than
the average child.
 Perhaps this is a result of “the
Bobo doll effect”? Under stress,
we do what has been modeled
for us.
Media Models of
DoViolence
we learn
antisocial
behavior
such as
violence
from
indirect
observations
of others
in the
media?
Research shows that viewing media violence leads to
increased aggression (fights) and reduced prosocial
behavior (such as helping an injured person).
This violence-viewing effect might be explained by
imitation, and also by desensitization toward pain in
The END

You might also like