0% found this document useful (0 votes)
10 views20 pages

Mod27

Topic 3.7: Classical Conditioning ● Behavioral Perspective ● Classical Conditioning ● Acquisition ● Unconditioned Stimulus ● Unconditioned Response ● Conditioned Stimulus ● Conditioned Response ● Extinction ● Spontaneous Recovery ● Stimulus Discrimination ● Generalization ● Higher-order Conditioning ● Counterconditioning ● Taste Aversion ● Biological preparedness ● One-trial learning ● Habituation Topic 3.8: Operant Conditioning ● Operant Conditioning ● Law of Effect ● P● Behavioral Perspect

Uploaded by

Gabriela Braidi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views20 pages

Mod27

Topic 3.7: Classical Conditioning ● Behavioral Perspective ● Classical Conditioning ● Acquisition ● Unconditioned Stimulus ● Unconditioned Response ● Conditioned Stimulus ● Conditioned Response ● Extinction ● Spontaneous Recovery ● Stimulus Discrimination ● Generalization ● Higher-order Conditioning ● Counterconditioning ● Taste Aversion ● Biological preparedness ● One-trial learning ● Habituation Topic 3.8: Operant Conditioning ● Operant Conditioning ● Law of Effect ● P● Behavioral Perspect

Uploaded by

Gabriela Braidi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 20

Mod 27

It’s one thing to classically condition a dog to salivate to the sound


of a tone, or a child to fear moving cars. But to teach an elephant to
walk on its hind legs or a child to say please, we turn to operant
conditioning.

Classical conditioning and operant conditioning are both forms of


associative learning, yet their differences are straightforward:

 Classical conditioning forms associations between stimuli (a CS


and the US it signals). It also involves respondent behavior—
automatic responses to a stimulus (such as salivating in
response to meat powder and later in response to a tone).

 In operant conditioning, organisms associate their own actions


with consequences. Actions followed by reinforcers increase;
those followed by punishments often decrease. Behavior
that operates on the environment to produce rewarding or
punishing stimuli is called operant behavior.

Skinner’s Experiments

 27-2 Who was Skinner, and how is operant behavior


reinforced and shaped?

B. F. Skinner (1904–1990) was a college English major and aspiring writer


who, seeking a new direction, enrolled as a graduate student in psychology.
He went on to become modern behaviorism’s most influential and
controversial figure. Skinner’s work elaborated on what psychologist Edward
L. Thorndike (1874–1949) called the law of effect: Rewarded behavior tends
to recur (Figure 27.1), and punished behavior is less likely to recur. Using
Thorndike’s law of effect as a starting point, Skinner developed a behavioral
technology that revealed principles of behavior control. By shaping pigeons’
natural walking and pecking behaviors, for example, Skinner was able to
teach them such unpigeon-like behaviors as walking in a figure 8, playing
Ping-Pong, and keeping a missile on course by pecking at a screen target.
Figure 27.1

Cat in a puzzle box

Thorndike used a fish reward to entice cats to find their way out of a puzzle
box through a series of maneuvers. The cats’ performance tended to improve
with successive trials, illustrating Thorndike’s law of effect. (Data
from Thorndike, 1898.)

For his pioneering studies, Skinner designed an operant chamber,


popularly known as a Skinner box (Figure 27.2). The box has a bar (a lever)
that an animal presses—or a key (a disc) the animal pecks—to release a
reward of food or water. It also has a device that records these responses.
This creates a stage on which rats and other animals act out Skinner’s
concept of reinforcement: any event that strengthens (increases the
frequency of) a preceding response. What is reinforcing depends on the
animal and the conditions. For people, it may be praise, attention, or a
paycheck. For hungry and thirsty rats, food and water work well. Skinner’s
experiments have done far more than teach us how to pull habits out of a
rat. They have explored the precise conditions that foster efficient and
enduring learning.
Figure 27.2

A Skinner box

Inside the box, the rat presses a bar for a food reward. Outside, measuring
devices (not shown here) record the animal’s accumulated responses.

Shaping Behavior

Imagine that you wanted to condition a hungry rat to press a bar. Like
Skinner, you could tease out this action with shaping, gradually guiding the
rat’s actions toward the desired behavior. First, you would watch how the
animal naturally behaves, so that you could build on its existing behaviors.
You might give the rat a bit of food each time it approaches the bar. Once the
rat is approaching regularly, you would give the food only when it moves
close to the bar, then closer still. Finally, you would require it to touch the bar
to get food. By rewarding successive approximations (as Sutherland did with
her husband), you reinforce responses that are ever closer to the final
desired behavior, and you ignore all other responses. By making rewards
contingent on desired behaviors, researchers and animal trainers gradually
shape complex behaviors.
Reinforcers vary with circumstances What is reinforcing (a heat lamp) to
one animal (a cold meerkat) may not be to another (an overheated child).
What is reinforcing in one situation (a cold snap at the Taronga Zoo in
Sydney, Australia) may not be in another (a sweltering summer day).
Shaping can also help us understand what nonverbal organisms can
perceive. Can a dog distinguish red and green? Can a baby hear the
difference between lower- and higher-pitched tones? If we can shape them to
respond to one stimulus and not to another, then we know they can perceive
the difference. Such experiments have even shown that some nonhuman
animals can form concepts. When experimenters reinforced pigeons for
pecking after seeing a human face, but not after seeing other images, the
pigeon’s behavior showed that it could recognize human faces (Herrnstein &
Loveland, 1964). In this experiment, the human face was a discriminative
stimulus. Like a green traffic light, discriminative stimuli signal that a
response will be reinforced (Figure 27.3). After being trained to discriminate
among classes of events or objects—flowers, people, cars, chairs—pigeons
can usually identify the category in which a new pictured object belongs
(Bhatt et al., 1988; Wasserman, 1993). They have even been trained to
discriminate between the music of Bach and Stravinsky (Porter & Neuringer,
1984).
Figure 27.3

Bird brains spot tumors

After being rewarded with food when correctly spotting breast tumors,
pigeons became as skilled as humans at discriminating cancerous from
healthy tissue (Levenson et al., 2015). Other animals have been shaped to
sniff out land mines or locate people amid rubble (La Londe et al., 2015).
Skinner noted that we continually reinforce and shape others’ everyday
behaviors, though we may not mean to do so. Isaac’s whining annoys his
dad, for example, but consider how his dad typically responds:

Isaac: Could you take me to the mall?

Dad: (Continues reading paper.)

Isaac: Dad, I need to go to the mall.

Dad: Uh, yeah, in a few minutes.

Isaac: DAAAAD! The mall!!

Dad: Show me some manners! Okay, where are my keys …

Isaac’s whining is reinforced, because he gets something desirable—a trip to


the mall. Dad’s response is reinforced, because it gets rid of something
aversive—Isaac’s whining.

Or consider a teacher who sticks gold stars on a wall chart beside the names
of children scoring 100 percent on spelling tests. As everyone can then see,
some children consistently do perfect work. The others, who may have
worked harder than the academic all-stars, get no rewards. The teacher
would be better advised to apply the principles of operant conditioning—to
reinforce all spellers for gradual improvements (successive approximations
toward perfect spelling of words they find challenging).

Types of Reinforcers

Flip It Video: Reinforcement

 27-3 How do positive and negative reinforcement differ,


and what are the basic types of reinforcers?

Until now, we’ve mainly been discussing positive reinforcement, which


strengthens responding by presenting a typically pleasurable stimulus
immediately after a response. But, as the whining Isaac story illustrates,
there are two basic kinds of reinforcement (Table 27.1). Negative
reinforcement strengthens a response by reducing or
removing something negative. Isaac’s whining was positively reinforced,
because Isaac got something desirable—a trip to the mall. His dad’s
response (doing what Isaac wanted) was negatively reinforced, because it
ended an aversive event—Isaac’s whining. Similarly, taking aspirin may
relieve your headache, and hitting snooze will silence your irritating alarm.
These welcome results provide negative reinforcement and increase the odds
that you will repeat these behaviors. For those with drug addiction, the
negative reinforcement of ending withdrawal pangs can be a compelling
reason to resume using (Baker et al., 2004). Note that negative
reinforcement is not punishment. (Some friendly advice: Repeat those
italicized words in your mind.) Rather, negative reinforcement—psychology’s
most misunderstood concept—removes a punishing (aversive) event. Think
of negative reinforcement as something that provides relief—from that
whining teenager, bad headache, or annoying seat belt alarm.

TABLE 27.1 Ways to Increase Behavior

Operant Description Examples


Conditioning
Term

Positive Add a desirable Pet a dog that comes


reinforcement stimulus. when you call it; pay
someone for work done.

Negative Remove an Take painkillers to end


reinforcement aversive pain; fasten seat belt to
stimulus. end loud beeping.

Crying works How is operant conditioning at work in this cartoon? 1


AP® EXAM TIP

Prepare to identify specific examples of positive and negative


reinforcements. Pay particular attention to Table 27.1 for guidance.

Sometimes negative and positive reinforcement coincide. Imagine a worried


student who, after goofing off and getting a bad test grade, studies harder
for the next test. This increased effort may be negatively reinforced by
reduced anxiety, and positively reinforced by a better grade. We reap the
rewards of escaping the aversive stimulus, which increases the chances that
we will repeat our behavior. The point to remember: Whether it works by
reducing something aversive or by providing something
desirable, reinforcement is any consequence that strengthens behavior.

Primary and Conditioned Reinforcers

Getting food when hungry or having a painful headache go away is innately


satisfying. These primary reinforcers are unlearned. Conditioned
reinforcers, also called secondary reinforcers, get their power through
learned association with primary reinforcers. If a rat in a Skinner box learns
that a light reliably signals a food delivery, the rat will work to turn on the
light (see Figure 27.2). The light has become a conditioned reinforcer. Our
lives are filled with conditioned reinforcers—money, good grades, a pleasant
tone of voice—each of which has been linked with more basic rewards. If
money is a conditioned reinforcer—if people’s desire for money is derived
from their desire for food—then hunger should also make people more
money hungry, reasoned one European research team (Briers et al., 2006).
Indeed, in their experiments, people were less likely to donate to charity
when food deprived, and less likely to share money with fellow participants
when in a room with hunger-arousing aromas.

Immediate and Delayed Reinforcers

Let’s return to the imaginary shaping experiment in which you were


conditioning a rat to press a bar. Before performing this “wanted” behavior,
the hungry rat will engage in a sequence of “unwanted” behaviors—
scratching, sniffing, and moving around. If you present food immediately
after any one of these behaviors, the rat will likely repeat that rewarded
behavior. But what if the rat presses the bar while you are distracted, and
you delay giving the reinforcer? If the delay lasts longer than about 30
seconds, the rat will not learn to press the bar. It will have moved on to other
incidental behaviors, such as scratching, sniffing, and moving, and one of
these behaviors will instead get reinforced.

Unlike rats, humans do respond to delayed reinforcers: the paycheck at the


end of the week, the good grade at the end of the term, the trophy at the
end of the sports season. Indeed, to function effectively we must learn to
delay gratification. In one of psychology’s most famous studies, some 4-year-
olds showed this ability. In choosing a piece of candy or a marshmallow,
these impulse-controlled children preferred having a big one tomorrow to
munching on a small one right away. Learning to control our impulses in
order to achieve more valued rewards is a big step toward maturity and can
later protect us from committing an impulsive crime (Åkerlund et al.,
2016; Logue, 1998a,b). Children who delay gratification have tended to
become socially competent and high-achieving adults (Mischel, 2014).
“Oh, not bad. The light comes on, I press the bar, they write me a check.
How about you?”

To our detriment, small but immediate pleasures (the enjoyment of watching


late-night TV, for example) are sometimes more alluring than big but delayed
rewards (feeling rested for a big test tomorrow). For many teens, the
immediate gratification of risky, unprotected sex in passionate moments
prevails over the delayed gratifications of safe sex or saved sex. And for
many people, the immediate rewards of gas-guzzling vehicles, air travel, and
air conditioning prevail over the bigger future consequences of global climate
change, rising seas, and extreme weather.

Reinforcement Schedules

 27-4 How do different reinforcement schedules affect


behavior?

In most of our examples, the desired response has been reinforced every
time it occurs. But reinforcement schedules vary. With continuous
reinforcement, learning occurs rapidly, which makes it the best choice for
mastering a behavior. But extinction also occurs rapidly. When reinforcement
stops—when we stop delivering food after the rat presses the bar—the
behavior soon stops (is extinguished). If a normally dependable candy
machine fails to deliver a chocolate bar twice in a row, we stop putting
money into it (although a week later we may exhibit spontaneous
recovery by trying again).

Real life rarely provides continuous reinforcement. Salespeople do not make


a sale with every pitch. But they persist because their efforts are
occasionally rewarded. This persistence is typical with partial
(intermittent) reinforcement schedules, in which responses are
sometimes reinforced, sometimes not. Learning is slower to appear,
but resistance to extinction is greater than with continuous reinforcement.
Imagine a pigeon that has learned to peck a key to obtain food. If you
gradually phase out the food delivery until it occurs only rarely, in no
predictable pattern, the pigeon may peck 150,000 times without a reward
(Skinner, 1953). Slot machines reward gamblers in much the same way—
occasionally and unpredictably. And like pigeons, slot players keep trying,
time and time again. With intermittent reinforcement, hope springs eternal.
Lesson for parents and babysitters: Partial reinforcement also works with
children. Occasionally giving in to children’s tantrums for the sake of peace
and quiet intermittently reinforces the tantrums. This is the very best
procedure for making a behavior persist.

Skinner (1961) and his collaborators compared four schedules of partial


reinforcement. Some are rigidly fixed, some unpredictably variable.

Fixed-ratio schedules reinforce behavior after a set number of responses.


Coffee shops may reward us with a free drink after every 10 purchased. Once
conditioned, rats may be reinforced on a fixed ratio of, say, one food pellet
for every 30 responses. Once conditioned, animals will pause only briefly
after a reinforcer before returning to a high rate of responding (Figure
27.4).

Figure 27.4

Intermittent reinforcement schedules

Skinner’s (1961) laboratory pigeons produced these response patterns to


each of four reinforcement schedules. (Reinforcers are indicated by diagonal
marks.) For people, as for pigeons, reinforcement linked to number of
responses (a ratio schedule) produces a higher response rate than
reinforcement linked to amount of time elapsed (an interval schedule). But
the predictability of the reward also matters. An
unpredictable (variable) schedule produces more consistent responding than
does a predictable (fixed) schedule. (Data from Skinner, 1961.)

Variable-ratio schedules provide reinforcers after a seemingly


unpredictable number of responses. This unpredictable reinforcement is what
slot-machine players and fly fishers experience, and it’s what makes
gambling and fly fishing so hard to extinguish even when they don’t produce
the desired results. Because reinforcers increase as the number of responses
increases, variable-ratio schedules produce high rates of responding.

Fixed-interval schedules reinforce the first response after a fixed time


period. Animals on this type of schedule tend to respond more frequently as
the anticipated time for reward draws near. People check more frequently for
the mail as the delivery time approaches. Pigeons peck keys more rapidly as
the time for reinforcement draws nearer. This produces a choppy stop-start
pattern rather than a steady rate of response (see Figure 27.4).

Variable-interval schedules reinforce the first response after varying time


intervals. At unpredictable times, a food pellet rewarded Skinner’s pigeons
for persistence in pecking a key. Like the longed-for message that finally
rewards persistence in checking our phone, variable-interval schedules tend
to produce slow, steady responding. This makes sense, because there is no
knowing when the waiting will be over (Table 27.2).

TABLE 27.2 Schedules of Partial Reinforcement

Fixed Variable

Ratio Every so After an unpredictable


many: reinforcement number:reinforcement
after after a random number
every nth behavior, of behaviors, as when
such as buy 10 coffees, playing slot machines or
get 1 free, or pay fly fishing
workers per product unit
produced

Interv Every so Unpredictably


al often: reinforcement for often: reinforcement for
behavior after a fixed behavior after a random
time, such as Tuesday amount of time, as when
discount prices studying for an
unpredictable pop quiz

AP® EXAM TIP

Students sometimes have difficulty with the schedules of reinforcement. The


word intervalin schedules of reinforcement means that an interval of time
must pass before reinforcement. There is nothing the learner can do to
shorten the interval. The word ratiorefers to the ratio of responses to
reinforcements. If the learner responds with greater frequency, there will be
more reinforcements.

“The charm of fishing is that it is the pursuit of what is elusive but


attainable, a perpetual series of occasions for hope.”

Scottish author John Buchan (1875–1940)

In general, response rates are higher when reinforcement is linked to the


number of responses (a ratio schedule) rather than to time (an interval
schedule). But responding is more consistent when reinforcement is
unpredictable (a variable schedule) than when it is predictable (a fixed
schedule). Animal behaviors differ, yet Skinner (1956) contended that the
reinforcement principles of operant conditioning are universal. It matters
little, he said, what response, what reinforcer, or what species you use. The
effect of a given reinforcement schedule is pretty much the same: “Pigeon,
rat, monkey, which is which? It doesn’t matter…. Behavior shows
astonishingly similar properties.”

Punishment

 27-5 How does punishment differ from negative


reinforcement, and how does punishment affect behavior?

Reinforcement increases a behavior; punishment does the opposite. So,


while negative reinforcement increases the frequency of a preceding
behavior (by withdrawing something negative), a punisher is any
consequence that decreases the frequency of a preceding behavior (Table
27.3). Swift and sure punishers can powerfully restrain unwanted behavior.
The rat that is shocked after touching a forbidden object and the child who is
burned by touching a hot stove will learn not to repeat those behaviors.

TABLE 27.3 Ways to Decrease Behavior

Type of Description Examples


Punisher

Positive Administer an Spray water on a barking


punishment aversive dog; give a traffic ticket for
stimulus. speeding.

Negative Withdraw a Take away a misbehaving


punishment rewarding teen’s driving privileges;
stimulus. revoke a rude person’s chat
room access.

Criminal behavior, much of it impulsive, is also influenced more by swift and


sure punishers than by the threat of severe sentences (Darley & Alter, 2013).
Thus, when Arizona introduced an exceptionally harsh sentence for first-time
drunk drivers, the drunk-driving rate changed very little. But when Kansas
City police started patrolling a high crime area to increase the swiftness and
sureness of punishment, that city’s crime rate dropped dramatically.

AP® EXAM TIP

You must be able to differentiate between reinforcement and punishment.


Remember that any kind of reinforcement (positive, negative, primary,
conditioned, immediate, delayed, continuous, or partial) encourages the
behavior. Any kind of punishmentdiscourages the behavior. Positive and
negative do not refer to values—it’s not that positive reinforcement (or
punishment) is the good kind and negative is the bad. Think of positive and
negative mathematically; a stimulus is added with positive reinforcement (or
punishment) and a stimulus is subtracted with negative reinforcement (or
punishment).

What do punishment studies imply for parenting? One analysis of over


160,000 children found that physical punishment rarely corrects unwanted
behavior (Gershoff & Grogan-Kaylor, 2016). Many psychologists note four
major drawbacks of physical punishment (Finkenauer et al., 2015; Gershoff,
2002; Marshall, 2002).

1. Punished behavior is suppressed, not forgotten. This temporary state


may (negatively) reinforce parents’ punishing behavior. The child
swears, the parent swats, the parent hears no more swearing and feels
the punishment successfully stopped the behavior. No wonder
spanking is a hit with so many parents—with 60 percent of children
around the world spanked or otherwise physically punished (UNICEF,
2014).

2. Punishment teaches discrimination among situations. In operant


conditioning, discrimination occurs when an organism learns that
certain responses, but not others, will be reinforced. Did the
punishment effectively end the child’s swearing? Or did the child
simply learn that while it’s not okay to swear around the house, it’s
okay elsewhere?

3. Punishment can teach fear. In operant


conditioning, generalizationoccurs when an organism’s response to
similar stimuli is also reinforced. A punished child may associate fear
not only with the undesirable behavior but also with the person who
delivered the punishment or where it occurred. Thus, children may
learn to fear a punishing teacher and try to avoid school, or may
become more anxious (Gershoff et al., 2010). For such reasons, most
European countries and 31 U.S. states now ban hitting children in
schools and child-care institutions (EndCorporalPunishment.org). As of
2017, 51 countries outlaw hitting by parents. A large survey in Finland,
the second country to pass such a law, revealed that children born
after the law passed were, indeed, less often slapped and beaten
(Österman et al., 2014).

4. Physical punishment may increase aggression by modeling violence as


a way to cope with problems. Studies find that spanked children are at
increased risk for aggression (MacKenzie et al., 2013). We know, for
example, that many aggressive delinquents and abusive parents come
from abusive families (Straus & Gelles, 1980; Straus et al., 1997).

Some researchers question this logic. Physically punished children may be


more aggressive, they say, for the same reason that people who have
undergone psychotherapy are more likely to suffer depression—because they
had preexisting problems that triggered the treatments (Ferguson,
2013; Larzelere, 2000; Larzelere et al., 2004). So, does spanking cause
misbehavior, or does misbehavior trigger spanking? Correlations don’t hand
us an answer.

The debate continues. Some researchers note that frequent spankings


predict future aggression—even when studies control for preexisting bad
behavior (Taylor et al., 2010). Other researchers believe that lighter
spankings pose less of a problem (Baumrind et al., 2002; Larzelere & Kuhn,
2005). That is especially so if physical punishment is used only as a backup
for milder disciplinary tactics, and if it is combined with a generous dose of
reasoning and reinforcing.

Parents of delinquent youths are often unaware of how to achieve desirable


behaviors without screaming at, hitting, or threatening their children with
punishment (Patterson et al., 1982). Training programs can help transform
dire threats (“Apologize right now or I’m taking that cell phone away!”) into
positive incentives (“You’re welcome to have your phone back when you
apologize.”). Stop and think about it. Aren’t many threats of punishment just
as forceful, and perhaps more effective, when rephrased positively? Thus, “If
you don’t get your homework done, I’m not giving you money for a movie!”
could be phrased more positively as….

“A pat on the back, though only a few vertebrae removed from a kick in the
pants, is miles ahead in results.”

Attributed to publisher Bennett Cerf (1898–1971)

In classrooms, too, teachers can give feedback by saying, “No, but try this
…” and “Yes, that’s it!” Such responses reduce unwanted behavior while
reinforcing more desirable alternatives. Remember: Punishment tells you
what not to do; reinforcement tells you what to do. Thus, punishment trains a
particular sort of morality—one focused on prohibition (what not to do) rather
than positive obligations (Sheikh & Janoff-Bulman, 2013).
What punishment often teaches, said Skinner, is how to avoid it. Most
psychologists now favor an emphasis on reinforcement: Notice people doing
something right and affirm them for it.

Skinner’s Legacy

 27-6 Why did Skinner’s ideas provoke controversy?


B. F. Skinner “I am sometimes asked, ‘Do you think of yourself as you think
of the organisms you study?’ The answer is yes. So far as I know, my
behavior at any given moment has been nothing more than the product of
my genetic endowment, my personal history, and the current setting”
(1983).
B. F. Skinner stirred a hornet’s nest with his outspoken beliefs. He repeatedly
insisted that external influences, not internal thoughts and feelings, shape
behavior. He argued that brain science isn’t needed for psychological
science, saying that “a science of behavior is independent of neurology”
(Skinner, 1938/1966, pp. 423–424). And he urged people to use operant
conditioning principles to influence others’ behavior at school, work, and
home. Knowing that behavior is shaped by its results, he argued that we
should use rewards to evoke more desirable behavior.

Skinner’s critics objected, saying that he dehumanized people by neglecting


their personal freedom and by seeking to control their actions. Skinner’s
reply: External consequences already haphazardly control people’s behavior.
Why not administer those consequences toward human betterment?
Wouldn’t reinforcers be more humane than the punishments used in homes,
schools, and prisons? And if it is humbling to think that our history has
shaped us, doesn’t this very idea also give us hope that we can shape our
future?

You might also like