0% found this document useful (0 votes)
2 views

Chapter 6 copy

Uploaded by

The lneffable Us
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Chapter 6 copy

Uploaded by

The lneffable Us
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Chapter 6 − Learning

October 1, 2020 9:39 AM

Learning is a process by which knowledge or experience changes as a result of experience


- Reading and listening to acquire new information is called cognitive learning

6.1 Classical Conditioning: Learning by Association

- Pavlov's research dogs were given meat powder, and this routine led the dogs to salivate before meat powder
was presented them
○ Then he started to serve meat powder while a metronome is ticking and later on, dogs salivate when they
just hear the metronome; so dogs are salivating in anticipation of food
- Classical conditioning/Pavlovian conditioning is a form of associative learning in which an organism learns to
associate a neutral stimulus (e.g. sound) to a biological relevant stimulus (e.g. food), this results a change in the
response to the previously neutral stimulus (e.g. salivation)
- Unconditioned stimulus (US) is a stimulus that elicits a reflexive response without learning (e.g. food, water, pain)
- Unconditioned response (UR) is a reflexive, unlearned reaction to a US (e.g. hunger, drooling, expressions of pain)
- So US and UR are both unlearned and something that happens naturally
○ E.g. The meat powder elicit an unconditioned salivation (Pavlov's dogs)
- Conditioned stimulus (CS) is a neutral stimulus that later starts to elicit a conditioned response due to being
associated with an unconditioned stimulus for a period of time (e.g. the metronome in Pavlov's dogs)
- Conditioned response (CR) is a learned response that occurs due to CS (e.g. dogs salivate due to metronome)
- CS can only have an effect if it becomes associated with the US
- CR is learned and UR is naturally occurring
- UR and CR are not always the same response
○ Many animals "freeze" (i.e. become motionless) when they are scared, since predators can detect movement
▪ Lab rats hear a tone and get an electrical shock on their feet have a UR of jumping, flinching, and pain,
but when they associate the tone with the shock, they just freeze when they hear the tone even if there
is no shock (CR)
- So conditioning has an evolutionary function
○ Evolutionary function of the CR can be seen as a way for the organism to interact adaptively with the US
- According to Hebb's rule: a weak connection becomes strengthened when a weak connection b/w neurons
is stimulated at the same time as strong connections
○ So in conditioning, if perceiving a puff of air and blinking response is a strong connection, and sound and blinking
is a weak connection, if you perceive a puff of air and hear sounds at the same time repeatedly (blinking
response would still occur), sounds and blinking response will become a strong connection and you will blink
when you hear the sound only

- CR can diminish over time, or it may occur with a new stimuli with which the response has never been paired
- Acquisition is the initial phase of learning in which a response is established
○ Thus, in classical conditioning, acquisition is when the neutral stimulus is repeatedly paired with the US
▪ If its not repeatedly paired (e.g. dog's were given food only sometimes when metronome was ticking),
then conditioning would not occur, or be very weak
- Extinction is the reduction of a conditioned response when a US and CS no longer occur together
○ So if dogs didn't receive food when metronome was ticking, and this happened frequently, then the
salvation would occur less and less till they don’t salivate at all
▪ Biologically makes sense, since metronome is no longer a good predictor of food
○ Rate of firing in the brain areas related to this association decreases over the course of extinction
- Spontaneous recovery is the reoccurrence of a conditioned response after some time of extinction
○ So dogs start to salivate when they go back to the experiment room after quite some time and start to
salivate due to the metronome
○ Possible animal may not able retrieve memory of extinction and go back to the memory of conditioned response
- Generalization is a process in which a response that occurred for a specific stimulus also occurs for a different
but similar stimuli (e.g. dogs not only salivating to metronome, but also similar sounds)
○ When we perceive a stimulus, it activates our brain's representation for that item and representations of other
related items
▪ So according to Hebb's rule, those additional representations synapses would also fire at the same time
as the synapses involved in conditioned responses and therefore strengthen the connections of those
additional synapses.
- Discrimination is when an organism responds to an original conditioned stimulus but not to new stimuli that may
be similar to the original stimulus
○ If stimuli similar to CS is presented without a US, it becomes less likely that the stimuli lead to
stimulus generalization
○ So the dogs would hear these other tones that would have their own memory representation in which they
did not receive any food
- Conditioned emotional responses consist of emotional and physiological responses that develop to a
specific object/situation
○ Ex: Watson & Rayner conditioned a 11−month old child Albert to fear white rats. Before conditioning, Albert
showed no fear. Then they startled Albert with a loud noise from striking a steel bar with a hammer when
Albert was with the rat. (US is the loud noise, UR is the feeling of fear by the noise). Then after repeated pairings
with rat and noise, he started to just fear the rat w/o the noise. The rat became CS and the fear elicited became
the CR. Albert emotional conditioning was also generalized to other white furry objects (like rabbits).
▪ Emotional conditioning doesn’t have to be very experimental, but can also happen naturally
- Conditioned emotional responses offer a possible explanation to many phobias
- If organisms learns a fear−related association, activity occurs in the amygdala (brain area related to fear)
- If organism learns to fear a particular location (like a certain cage is associated with electrical shock), then
context− related activity in the hippocampus will communicate to the amygdala to produce contextual fear
conditioning
- Neural connections related to fear conditioning remain intact after extinction
○ Other neurons suppress the activity of brain areas related to fear responses
▪ But if CS and US is paired again, this suppression is gone and fear−conditioned responses will occur again
- People with psychopathy (antisocial personality disorder; notorious for disregarding the feelings of others) are
not really affected by emotional conditioning
○ In an experiment, people were showed a face (neutral stimuli) and then followed by a pain (US) which elicit a
pain response (UR). Then after repeated pairings, the face became the CS and had a negative emotional
reaction to those faces (CR). But the psychopathy people showed little physiological arousal and their emotional
brain areas were quiet when looking at the CS, and don't really mind looking at the faces, unlike the normal ppl.

- Fear of snakes is a product of learning than actual instinct


○ Conducted an experiment where a photo of a snake (CS) is paired with a mild electric shock (US) which UR is
increased palm sweat (skin conductance response); this reaction is part of fight−or−flight response when our
bodies are aroused by threatening/uncomfortable stimulus. The snake photo alone elicited a strong skin
conductance response (CR). When paired with a flower photo and shock, conditioning responding was low.
When paired with a gun photo and shock, conditioned arousal to guns was comparable with the flowers (higher
than flowers though).
▪ Humans have evolved to detect and fear an animal that has a long history of causing injury/death
□ Not true with flowers (harmless to humans) or guns (relatively new to our species' history)
◆ This evolutionary explanation is known as preparedness
- Preparedness is the biological predisposition to rapidly learn a response to particular class of stimuli
-

- Conditional taste aversion is the acquired dislike or disgust for a food or drink because it was paired with illness
○ CS is the food, and US is whatever is in the food (like bacteria) and in the environment that makes you sick
○ Conditioned aversions only occur for the flavour of a particular food and not any other stimuli during that time
▪ Listening to a particular song while eating a 2 week old tuna sandwich, your aversion would develop to
the tuna sandwich and not the particular song
○ You can develop conditioned aversions even after one single exposure and even if you feel sick couple hours
later eating the food
▪ E.g. the food (CS) and feeling sick because of food poisoning (UR) can take a matter of hours
□ Most conditioning only happens if CS, US, and UR happen in a short period of time
- Usually conditioned taste aversion develops to something we ingested that has an unfamiliar flavour, since
these unfamiliar flavours stick out and are much easier to remember
○ If someone eats a Swiss cheese sandwich for lunch everyday and suddenly you feel ill during the afternoon,
the person will less likely develop conditioned taste aversion
▪ This scenario can be explained by latent inhibition
- Latent inhibition occurs when a frequent experience with a stimulus before it is paired with a US makes it less
likely that conditioning will occur after a single episode of illness
- Conditioned emotional responses are also being created by negative political advertisements
○ CS would be attacked politician. US would be the negative imagery (black and white image, grainy, poor
quality). UR would negative emotional response to the imagery. Then the people who made the ads will hope
that the attacked politician will produce a negative emotional response (CR)
▪ This is similar to evaluative conditioning. Evaluative conditioning is when you pair a stimulus (e.g. shape)
with a positive or negative stimuli (happy/angry face). Repeated association of a stimulus with a emotion
leads to people to develop positive or negative feeling towards the stimulus. This is what negative
political ads are trying to accomplish.
□ Ultimately, the actual effect of the ad increased the voters who already agreed with the views
expressed in ads. Goal of having a negative opinion on attacked politician led to motivate people
who already have negative view on attacked politician to go and vote to the party that made the ad.
- Classical conditioning can explain drug−related phenomena, such as craving and tolerance
○ Cues that accompany drug use (cigarette lighter, smell of tobacco smoke) can be conditioned stimuli that
elicit cravings
○ When a person takes a drug, the body attempts to metabolize the substance. However, overtime the
paraphernalia associated with drug serves as cues (CS) that the drug (US) will process the body (UR), and
process involving metabolizing the drug will begin before the drug is consumed (CR)
▪ So over time you need more dosage of drug to override the preparedness of the body
□ This is called conditioned drug tolerance
◆ This is dangerous, since if you changed the environment where you normally take drugs,
there will be less CS's to trigger CR (the body's metabolizing activity to prepared for the drug
arrival)
and you can actually overdose

6.2 Operant Conditioning: Learning through Consequences

- Contingency simply means that a consequence depends upon an action


- Reinforcement is a process in which an event/reward that follows a response, increases the likelihood of that
response occurring again
- Thorndike proposed the law of effect: idea that responses followed by satisfaction will occur again in the
same situation, whereas those that are not followed by satisfaction become less likely
○ E.g. Cat's being fed for good behaviour, or cat escapes a puzzle box ("satisfaction" is achieving the desired
goal, which is the puzzle box)
- Reinforcer is a stimulus that is contingent upon a response and that increases the probability of that
response occurring again
- Operant chambers/Skinner boxes have a lever/key and when the animal held captive pushes the lever, it delivers
a reinforcer such as food
○ Researchers record animal's rate of responding over time (measure of learning) and typically set criterions
before reinforcer is available
- Reinforcer would be a stimulus like food, whereas reinforcement would be the changes in frequency of a
behaviour, like lever pressing that occurs as a result of the food reward
- Punishment is a process that decreases the future probability of a response
- Punisher is a stimulus that is contingent upon a response, and that results in a decrease of behaviour

- Reinforcement and punishment can be achieved with the removal of a stimulus


- Terms of operant conditioning:
○ Reinforcement: increases the chances of a behaviour occurring again
○ Punishment: decreases the chances of a behaviour occurring again
○ Positive: stimulus is added to a situation; positive can refer to reinforcement or punishment
○ Negative: stimulus is removed to a situation; negative can refer to reinforcement or punishment
- Positive reinforcement is the strengthening of behaviour after potential reinforces such as praise, money,
or nourishment, which follow that behaviour
- Negative reinforcement involves the strengthening of behaviour because it removes/diminishes a stimulus
○ E.g. Studying to avoid nagging of parents, taking an aspirin to remove a headache
○ This type of reinforcement is further classified in 2 subcategories: avoidance learning and escape learning
- Avoidance learning is a specific type of negative reinforcement that removes the possibility that a stimulus will occur
○ E.g. Paying bill on time to avoid late fees, leaving early for work to avoid traffic, etc.
- Escape learning occurs if a response removes a stimulus that is already present
○ E.g. Covering your ears when you hear overwhelmingly loud music
- Positive punishment is a process in which a behaviour decreases in frequency because it was followed by a
particular, usually unpleasant, stimulus
○ E.g. Cat owner's spray the cat with water if the cat scratches furniture
- Negative punishment occurs when a behaviour decreases because it removes or diminishes a particular stimulus
○ E.g. Parent's who "ground" their child; so removes the child's privileges

- Rats in the operand chamber don't immediately go and press the lever, must first learn that it accomplishes something
○ So getting the rat to press the lever can be done by reinforcing behaviours that approximate (or lead up to)
lever pressing , such as standing up, facing the lever, putting paws on lever, and pressing downward
▪ This process is known as shaping
- Shaping is the process of reinforcing successive approximations of a specific operant response
○ This is done by a step−by−step fashion until the desired response
- Chaining involves linking together two or more shaped behaviours into a more complex action or sequence of actions
- Applied behaviour analysis (ABA) involves using close observation, prompting, and reinforcement to teach
behaviours, often to people who experience difficulties and challenges owing to a developmental condition such as
autism
○ Autistic kid can have trouble clearing the dishes on the dining table
▪ So you use prompts (stand up, gather silverware, gather plate, etc.) and give verbal reward for each
step completed. Then desired behaviour is shaped
- Primary reinforcers consist of reinforcing stimuli that satisfy basic motivational needs—needs that affect an
individual’s ability to survive (and, if possible, reproduce)
○ E.g. Food, water, shelter, sexual contact
- Secondary reinforcers consist of stimuli that acquire their reinforcing effects only after we learn that they have value
○ More abstract and do not directly influence survival−related behaviours
○ E.g. Instagram likes, money, etc...
- Nucleus accumbens becomes activated during the processing of rewards, including primary ones (eating, sex)
and "artificial" rewards like cocaine or smoking a cigarette
○ Variations in this area might account why individuals differ so much in their drive for reinforcers
▪ People who are prone to risky behaviours like gambling and alcohol abuse are more likely
inherited particular copies of gene that code for dopamine and other reward−based chemicals in
the brain
▪ People who are impulsive (vulnerable to gambling and drug abuse) release more dopamine and
have trouble removing dopamine
- Secondary reinforcers also trigger the release of dopamine in reward are of the brain
○ E.g. Monetary rewards cause dopamine release in basal ganglia and frontal lobes
- When a behaviour is rewarded for the first time, dopamine is released which reinforced reward−producing behaviours
○ Dopamine releasing neurons and nucleus accumbens keep track of which behaviours are associated to rewards
▪ Alter rate of firing when need to update which actions leads to rewards
- Discriminative stimulus is a cue or event that indicates that if a response is made, it will be reinforced
○ E.g. You ask to borrow parent's car only when they are in a good mood
▪ So your parent's mood will dictate whether you perform a behaviour (asking to borrow the car)
- Discrimination occurs when an organism learns to respond to one original discriminative stimulus but not to
new stimuli that may be similar
○ E.g. Pigeon may learn that it will receive a reward if it pecks the key after 1000 Hz tone, but it wont receive
the reward if it pecks the key at 2000 Hz tone. As a result, pigeon wont peck the key after 2000 Hz tone
- Generalization takes place when an operant response occurs in response to a new stimulus that is similar to
the stimulus present during original learning
○ E.g. Pigeon who learn to peck key after 1000 Hz tone may attempt to peck a key whenever any tone is presented
○ E.g. Child who pets neighbour's dogs led to child laughing and playing with the dog, then they might be
more likely to pet other dogs and furry animals
- Thorndike said that reinforcement was more effective if there was very little time b/w action and consequence
○ Study showed that pigeon would peck the key less frequently if amount of time to get the reward was increased
- Delayed reinforcement influences human behaviours as well
○ Drugs that hit you as soon as you take it is more addictive than drugs that take a while to actually feel an affect
▪ So more likely to get addicted to drugs that have a rapid effect than drugs that have a delayed effect
- Extinction is the weakening of an operant response when reinforcement is no longer available
○ E.g. if your parents no longer let you borrow the car no matter how nicely you ask, you may persist
your behaviour for a while but you will eventually stop asking
○ If you expect a reward for your behaviour and nothing comes, the amount of dopamine release decreases
▪ Dopamine will increase again if there is a new behaviour−reward relationship to learn

- Behaviours change when the reinforcer loses some of its appeal


○ If rats were pre−feed with a taste, and goes to the operand chamber and presses on the 2 levers which give
2 different rewarding taste, the rat would crave less on the taste he already tasted
▪ So the taste the rats already tasted is "de−valued" while the other reward taste is not affected, they have
a stronger preference for the one that is not devalued
○ Nucleus and nucleus accumbens have a role in altering behaviour and "expectations" about rewards that
are devalued; neurons would fire less when a reward that is devalued is available
- Typically a behaviour is rewarded according to some kind of a schedule
- Schedules of reinforcement are rules that determine when reinforcement is available
○ Can have a dramatic effect on learning, relearning, or unlearning of responses
- Continuous reinforcement is when every response made results in reinforcement
○ E.g. Putting enough money in a vending machine would deliver you a snack every time
- Partial (intermittent reinforcement) is only a certain number of responses are rewarded, or a certain amount of
time must pass before reinforcement is available
○ E.g. Phoning a friend may not get you an actual person on the other end of the call, cuz he might be busy
- 4 types of partial reinforcement schedules are possible:
○ Raito schedule: reinforcements are based on the amount of responding
○ Interval schedule: reinforcements are based on the amount of time b/w reinforcements, not the number
of responses a human (or animal) takes
○ Fixed schedule: schedule of reinforcement remains the same over time
○ Variable schedule: schedule of reinforcement, although linked to an average (e.g. 10 lever presses or 10
secs), varies from reinforcement to reinforcement
- Fixed-ratio schedule is when reinforcement is delivered after a specific number of responses have been completed
○ E.g. Worker in factory may get paid based on how many items they worked on (so like $1 for 5 items produced)
- Variable-ratio schedule is the number of responses required to receive reinforcement varies according to an average
○ A VR5 (variable ratio with average of 5 trials b/w reinforcements) could include trials that require seven
lever presses for a reward to occur, followed by four, then six, then three, and so on.
▪ But average number of responses to receive reinforcement would be five
○ This is how slot machines operate
- Fixed-interval schedule reinforces the first response occuring after a set amount of time passes
○ E.g. If you have psych exam every 4 weeks, your reinforcement for studying is on a fixed−interval schedule
○ Responding drops off after each reinforcement is delivered, but increases as soon as reinforcement is
available again
- Variable-interval schedule: in which the first response is reinforced following a variable amount of time
○ The time interval varies around an average
▪ E.g. During a meteor shower, you are rewarded looking upward at irregular times; the meteor falls in
an average of 5 mins (but could have inactivity for 10 mins, 2 mins, etc..)
- Partial reinforcement effect refers to a phenomenon in which organisms that are conditioned under
partial reinforcement resist extinction longer than those conditioned under continuous reinforcement
○ This is likely because that individuals are accustomed to not receive reinforcement for every response so it
does not alter the motivation to produce the response even when reinforcement is no longer available
○ E.g. Cheesy pick up lines in bars
- People tend to be more sensitive to the unpleasantness of punishment than they are to the pleasures of reward
- Spanking children is a good punisher when it used to immediately stop a behaviour
○ But is associated with major side effects such as poorer parent−child relationship, poorer mental health for
both adults and children, delinquency in children, etc..
- Punishment may supress an unwanted behaviour immediately but it does not teach which behaviour is appropriate
○ Punishment is more effective when combined with reinforcement of an alternative, suitable response

- Its possible that a complex behaviour is influenced by classical conditioning and operant conditioning
○ E.g. Consider gambling; slot machines use a variable−ratio schedule of reinforcement, a type of operant
conditioning that leads to high response rate. The flashy lights and dinging sounds from the machine serves as
a CS for the UR of excitement associated with gambling
▪ classical conditioning produces an emotional response and operant conditioning maintains the behaviour

You might also like