Learning and Reinforcement
Learning and Reinforcement
LEARNING
Highlights
5.1 Concept and Nature of Learning
5.2 Theories of Learning
5.3 Classical Conditioning
5.4 Operant Conditioning
5.5 Cognitive Learning
5.6 Social Learning
5.7 Shaping Behaviour
5.8 Strategies of Reinforcement
5.9 Schedules of Reinforcement
Learning is any permanent change in the behaviour of a person that occurs as a result of experience.
It is accompanied by acquisition of knowledge, skills and expertise which are relatively permanent.
Temporary changes may be only reflexive and fail to represent any learning. If reinforcement does not
accompany the practice, the behaviour will eventually disappear.
Many psychologists have contributed to the theory of learning thatprovides abasis for changing
types of behaviour that are unacceptable and maintaining those that are acceptable. When individuals
engage in various types of dysfunctional behaviour (late for work, disobeying order, poor performance),
the manager will attempt to make them learn functional or acceptable behaviour. Learning theory can also
provide guidelines for conditioning the employees. This chapter is aimed at explaining learning theories
and stratagies of reinforcement.
5.1 CONCEPT AND NATURE OF LEARNING
Meaning of Learning
behaviour that occurs as a
E.R. Hilgard has defined learning as a relatively permanent change in
that change in behaviour indicates
result ofexperience. or reinforced practice. Ironically, it can be said
5.1
Essentials of
Organisational
5.2
Features of Learning
The process of learning involves the following implications:
() Learning involves a change, in behaviour of a person. It may be good or bad from organisation':
point of view. For example, bad habits, prejudice, stereotypes and work restrictions may be leamt
by an individual.
(ii) Change in behaviour must be relatively permanent. Temporary changes may be only reflexive.
Changes caused by fatigue or temporary adaptations are not covered in learning.
(iüi) Change inbehaviour should occur as aresult of experience, practice, ortraining. The change
may not be evident until a situation arises in which the new behaviour can occur.
(iv) The practice or experience must be reinforced in order for learning to occur. If reinforcement
does not aceompany the practice or experience, the behaviour will eventually disappear.
() Learning is reflected in behaviour. Achange in an individual's thought process or attitudes, not
accompanied by behaviour, is no learning.
Principles of Learning
Learning is said to have occurred when people demonstrate adifference in behaviour or abilty 0
perform atask. The following principles are important for the development of any training programin.
1. Trainee must be motivated to learn.An employee must see a purpose in learning the inforna
presented and havea clear understanding of what is presented. Ifthese two factors are considerea,
satisfaction.
will be a greater chance of satisfaction. Agood trainee perceives an opportunity of real
Agood trainee perceives an opportunity of real satisfaction from training.
Learning 5.3
2. Intornanon must be meaningful. The training material must relate to the purpose of the tramns
programme or it Will stop being a motivator. Further. the material must be
manner, from the Simple to the more complex. Further training presented in a sequemta
should provide variety to prevent bou
and fatigue. Materials can be presented through case studies. lectures, films,
computer games. discussions, or simuiated
3. Learning must be reinforced. In organisations, both
should be used. If behaviour is undesirable. the negativé
positive and negative reinforcemenis
reinforcement such as denial of a pay ralse,
promotion, or transter can be effective. However. during the orientation and training period,
posinve
reinforcement is mere effective than negativereinforcement. According to Behaviour Modification Model
developed by B.F. Skinner, the more a desired activity or new knowledge is repeated and rewarded witn
verbal praise, physical rewards, or income, the more it will be remembered and
become a part of a person s
behaviour.
4. Organisation of material. The trainer must remember that well organised material will help the
trainees to remember the things taught to him. Presenting an overview of the material in a logical order
will help the employee understand everything. Further, the sequence can affect ho well the perSon can
remember the material presented. The training section may prepare the training material to be used ror
different jobs with the help of line supervisors. Acomplete outline of the whole course should be made
with the main topics included under each heading. The training material should be distributed among the
trainees wellin advance so that they may come prepared in the lecture class and may be able to understand
the operations quickly and may remove their doubts by asking questions from the instructor.
5. Feedback on learning. People like to know much they have learnt or how well they are doing.
The sooner employees know the results of a quiz or test, the sooner they can assess their progress. The
sooner employees receive positive feedback from the trainer, the less time they will waste in learning.
Self-graded tests and programmed learning kits provide the necessary feedback to person on his
progress on a particular subject. The principle of feedback does not necessarily mean frequent testing,
but the more immediate the feedback on learning the more motivating it is likely to be.
5.2 THEORIES OF LEARNING
There are four theories which explain how individuals learn new patterns of behaviour as shown in
Fig. 5.1.These are
attributed
() Classical Conditioning.The classical behaviourists, particularly Pavlov and Watson,
learning to the connection between stimulus and response. (Stimulus ’ Response).
S- Stimulus
R- Response
LEARNING
Cognition implies the process of acquiring knowledge. Individuals have cognitive systems thas
environment. These systems
represent what they know about themselves and the external are
through cognitive processes like perceiving, imagining, thinking, remembering, reasoning, etc.
understanding aperson's cognitive system,it is possible to predict his behaviour. The more we understand
developed
about an individual's cognitive system, the better we are able to predict his behaviour.
5.3 CLASSICAL CONDITIONING
Classical conditioning deals with the association of one event with another desired event resil
in a desired behaviour or learning. It is a type of conditioning where an individual
responds to soma
stimulus that would invariably produce such a response. Learning through classical conditioning wac
first studied by lvan Pavlov, a famed Nobel Prize winning physiologist, at the turn of
the 20th century
1. Before Conditioning
Unconditioned Unconditioned
Stimulus Response
(Meat) (Salivation)
Conditioned No
Stimulus
(Bell) Response
2. During Conditioning
Unconditioned Unconditioned
Stimulus
(Meat) Response
(Salivation)
Conditioned
Stimulus
(Bell)
3. After Conditioning
Conditioned
Stimulus Conditioned
(Bell) Response
FIG. 5.2.
(Salivation)
Stages in Classical
Conditioning.
Learning 5.5
Pavlov conducted an experiment on a dog to study the relation between the
the ringing of a bell. A simple surgical procedure helped bim to dog 's salivalion an
measure
secreted by the dog. When Pavlov presented a piece of meat to the accurately the amount of Saiiva
dog, he noticed a great deal o
salivation. He termed the food an unconditioned stimulus (food
salivation an unconditioned response. When the dog saw the automatically caused salivation) and tne
meat, it salivated. During tne seo
stage, Pavlov merely rang a bell (neutral stimulus), the dog did not
introduced thes sound of the bell each time the meat was given to the dog.salivate. Pavlov subsequently
Thus, meat and the ringing ol
the bell were linked together. The dog eventually learnt to
salivate response to the
in
even when there was no meat. Pavlov conditioned the dog to respond to a learned ringing of the bel
stimulus. Thorndike
called this the *law of exercise" which states that behaviour can be learned by
between a stimulus and aresponse.
repetitive assOc1ation
The meat was an unconditioned stimulus (US), It invariably caused the dog to react in a
specific
way, i.e., noticeable increase in salivation. This reaction is called the unconditioned
response (UR). I he
bell was an artificial stimulus or conditioned stimulus (CS). It was originally neutral. But when the bell
was paired with the meat (an unconditioned stimulus), it eventually produced a respnse. After conditioning,
the dog started salivating in response to the ringing of the bell alone. In other words, conditioned
stimulus (CS) led to conditioned response (CR).
Thus, under classical conditioning, 1earning is a conditioned response which involves building up
an association between a conditioned stimulus and an unconditioned stimulus. Using the paired stimuli,
one compelling and the other one neutral, the neutral one becomes a conditioned stimulus and, hence,
takes on the properties of the unconditioned stimulus. This happens quite often in organisational settings.
Ina case, there was a cleanliness drive in a hospital to prepare for inspection by the top officials of the
health ministry. Here the nurses and other staff showed great attention to their duties. This practice
continued for a considerable period. Eventually, nurses andother staff showed their utmost attention to
duties, whenever the cleanliness drive was carried out in the hospital even though it was not linked with
the inspection by the health ministry official.
Classical conditioning represents only a very small part.of total human learning. So it has a limited
value in the study of Organisational Behaviour. In the words of S.P. Robbins, "Classical conditioning is
passive. Something happens and we react in a specific or particular way. It is elicited in response to a
specific, identifiable event and as such it explains simple and reflexible behaviours. But behaviour of
people in organisations is emitted rather than elicited, and it is voluntary rather than reflexive'", The
learning of complex behaviours can be better understood by looking at operant conditioning.
Learning
pleasure but have a variety of 5.9
meanings for individuals such as
considered secondary because it is used to money, promotion, and praise. Money is
(also a primary reinforcer). purchase primary reinforcers (food) or as a proxy tor status
Effective reinforcers must meet two
Second, the reward conditions
type of performance. : First, the
reward should be contingent upon the
positive reinforcers differ among should be matched with the needs of the worker.
individuals, managers must either develop a reward systemBecause
appropriate for all the members of their workgroup that is
or tailor their rewards to
2. Negative suit each individual.
Reinforcement or Avoidance Learning.
individuals learn to avoid or escape Negative
society is based on avoidance learning.from unpleasant consequences. reinforcement
Much
takes place when
In the workplace, For example, people learn to drive lawful behaviour in our
avoidance learning usually occurs when peers or carefully to avoid accidents.
actions. supervisors criticise an individual's
Negative
the offering of areinforcement
relies on avoidance of punishment or the
reward. For instance, we learn to watch threat of punishment rather than
bundle upon cold days to avoid accidents and to for traffic when crossing streets, and we learn to
threat of punishment is not implied in any of protect ourselves from cold. However, punishment or
these actions. In work
warnings, orientation sessions and counselling help alert environments, training, safety
undesirable behaviour. When coupled with positive employees against ne gative consequences of
can be extremely beneficial. reinforcement for appropriate behaviour, the effect
3. Extinction. It is an effective method ofcontrolling undesirable behaviour. It refers to non
reinforcement. It is based on the principle that if a response is not reinforced, it will eventually
The absence of all forms of
reinforcement is used to remove or extinguish undesirable disappear.
disruptive employee who, for example, picks fights and who is apparently behaviour. A
may continue the disruptions because of the attention they bring. By punished by the supervisor
employee, attention is withheld and possibly also the motivation forignoring
fighting.
or isolating the disruptive
4. Punishment. Through punishment, managers try to
correct improper behaviour of subordinates
by providing negative consequences. Giving harsh criticism,
and reducing an
docking pay, denying privileges, demoting,
individual's freedom to do his or her job are common forms of punishment in the
workplace. Punishment is the historic method of reducing or eliminating undesirable behaviour.
punishment frustrates the punished and leads to antagonism towards the punishing agent. Sometimes,
As a result
the effectiveness of the punishing agent diminishes over time. Because of the
possible dangers of
punishment, it should be administered properly. The following points may be noted in this regard:
(i) The specific undesired behaviour, not the person, should be punished. If it is directed at the
person, punishment will receive revenge.
(i) The punishment should be enough to extinguish the undesired behaviour. Underpunishment
may not deter the behaviour, overpunishment may produce undesirable results.
(ii) Punishment should be administered privately.By administering the punishment in front of others,
the worker is doubly punished in the sense that he is also put out of face.
(iv) Punishment should quickly follow the undesirable behaviour. It is more effective when applied
immediately after the undesirable behaviour is produced. Further, punishment should every
occurrence of the undesirable behaviour.
Essentials of
5.10
Organisational
(v)
Punishment
behaviour
forces the person to
is effective in modifying behaviour if itundesirable
that is reinforced. If this not done, the behaviour seltends Behair
ect toa desirabi
(vi)
causing fear and anxiety in the person being punished.
Punishment must be administered carefully so that it does not become a reward for reap ear
behaviour.
Although most ethical criticisn1s of behaviour modification techniques focus on
and other
und
punishment,
esirable
cthanShKaInnwhatgere
behaviour. behaviourists advocate the use of positive reinforcement rather than punishment to
Punishment, by definition, only tells the individual what should not be done
should be done. Thus, one mistake may befollowed by a new one as the individual seeksrattoher
whicfihnd,
error, behaviour that will not be punished. In addition, punishment causes resentment. by trial
is
tocounterproductive
in the work environment. For most organisation members who are
be productive, positive reinforcement (combined with mature and Uswiulalinyg
extinction, if necessary) is more
humane.
TABLE 5.1. Reinforement Theory at a Glance
effective and
To Encourage Desiruble
Behaviour
0 Positive Reinforcement.
Increasing the frequency of or strengthening a desirable
by making it contingent with the
occurrence of a desirable consequence. behaviom
Example- Amanager nods to express approval to a
customer. subordinate who pacifies an
annoyed
() Negative Reinforcement or Avoidance. Increasing the
behaviour by making it contingent frequency of or strengthening adesirable
with the removal of an
Example -A manager who has been regularly undesirable consequence.
nagging when the production quota is met. nagging worker about his performance, stops
a
To Discourage
(in)
Undesirable Behaviour
Extinction. Decreasing the frequency of or weakening an
desirable consequences previously
contingent with its
undesirable behaviour by removing
Example -A disruptive employee is isolated or occurrence.
the attention of fellow employees. ignored so that his behaviour does not
() Punishment. receive
Decreasing
it contingent with the frequency of or
the occurrence weakening an undesirable behaviour by making
of an
Example -Amanager deducts an undesirable consequence.
employee's pay when he reports late for work.
5.9
SCHEDULES OF REINFORCEMENT
Reinforcement
sometimes
does
not
yields high grades always follow a particular response. For
expensive restaurant and plying sometimes, it does not. Similarly, example,
and studying hard for exaius
always. Also, keeping on top with taking
excellent wine help in closing an a client out to
dinner al annot
recognition; at other times, it of one's work and getting mayit done on time important deal, but
may be ignored. In sometimes results in praise and
following agiven form of behaviour seems to be many cases, the
These rules are known as
schedules of quite random. Inoccurrence or of reinforcement
it is absence definite rules.
others,
reinforcement and exert powerful governed effects
by
upon behaviour.
Learning 5.11
The influence of such schedules was studied
systematically for several decades by B.F. Skinner
and his associates. The key questions in such research have been these: How quickly and how often do
subjects perform various responses under different schedules of reinforcement? Do such rates of
responding vary from one schedule to another ? In order to answer these questions, a large number of
schedules have been exanmined. The simplest, of course, is one in which every response is
reward - continuous reinforcement. Besides this, however, the most basic schedules are followed by a
ones in which
the occurrence of reward is governed by a single rule, Four distinct schedules of
type exist, which are discussed below.
reinforcement of this
Interval Ratio
() Fixed Interval Schedule. This schedule demands that a fixed amount of time has to elapse before
a reinforcement is administered. In many organisations, monetary reinforcement comes at the
end of aperiod of time. Most workers are paid hourly, weekly or monthly for the time spent on
their jobs. This method offers the least motivation for hardwork among workers because pay is
tied to time interval rather than actual performance. The occurrence of reinforcement depends
largely on the passage of timne.
(i) Variable Interval Schedule. The availability of reinforcement is atso controlled mainly by the
passage of time in a variable interval schedule. In some cases, reinforcement can be obtained
after a short period has passed. In others, a much longer interval must elapse before it again
becomes available. As a result of such uncertainty, variable-interval schedules of reinforcement
generally yield moderate and steady rates of response. Suppose the plant manager visits the
production shop at 11 a.m. each day (fixed interval), performance tends to be high just prior to
his visit and thereafter it declines. Under variable interval schedule, the manager visits at randomly
selected time intervals and no one knows for sure when the manager will be around. As a result,
performance tends to be higher and there would be less fluctuations than under the fixed interval
schedule.
or constant
(iii) Fixed Ratio Schedule. In a fixed ratio schedule, rewards are showered after a fixed
It tends to
number of responses. For example, piece rate incentive plan is a fixed ratio schedule.
produce high rate of response which is both vigorous and steady. Workers try to produce as
Therefore, the response level
many pieces as possible in order to pocket the monetary rewards.
schedule.
here is significantly higher than that obtained under an interval
behaviour of the individual, he
(iv) Variable Ratio Schedule. When the reward varies relative to the
commission represent
is said to be reinforced on a variable-ratio schedule. Salespersons on
some occasions, they may make
examples of individuals on such a reinforcement schedule. On
occasions, they might need to make
a sale after only two calls on potential customers. On other
Essentials of
Organisational
5.12
Behavior
twenty or more calls to secure a sale. The reward, then, is variable in relation to the number of
REVIEW QUESTIONS
1. "Learning leads to change in human behaviour." Comment.
2. What is law of effect? When should punishment be used by the managers?
3. "Classical conditioning is passive". Elaborate.
4. "Behaviour is a function of its
consequences." Do youagree? Why?
5. Explain the concept of learning and briefly examine the
various theories of learning.
6. Briefly discuss various schedules of
reinforcement.
7. Discuss the nature of learning. How does classical conditioning help in
behaviour? learning the desired
8. What is operant conditioning? How is it
9. What
different from classical conditioning?
strategies are employed under operant
conditioning to modify the behaviour of
subordinates? Discuss with the help of suitable examples.
10. What is meant by
reinforcement?
to make the employees What types of
learn new behaviours. reinforces could be employed by the managers
11. "Reinforcement theory of
statement. learning is at the root of behaviour
12. Explain with examples the
modification." Examine this
social learning in modern concepts of cognitive and social learning. What is the
13. How would you organisations? relevance of
convince
technique for human resource someone, Who believes OB Mod is
11. Write short notes on the management? manipulative that it is an ethical
following:
(a) Classical conditioning
(b) Law of effect
(c) Positive reinforcement.
CASE STUDY
The employees of Blue Diamond Company faced a cloudy future. The company wnicn
manutactures paper egg cartons,was encountering stiff competition from several Tirms
producing styrofoam containers. In addition, the economic recession was biting into profits
and empioyees were generally jittery about their jobs. Relations between management and
labour were strained. In order to improve internal working, the Chief Executive devisu
system of productivity incentive called the 100 Club. Under this programme, empioyeo
were allocated points for above-average performance, Anv emoloyee who worked a ful year
without having an industrial accident was awarded twenty points. 100% attendance was
worth twenty-five points, and so on. Every year, on the programme's anniversary
dale,
points would be added up, and a record would be maintained. Upon reaching 100 points, ne
worker received a nylon jacket emblazoned with the company logo and a patch signiying
membership of the 100 Club. Each of the plant's 325 employees eventually earned a jacket.
Those who continued to accumulate points above 100 received additional gifts. For
example, with 500 points, employees could choose such items as a blender, a wall clock,
or a pine cribbage board. Even though none of these was beyond the
the workers, the response was impressive. purchasing power Or
After twO years, productivity at the plant was up 16.5 per cent and
errors were down 40 per cent. Workers' grievances had decreased 72 per cent quality-related
and lost time
due to industrial accidents was reduced by 43.7 per cent. Beyond these
improvements,
relations between labour and management had never been better. Labour leaders credited
the 100 Club with keeping the company afloat and fostering a new atmosphere of
cooperation.
QUESTIONS
1. ldentify the problem in the above case.
2. What reinforcers did the company use and what were the results?
3. What kind of reinforcement schedule did the company use? Give its merits and demerits.
REFERENCES
1 Hilgard, E.R., Introduction to Psychology, New Delhi, Oxford &IBH, 1975, p. 186.
2 McGehee, W., "Are We using What We Know About Training?-Learning Theory and Training." Personnel
Psychology, Spring 1958, p.2.
3 Warren, Howard C. (ed.), Dictionary of Psychology, New York, Houghton Miffin, 1934, p. 151.
4. Robbins, S.P, Organisational Behaviour, New Delhi, Prentice-Hall, 1936, p. 111.
5 Luthans, Fred, Organisational Behaviour, New York, McGraw-Hill, 2013, p. 292.
6. Pavlov, Ivan P, The Work of the Digestive Glands (trans. W.H. Thompson), London, Chales Griffin, 1902.
7 Robbins, S.P., op. cit., p. 112.
8 Skinner, B.F., Contingencies of Reinforcement, East Norwalk, C.T., Appleton-Century-Crafts, 1971.
9 Ibid.
5.14 Essentials of Organisational Behaviour
10.
Luthans, Fred, op. cit., p. 296. Organizational Behaviour,
11.
Davis, Tim, R.V. and Luthans, Fred, A Social Learning Approach to ,Academy of
Management Review, April 1980, pp. 281-90.
12. Applied Psychology, VoI. 71, No,
100e Mel E., Vicarious Punishment in aWork Setting, Journal of
1986, pp. 343-345.
13.
Robbins, S.P., op. cit., p. 133.
14.
15.
Ihorndike, E.L., Animal Intelligence, New York, Macmillan Publishing Company, 1911.
LUthans, Fred, Organizational Behaviour. McGraw-Hill, New York, 1989, p. S12.
16. Ferster, C.B. and Skinner, B.F. Schedules of Reinforcement. New York, Appleton, 1957.
17.
Honig, W.K., and Staddon, J.E.R. (Eds), Handbook of Operant Behaviour. Englewood Cliffs, N.J., Prentice
Hall, 1977.
18.
Luthans, Fred and Kreitner, Robert, The Management of Bahvioural Contingencies,
July-Aug. 1974, pp. 7-16. Personnel, 51, No. 4,
19.
Robbins, Stephen P., op. cit., p. 251.
20.
Schermerhorn,
1988, p. 138.
Hunt and Osborn, Managing
Organisational Behaviour, New York, John Wiley & Sons,
21.
Skinner, B.F., Contingencies of Reinforcement, East
Norwalk, C.T.,
Setting: Mice Appleton-Century-Crafts,
22. Fred L. Fry, Operant Conditioning in 1971.
Pp. 17-24. Organisational Of or Men, Personnel,
July-Aug. 1974,