100% found this document useful (3 votes)
504 views198 pages

Psychological Statistics and Research Methods

This document provides an overview of psychological assessment and research methods. It discusses the history and meaning of assessment, including how assessment has evolved from a focus on individual differences. It describes how psychological tests are used to measure traits like intelligence, aptitudes, achievement, personality, and vocational fitness. The document also outlines the objectives and content of the course, including units covering statistical concepts, research design, tools of research, and different research methods.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (3 votes)
504 views198 pages

Psychological Statistics and Research Methods

This document provides an overview of psychological assessment and research methods. It discusses the history and meaning of assessment, including how assessment has evolved from a focus on individual differences. It describes how psychological tests are used to measure traits like intelligence, aptitudes, achievement, personality, and vocational fitness. The document also outlines the objectives and content of the course, including units covering statistical concepts, research design, tools of research, and different research methods.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 198

Edited with the trial version of

Foxit Advanced PDF Editor


To remove this notice, visit:
www.foxitsoftware.com/shopping

ii

M.Sc. APPLIED PSYCHOLOGY


First Year

PAPER – 2: PSYCHOLOGICAL STATISTICS AND RESEARCH METHODS

SYLLABUS

Objectives
 To understand the meaning and methods of Assessment
 To gain knowledge and acquire skill in various statistical methods.
 To understand the various research methods
 To understand the different stages of research
 To identify the tools of research
 To develop a positive attitude towards research

UNIT – I
The History and Meaning of Assessment
Introduction – The use of Tests – Measurement – The meaning of Psychological
Assessment Evaluation – History of Assessment – Theory and Assessment –
Measurement Assessment and Evaluation – Tests and Assessment.
Basic Statistical Concept in Testing and Assessment
Definition of a Psychological Test – Statistical Methods in Testing. Measures of
Central Tendency – Meaning – Three Common Measures of Central Tendency : The
Mean, Median and Mode. Measures of Variability: The Range – Quartile Deviation
and Mean Deviation – The Variance and the Standard Deviation. Normal
Probability Curve: Properties and its applications – Skewness and Kurtosis.
Finding Points within Distributions
Percentile Ranks – Calculation of Percentiles – Standard Scores and
Distributios: Z – Score – Standard Normal Distribution – Percentile and Z Scores –
Mc Calls T – Quartiles and Deciles – Sten – Stanine Scores.

UNIT – II

ANNAMALAI UNIVERSITY
Bivariate Analysis
Correlation – Rank Order – Product Moment – Test of Significance: ‘t’ Test –
Calculation and Interpretations – The ‘t’ ratio and its assumptions.

UNIT – III
An Overview of Experimentation
The Nature of Research – Psycholgoical Experimentation – An application of the
Scientific Method – An example of a Psychological Experiment.
Major Stages in Research
Defining a research problem – Sources for research Problem: Study of Related
Literature – Criteria for selecting a problem. Hypothesis: Meaning – Types and
iii

Formulation of Hypothesis – Sampling: Meaning – Types of Sampling – Probability


and Non-Probability Sampling.

UNIT – IV
Tools of Research and Tests
Criteria for selection of tools – Factors related to construction of tools – Tools of
different Types: Observation – Interview – Questionnaire – Check List – Schedule –
Rating Scales. Attitude Scales: Thurstone’s method and Likert Scale –
Characteristics of a Research Tool – Reliability and Validity – Methods of obtaining
reliability and validity coefficients. Test Construction: Rational test construction –
Empirical test construction – Factor Analytic test construction – Steps in test
construction – Sources of Information about tests.

UNIT – V
Research Methods
Historical Research – Normative survey – Experimental Research: The
Experimental variables – Dependent, Independent and Extraneous Variables –
Experimental control – The Nature of Experimental Control – Types of Empirical
Relationship in Psychology Planning an Experiment – Ethical principles –
Conducting an Experiment – Ethical principles Conducting an Experiment – Ethical
principles in the Conduct of Research with human participants – Ethical Principles
for Animal Research.

TEXT BOOKS
1. Walsh, W.B. and Betz., N.E. Tests and Assessment. NJ: Prentice Hall, Inc.,
1985.
2. Runyon, P.R. Avdrey Haber, Pittenger, D.J. and Coleman, K.A. Fundamentals of
Behaviour Statistics (8th Ed.) NY: McGraw Hill, 1996.
3. McGuigan, F.J. Experimental Psychology – A Methodological Approach (4th Ed.)
NJ: Prentice Hall, Inc., Englewood Cliffs, 1983.
4. Kothari, C.R. Research Methodology – Methods and Techniques (2nd Ed) New

ANNAMALAI UNIVERSITY
Delhi: Wiley Eastern Ltd. 1985.
5. Rajamanickam, M. Statistical Methods in Psychological and Educational
Research. New Delhi: Concept Publishing Company, 2001.
*****
iv

M.Sc. APPLIED PSYCHOLOGY


First Year

PAPER – 2: PSYCHOLOGICAL STATISTICS AND RESEARCH METHODS

CONTENTS

Lesson Page
Title
No. No.

1. The History and Meaning of Assessment 1

2. The History and Meaning of Assessment (Contd.) 10

3. The Basic statistical concepts in Testing and Assessment 16

4. The Basic statistical concepts in Testing and Assessment 26


(Contd.)

5. The Basic statistical concepts in Testing and Assessment 35


(Contd.)

6. The Basic statistical concepts in Testing and Assessment 47


(Contd.)

7. Finding Points Within Distributions 54

8. Finding Points Within Distributions (Contd.) 65

9. Bivariate Analysis 72

10. An Overview of Experimentation’s 84

11. Major Stages in Research 96

12. Major Stages in Research (Contd.) 105

13. Major Stages in Research (Contd.) 113

14. Tools Research and Tests 121

15. Tools Research and Tests (Contd.) 131

ANNAMALAI UNIVERSITY
16. Tools Research and Tests (Contd.) 142

17. Tools Research and Tests (Contd.) 152

18. Research Methods 163

19. Research Methods (Contd.) 174

20. Research Methods (Contd.) 181


1

LESSON – 1

THE HISTORY AND MEANING OF ASSESSMENT


OBJECTIVES
After reading this lesson the student should
 Understand the use of psychological test
 Explain the meaning of psychological assessment
 Narrate the history of assessment
 Describe the impact of various theories on the assessment process.
SYNOPSIS
Introduction – The use of Tests – The meaning of Psychological Assessment –
History of Assessment
INTRODUCTION
Scientists are more concerned with the development of knowledge which is
objective, exact and verifiable. The aim of any scientific inquiry is to collect facts
about an object or a phenomenon or a system, or a problem under investigation,
objectively and precisely. These facts are not observed or considered in isolation of
one another. Rather, an attempt is usually made to identify the nature of exact
relationship among them to interpret and formulate deductive or probable
explanation of the object or the system or the event under study.
In order to ascertain the extent, dimension or magnitude of something or to
determine the attribute of something with precision, the scientists depend on the
measurement. In general, measurement refers to the assignment of numerals to
objects or events according to rules (Stevens, 1951). Measurement is a quantitative
process in which we will result in numbers with quantitative meaning. Hence, we
can restrict ourselves the term “measurement” to quantitative description.
Human behaviour has been assessed and measured in a variety of settings
and by individuals in many different disciplines. Everyone is to some extent an
assessor of human behaviour. But, each person performs the assessment in
his/her own way and thus frequently ends up with different results. Even the
experts differ in their approach to assessment of human behaviour and they do
ANNAMALAI UNIVERSITY
arrive at meaningful, yet different conclusions. From these results, it is possible for
us to develop a number of laws or procedures for a systematic assessment process.
It has been observed that persons differ in intellectual and other psychological
characteristics. The entire assessment process probably received its biggest push
from Francis Galton, who introduced the concept of individual differences. Hence,
the individual differences were studied scientifically, subjected to measurement and
evaluated objectively.
The Use of Psychological Tests
Psychological tests and testing play a significant role in a wide variety of
situations and can affect the lives of many persons. Psychological tests have been
devised and are primarily used for the determination and analysis of individual
2

differences: The study of individual differences in general intelligence, specific


aptitudes, educational achievement, vocational, fitness and personality traits.
Psychological tests, especially those of general intelligence and of specific
aptitudes are extensively used in educational classification and selection. The
purpose of testing is to provide educational and vocational guidance; to place an
individual in a special class for superior pupils, or in identifying the mentally
retarded; or to discover causes, intellectual or otherwise, might lead to behaviour
problems in school. In clinics, Psychological tests are used for individual diagnosis
of factors associated with personal problems. In business and industry, tests are
helpful in selecting and classifying personnel for placement in jobs and for any of
these tests results are the only source of information.
THE MEANING OF PSYCHOLOGICAL ASSESSMENT
McReynolds (1975) views assessment as a process in which one person
attempts to understand another person. According to Sundberg (1977), it is a
process of developing images, making decisions and checking hypotheses about
another person’s behaviour in interaction with the environment. This definition
suggests the need to develop images of the person, images of the environment, and
person environment theories to understand and predict behaviour.
Maloney and Ward (1976) define ‘assessment’ as a process of solving problems,
in which tests are frequently used as a method of collecting important information.
To Walsh and Betz (1985), ‘Psychological assessment’ is a process of
understanding and helping people, to cope with problems. In this process, tests are
frequently used to collect meaningful information about the person, and his/her
environment. In an applied sense, the assessment process has four parts, viz:
(1) the problem (2) information gathering, (3) understanding the information
and (4) coping with the problem.
1. The problem
The test stage of the assessment process involves defining the problem or
asking the question. Initially, problem classification is usually carried out in an
interview situation.
For example, a client may say, ‘I can’t decide what I want to do after high

ANNAMALAI UNIVERSITY
school (or college)’; ‘I don’t know, what makes me so nervous all the time”; ‘Why do I
find it so hard to make friends’?; ‘I am afraid of not being able to find a job after
graduation’. Frequently, the reported problem suggests and may be linked to a
content area personality, ability, interests, values, achievement, or the social
climate’.
For example, the problem – ‘I can’t decide what I want do’ may be linked to
the content area of vocational interests; the second problem ‘I don’t know what
makes me so nervous all the time and the problem of having no friends’ may be
linked to the content area of personality. Concern about ‘unemployment after
graduation’ may be linked to the cognitive ability area.
3

Making a link between a reported problem and the content of the problem is of
great help in the information gathering process, which is a part of psychological
assessment. The link tends to be suggestive of the kind of information needed for
coping with the problem. However, care should be taken not to simplify the ‘link’
concept, because in some situations a problem may suggest more one problem
content area.
2. Information Gathering
The second part of the assessment process involves gathering meaningful
information. It is necessary to note that the person and the environment are in
constant transaction. Therefore any assessment of the environment in which the
behaviour occurs, how the individual perceives the situation, and the context or the
environment determines the behaviour of an individual.
Most of the time, a Psychological test or inventory of some kind may be used to
collect relevant information. Other methods (like interview, observation, reports of
others, case study information) are also useful in obtaining more details. But,
Psychological tests are a primary tools used to collect information about people and
environment.
The linkage between a reported problem and a content area also suggests the
use of psychological tests like the personality inventory to collect information about
the client’s perceptions about his/her family climate, family interactions, and
values that may have a self-concept. With the help of these information, an image
of the person and the environment or context, can be developed. In order to develop
these images, it is necessary to organize, understand and interpret the information.
3. Understanding the Information
The third part of the Psychological assessment process involves an attempt to
organize, interpret and understand the person-environment information within
some kind of theoretical perspective. In other words, it is an attempt to use a theory
of personality, human development and person-environment Psychology in
organizing, understanding and predicting behaviour.
4. Coping with the Problem
The final step in the assessment process is coping with the problem. Actually,
coping begins when the client initially reports a problem or asks a question and

ANNAMALAI UNIVERSITY
continues through the data linkage of the problem to the content area, a set of
working hypotheses about the person and his/her situation are developed from the
information gathered. Proving or disproving these hypotheses indicates the progress
toward answering questions and coping with problems. The testing of hypotheses
may suggest other hypotheses. In addition, psychological assessment process may
suggest a series of alternative solutions or a number of answers to a certain
question or sometimes no solutions or no answers at all.
Thus, psychological assessment does not promise right answer or correct way
of coping. Psychological assessment is only an information based process that is
primarily concerned with understanding and helping people to cope with their
problems and make decisions effectively.
4

HISTORY OF ASSESSMENT
Individual Differences and Intelligence
The study of individual differences had origins in the late nineteenth and early
twentieth centuries.
Francis Galton (1822-1922), an English man was the first scientist to
undertake systematic and statistical investigations of individual differences. He was
particularly interested in intelligence and devoted much of his time to the study of
inherited genius.
Wilhelm Wundt and Herman Ebbinghaus, the German Psychologists suggested
that psychological events can be interpreted in quantitative terms. Wundt
established the first psychological laboratory in Leipzig University in 1879.
Ebbinghaus explored the learning and retention rates of children through a
completion test developed by him.
James Cattell, an American, carried Galton’s methods and concepts to the
University of Pennsylvania and developed a testing laboratory. Cattell introduced
the term ‘mental test’ and made an effort to relate scores on mental tests to reaction
time.
Jean Esquirol (1772-1840) and Eduard Seguin (1813-80) were concerned with
mental deficiency and mental diseases. Esquirol explicits the distinction between
mental deficiency and mental illness. He also distinguished the different levels of
mental deficiency. Further, Esquirol identified the fact that development and use of
language are of the most useful and valid psychological criteria for differentiating
mental deficiency.
Seguin is noteworthy for his pioneering work and methods in the training of
mentally defectives. His methods emphasized the development of greater sensory
sensitivity and discrimination and of improved motor control and utilization. Seguin
designed a form board which consists of a several performance batteries to identify
the levels of intellectual development.
A French Psychologist, Alfred Binet working with Theodore Simon, produced
the first widely used intelligence scale in 1905 to identify children who could not
profit significantly from schooling. They developed an individual test consisting of

ANNAMALAI UNIVERSITY
30 problems of increasing difficulty. Later it was revised by Lewis M. Terman in
1916 and called as Standford – Binet intelligence scale.
About this same time, the first paper-pencil group intelligence tests were
developed under the leadership of Arthur Otis and Robert M. Yerkes. There are two
tests – (1) The Army Alpha Test for Literates and (2) The Army Beta Test for
Illiterates were administered to American Soldiers during World War I to measure
their mental abilities and also used for recruiting the soldiers.
Development in the area of intelligence testing were slow until 1938, then,
David Wechsler developed Wechsler – Bellevue intelligence scale. This was followed
by the development of the Wechsler Intelligence Scale for Children in 1949; the
Wechsler Adult Intelligence Scale in 1955 (a revision of the 1939 Wechsler –
5

Bellevue Intelligence scale) which was revised in 1981; and the Wechsler Preschool
and primary scale of intelligence in 1967. Wechsler scales provides a good
contribution to intelligence testing because they were able to improve the
assessment process at the adult level.
In 1978, Jane Mercer developed her system of Multicultural Pluralistic
Assessment to make testing fair to people from minority cultures. This assessment
system combines the Wechsler scale for children (Revised) with a one-hour
interview with the child’s parents and a complete medical examination. This system
is used to measure the child’s adjusted IQ or learning potential. More work is
needed to verify the effectiveness of this assessment system.
Aptitudes and Achievements
The assessment of aptitude and achievement began sometime later than the
assessment of intelligence. Intelligence tests were useful only in assessing overall
intellectual level but they yielded limited information about special abilities.
Therefore, an attempt was made to develop aptitude and achievement tests.
An important attempt in developing an aptitude test was made by Carl
Seashore in 1918, with his measures of Musical Talent. Other pioneers in the
development of these tests were T.L. Kelley, with the Standford Achievement Test in
1923; Edward. F. Lindquist with the IOWA Every pupil test in 1936; Louis, L.
Thurstone, with the Primary Mental Abilities Test in 1938; and the United States
Employment Agency which developed the General Aptitude Test Battery in 1947.
Personality Assessment
The first personality inventory (self-report inventory) used to assess the
individuals was the ‘Personal Data Sheet’ developed by Robert S. Woodworth in
1920. This inventory was a screening technique for identifying seriously disturbed
men who were not socially capable of military service. It served as a model for the
development of subsequent inventories of emotional adjustment. This was followed
by two significant contributions in the area of projective techniques; The Rorschach
Inkblot test developed by Hermann Roschach, a Swiss psychiatrist in 1921.
Thematic Apperception Test developed by H.A. Murray and C.D. Morgan in 1938.
Both were projective techniques which required the individual to respond to vague
and unstructured tasks or stimuli.

ANNAMALAI UNIVERSITY
Another major contribution in the area of personality assessment was the
development of the Minnesota Multiphasic Personality Inventory by Starke
Hathaway and Charles McKinley in 1943. Even today, it is the one of the widely
used self-report measure of emotional disturbances in the assessment area. Since
1945 till date projective tests have occupied a primary position in practical
applications and in research, for diagnostic purposes, personnel selection and
researches on personality theory.
Interest Assessment
The assessment of interests was introduced in 1927, when E.K. Strong Jr.,
developed the Strong Vocational Interest Blank (SVIB) for men. This is an empirical
inventory that showed the similarities in individual’s likes and dislikes of those who
6

are employed in a variety of other occupations. Strong later developed scales for
women’s occupations in the same way.
David. P. Campbell expanded the 1974 edition of the SVIB and renamed the
Strong Vocational Interest Blank as the Strong – Campbell Interest Inventory. and
further revised in 1981. John C. Holland’s (1973) theory of personality types and
model environments has been used to structure the profile and interpret the scores.
Two other important inventories used to assess interests are the Kuder
Preference Record introduced by G.F. Kuder in 1934. The most recent revision of
this inventory occurred in 1979 and this is now called the Kuder Occupational
Interest Survey.
John Holland’s Self-directed Search was introduced in 1972 and revised in
1977. All the interest inventories attempt to explain the similarities in the likes and
dislikes of individuals in various occupations.
Environmental Assessment
Any assessment of a person is incomplete without some assessment about the
environment in which the person’s thought and behaviour occur. Thus, an
important additional feature of the assessment process, which has gained more
emphasis in recent years, is the environmental perspective of behaviour.
The environmental Psychologists like Kurt Lewin, Henry Murray, C.R. Pace
and G.G. Stern Rudolf Moos have contributed more to the development of
Environmental Assessment technique.
The first Environmental Index constructed was the College Characteristics
Index (CCI), by Pace & Stern, 1958. The basic rationale behind the development of
the CCI was that the environmental pressures may be inferred from the collective
perceptions or interpretations of the environment.
The High School Characteristics Index (Stern, 1970) was developed from the
College Characteristics Index and was designed to measure the environmental
pressure of educational settings (from colleges and universities).
The third Environmental Index, the Evening College Characteristics Index was
developed to assess the pressure of the non-residential college, the community
college or the two-year Junior College students. An Organizational Climate Index

ANNAMALAI UNIVERSITY
(OCI; Stern, 1970) was developed to measure the pressure experienced by the staff
in elementary and secondary education.
The initial rationale for the development of the CCI by Pace and Stern was one
of the major attempts to develop an objective inventory to collect information about
the psychological environment. This is a major contribution to the area of
environmental assessment and person environment psychology.
Pace in 1969 developed College and University Environmental Scales, adapted
from the College Characteristics Index, to assess college environments in defining
the cultural, social and intellectual climate of their campuses.
Moos (1974, 1976) developed a series of nine Social Climate Scales applicable
in a variety of different environments. Moos developed the Ward Atmosphere Scale
7

(WAS 1976) and Community Oriented Programmes Environmental Scale (COPES


1976). The WAS assesses the Social Climate of hospitals based on treatment
programmes and the COPES assesses the social environments of community–based
treatment programmes.
To assess total institutions, Moos developed the Correctional Institutions
Environmental Scale (CIES - 1976) and the Military Company Environmental
Inventory (MCEI - 1976). The CIES is used to describe the social climate of juvenile
and adult correctional programmes. The MCEI was developed to assess the social
climate of military companies. The University Residence Environment Scale and the
Classroom Environment Scale were further developed to describe educational
environments.
To assess community settings, three scales have been developed; The Group
Environment Scale; The Work Environment Scale; and the Family Environment
Scale.
Moos and his colleagues have developed the Multiphasic Environmental
Assessment Procedure _ (MEAP) in order to assess, measure and describe sheltered
care settings, such as skilled nursing facilities and a variety of senior citizen
housing facilities. According to Moos and his colleagues, such information makes it
possible to review the quality of the environment, to assess the impact of specific
features on programmes, and to explore the interaction between resident and
environmental characteristics.
The Institutional Functioning Inventory was developed in 1968 by the higher
education research group at the Educational Testing Service and subsequently
revised in 1978. The primary purpose of the scale is to help institutions,
understand how their faculty members, administrators, students and others
perceive important aspects of the college environment.
To assess environments, Astin and Holland (1961), developed the
Environmental Assessment Technique (EAT). The EAT is a technique that uses
eight objectively determined indices to assess the environment. The rationale for the
development of the technique was based on the assumption that a major portion of
environmental forces is transmitted through other people.
Gough developed The Home Index in 1949; it is an objective inventory for

ANNAMALAI UNIVERSITY
assessing social – economic factors in the home and family environment.
A variety of research and measurement scales are emerging in the area of
environmental assessment and that environmental variables need to be taken into
account in the assessment process.
THEORIES AND ASSESSMENT
The theories of personality and other psychological entities have had and will
continue to have significant impact on the assessment process.
The theories that have influenced the assessment process can be considered
under three major categories: (1) Personality theories and Assessment (2) Person
Environment Psychology and Assessment (3) Human development and Assessment.
8

a) Personality Theories and Assessment


Personality theory in its various forms has had considerable influence on
assessment. Psychoanalytic concepts, developed by Sigmund Freud in the early
part of the century, have had a profound impact on the assessment process. For
example, most projective techniques, such as The Rorschach Inkblot Test. The
Incomplete Sentences Blank, Draw–a–Person test are based on Psychoanalytic
theory.
In 1913 John Watson introduced behaviouristic psychology, stressing the
importance of science of behaviour. According to Watson, the best predictor of
future behaviour is past behaviour. Based on Watson’s work, the modern behaviour
modification movement and the behavioural assessment of clients were developed.
Behaviour assessment is concerned with understanding behaviour in relation to the
environment.
Kurt Lewin (1936) introduced Field theory suggesting that behaviour should be
viewed in terms of both the person and the environment. The impact of this idea is
more important in assessment today. Carl Rogers ‘Self-theory’ (Person – centered
theory) stressed the importance of perception and self assessment. Growing as a
person, understanding and liking self and others, made the person to be relatively
good at self-assessment.
b) Person – Environment Psychology and Assessment
A second set of theories relevant to assessment is in the area of person –
environment psychology. This theoretical base is vested in Lewin’s (1936) idea that
the environment is very important as the individual and both must be analysed to
understand behaviour.
Murray (1938) introduced one of the first theoretical views a need/press
model. From an assessment perspective, this theory suggests that behvaiour is a
function of personality needs and perceived environmental pressures.
Banker’s (1968) theory of behaviour settings proposes that environments tend
to have a coercive influence on behaviour. For example, when in school, we behave
in a way as a student should, and when in the family, we behave according to our
role as a child, father or mother should.

ANNAMALAI UNIVERSITY
Holland’s (1973) theory of personality types and model environments indicated
that congruent person environment relations lead to predictable behaviour. He
proposed that People tend to enter and remain in environments that match their
personality.
These theories help in reasonable predictions about the individual within a
person – environment framework and therefore have clear implications for the
assessment process.
c) Human Development and Assessment
A third group of theories relevant to assessment focuses on human
development. Theories of development are suggestive of how human development
occurs over the life span. The human being develops through a period of growth, a
9

period of stability, and a period of decline. This developmental approach has


stimulated two families of developmental theories having clear implications for
assessment. One family of theories, the cognitive development theories, examines
how we reason, think, or make meaning of our experiences. The main theorists in
this group include Piaget, Kohlberg, and Loevinger. The second family of theories,
the psychological theories, are concerned with the ‘What’, or content of human
development. These theories combine feelings, thinking, and behaviour into a
description of the life span. Thus, for example, these theories explore the questions
“Who am I?” ‘Who am I to Love?’ and what am I to believe?” Significant theorists in
this group include Erikson, Chickering, and Levinson. The above developmental
theories are in the context of their meaning and implications for assessment.
SUMMARY
Assessment of human behaviour is one of the important aspect of psychology.
Psychological tests and testing helps us to make an objective assessment.
Psychological testing involves the problem information gathering, understanding
the information and coping with the problem. It was understood that individuals
differ in their attitude, aptitude, intelligence, personality etc. Hence, many theories
have been proposed to assess the human behaviour. The important theories which
are discussed in this lesson provides a basic knowledge and understanding about
psychological assessment.
KEY TERMS
Measurement Psychological Test
Problem Information gathering
Intelligence assessment Aptitude assessment
Personality assessment Measurement of Environment
Interest assessment Environment Assessment
QUESTIONS
1. Enumerate the uses of psychological tests.
2. Examine the meaning of psychological assessment.
3. Analyse the influence of theories on the assessment process.


ANNAMALAI UNIVERSITY
10

LESSON – 2

THE HISTORY AND MEANING OF ASSESSMENT (Contd…)


OBJECTIVES
After reading this lesson the student should
 Understand the judgemental and objective approaches to assessment.
 Explain the usefulness of DSM-III.
 Describe the four types of measurement with examples.
 Examine the tests and their basic assumptions.
SYNOPSIS
Theory and Assessment – Measurement Assessment and Evaluation – Tests
and Assessment.
INTRODUCTION
The judgemental and objective perspectives represent two main approaches to
assessment. The judgemental approach, often referred as the clinical approach,
may or may not use test scores and objective data in understanding the client’s
current behaviour or in predicting future behaviour.
In the judgemental approach all available information, whether subjective or
objective, is interpreted by the tester and used to make some kind of meaningful
conclusions. Cronbach (1970) refers to this approach as ‘impressionistic’. The
judgemental approach includes a broad range of information about the person and
the environment in the assessment process. The subjective data includes the
interview data, reports of others, observations and descriptive data. Objective
information refers to the test scores and ratings.
The objective approach is limited to information about the person and the
environment. The data obtained can be quantified or coded for statistical analysis.
Meehl calls this is the statistical approach and Cronbach refers to it as the
psychometric approach. The objective approach maintains that judgements of
interview data, reports of others, observations, case study information may be
included in assessment only if they are quantifiable and should be obtained

ANNAMALAI UNIVERSITY
through the reliable and valid rating instrument.
In the objective approach, the information is processed by a computer,
whereas in the judgemental approach, the information is cognitively processed by
the counselor psychologist. The major limitation of the objective approach is that it
is not feasible to quantify all the characteristics of individuals and the environment.
But, the informations are quantifiable in the objective approach, it is more useful in
predicting future behaviour.
However, the psychological assessment process does not require us to choose
one approach. Psychological assessment is an information based process and
therefor it needs all available meaningful information about the person and the
11

environment. In practice, a good psychological assessment will be based on


information emerging from both the judgemental and the objective approaches.
DIAGNOSIS AND STATISTICAL MANUAL OF MENTAL DISORDERS: (DSM-III)
Diagnosis is similar to, yet different from, psychological assessment. The
diagnostic process involves some statement of the problem, information gathering,
and information interpretation. Diagnosis is a classification on the basis of
observed symptoms. According to Rapaport, Gill and Schafer (1968), diagnosis is
defined as the act or art of identifying a disease from its symptoms. Wolman (1978)
defines diagnosis as the search for a disease responsible for symptoms, that is, the
source of behavioural problems.
The most basic feature of the medical model is that diagnosis attempts to
suggest the cause and nature of the condition. It is also necessary to remember
that many illnesses and syndromes have unknown causes. From another
perspective, diagnosis may be viewed as a process that tends to verify sets of
symptoms or characteristics for subsequent treatment, given a description / to
describe as learning disability or depression, for subsequent treatment. Cause is
usually not indicated in the diagnosis of mental disorders, except for specific
organic conditions.
In the classification system for mental disorders, the Diagnostic and Statistical
Manual of Mental Disorders published by the American Psychiatric Association, in
which the cause is not indicated for most of the functional disorders. The approach
is primarily descriptive, and the definitions consist of descriptions of the clinical
features of the disorders.
The first edition of the American Psychiatric Association’s Diagnostic and
Statistical Manual of Mental Disorders was published in 1952. This contains
descriptions of the various diagnostic categories. The 1980 edition of the DSM III is
an important basis for understanding mental disorders.
This is an official manual that describes and delineates the various
psychological disorders. It is useful for counselors and clinicians working in
community mental health clinics, hospital settings, and private practice. They
should be familiar with the diagnostic classification system of psychological

ANNAMALAI UNIVERSITY
disorders in order
(1) to make generally recognizable diagnosis
(2) to communicate with other professionals and
(3) to follow the relevant literature in the field of mental health.
In spite of certain limitations like, category overlap, vague category definitions,
and inability to show the cause the DSM III is atleast an attempt to describe
psychological disturbances, environmental factors, and other areas of functioning
that may be problematic. After DSM – IIIR the DSM-IV was also published recently
by American psychiatric Association.
12

MEASUREMENT AND ASSESSMENT


Definition: psychological measurement and psychological assessment have the
idea of process in common, but the function of this process is different.
As already mentioned, the process of psychological assessment involves the
collection of meaningful information to understand and help people to cope with
problems. Psychological measurement, is the process of assigning numbers
according to certain agreed upon rules. The method of assigning numbers useful for
psychologists was set up by Stevens (1951). Stevens suggests four types of
measurement. (1) The Nominal Scale (2) The Ordinal Scale (3) The interval Scale,
and (4) The Ratio Scale.
By assigning numbers according to certain rules, it is possible to quantify
behaviour, personality traits, aptitudes, environmental perceptions and other
psychological variables. Once the variables are quantified, we are able to provide
them with additional meaning by relating them to other factors and by making
statistical comparisons. The four types of measurement scales are presented here.
i) The Nominal Scale
Stevens called the most elementary level of measurement as the nominal scale.
Here, numbers are used to classify, name or identify individuals or things by
groups to which they belong. For example, numerical labels to identify occupational
groups: Engineers may be given a code number of 1, chemists a code number of 2,
social workers a code number of 3 and so on. Telephone numbers are another
example of assigning numbers for identification purposes. The only mathematical
operation that may be applied to nominal scales is counting. The numerical labels
say nothing about order and cannot be used to add, subtract, multiply or divide.
ii) The Ordinal Scale
The second level of measurement is the ordinal scale. This involves the
assignment of numbers to indicate rank or order from highest to lowest. In a horse
race, the winner of the race is ranked first, the horse finishing second is assigned
the rank of 2, and the horse finishing third is assigned the rank of 3. This ranking
tells us nothing about the differences among the horses. The ranking does not
indicate the quantitative differences between the various ranks. Just like the

ANNAMALAI UNIVERSITY
nominal scales, subtraction, multiplication, and division cannot be approximately
used with ordinal scale. However, statistical analysis of rank difference correlation
are appropriate.
iii) The Interval Scale
The third level of quantification in measurement is called the interval scale.
The scores obtained on tests and inventories measuring personality traits,
interests, abilities, values and environmental perceptions fall on the interval scale.
Interval scales permit the addition and subtraction of scores, but not the division of
one score by another. On an interval scale, distances between each number and the
mean are equal, but a zero point is not known.
13

For example, the centigrade thermometer is an interval scale on which the


difference between 20 and 30 degrees is the same as the difference between 50 and
60 degrees. The intervals are equal. But we cannot say that minus 50 degrees,
since there is no exact zero point. An interval scale does permit us to add and
subtract. This helps in describing the distributions of test scores by computing an
average, the standard deviation, and correlation coefficients. These statistical
procedures cannot be used with ordinal data.
iv) The Ratio Scale
The fourth level of measurement is the ratio scale. This is the highest level of
measurement and permits the use of all mathematical operations. Ratio scales like
interval scales have equal intervals, plus the additional property of an exact zero
point. Height, weight and volume may be quantified in this way. It is possible to say
that the difference between 10 kilograms and 20 kilograms is the same as the
difference between 20 kilograms and 30 kilograms. In addition, we may say that 50
kilograms is twice as heavy as 25 kilograms.
Few of the measurements made in Psychology qualify as ratio scales. Most of
the measurements in psychology are assumed to be interval scales and permit the
use of inferential statistics. In general, the measurement procedures help us to
estimate the amount or quantity of personal and environmental variables through
the use of psychological tests and inventories.
TESTS AND ASSESSMENT
a) The Interview: This is one of the traditional methods of collecting information
about people. The traditional interview method which is probably as old as the
human race, has some distinct advantage but also some clear disadvantages. An
interview is simply a conversation with a purpose (Bingham and Moore, 1924). The
interview is a very individualized procedure that enables us to collect personal and
subjective information about a client. The flexibility of the interview (in some
aspects reflect the weaknesses) is a positive feature of the procedure but sometimes
difficult to objectify or quantify. Here lies the weakness of interview. The
information is thus subjective and individual and the procedure varies from person
to person. In other words, it is very difficult to use subjective and unquantified
information to compare people. Tests provide us with such objective and quantified

ANNAMALAI UNIVERSITY
information.
b) The psychological test: The tests are developed out of the need to find a
more objective and efficient method of collecting information about people. A test is
a method of acquiring a sample of a person’s behaviour under controlled
conditions. Controlled conditions refer to that all people taking the test do it the
same way to make sure they may be compared. This implies that tests are
standardized.
Procedures of administration, scoring and interpretation are the same for each
person taking the test. Standardized tests produce a sample of a person’s
behaviour that may usually be quantified and reported in an objective and
numerical form.
14

By comparison, interview information is qualitative and is too difficult to report


it in numerical form. Both kinds of information-interview and test, qualitative, and
quantitative, are useful and meaningful in psychological assessment. It is necessary
to add that the quantitative information (test scores) is more useful in making
comparison among people who have taken the same test under similar conditions.
In summary, the test has certain assets, one of which is objectivity. An
objective test, that is standardized can be administered, scored and interpreted in a
similar way for all people. A second attribute of the test is that it is quantifiable. A
test score may be reported in numerical form, which permits more precision and
the use of statistical procedures. Under many circumstances, tests are more
economical and more efficient than the interview.
TESTS AND SOME BASIC ASSUMPTIONS
As already mentioned, a test is a method of obtaining a sample of behaviour
under controlled conditions. For a number of reasons, tests and test scores are
more suggestive. Below are listed some basic assumptions we must make in using
tests and some good reasons why tests are at best suggestive.
1. It is assumed that each item as a test and all the words in that item have
some similar meaning for different people. This variance in meaning is a
limitation, but not a reason for discarding the use of tests.
2. A second assumption is that people are able to perceive and describe their
self-concepts and personalities accurately. These self distortions may be very
much a part of the client’s problems.
3. It is assumed that people will report their thought and feelings honestly.
There are many situations like person, job placement, promotion-where
people may think it is not in their best interest to do so.
4. It is assumed that an individual’s test behaviour (and actual behaviour) is
rather consistent over time.
5. It is assumed that the test measures what it is supposed to measure.
6. It is assumed that an individual’s observed score(x) on a test is equal to his
or her score (true ability) plus the error (Xe).
All test scores have an error factor made up of things like examinee’s health,
attention span, previous experience, emotional state, fatigue, motor defects, visual

ANNAMALAI UNIVERSITY
defects, socio-cultural problems, group attentiveness, test administration, lighting,
noise and so forth. With these assumptions we widely use the psychological tests
for understanding an individual as well as growth.
SUMMARY
Psychological assessment is an information based process and it does not
require only one approach or theory. The assessment involve various scales like
nominal, ordinal, interval and ratio. Most of the psychological measurements are in
interval scale and a few in ratio scale. Usually, tests are used in psychological
measurement and there are enumerous standardized tests available in the field of
psychology to assess the various aspects of the human behaviour. The standards
15

tests possess reliability validity and objectivity and hence these tests are more
economical and efficient.
KEY TERMS
Judgemental perspectives
Objective perspectives
Diagnostic and statistical manual
Measurement
Assessment
Nominal Scale Ordinal Scale
Interval Scale Ratio Scale
QUESTIONS
1. Examines the judgemental and objective approaches to assessment.
2. Enumerate the advantages and disadvantages of DSM-III.
3. Describe nominal and ordinal scale with examples.
4. With suitable examples, explain interval and ratio scales.
5. Describe the basic assumptions underlying the psychological tests.



ANNAMALAI UNIVERSITY
16

LESSON – 3

BASIC STATISTICAL CONCEPTS IN TESTING AND ASSESSMENT


OBJECTIVES
After reading this lesson the student should
 Understand the definition of a psychological test
 Analyse the need for statistical methods in testing
 Describe scores and Frequency distributions
 Explain the different types of frequency distribution
SYNOPSIS
Definition of a psychological test – Statistical Methods in testing.
INTRODUCTION
The three main aims of psychology are, to understand the human behaviour,
to predict the human behaviour and to control the human behaviour. To
understand more about the behaviour we use psychological tests. Since, the
behaviour is a complex process that too there are individual differences exists
among the human beings, the task of understanding becomes more complicated.
Hence, we use various type of psychological tests to gather information about the
person and his/her environment. Psychological tests are major tools by which we
gather informations. But, most of the psychological entities are qualitative
constructs. Therefore, to make certain inferences about the psychological
constructs we use the scaling techniques thereby the qualitative construct has been
converted into a quantitative entity. Therefore, it is necessary to understand the
nature of psychological tests and the method of interpreting the test, to achieve the
purpose of psychological measurement.
DEFINITION OF A PSYCHOLOGICAL TEST
The important assumption for any psychological testing is that people and
enviornments differ on certain identifiable and measurable dimensions or
characteristics. This concept is known as “individual differences”. It is possible for
us to observe the individual differences either by means of observing the individual

ANNAMALAI UNIVERSITY
or by collecting some relevant information about the individual. For example, we
can easily observe the differences in the heights and weights of people around us.
Similarly, we have undoubtedly known people we considered as very intelligent,
very good in mathematics an excellent athlete or less intelligent, very poor in
mathematics, poor in atheltic ability etc. we are observing and getting to know the
people around us but, the fact here is they differ from each other in various ways.
Our everyday lives involve constant observations of individual differences.
These observations are often automatic and usually relatively unsystematic. But,
the methods of psychological assessment are designed to provide more systematic
observations of differences among the people or environment. More specifically,
psychological tests are means of observing people’s behaviour under controlled
17

conditions. Cronbach (1970) defines a psychological test as a “systematic procedure


for observing a person’s behaviour and describing it with the aid of a numerical
scale or category system”. Anatasi (1982) describes a psychological test as
“essentially an objective and standardised measure of a sample of behaviour”.
Hence, a psychological test is a highly refined and systematized version of the
ordinary process of observation of ourselves and the people around us.
Eventhough, a psychological test is more similar to the everyday process of
observation, its greater refinement and systematization are very important for the
science of psychology. Particularly, for the assessment and evaluation the
systemetization becomes a base. One of the major attribute of test is quantification.
i.e. The test scores are quantifiable. This allows a descriptive precision which is not
possible in the case of everyday observation. For example, you may observe that
Rama is more intelligent than Babu. But, the results of an intelligent tests would
provide numerical estimate of this value which add the precision and validity to
your intuitive comparison of the two individuals. Hence, the numerical scores
yielded by a psychological test differentiate the test from the ordinary process of
observation.
In addition to quantification, the terms “standardized” and “objective” were
also an essential characteristics of a psychological test. The term standardized
refers to the controlled conditions under which testing occurs. Standardization
refers to the procedure and materials used and the methods of scoring the test are
constant across the testing.
Here, procedure refers to the conditions under which the test is administered,
including the procedures that the test administrator follows, the instructions
provided to the examinees, the sample questions utilized and time limits if any.
Constancy in testing materials refers to the fact that the test’s content should be
comparable from one testing to the next. i.e. The nature and number of test items
be comparable across examinees and testing.
For example, if we take two different tests of attitudes assessment there
should be a possibility for comparing the items, scores and examinees based on the
scores of two attitude tests. Finally, the methods of scoring the test should be

ANNAMALAI UNIVERSITY
clearly defined and constant across testing. This implies that the identical test
responses should receive identical scores. For example, if there are objective type
questions it should be a negative mark for a wrong response of a multiple choice
question, it should be the same for all the other items in the multiple choice test as
well as for all the examinees.
The idea of constancy in scoring procedures are related to the concept of
‘objectivity’ – in testing. Objectivity refers to the extent to which every observer or
judge evaluating the test or performance arrives at the same score or interpretation
of the test. For example, in the case of paper – pencil test a scoring key could be
provided to ensure the objectivity in assessment.
18

There are two more additional points concerning the definition of psychological
test. One is “behaviour” and the other is “sample”. The term “behaviour” here refers
to cognitive behaviours such as abilities, aptitudes, attitudes, personality and other
overt behaviours. Hence, the term behaviour is used here in the broadest sense,
which includes the full range of human responses capable of being observed and
recorded. The second term is the test measures a “sample of behaviour”. Anatasi
(1982) mentioned this term with a purpose that the tests can sample only a portion
or segment of the totality of our responses, capabilities or tendencies. For example,
a test on physics attitude cannot include each and every possible question that
could be asked about the subject. Hence, a major assumption in testing is that the
observations we make from samples of behaviours can be generalized to the
person’s behaviour over time and across situations.
Hence, a psychological test is a systematic and standardized method of
observing and quantifying the characteristics of an individual or environment, the
individual differences among the people and the ways environment differ from each
other. The tests yield a numerical score describing some characteristic of the
individual or of an environment since, it involves quantification. It is essential that
the scores yielded by a test should be informative, meaningful and interpretable to
the purpose for which the information is needed. Test scores become meaningful
and interpretable in a variety of way but two of the ways are the use of norms and
knowledge of rationale underlying the tests.
Norms are ways of interpreting people’s scores in relationship to the scores of
some appropriate comparison of a group of people. In simple term, norm, is
standard against which a person’s score is compared. The understanding of how
different rationales by which tests are constructed leads to a different types of
meaning and interpretability to test scores.
STATISTICAL METHODS IN TESTING
Statistics is a science of measurement. It is concerned with counting,
measuring quantifying, describing, interpreting and inferring i.e. Description and
inference are the two important uses of statistics. Statistics deals with the data in
the form of numbers selecting from a group of people, things or events. It consists a
set of data (information) about a number of individuals or objects belonging to the
ANNAMALAI UNIVERSITY
same group or class. It is a set of quantitative data relating to an aggregate of
individuals collected in a systematic way with a specific purpose.
We collect information from many sources. We measure individuals from
which an average can be calculated. Any calculated measure is a “statistic”. For
example, the percentage we calculate is a calculated measure. It is a ‘statistic’. The
subject “STATISTICS” is singular whereas the calculated “statistics” are plural.
In statistics there are two methods viz. experimental statistics and inferential
statistics. Experimental statistics study about the whole population or universe. i.e.
All statistical measures in connection with the population is known as “parameter”.
Whereas, inferential statistics study about the sample from a specific population.
19

i.e. All statistical measures related with the sample is known as “statistics”. In
inferential statistics there are two methods known as descriptive and non-
descriptive methods. Major descriptive methods in statistics are score and normal
distributions, measures of central tendencies, variabilities and relationships. Non
descriptive methods include regression analysis which is used for predicting
behaviour or performance, factor analysis which is used to investigate the basic
underlying dimensions in a set of variables.
Though there is no agreement on the definition of the statistics, the
applications of statistics is very much prevalent in many fields. Almost any
characterization of statistics would include the following general functions:
1. To extract relevant, complete and accurate information from data.
2. To help us to be definite, efficient exact in our design of experiments and
procedures.
3. To summarize our results in a meaningful and convenient form.
4. To make predictions of what will happen under the given condition.
5. To draw general conclusions and to weigh them against accepted rules.
FREQUENCY DISTRIBUTIONS
The data can be obtained after the experimentation or the data from a survey
are frequency collections of numbers. Classification and description of these
numbers are required for interpretation. One such method to classify the data is
known as ‘frequency distribution’. A frequency distribution is an arrangement of
the data that shows the frequency of occurrence of the different values of the
variable (or) frequency of occurrence of values falling within an arbitrarily defined
intervals of the variables. In other words, a frequency distribution shows a tallying
of the number of times each score value (or interval of score values) occurs in a
group of scores.
Rules for Classifying scores into a frequency distribution
1. Find out the highest and lowest number in a data. Determine the range or
gap between the highest and lowest scores.
2. Settle upon the number and size of the groupings to be used in making a
classification. Commonly used grouping intervals are 3, 5, 10 units in length since,
it is somewhat easier to work with in later calculations.
ANNAMALAI UNIVERSITY
Select the grouping unit in such a way that it should have approximately 5 to
15 categories of classes.
3. Tally the scores in their proper intervals usually from the smallest scores to
the largest score the class intervals are listed.
Example
The following are the scores of 50 students in a math test. Form the frequency
distribution.
20

57 61 63 59 70 70 75 80 85 58
71 65 66 73 64 69 76 83 81 85
77 57 62 82 59 68 75 74 78 83
79 69 60 81 58 65 67 77 79 82
80 75 57 85 65 60 84 62 72 76

The lowest score in the distribution is 85 and the highest score is 57.
 The range = (Highest – Lowest) score
= 85 – 57 = 28
For instance we can take the size of the grouping (i.e. class interval) as 5.
Hence 28/5 = 5.6. Hence, we can take 6 sets of class intervals to group the data.
Now, we have to tally the scores under each set of class interval to obtain a
frequency distribution which is shown below:

TABLE – 3.1
Class Intervals (CI) Tallies Frequency (f)
56-60 |||| |||| 9
61-65 |||| ||| 8
66-70 |||| || 7
71-75 |||| || 7
76-80 |||| |||| 9
81-85 |||| |||| 10
N=50
Hence, in the above table each interval covers ‘5’ scores. The scores falling
under each interval has been shown by a ‘tally’ mark. The total number of tallies in
each category is written by a number called 'frequency'. The sum of all the
frequencies is known as 50. So, the total frequency within each class interval has
been tabulated against the proper interval, as shown in column – 3. The math test

ANNAMALAI UNIVERSITY
scores are arranged in a frequency distribution.
Exact limits, Mid-points Class Intervals
In the case of discrete variable the class intervals themselves as given
represent the exact limits. But, often we take the continuous variable into account.
Hence, it is necessary to calculate the exact limit of the class intervals. These units
are usualy taken as one-half (or) 0.5 unit above and below the value reported. For
example, for the 56-60 class interval the exact limit is 55.5 – 60.5 i.e. subtract 0.5
from a lower number and add 0.5 with a higher number. In the same way the exact
limit of 61-65 can be written as 60.5-65.5. If we write all the exact class intervals,
we get the continuous distribution.
21

The mid-point of a class intervals are required in the calculation of further


statistics like mean, median and standard deviation etc. If the mid-point of the
class interval is calculated all the individual scores in a class interval lose their
identity and take over the value of a mid-point of the class interval to which they
belong. The mid point can be calculated by adding the lower and upper limit of
class intervals and divide it by ‘2’.
Low erlimit  Upper limit
i.e. Mid-point =
2
56  60
For example, the mid-point of the C.I. 56-60 is  58
2
61  65
In the same way the mid-point of the C.I. 61-65 is  63
2
Hence, in the calculation of various statistics, two important assumptions
about the class intervals are made. They are,
(i) The observations or scores are uniformly distributed over the entire range of
the interval. This assumption is generally used in the calculation of median,
quartiles and percentiles.
(ii) All the values or scores in the class interval are the same and equal to the
value corresponding to the mid-point of the interval. This assumption is generally
used in the calculation of mean, standard deviation and drawing frequency
polygons.
GRAPHICAL REPRESENTATION OF FREQUENCY DISTRIBUTION
The graphical representation of the frequency distribution aid in analysing the
numerical data. This may provide an easy understanding about the frequency
distribution. There are five graphical methods generally used to express the
frequency distributions, viz. the frequency polygon the frequency curve the
histogram, the cumulative frequency graph, and the cumulative percentage curve
or ogive.
(i) Frequency Polygon
A polygon is defined as a many sided figure. If a many sided figures drawn on

ANNAMALAI UNIVERSITY
the basis of frequencies is given in a distribution – it is called frequency polygon.
To draw a frequency polygon in a graph, the mid-point of the class intervals
has been taken in the X-axis, whereas the frequencies are taken in Y-axis. Against,
each point on the base-line the corresponding frequencies have been located. Then,
all the points have been joined by a series of short lines to form the frequency
polygon. Since, a polygon is a complete figure, the ends should touch the baseline.
For this purpose, at each end of the distribution assume one class interval with
zero frequency.
22

Frquencies

Mid point of C.I.


(ii) Frequency Curve
In a frequency polygon instead of joining the points by a straight line, if they
are joined by a smooth curve it is called frequency curve. But, in a frequency curve
the two ends of a curve does not touch the origin.
(iii) Histogram
Frquencies

Mid point of C.I.


A histogram is a graph in which the frequencies are represented by the areas
in the form of bars. In a frequency polygon all the scores within a given interval are
represented by the mid-point of the class interval. But, in a histogram, the scores
are assumed to be spread uniformly over the entire interval. Within each interval of
a histogram, the frequency is shown by a rectangle the base of which is the length

ANNAMALAI UNIVERSITY
of the interval and the height of which
is the number of scores within the
interval.
(iv) Cumulative Frequency Graph
If the scores of a distribution is
Frquencies

added serially with the next score it is


known as cumulative frequencies. For
example take the data from Table –
3.1.
TABLE – 3. 2 M i d poi nt of C.I.

C.I. f Mid point CumulativeFre % Cum


23

of C.I. quency frequency


56-60 9 58 9 18
61-65 8 63 17 34
66-70 7 68 24 48
71-75 7 73 31 62
76-80 9 78 40 80
81-85 10 83 50 100
The data in the 4th column of the Table – 3.2 represents the cumulative
frequencies of the distribution.
The graph is drawn by taking the mid-point of a class interval in X axis and
cumulative frequencies in Y axis. After marking the frequencies against each
interval and the points are joined we get a cumulative frequency curve. The starting
point of cumulative frequency curve should touch the base-line and hence assume
one additional class interval with zero frequency before the first class interval.
It should be noted that the general trend of the cumulative frequency curve is
progressively rising. The upward
rise is not a straight line. When
the distribution of frequencies 50
are symmetrical, we will get a S- 45

shaped cumulative frequency 40

curve. The zero frequency 35


Frquencies

assumed by us does not affect 30

the curve in anyway. 25

Percentiles, Median and 20


15
Quartile deviations are
10
calculated using the cumulative
5
frequencies.
v) Cumulative percentage curve 53 58 63 68 73 78 83 88 93
(or) Ogive M id point of C.I.
To calculate the percentile
ranks we need the cumulative percentage curve. The cumulative frequencies are

ANNAMALAI UNIVERSITY
converted into cumulative percentages. For instance, to convert the cumulative
frequency into a cumulative percentage, multiply the cumulative frequency by 100
and divide it by a total number ‘N’. Then the graph is plotted by taking the exact
limits of C.I. in X-axis and cumulative percentage in Y- axis. For example the data
in Table – 3.2 is plotted here. Ogives are useful in
a. To calculate the percentiles and percentile norms.
b. To compare the groups - If the ogive representing the different scores of a
distribution is plotted upon the same co-ordinate axis, the groups can be easily
compared.
24

100
90
80
Frquencies

70
60
50
40
30
20

10

52.5 55.5 60.5 65.5 70.5 75.5 80.5 85.5 90.5


Exact limits of C.I.
SOME CONVENTIONS IN THE CONSTRUCTION OF GRAPHS
1. In the graphing of frequency distributions it is customary to let the
horizontal axis represent the scores and the vertical axis represent
frequencies.
2. The arrangement of graph stocked proceed from the left to right. The lower
numbers on the horizontal scale should be on the left and the lower
numbers on the vertical scale should be near to the bottom side.
3. The distance along either axis selected to serve as a unit is arbitrary and
affect the appearance of the graph. It is suggested that the ratio of height to
length may be roughly 3:5. This procedure seems to have aesthetic
advantages.
4. Whenever possible the vertical scale should be so selected that a zero point
falls at the point of intersection of the axis.

ANNAMALAI UNIVERSITY
5. Both the horizontal and vertical axes should be approximately labelled.
6. Every graph should be assigned a descriptive title which states precisely
what it is about.
SUMMARY
A psychological test is a systematic and standardized method of observing and
quantifying the characteristics of an individual or environment. Statistics is a
science of measurement, plays a vital role in testing and assessment. There are two
methods in statistics viz. experimental and inferential. The frequency distribution is
an arrangement of data, which shows the occurrence of different values of the
variable. The frequency distribution possess class intervals, mid points and exact
limits. Frequency distribution can also be represented graphically. The graphical
25

representation of frequency distribution help us to analyse the numerical data and


hence it becomes the important technique of representing the data.
KEY TERMS
Psychological Test Norms
Frequency distribution Class Interval (CI)
Midpoint of CI Exact limits of CI
Frequency polygon Frequency curve
Histogram Cumulative Frequency curve
Ogive
QUESTIONS
1. Examine the definition of a psychological test with its scope.
2. What is a frequency distribution? Explain with suitable examples.
3. Describe the different types of frequency distributions.
4. Discuss the four methods of representing frequency distributions
graphically.



ANNAMALAI UNIVERSITY
26

LESSON – 4

THE BASIC STATISTICAL CONCEPTS IN TESTING AND


ASSESSMENT (Contd.)
OBJECTIVES
After reading this lesson the student should
 Understand what are the measures of central tendency
 Know how to calculate the measures of central tendency
 Describe when and where to use this measures.
SYNOPSIS
Measures of central tendency – Meaning – Three common measures of central
tendency : The Mean, Median and Mode.
MEASURES OF CENTRAL TENDENCY
After tabulating any scores or data it is necessary to interpret it and to
compare the performance of a group as a whole. Some standard values are required
for this purpose. The commonly used central values are mean, median and mode.
These are also called “a measures of central tendency”. Descriptive statistics that
indicate the central location of a distribution of observations are known as
measures of central tendency. This is a index of central location employed in the
description of frequency distributions. These statistics are used to summarize the
data by describing the most typical or representative value in the data set.
i) MEAN
The ‘Mean’ or ‘Average’ is the measure of central location. It is usually referred
to a value obtained by adding together a set of measurements and dividing the total
by the number of cases in the given measurement. This is also called ‘arithmetic
mean’. In general an average is a central reference value which is close to the point
of greatest concentration of the measurements. For certain purposes a particular
measurement may be viewed as certain distance above or below the average. But,
this ‘mean’ is the most important and widely used measure of central tendency.
This statistic is the appropriate measure of central location for interval and ratio
variables. The median and mode are sometimes viewed as appropriate measures for

ANNAMALAI UNIVERSITY
nominal and ordinal variables although they can also be used with interval and
ratio variables.
The simplest method of finding the arithmetic mean is dividing the sum of the
scores by the total number of items.
If x1, x2, …… xm are N scores then,
Sum of the scores
Arithmetic mean =
Number of items
x1  x 2  x 3  ...  x n
=
N
27

Arithmetic mean =
x
N
This is the basic formula to find out the mean. All the other formula for
calculating mean are derived from this formula. This formula can easily be used
when the scores are small, the number of items are very few and the data is raw or
ungrouped. This mean is generally denoted by ‘M’ and the above discussed formula
can be symbolically written as

M
x
N

Examples
1. Find the arithmetic mean of the numbers given below:
2, 3, 4, 5, 6, 7, 8, 9, 1, 5, 3, 4, 6, 7, 0

M=
 x  70  4.667
N 15
2. Find the average of the marks of 10 students given below:
12, 14, 13, 15, 16, 17, 14, 13, 11, 15
140
M  14
10
Mean for Grouped Data
Sometimes some scores are repeated more than once or we may say that the
frequencies are given. If the data is grouped and given in the form of a frequency
distribution, mean is calculated by the given formula.

M
 fx
N
Here, ‘f’ is the frequency
‘x’ is the mid-point of the class interval.
and ‘N’ is the total number of frequencies.

Example
Find out the mean for the distribution given below:

ANNAMALAI UNIVERSITY
C.I.
f
1-5
13
6-10
14
11-15
16
16-20
19
21-25
15
26-30
14
31-35
9

The first step is we should arrange the distribution in the following manner,
then we should find out the mid-point of the class interval and ‘fx’ values.
TABLE – 4.1
Class interval Frequency Mid-point of CI
fx
C.I. f x
1-5 13 3 39
6-10 14 8 112
28

11-15 16 13 208
16-20 19 18 342
21-25 15 23 345
26-30 14 28 392
31-35 9 33 297
N=100 fx=1735

 Mean 
 fx
N
1735
=
100
M = 17.35
Assumed Mean Method
When the scores are in large numbers then, the calculation becomes tredious.
Hence, a number is assumed as a ‘mean’ Generally, the assumed mean is taken as
the mid-point of the class interval which has the maximum frequency.
Once, the mean is assumed then, the deviations from the assumed mean are
calculated and those deviations are multiplied by the respective frequencies. Then,
the following formula is applied to calculate the mean.
  fd 
Arithmetic mean = A +   C
 N 
Here,
A  Assumed mean ( = mid point of the class interval with maximum
frequency)
f  frequency
c  size of class interval
N  Total number of frequencies in the distribution and
xA
d  class deviation =
C

ANNAMALAI UNIVERSITY
Where, x – mid point of C.I.
A – Assumed mean
C – Size of C.I.

Example
Let us take the previous example in Table 4.1.
29

TABLE – 4.2
Class-Interval Frequency Mid-point of deviation fd
(C.I.) (f) C.I.(x) (d)
1-5 13 3 -3 -39
6-10 14 8 -2 -28
11-15 16 13 -1 -16
16-20 19 18*A 0 0
21-25 15 23 1 15
26-30 14 28 2 28
31-35 9 33 3 27
f = 100 fd= -13

  fd 
 Mean = A +   C  5
 N 
 13 
 18   5
 100 
= 18 – 0.65
Mean = 17.35
Hence, we can avoid the tredious calculations by employing the assumed
mean method.
Characteristics
1. Each item in the distribution are matters. If we change any one item in the
distribution the mean will be changed.
2. Mean is generally affected by the extreme values.
3. The sum of the deviations about this mean is ‘zero’. (i.e.) If we find the
differences between each score and the arithmetic mean and add all these
differences the sum will be ‘zero’.
Advantages

ANNAMALAI UNIVERSITY
1. It is the most commonly used average.
2. It is very easily computed and understood.
3. It utilizes the entire data in a group.
4. The aggregate can be calculated if the number of items and the sum are
known.
5. It affords a good comparison.
Limitations
1. It can hardly be located by inspection.
2. It can mislead even if one data is not presented.
3. The mean value can be greatly distorted by the extreme values.
30

(ii) MEDIAN
The median is the point in the distribution where half of the scores falls above
and the other half falls below.
To calculate the median, scores must be arranged either, in ascending or
descending order. Then we have to find out the mid way of the scores. The
calculation of median slightly differs if there is an even or odd number of scores.

Examples
(When the number of items in the series are odd)
1. Find out the median for the following numbers.
2, 5, 7, 3, 4, 6, 9
Arrange the numbers in the ascending order.
2, 3, 4, 5, 6, 7, 9
Since there are 7 members in the series, the fourth number is the median of
this array.
Therefore, Median = 5
(when the number of items in the series are even).
2. Find out the median for the following marks.
50, 56, 53, 52, 58, 60
Arrange the series in the ascending order.
50, 52, 53, 56, 58, 60
Here, the median is the average of the values of the two middle terms.
In this array there are 6 numbers. Hence, the median should be taken as the
average of 3rd and 4th terms.
53  56
i.e.  54.5
2
Therefore Median = 54.5
th
 N 1
In general, the formula for median is given as median = The value of  
 2 
term, provided the terms arranged in an array. This formula is suitable only for the
ANNAMALAI UNIVERSITY
raw data.
Median for Grouped Data
To calculate the median for the grouped data, the following formula is used.
N  m 
 
Median = l   2  C
 f 

Here, l the exact lower limit of the class interval in which the median lies.
i.e. N/2th lower class limit.
N/2  half of the total frequency
31

f  frequency of the median class


m  cumulative frequency upto the lower boundary of the median class.
(i.e. cumulative frequency before the median class)
c  size of the class interval.

Example
Let us take the same example in Table 4.2
TABLE – 4.3
Cumulative True (or) exact
C.I. f
frequency (C.F.) class intervals
1-5 13 13 0.5 – 5.5
6-10 14 27 5.5 – 10.5
11-15 16 43 10.5 – 15.5
16-20 19 62 15.5 – 20.5
21-25 15 77 20.5 – 25.5
26-30 14 91 25.5 – 30.5
31-35 9 100 30.5 – 35.5

To calculate the median for the grouped data the following steps should be
followed:
(1) Compute the cumulative frequencies.
(2) Determine the N/2. i.e. one half of the number of scores.
(3) Find the class intervals in which the middle score falls and determine the
exact lower class limit of the interval.
(4) Find out the cumulative frequency before the median class.
(5) Substitute the values in the formula for median.
In the Table 4.3
N = 100
N/2 = 100/2 = 50

ANNAMALAI UNIVERSITY
50 lies in the 15.5 – 20.5 class.
Therefore, the exact lower class limit of the median class is 15.5 (i.e. l = 15.5)
f = 19, m = 43 and c = 5
 N m 
 
Therefore, Median = l 2  C 
 f 

 50  43 
= 15.5 +   5
 19 
75
= 15.5 +  
 19 
32

= 15.5 + 1.84
Median = 17.34
Characteristics
1. It is the middle position (or) exact mid-point which divides the distribution
equally into two parts.
2. It is affected by the number of cases in the distribution and not by the size
of the extreme values. i.e. If we increase or decrease the number of cases median
will be changed. But, if the number of cases are same and only the value of the
extreme items are increased or decreased the median will not be changed.
Advantages
1. It is easily calculated.
2. It is not affected by the unusual items.
3. It is not affected by the extreme variations.
4. It can be determined without the knowledge about the magnitude of extreme
item provided the number of items (or) cases are known.
Limitations
1. It has no further applications like the mean.
2. The items have to be arranged in size before the median can be computed.
3. The median cannot be added algebraically. i.e. If the medians of the two
subgroups are known the median of the total group cannot be found out from
the given values of the medians.
(iii) MODE
Another important measure of central tendency is mode. In a simple grouped
series of measures the crude mode (or) empirical mode is that a single measure (or)
score which occurs most frequently.
In other words the frequency which occurs most of the times is known as
mode.
Examples
1. Find the mode of the series.

ANNAMALAI UNIVERSITY
6, 7, 8, 8, 3, 2, 3, 8, 7, 2, 8, 8
In this series ‘8’ occurs most of the times (5 times) and hence ‘8’ is the crude
mode of this series.
2. Find the mode of the series.
21, 26, 23, 28, 28, 44, 46, 46, 49, 62, 63
In this series the numbers ‘28’ and ‘46’ occurs most of the times. (2 times).
Hence mode = 28, 46
This is known as bimodal series
33

Mode for grouped data


Example
Let us take the example in Table 4.2. In a frequency distribution, mode is mid-
point of the class interval which has the maximum frequency.
In Table 4.2, the class interval 16-20 has the maximum of ‘19’. Hence the mid-
point of the C.I. ‘18’ is the mode of this distribution.
Another Method
Another Formula for mode is given as,
Mode = 3 Median – 2 Mean.
For the Table 4.2, the mean is calculated as ’17.35’ and the median is
calculated as ’17.34’.
Hence, mode = 3 (17.34) – 2 (17.35)
= 52.02 – 34.70
Mode = 17.34
Characteristics
1. The mode is a statistic of limited practical value.
2. It is entirely independent of the extreme values.
3. It is an average position.
4. It has little meaning unless the number of measurements under study is fairly
large.
Advantages
1. It is the most descriptive average
2. In smaller number of cases it can be calculated by inspection and observation.
3. To calculate the mode there is no need to arrange the values in order of size.
Limitations
1. Its significance is very limited.
2. In small number cases no modal class may be found out since no single value
may be repeated.
3. It cannot be used for further algebraic manipulation.

ANNAMALAI UNIVERSITY
WHEN TO USE THE MEASURE OF CENTRAL TENDENCY
(i) Mean
1. When the scores are distributed symmetrically around a central point. (i.e.) the
distribution is more or less normal distribution and each score contributes to
its determination.
2. When the measure of central tendency having the greatest stability is wanted.
3. When other statistics (like standard deviation, co-efficient of correlation) are to
be computed later. Many statistics are based upon the mean.
(ii) Median
1. When the exact midpoint of the distribution is wanted. i.e. 50% point.
2. When there are extreme scores which would markedly affect the mean. Extreme
score do not disturb the median. (For example, in the series 4, 5, 6, 7 and 8,
34

both the mean and median are 6. But if ‘8’ is replaced by 50 and the other
scores remaining the same, the median remains still 6, but the mean is 14.4]
3. When it is desired that certain scores should influence the central tendency, but
all that is known about them is that they are above or below the median.
(iii) Mode
1. When a quick and approximate measure of central tendency is all that is
wanted.
2. When the measure of central tendency should be the most typical value. (i.e.)
When we describe the style of dress or shoes worn by the ‘average women’, for
instance, the modal or most popular fashion is usually meant.
SUMMARY
Measures of central tendency help us to interpret and compare the
performance of a group as a whole. Mean, median and mode are the commonly is
used measure of central tendency. Mean is the measure of a central location;
median is the point which divides the distribution equally is to two halves; and
mode is the most occurring frequency in the distribution. The measurement of
central tendency has wider application in the statistical analysis since they explain
about the averages score in a distribution.
KEY TERMS
Central tendency Mean
Arithmetic Mean Average
Median Mode
QUESTIONS
1. What are the measures of central tendency?
2. What is a median? How it is calculated?
3. How do you define mode? When do you apply it?
4. Calculate the mean, median and mode for the following frequency distribution.
C.I. 10-14 15-19 20-24 25-29 30-34 35-39 40-44
f 2 4 7 8 7 7 5

ANNAMALAI UNIVERSITY 
35

LESSON – 5

BASIC STATISTICAL CONCEPTS IN TESTING AND ASSESSMENT


(Contd.)
OBJECTIVES
After reading this lesson the student should
 Understand the measures of variability.
 Explain range as a measure of variability.
 Describe the procedures followed in computing standard deviation.
 Discuss how standard deviation can be interpreted.
SYNOPSIS
Measures of variability: The range – Quartile Deviation and Mean Deviation –
The variance and the standard deviation.
INTRODUCTION
Measures of central tendency such as the mean, median and mode are used to
summaries information. They are important because they explain about the average
score in a distribution. But, average alone does not give all the information that is
needed about a group of scores. Any distribution has atleast one more feature that
must be summarized in some way. Distributions exhibit spread or dispersion that
tendency for observations to depart from central tendency. Variability or dispersion
is thus an important concept in statistical inquiry. It reflects the “poorness” of
central tendency as a description of a randomly selected case as it depicts the
tendency of observations not to be like the average. As variability is accounted for,
estimates and inferences are improved.
Variability refers to the spread of the separate scores around a measure of
central tendency. If the spread is substantial, the variability is said to be
considerable; if the spread is little, it is insignificant.
A measure of variation or dispersion or the variability of scores measures the
scatter or spread of the scores around the central tendency. Measures of variation
measures the amount of the variation on its degree but not the direction. Measures

ANNAMALAI UNIVERSITY
of variability are also called the measures of second order, because in their
calculation we average the values derived from the measures of the first order (like
mean, median, mode). All the measures of variability represent distances rather
than points; and larger they are greater the variability in scores.
The usefulness of a measure of variability can be seen from a simple example.
Suppose a test of intelligence has been administered to a group of 50 boys and 50
girls. The mean score is 105 for the two groups. This indicates that there is no
difference in the performance of the two groups as indicated.
But, the boys scores are found to range from 95 to 115 and that of girls range
from 75 to 155. This difference in range shows that the girls are more variable than
the boys. Hence, the boys group is more homogeneous than the second with respect
36

to IQ. i.e. if the group is made up of individuals of nearly the same ability, most of
the scores will fall closer to the mean, when the range is relatively short, the
variability will be small, we can expect the first group to be more teachable and that
they will be likely to learn new ideas and nearly the same rate. The second group
can be expected to show considerable disparity in speed of grasping new ideas.
Hence, to explain these kinds of things the measures of dispersions are highly
useful.
There are four customary values are used to indicate the variability. They are,
(i) The range
(ii) The semiquartile range (or) Quartile deviation (Q)
(iii) The average or mean deviation
(iv) The standard deviation

i) RANGE
Range is the simplest method of studying variability. It is defined as the
difference between the value of the smallest score and the value of the largest score
included in the distribution.
i.e. Range = L-S
L  Largest Score and
S  Smallest Score.
If the averages of two distributions are about the same, a comparison of the
range indicates that the distribution with the smaller range has less dispersion and
the average of that distribution is more representative of the group.

Limitations
1. It helps us to make only a rough comparison of two or more groups for
variability.
2. It takes account of only the two extreme scores of a series and is variable when
‘N’ is small or when there are large gap in the frequency distribution.
3. It is highly affected by the fluctuations in sampling. Its value is never stable.
4. The range does not take into account the composition of a series of the
ANNAMALAI UNIVERSITY
distribution of the items within the extremes. The range of a symmetrical and
an assymetrical distribution can be identical.
(ii) QUARTILE DEVIATION
The quartile deviation (or) semiquartile range is one-half the scale distance
between the first quartile (or 25th percentile) and 3rd quartile (or 75th percentile) in a
frequency distribution.
The value of the item which divides the first half of the series (with values less
than the values of the median) into two equal parts is called 1st quartile (Q1) or
lower quartile. In other words Q1 is a point below which 25% of the score lies.
37

Q1 Q2 Q3
Median
Fig. 5.1 Quartiles in a distribution

The second or middle quartile is the median. In other words it is a point below
which 50% of the scores lies. A median is 2nd quartile (Q2) or it is a 50th percentile
(P50).
The value of the item which divides the latter half of the series into two equal
parts is called 3rd quartile (Q3) or the upper quartile. In other words it is the point
below which 75% of the score lies. The Q3 is the 75th percentile (P75).
The quartile deviation (Q) is one half of the scale distance between the 3 rd and
1st quartile.
Q3  Q1
i.e. Q =
2
N  m
Here, Q1 = l +  4 c
 f 
 
3N  m
Q3 = l +  4 c

 f
 
Note: In the median formula N/2 is replaced by N/4 for Q1 and by 3N/4 for Q3.

‘Q’ for grouped data


Let us take the Table 4.3, for the calculateion of ‘Q’

ANNAMALAI UNIVERSITY
N  m
Q1 = l +  4


f
c


In the example in Table 4.3
N = 100
N/4 = 25
Therefore 25 lie after 13 in the cumulative frequency (i.e.) In the class interval
5.5-10.5.
Here, l = 5.5, f = 14, m = 13, c = 5
38

 25  13 
Q1 = 5.5 +  5
 14 
= 5.5 + 4.29
Q1 = 9.79
3N 3  100
Similarly,   75
4 4
 75 lie after 62 in the cumulative frequency.
i.e. In the class interval 20.5 – 25.5
Here, l = 20.5, f = 15 m = 62, c = 5
 75  62 
 Q3 = 20.5 +  5
 12 
= 20.5 + 4.33
Q3 = 24.83
Q3  Q1
Q=
2
24.83  9.79 15.04
= 
2 2
Q = 7.52

Merits
1. It is a more representative and trustworthy measure of variability than the
overall range.
2. It is a good index of score density at the middle of the distribution.
3. Quartiles are useful in indicating the skewness of a distribution.
Q3 – Median > Median – Q1  indicates +ve skewness
Q3 – Median < Median – Q1  indicates –ve skewness
Q3 – Median = Median – Q1  indicates zero skewness.

ANNAMALAI UNIVERSITY
4. Like the median ‘Q’ is applicable to open end distribution.
5. It is the probable error of a normal distribution.
Limitations
1. It is not useful for further algebraic treatments.
2. It is possible for two distributions to have equal ‘Q’s but quite dissimilar
variability of the lower and upper 25% of the scores.
3. It is affected to a considerable extent by fluctuations in sampling. A change in
the value of a single item may in certain cases affect its value considerably.
39

iii) AVERAGE DEVIATION


The average deviation is the average distance between the mean and the scores
in the distribution. It is the arithmetic mean of all the deviations when algebraic
signs are disregarded.
For the ungrouped data the average deviation is calculated using the following
formula.

A.D. =
x
M
Here, x  deviations from the mean
M  mean of the array.

Example
Calculate the average deviation for the following data 5, 8, 10, 12, 15.
Scores Deviation x = X-M Deviation without
sign
5 -5 5
8 -3 3
10 0 0
12 +2 2
15 +5 5
 Scores = 50 d = 15
Mean = 50/5
M =10

 Average deviation =
 x  15
M 10
 A.d. = 1.5
This indicates that the scores deviated on the average 1.5 points from the mean.

A.D. for grouped data

ANNAMALAI UNIVERSITY
The average deviation for the group data is given by the formula.

A.D. =
 f |x  x|
N
Here,
x  any representative mid-point of a class interval.
x  mean of the distribution.
N  Sum of the frequencies i.e. Total number of frequencies.
40

Let us take the problem in Table 4.2


C.I. f Mid-point x |x- x | f|x- x |
1-5 13 3 14.35 186.55
6-10 14 8 9.35 130.90
11-15 16 13 4.35 69.60
16-20 19 18 0.65 12.35
21-25 15 23 5.65 84.75
26-30 14 28 10.65 149.10
31-35 9 33 15.65 140.85
f=n=60 f|x- x |=
774.10
Here, the mean is already calculated as “17.35”. (In the previous lesson)

 AD =
 f |x  x|
N
774.1
=
100
A.D. = 7.741

Merits
1. It is easily understood measure of variability i.e. It is the average of the
deviations from the mean.
2. It is based on all the observations and it cannot be calculated in the absence of
even single score.
3. It is not affected very much by the values of the extreme items.
Limitation
Average deviation ignores the algebraic signs of the deviations and as such it is
not capable for further mathematical treatment.

(iv) STANDARD DEVIATION


ANNAMALAI UNIVERSITY
The standard deviation (S.D.) is the most commonly used indicator of degree of
dispersion and is the most dependable estimate of the variability in the population
from which the sample was chosen. It is also used in numerous other statistical
formula.
The standard deviation is the most stable index of variability. It differs from the
A.D. in several aspects.
1. In computing A.D. we disregard the signs whereas in finding SD, we avoid the
difficulty of signs by squaring the separate deviations.
2. The squared deviations used in computing SD are always taken from the mean.
41

The standard deviation is a kind of average of all the deviations from the mean.
The fundamental formula for computing the index of variability is

S.D. =  x 2N

x  deviation from the mean.


i.e. x = x – X
N  the size of the sample.
As a general concept the standard deviation is symbolised by SD, but much
more often by ‘’ (sigma).
The steps involved in the computation of S.D. are

1. Find each deviation from the mean i.e. x – X = x


2. Square each deviation i.e. Find x2
3. Sum all the square deviations i.e. find x2
4. Divide the sum by N i.e. find x2/N

5. Find out the square root of the result of step 4. i.e.  x 2 N , so that we get
the standard deviation for the ungrouped data.

Example
Calculate the S.D. for the following array.
5, 8, 10, 12, 15
Scores Deviation x = x – X Square of the
deviation x2
5 -5 25
8 3 9
10 0 0
12 2 4
15 5 25

ANNAMALAI UNIVERSITY
N = 50 x2 = 63

Here, the mean is ‘10’.


2
Therefore, S.D. =  x /N

= 63  1.26
50
S.D. = 1.123
42

S.D. for Grouped Data


To calculate the S.D. for the group data the following formula is applied.
2 2
c
 fd    fd 
 
N  N 
where d = x-A.
x  any representative score
A  assumed mean.
The steps in the calculation of S.D. are,
1. Take the deviations of the C.I. from the assumed mean and these deviations are
denoted by ‘’.
2. Multiply the deviations by the respective Frequencies and obtain the total fd.
3. Obtain the sum squares of the deviations multiplied by ‘fd’. i.e. fd2
4. Substitute the values in the formula for S.D. we get the S.D.

Example
Let us take the example in Table 4.2,
Here, Assumed mean = 18
Class Interval Frequency Mid point Deviation
fd fd2
C.I. f of C.I. d
1-5 13 3 -3 -39 117
6-10 14 8 -2 -28 56
11-15 16 13 -1 -16 16
16-20 19 18 0 0 0
21-25 15 23 1 15 15
26-30 14 28 2 28 56
31-35 9 33 3 27 81
f=N=100 fd= -13 fd2 = 341

ANNAMALAI UNIVERSITY
C = 5, fd = -73, fd2 = 341, N= 100

341   13 
2
=5  
100  100 

=5 3.41  0.0169

=5 3.3931
= 5  1.842
 = 9.21
43

Merits
1. S.D. is rightly defined and its value is always definite.
2. It is based on the observation of all the data.
3. It is amicable to algebraic treatment and possesses many mathematical
properties. Hence, it is used in many advanced calculations.
4. It is less affected by the fluctuations in sampling than most of the other
measures of variability.
5. It is possible to calculate the combined S.D. of two or more groups.
Limitations
1. It is difficult to understand and interpret.
2. It gives more weight to extreme items and less to those near the mean, because
the square of the deviations which are big in size would be proportionally
greater than the squares of those which are comparatively small.
Combined Standard Deviation
It is possible to compute combined standard deviation of two or more groups.
Combined standard deviation is denoted by 12 and is computed as follows:

N112  N2 22  N112  N2 22


12 =
N1  N

1 – S.D. of the 1st group


2 – S.D. of the 2nd group

d1 = x 1  
 x12 ; d2  x 2  x12 
x1 = Mean of the 1st group scores.

x1 = Mean of the 2nd group scores

x12 = Combined Mean

N1 x1  N2 x 2
x12 

ANNAMALAI UNIVERSITY
N1  N2
N1 = Number of items in the 1st group.
N2 = Number of items in the 2nd group.

Example
The mean, S.D. , N of the two groups are given here. Calculate the combined
S.D.
Group N x 
A 50 95 6
B 60 105 7
44

N1 X1  N2 X 2 (50  95)  (60  105)


x 12 = 
N1  N2 (50  60)
4750  6300 11050
= 
110 110
x 12 = 100.45
d1 = x 1 – x 12 = 95-100.45 = -5.45
d2 = x 2 – x 12 = 105-100.45 = 4.45

N112  N222  N112  N2 22


Therefore 12 =
N1  N2

(50  62 )  (60  72 )  50(5.45)2  60(4.55)2


=
50  60
1800  2940  1485.12  1242.15
=
110
= 67.88
12 = 8.24
The variance and standard deviation
A more stable index that reflects the degree of variability in a group of scores is
the variance and its derivative is the standard deviation. The standard deviation
concept was introduced by Karl Pearson in 1893. It is a widely used measure of
studying dispersion. The S.D. is also known as root-mean square deviation for the
reason that it is the square root of the means of the squared deviations from the
arithmetic mean. Usually standard deviation is derived from the variance by taking
the square root of the variance
S.D. = Variance
i.e. Standard deviation is the square root of the variance (or) the square root of
the mean squared deviation and is an index of variability. In the calculation of S.D.,

ANNAMALAI UNIVERSITY
deviations are always taken from the mean and never from the median or mode.
Further, the value of S.D. is always positive. Standard deviation provides a
standard unit for measuring distances of various scores from their mean.
The S.D. measures the absolute dispersion of variability of a distribution. The
greater the amount of dispersion or variability, the greater the standard deviation
and also the greater will be the magnitude of the deviations of the values from their
mean. A small standard deviation means a high degree of uniformity of the
observations as well as homogeneity of a series; a large standard deviations means
it is just opposite.
Thus, if we have two or more comparable series with identical means, it is the
distribution with the smallest standard deviation that has the most representative
45

mean. Hence, the standard deviation is extremely useful in finding the


representatives of the mean.
The useful and most accepted interpretations of a standard deviation is in terms
of the percentage of cases included within the range from one standard deviation
below the mean and to one standard deviation above the mean. In a normal
distribution, the range from -1 to +1 contains 68.27% of the cases. Since, most of
the samples yield distribution that depart to same expent from normality, i.e. about
two third which is some what less than 68.27%.
The relationship between sum of squares, variance and standard deviation are
very closely related with each other. The interrelationships between the sum of the
squares, variance and standard deviations can be shown algebraically as follows:
2
Standard Deviation  
x  v
N
2
Variance V =
x  2
N
Sum of squares x2 = N.V = N2
Both V and  are indicators of amount of variability or dispersion in a
distribution. V (or) 2 is said to measure variance and ‘’ to measure variability. V
and  also form familiar indicators of the extent of individual differences which form
the basis of all psychological and educational testing.
WHEN TO USE THE MEASURES OF VARIABILITY
i) Range
1. When a knowledge of an extreme score is all that is wanted.
2. When the data are too scant or too scattered to justify the computation of a
more precise measure of variability.
ii) Median
1. When the median is a measure of central tendency.
2. When the distribution is incomplete at either end.

ANNAMALAI UNIVERSITY
3. When there are scattered or extreme scores which would disproportionately
influence the SD.
4. When the concentration around the median i.e.) the middle 50% of the case is of
primary interest.
iii) Average Deviation
1. When it is desired to weigh all deviations from the mean according to their size.
2. When extreme deviations would influence standard deviation unduly.
46

iv) Standard Deviation


1. When a measure having the greatest stability and reliability is sought.
2. When the extreme deviations should excessive a proportionally greater effect
upon variability.
3. When the co-efficient of correlation and other statistics are subsequently to be
computed.
4. When the interpretations related to the normal probability curve are desired.
SUMMARY
A measure of variability measures the scatter or spread of the scores around
central tendency. Range, Quartile deviation, Mean deviation and Standard deviation
are the different measures of variability. Variability measures the amount of
variation on its degree but not the direction. The range provides an idea about how
the distribution is spreaded. Quartile deviation divides the distribution equally into
four parts, which helps us in the calculation of percentiles and deciles. The
standard deviation is the most stable index of variability and is highly useful in
future calculations. Standard deviation provides a standard unit for measuring
distances of various scores from their mean.
KEY TERMS
Range Quartile deviation
Average Deviation Standard Deviation
Combined Standard Deviation The Variance
Sum of Squares
QUESTIONS
1. Enumerate the measures of variability with examples.
2. Define standard deviation and explain its uses.
3. Following are the marks scored by pupils in statistics 45, 65, 85, 90, 33, 92.
Find out the standard deviation.
4. Calculate the Quartile deviation, Average deviation and standard deviation for
the following frequency distribution.

ANNAMALAI UNIVERSITY
C.I.
f
5-9
5
10-14
6
15-19
15
20-24
10
25-29
8
30-34
6
35-39
3
40-44
2



47

LESSON – 6

BASIC STATISTICAL CONCEPTS IN TESTING AND ASSESSMENT


(Contd.)
OBJECTIVES
After reading this lesson the student should
 Understand the normal probability curve.
 Describe the properties of a normal probability curve
 Explain the applications of a normal probability curve
 Analyze the skewness and kurtosis in a distribution.

SYNOPSIS
Normal probability curve: Properties and its applications – Skewness and
Kurtosis.
SHAPES OF DATA DISTRIBUTION
Distribution of scores may assume many different shapes. When a frequency
distribution is plotted with only a small number of scores, the shape of the curve
will be very uneven or irregular. With a large number of scores, the curve would
ordinarily be expected to take on a more smoothed or regular appearance. The
shape of this smooth curve will depend upon (i) The properties of the measuring
instrument (ii) the distribution of the underlying characteristic we are attempting to
measure.

There are three types of curves (distributions) most frequently discussed in


psychological measurement. They are

(i) Normal probability curve (or) Normal curve.


(ii) Positively skewed curves and
(iii) Negatively skewed curves.

ANNAMALAI UNIVERSITY
THE NORMAL PROBABILITY CURVE
It is a bell shaped perfect bilaterally symmetrical curve where in the measures
are concentrated closed around the center and taper off from this central high (rest)
to the left and right. There are relatively few measures at the low – score end of the
scale: an increasing number of up to a maximum at the middle position; and a
progressive falling off towards the high-score end of the scale.
48

As
y
sy mp
m ttic
m B
et e
ric ll
al sh
un ap
ity ed

Mean
Median
Mode
The NPC or Normal curve is typical of many biological, anthropometrical,
psychological & physical measurements. It occurs whenever a large element of
chance enters into a measurement & when each chance event has about 50-50
probability of occurring.

Properties of NPC
1. The curve is symmetrical. The mean, median & mode exactly coincide in the
middle of the NPC.
2. The maximum ordinate of the curve occurs at the mean, that is where z = 0,
and in the unit normal curve is equal to 0.3989.
3. From the maximum point at the mean of NPC, the height of the curve declines
as we go in either direction from the mean. This falling off is slow at first then
rapid and then slow again.
4. The curve is asymptotic Theoretically the curve never touches the base line. Its
tails approach but never reach the base line. Hence the range is unlimited.
5. The points of inflection of the curve occurs at points  1 standard deviation unit
above and below the mean. Thus the curve changes from convex to concave in
relation to the horizontal axis at these pints.

ANNAMALAI UNIVERSITY
6. Roughly 68% of the area of the curve falls within the limits of 1 SD unit from
the mean.
7. The height of the curve at a distance of 1SD, 2SD, 3SD, from the mean on both
sides are 60.7%, 13.5% & 1.1% respectively of the height at the median.
8. In a normal distribution, the mean equals the median exactly and the skewness
is zero.
9. In the unit normal curve the limits z =  1.96 include 95% and the limits z = 
2.56 include 99% of the total area of the curve, 5% and 1% of the area
respectively falling beyond these limits.
49

34.13%
34.13%

13.59%
13.59%
2.15%

2.15%
0.13% 0.13%

-3 -2 -1 Z=0 1 2 3


Mean
Median
Mode

Areas under NPC


In essence the normal probability curve is based on the normal frequency
distribution. A normal distribution is a hypothetical distribution, with a bell-shaped
form in which a large proportion of scores is at or near the midpoint of a
distribution.
In a normal distribution, the scale along the bottom of the distribution is in z-
score units. i.e. the mean of a normal distribution is always 0. The other members
represent the standard deviation units above and below the mean. A z score of –2
represents a score that is 2.0 standard deviations below the mean, whereas z-score
of 3.0 represents a score that is 3.0 standard deviations above the mean. To explain
the areas under the NPC we should convert a raw score into a z-score, we can
determine its percentile ranking. In addition we can use this fact to determine the
proportion of the distribution that is between specific points. For normally and near
normally distributed variable, the proportion of the area between the mean and
 1.0 standard deviations is approximately 0.68 (or) 68%
 2.0 standard deviations is slightly more than 0.95 (or) 95%
 3.0 standard deviations is approximately 0.997 (or) 99.7%
The value of the normal distribution lies in the fact that many real world
distributions approach the form of theoretical distribution. This fact enables us to

ANNAMALAI UNIVERSITY
use the characteristics of the theoretical model to our advantage in real-world
applications. One of the most common uses of the Z-score is in comparing two
different measures that have different means and standard deviations. Using the z-
score transformation we can make direct comparisons between these different
scales.
The distances always in the units of SD are measured off on the base line of the
normal curve from the mean as origin. The number of cases in the normal
distribution between the mean, and the ordinate erected at the distances of  1,
2 & 3 is shown below:
50

i) Between mean & 1 : 34.13%


ii) Between mean & -1 : 34.13%
iii) Between 1 & 2 : 13.59%
iv) Between 2-1 & -2 : 13.59%
v) Between 2 & 3 : 2.15%
vi) Between -2 & -3 : 2.15%
vii) Beyond 3 : 0.13%
viii) Beyond -3 : 0.13%
Total : 100%
Applications of the NPC
1. The area relationship under the NPC may be employed to derive much
important information about any normal distribution of measures.
2. A given histogram or frequency polygon may be compared with an NPC of the
same area, mean & SD.
3. The percentage of cases in a normal distribution within given limits can be
determined.
4. The limits in any normal distribution, which include and given % of the cases
can be found out.
5. Two distributions interms of overlapping can be compared.
6. The relative difficulty of test questions, problems and other test items can be
determined.
7. Using an NPC, a given group can be separated into sub-groups, according to
capacity, when the trait is normally distributed.
SKEWNESS
A distribution is said to be skewed when the mean and the median fall at
different points in the distribution. And the balance or centre of gravity is shifted to
one side or the other.
ANNAMALAI UNIVERSITY
When the dispersion or scatter of the scores in a series is greater on one side of
the point of a central tendency then on the other, the distribution is skewed. In
other words, Skew ness is the degree of asymmetry (or) departure from symmetry of
a distribution.
51

Positively and negatively skewed distribution

Normal
Curve
Mean
Median
Mode
Mode

Mode
Median

Median
Mean
Mean

Low Positive Skewness High Low Negative Skewness High


When most of the scores pile up at the low end (or left) of the distribution and
spread out more gradually towards the high end of it, the distribution is said to be
positively skewed. In a positively skewed distribution, the mean falls to the right of
the median.
When most of the scores pile up at the high end (or right) of the distribution and
spread out more gradually toward the low end of it, the distribution is said to be
negatively skewed. In a negatively skewed distribution, the mean falls to the left of
the median.

Index of skew ness


A useful index of skew ness is given by the formula
3(Mean  Median)
SK 
S.D.
The formula clearly points out that
(i) The skew ness is zero, when the mean = median
(ii) The skew ness is positive, when the mean > median
(iii) The skew ness is negative, when the mean < median.
ANNAMALAI UNIVERSITY
When the percentiles are available the following formula can be used.
P  P10
SK  90  P50 (Median)
2
For a normal curve skew ness is zero.
For example, If mean = 40, median = 44 & S.D. = 40. Find skewness
3(Mean  Median)
SK 
SD
3(40  44)
=
40
52

 12
=
40
SK = -0.3
KURTOSIS
The term kurtosis refers to the peakedness or flatness of a frequency
distribution as compared with the normal (or) Kurtosis is the degree of peakedness
of a distribution taken relative to a normal distribution.

Lepto
kurtic
tic

Platy Kurtic
ur
ok
es
M

Normal Curve

A frequency distribution more peaked than the normal is said to be leptokurtic.


A frequency distribution which is flatter than the normal is said to be platykurtic. A
normal curve is also called as mesokurtic.
For the normal curve kurtosis is 0.263. This is a standard value. If the kurtosis
is greater than 0.263, the distribution is platykurtic and if it is less than 0.263 it is
leptokurtic.
The kurtosis is measured by the following formula:
Q (Quartile deviation)
Ku 
P90 - P10
For example, If Q = 30, P10 = 25, P90 = 75, Find the kurtosis
30 30
Ku =   0.6
75  25 50

ANNAMALAI UNIVERSITY
Ku = 0.6
In this case the curve is platykurtic

Why distributions Exhibit Skew ness or kurtosis


The reasons why distributions exhibit skew ness & Kurtosis are numerous and
& often complex. The common causes of asymmetry in a distribution are,

i) Selection
Biased selection produces skew ness in the distribution of the scores. e.g.
Inadequate sampling, homogeneity of the group, an inadequately constructed and
administered test etc.
53

ii) Unsuitable & Poorly – Made Test


If a test is too easy scores will pile up at the high end of the scale, giving a
negative skew ness. If a test is very difficult scores will pile up at the low end of the
scale giving a positive skew ness.

iii) Non-normal Distribution


A real lack of normality in the fact that is measured produces skew ness and
kurtosis. e.g. The likelihood of death due to degenerate diseases is very high during
maturity and old age and very minimal during the early ages.

iv) Errors in the Use of Tests


Errors in administering, scoring etc. of a test tend to make a skew ness in a
distribution.
SUMMARY
In a psychological measurement we use 3 common types if distributions viz:
normal curve, positively skewed curve and negatively skewed curve. Normal (or)
Normal probability curve (NPC) is a perfect bilaterally symmetrical curve which is
typical in psychological measurements. The NPC has wide application in testing
and measurement. Skewness is the index of error of a distribution. We may have
errors in distribution due to poor selection poor test, errors in test administration
and scroring and non-normality of distribution.
KEY TERMS
Normal probability curve Skew ness
Positive skew ness Negative skew ness
Kurtosis Mesokurtic
Leptokurtic Platykurtic
QUESTIONS
1. What is a normal probability curve? Explain its properties and application.
2. When the distribution is said to be skewed? Discuss the two types of skew ness
in a distribution.
3. What is kurtosis? Explain the 3 types of kurtosis.

ANNAMALAI UNIVERSITY
4. How the index of skew ness and kurtosis are calculated?
5. Draw a graph that represents each of the following types of data.
a) Positive skew ness b) Negative skew ness
c) Platykurtic distribution d) Leptrokurtic distribution


54

LESSON – 7

FINDING POINTS WITHIN DISTRIBUTIONS


OBJECTIVES
After reading this lesson the student should
 Understand what does percentile rank imply.
 Know the calculation procedure of percentile.
 Explain standard scores and distribution.
 Describe standard normal distribution.
SYNOPSIS
Percentile Rank – Calculation of percentiles – Standard Scores and Distributions
: Z score – Standard Normal Distribution – Percntiles and Z scores.
PERCENTILE RANKS
A percentile rank expresses the percentage of scores that fall below a particular
score (Xi). To calculate a percentile rank we should follow simple steps.

1. Determine how many cases are below the score of interest.


2. Determine how many cases are in the group.
3. Divide the number of cases below (step 1) by the total number of cases in the
group (Step 2)
4. Multiply the result of step 3 by 100.
The formula is,
B
Pr   100 = percentile rank of xi, where B is the number of scores below Xi
N
and N is the total number of scores. This means that you form a ratio of the
number of cases below the score and the number of scores. Because there will
always be either the same or fewer cases in the numerator (top half) of the equation
than there are in the denominator, this ratio will always be less than or equal to 1.
To get rid of the decimal points we multiply by 100.
As an example, let us consider the runner who finishes 62nd out of 63 races in a
ANNAMALAI UNIVERSITY
gain class. To obtain the percentile rank, we divide 1 (the number of people
finishing behind the person of interest) by 63 (the number of scores in the group).
This gives us 1/63 or 0.016. Then we multiply this result by 100 to obtain the
percentile rank, which is 1.6. This rank tells us the runner is below the 2nd
percentile.
The percentile rank depends absolutely on the cases used for comparison.
Percentiles are the specific scores or points within a distribution; percentiles divide
the total frequency for a set of observations into hundredths.
CALCULATION OF PERCENTILES IN A FREQUENCY DISTRIBUTION
Just as the median and quartiles are found, it is possible to compute points,
which lie below 10%, 43%, 85% or any percent of the scores. These points are
55

called percentiles and designated by the symbol Pp. P10 for example, is the point
below which lie 10% of the scores. The method of calculating percentiles is
 pN  F 
Pp  1   i (Percentiles in a frequency distribution, counting from below
 fp 
up).
P = Percentage of the distribution wanted eg. 10% 33% etc.
l = Lower limit of the class – interval upon which Pp lies.
PN = Part of N to be counted off in order to reach Pp.
F = sum of all scores upon intervals below 1.
fp = number of scores within the interval upon which Pp falls.
i = length of the class interval.
The details of the calculation are given below in the table.
For example, for finding P70 pN = 35 (70% of 50 = 35). and from the cumulative
frequency 30 scores are covered by the class interval 70-74 upto 74.5, the lower
limit of the class interval next above.
Hence, P70 falls upon 75-79 and, substituting pN = 35, F = 30, fp = 8 (frequency
upon 75-79) and i = 5 (class interval) is is found that p70 is 77.6 (for detailed
calculation see the table 7.1). This result means that 70% of the 50 students scored
below 77.6 in the distribution of scores. The other percentile values are found
exactly in the same way as p70.
It should be noted that p0 which marks the lower limit of the first interval (39.5)
lies at the beginning of the distribution p100 marks the upper limit of the last
interval, and lies at the end of the distribution. These two percentiles represent
limiting points. They indicate the boundaries of the percentile scale.
TABLE 7.1 CALCULATION OF CERTAIN PERCENTILE IN A FREQUENCY DISTRIBUTION
(Data for fifty Scores)
Scores f cmf Percentile
95-99 1 50 p100 = 99.5

ANNAMALAI UNIVERSITY
90-94
85-89
2
4
49
47 p90 = 87.0

80-84 5 43 p80 = 81.5

75-79 8 38 p70 = 77.6

70-74 10 30
65-69 6 20 p50 = 72.0

60-64 4 14
55-59 4 10 p30 = 65.3
56

50-54 2 6 p20 = 59.5

45-49 3 4 p10 = 52.0

40-44 1 1
N = 50 p0 = 39.5

5  4
p10 = 10% of 50 = 5; 49.5 +  5  52.0
 2 
 10  10 
p20 = 20% of 50 = 10; 59.5 +  5  59.5
 4 

 15  14 
p30 = 30% of 50 = 15; 64.5 +  5  65.3
 6 
 25  20 
p50 = 50% of 50 = 25; 69.5 +  5  72.0
 10 
 35  30 
p70 = 70% of 50 = 35; 74.5 +  5  77.6
 8 
 40  38 
p80 = 80% of 50 = 40; 79.5 +  5  81.5
 5 
 45  43 
p90 = 90% of 50 = 45; 84.5 +  5  87.0
 4 

Calculation of percentile ranks in a frequency distribution


The procedure followed here in computing percentile ranks, begins with an
individual score and then determining the percentage of score which lies below it. If
the percentage is 62, for example, the score has a percentile rank or PR of 62 on a
scale of 100. What is the PR of an individual who scores 63? Score 63 falls in
interval 60-64 as given in the previous table 7.1. There are ten scores upto 59.5
lower limit of the interval (column cum f) and four scores spread over this interval.
Dividing 4 by 5 (interval length) gives us 0.8 score per unit of interval. The score of

ANNAMALAI UNIVERSITY
63 is 3.5 score units from 59.5 lower limit of the interval within which the score of
63 lies. Multiplying 3.5 by 0.8 we get 2.8 as the score distance of 63 from 59.5 and
adding 2.8 to 10 (number of scores below 59.5) we get 12.8 as the part of N lying
below 63.
12.8
 100  25.6 hence, the
50 0.8 0.8 0.8 0.4 0.4 0.8
percentile route of score 63 is 26. The
39.5 60.5 61.5 62.5 63.5 64.5
diagram below will classify the 63.0
calculation.
Figure 7.1
57

Score 63.0 is just 0.8 + 0.8 + 0.8 + 0.4 or 2.8 scores from 59.5. The percentile
ranks for several scores may be read directly from the table. For instance 52 has a
PR of 10.72 (median) a PR of 50, and 87 a PR of 90.
In summary, the percentile and the percentile rank are similar to one another.
Other percentiles give the point in a distribution below which a specified percentage
of cases fall. The percentile is in raw score units. The percentile rank gives the
proportion of cases below the percentile. When reporting percentiles and percentile
ranks, the population should be carefully specified. Remember that a percentile
rank is a measure of relative performance.
STANDARD SCORES AND DISTRIBUTIONS
Raw score scales are transformed into other scales for deriving comparable
scales for different tests and to have standard meaning. The most obvious need for
comparable scales is seen in educational and vocational guidance, particularly
when profiles of scores are used. A profile is intended to provide a somewhat
extensive picture of an individual. The comparison of trait positions for a person
depends upon having scores that are comparable values numerically.
For such a purpose, conversion of raw scores to values on some common scale
is essential. It is necessary that the scales should have appropriately equal units as
well as comparability of means, dispersions and forms of distribution.
For example, a student gets scores of 195 in an English examination, 20 in a
reading test, 39 in an information test, 139 in a general academic – aptitude test,
and 41 in a non-verbal psychological test.
Is the student therefore best in English and poorest in reading? From the raw
scores alone, we cannot answer these questions. We need a common scale before
drawing any conclusion. These scores when converted into standard scores which
will furnish one such common scale.

Z-Score
The Z score is a transformation of data into standardised units that are easier to
ANNAMALAI UNIVERSITY
interpret. AZ - score is the difference between a score and the mean, divided by the
XX X-X
standard deviation. Z  or
S.D. 
In other words, a Z score is the standard deviation of a score X from the mean
X in standard deviation units.
58

TABLE 7.2
A Comparison of Standard Scores and Raw Scores Earned By
Two Students in Five Examinations
Standard
Raw Scores Deviations
Examination Mean S.D. Scores
I II I II I II
English 155.7 26.4 195 162 +39.3 +6.3 +1.49 +0.24
Reading 33.7 8.2 20 54 -13.7 +20.3 -1.67 +2.48
Information 54.5 9.3 39 72 -15.5 +17.5 -1.67 +1.88
Academic aptitude 87.1 25.8 139 84 +51.9 -3.1 +2.01 -0.12
Non-verbal test 24.8 6.8 41 25 +16.2 +0.2 +2.38 +0.03
+2.54 +4.51
Total Means 434 397 +0.51 +0.90

If the mean scores for the group are available, as given in the table, the above
mentioned scores can be transformed into Z Scores.
For example the score obtained by the 1st student is 195 and the mean is 156.7
for the same, with the standard deviation of 26.4 To calculate Z score.
X  X 195  155.7  39.3
Z=    1..49
 26.4 26.4
Thus the deviations from the mean are expressed in standard scores, (i.e.) Z
scores. The mean of a set of standard scores is always ‘O’ (the reference point) and
the standard deviation is always 1.00. Approximately half of the scores in a
distribution will be below and half above the mean, about half of our standard
scores will be negative and the other half positive.

We can now interpret the scores given above and give more satisfactory answers
to the questions raised. Student I is most superior in the non-verbal psychological
test, next in academic aptitude and third in English. In terms of standard scores,
we find that this student is equally deficient in reading ability and information,
whereas the deviations would place the student in the reverse order.

ANNAMALAI UNIVERSITY
Comparing the two students in terms of standard scores, it is in the
psychological test that the advantage is greatest. Student II has a decidedly greater
advantage in reading ability in terms of standard scores.

The shift from raw to standard score requires a linear transformation. This does
not change the shape of the distribution in any way. If the original distribution was
skewed (or normal) the standard core distribution will be skewed or normal in
exactly the same way.
59

B A

-40 -30 -20 -10 0 10 20 30 40 50 60 70 80 90 100 110


Deviation score scale (x) Original score scale (x)
Mean = 0 S = 14.0 Mean = 80 S = 14.0

C
D ist ri bu t io n be fo re a n d a ft e r
conversion fro m a raw-score scale
to a standa rdized score scale with
a de s i re d me a n a n d st a n da rd
de vi a t io n , w i t h a n d w it h o u t
-4.0 -3.0 -2.0 -1.0 0 +1.0 +2.0 +3.0 normalizing the distribution.
Standard score scale (x)
Mean = 0 S = 1.0

Fig 7.2

The nature of standard score scale


As already mentioned, a standard- score scale has a mean of zero and a
standard deviation of 1.00. An illustration of the conversion of a raw-score scale
into a standard scale is shown in Fig 7.2 in distributions A, B and C. Distribution
7-2 (A) is based upon the obtained or raw scores. The mean is 80.0 and SD is 14.0.
The distribution is somewhat skewed. The deviation score distribution is
represented in 7.2 (B). In this it is observed that the mean is zero after deducting
the mean from every raw score. The final step, arriving at the Z scale is shown in
Figure 7-2(C). Distribution C is drawn so that the mean is directly below that in
distribution B, both at zero, so that deviations of 14 units on the original scale
correspond with a deviation of 1 S.D. on the standard scale. Note that the form of
distribution has not changed: it is still skewed exactly as it was originally. This
procedure does not normalize the distribution as some other scaling procedures do.
STANDARD NORMAL DISTRIBUTION

ANNAMALAI UNIVERSITY
It is a probability distribution obtained from an infinite sequence of financial
(two-choice) chance events.
For example, if a coin is tossed, the probability is often unbiased join falling
heads is ½ and that the probability of falling tail is also ½. These ratios are called
probability ratios. It indicates that the total possible outcome is 2 (Head, Tail). The
occurrence of tail or head is laid 2 possibilities. Since, there are only two possible
outcomes in a given toss, a head or a tail is equally probable. Therefore, the
probability of H is ½ of T is ½ and H + T = ½ + ½ = 1.00.
60

Frequency of occurence

.1359
.1359

.0214
.0214
00135
00135

-3 -2 -1 0 +1 +2 +3
Z scores
Fig. 7.3

The figure 7.3 shows the theoretical distribution of heads in an infinite number
of tosses of the coin. This figure might look like a bell-shaped curve. The measures
are concentrated closely around the center and taper off from the central high point
to the left and the right. There are relatively few increases at the ‘low-score’ end of
the scale: an increasing number upto a maximum at the middle position; and a
falling off toward the “high score” end of the scale. If we divide the area under each
curve by a perpendicular line drawn through the central high point to the base line,
the two parts formed will be similar in shape and very nearly equal in area. Actually
this is a normal distribution or it is known as a symmetrical financial probability
distribution.

On most occasions we refer to units on the X-axis of the normal distribution in


Z-score units. Any variable transformed into Z-score unit takes on special
properties. First z-scores have a mean of o and a standard deviation of 1. This is
because the deviations around the mean is always equal to zero. The numerator of
the Z score equation (X- X ) is just the deviation around the mean while the
denominator is a constant. Thus the mean of Z scores can be expressed as
1
 (X  X )
s or
Z

ANNAMALAI UNIVERSITY
N N
Because (X-X) will always equal zero, the mean of Z – scores will always be
zero. In figure 7.2 the standardized or Z-score units are marked on the X-axis. The
numbers under the curve are proportions of cases in decimal form we would expect
the observe in each area. Multiplying these proportions by 100 yields percentages.
For example, we see that 34.13% or 0.3413 of the cases fall between the mean and
1 standard deviation above the mean. Putting these two bits of information
together, we can conclude that if a score is 1 standard deviation above the mean,
then it is about the 84th percentile rank (50+34.13 = 84.13 to be exact). A score that
is 1 standard deviation below the mean would be about the 16 th percentile rank (50
– 34.14 = 15.87). Thus we can use what we have learnt about means, standard
61

deviations, Z = scores and the normal curve to transform raw scores, which have
very little meaning to us, 250
into percentile scores, which
are easier to interpret. These 200
methods can be used

Frequencies
whenever the distribution of 150
scores is normal.
100
PERCENTILES AND Z
SCORES
A table relating Z scores 50
to proportions are available
in the text books on 10H1T9
45H2T8
120H3T7
210H4T6
252H5T5
210H6T4
120H7T3
45H8T2
10H9T1
H10
T10

“Statistics for Psychology


and Education”. It can be
used to find the proportion Fig. 7.4
of cases that fall above or
below any Z-score.
TABLE 7.3
A Sample of Z Scores and Proportions from the Original Table of Z Scores
1 2 3 4
Z scores Area from the Area from -  to Area from Z to +
Mean to Z Z 
0.00 0.0000 0.5000 0.5000
0.01 0.0040 0.5040 0.4960
0.02 0.0080 0.5080 0.4920
0.03 0.0120 0.5120 0.4880
0.04 0.0160 0.5160 0.4840
1.00 0.3413 0.8413 0.1587
3.04 0.4988 0.9988 0.0012

ANNAMALAI UNIVERSITY
The Z – scores are listed in the first column. The second column gives the area
from the mean to Z (This area is a proportion and can be changed to a percentage
by multiplying by 100). The third column gives the area between negative infinity
(the lowest possible point on the distribution) and the Z – score. The fourth column
gives the area between the Z – Score and positive infinite (the highest possible
point).
Let us consider an example: For a Z – score of 1.0 find the Z – value in the table.
The second column of the table shows that the area between the mean of the
distribution and this Z – score is 0.3413. In other words, 34.13 per cent of the
distribution falls in this area.
62

Consider the percentile rank for a Z score of 1.00. This can be obtained from
column three. It shows the area between the bottom of the distribution and the
observed Z-score.
Since 1.00 is a positive Z – score, we know that at least 0.50 of the distribution
lies below the Z – score because half of the Z distribution lies below zero. Another
way to obtain the 0.8413 we would have to add 0.50 (the area below the O. which is
the mean of the Z-distribution) to 0.3413 (the area between the mean and the
observed Z – score).
Instead of using Z – score to find the percentile ranks, we can use the percentile
ranks to find the corresponding Z-scores. To do this we need to look under
percentiles and find the corresponding Z-scores.
An example: One of the difficulties in grading students is that performance is
usually related in terms of raw scores, such as marks in an examination. For
instance, the professor hand over your test paper with 72 on it. An alternative way
of doing this would be to give you feedback about your performance as a Z – score.
To do this, you should subtract the average score (mean) from your score and
divide by the standard deviation. If your Z – score is positive, you would
immediately know that your score is above average, if it were negative, you would
know your performance is below average.
SUMMARY
Many standard sores are used in psychological testing like percentile, Z scores
and T scores. Percentile, divides the distribution equally into 100 parts, whereas a
percentile rank expresses the percentage of scores that fall below a particular score.
Percentile rank is a measure of relative performance and hence it is useful to
classify the people in a group. Standard scores are used to correct the raw score in
some scales into standard meaning. The shift from raw score to standard scores
requires a linear transformation in Z score serves this purpose well. Z score is
useful comparing the two different groups in terms of standard scores. Even we can
correct the percentiles into Z score and we can check whether the performance is
average or above average or below average.
KEY TERMS

ANNAMALAI UNIVERSITY
Percentiles
Standards scores
Percentile rank
Z score
Standard normal distribution
QUESTION
1. What does percentile rank refer to?
2. Define standard scores and discuss their uses.
3. Explain the standard normal distribution.


63

LESSON – 8

FINDING POINTS WITHIN DISTRIBUTIONS (Contd.)


OBJECTIVES
After reading this lesson the student should
 Understand Mc Calls ‘T’ Score
 Explains Quartiles and deciles
 Know the calculation of quartiles and deciles
 Differentiate the sten and stanine scores.
SYNOPSIS
McCalls T – Quartiles and Deciles – Sten – Stanine

McCalls T Score
There are a variety of other systems by which we can transform raw scores to
give them more meaning. One system was established in 1939 by W.A. Mc Call,
who originally intended to develop a system to derive equal units in mental
quantities. He suggested that a random sample of 12 year old be tested and their
score distribution be obtained. Then, percentile equivalents were to be assigned to
each raw score, showing the percentile rank in group for what Mc Call generated
was a system that is exactly the same as standard scores (Z-scores). The mean in
Mc Call’s system is 50 rather than zero (O) and the standard deviation is 10 rather
than 1. Indeed a Z score can be transformed to a T score by applying the linear
transformation T = 10Z + 50.
Mc Call did not intend to create an alternative to the Z score. He wanted to
obtain one set of scores which could then be applied in other situations without
standardizing the entire set of numbers.
Therefore, T scores and normalised standard scores are converted into
distribution with a mean of 50 and a standard deviation of 10. As shown in the
Figure 8.1. The T divisions above the mean (1, 2, 3, 4, 5) became 6, 7, 8, 9
and 10; and the T divisions below the mean (-1, -2, -3, -4, -5) are 4, 3, 2, 1
ANNAMALAI UNIVERSITY
and 0. The T of the distribution remains equal to 1.00. The T-scale begins at -5
and ends at +5T. But T is multiplied by 10 so that the mean is 50 and the other
divisions are 0, 10, 20, 30, 40, 50, 60, 70, 80, 90 and 100. The scale covers exactly
100 units. This is convenient but the extremes of the scale are far beyond the
ranges of most groups. In actual practice, T scores are ranging from about 15 to 85
i.e., from –3.5 to +3.5.
64

T- SCALING FOR A SET OF SCORES

1.13

Mean
-5 -4 -3 -2 -1 0 1 2 3 4 5
 - Scale Zero Point of Mean
0 1 2 3 4 5 6 7 8 9 10
 - Scale Zero Point at - 5
0 10 20 30 40 50 60 70 80 90 100
T- Scale Zero Point at - 5

Fig. 8.1 To Ilustration of  -Scaling and T-scaling in a normal distribution


The process of T – scaling is outlined in a series of steps.
In table 8.1 the test scores are entered in column (1) : and in column (2) the
frequencies are listed – number of subjects achieving each score.
TABLE – 8.1
(1) Test (3) (4) Cum freq. below score (5) Col (4)
(2)f (6) T-scores
scores cum f +1/2 on given score in %
10 1 62 61.5 99.2 74
9 4 61 5 95.2 67
8 6 57 54 87.1 61
7 10 51 46 74.2 56
6 8 41 37 59.7 52
5 13 33 26.5 42.7 48

ANNAMALAI UNIVERSITY
4
3
18
2
20
2
11
1
17.7
1.6
41
29

In column (3) scores have been cumulated from the low to the high end of the
frequency distribution.

Column (4) shows the number of subjects who fall below each score plus one
half of those who earn the given score. The entries in this column may readily be
computed from columns (2) and (3). There are no scores below 3 and 2 scores or 3,
so that the number below 4 and 18 is 4; hence the number of scores below 4 + ½ or
4 is 2 + 9 or 11.
65

The score of 4, for example covers the interval 3.5 – 4.5 mid point 4.0. Each
sum in column (4) is taken up to the mid point of the score – interval. In column
(5) the entries in column (4) are expressed as percents of N. Thus, 99.2% of the
scores lie below 10.0 mid point of the interval 9.5 – 10.5; 95.2% of the scores lie
below 9.0 mid point of 8.5 – 9.5 and so on.

Turn the percents in column (5) into T-scores by means of Table 8.1 containing
the percents and T scores, which is given in all the books on ‘Psychological
statistics”.

T scores in the table 8.1 corresponding to percentages nearest to those wanted


are taken without interpolation. Thus for 1.6% we take 1.79 (T – score = 29); for
17.7% we take 18.41 (T-score = 41) and so on.

29 41 48 52 56 61 67 74

20 30 40 50 60 70 80
3 4 5 6 7 8 9 10

Fig. 8.2 Normalised distribution of the scores in Table and


figure. Original scores and T-score equivalents are shown on
baseline

When the raw scores are transformed into normalising standard scores – into T-
scores – they occupy the positions in the normal curve shown in Figure 8.2. The
unequal scale distances between the scores in the figure show clearly that, the
original scores do not represent equal difficulty steps. In other words, normalising a
distribution of test scores alters the original test units and the more skewed the raw

ANNAMALAI UNIVERSITY
score distribution the greater the change in the unit.

T-scores from different tests are comparable and have the same meaning, since
reference is always to a standard scale of 100 units based upon the normal
probability curve.

A comparison of T – scores and standard scores


With respect to original scores, T – scores represent equivalent scores in a
normal distribution. Standard scores always have the same form of distribution as
raw scores and are simply original scores expressed in T units. Standard scores
represent the kind of conversion we make when we change inches as centimeters,
that is, the transformation is linear. Standard scores correspond exactly to T-scores
when the distribution of raw scores is strictly normal.
66

QUARTILES AND DECILES


Quartiles and Deciles refer to divisions of the percentile scale into groups. The
quartile system divides the percentile scale into four groups, and the decile system
divides the scale into ten groups.

The first quartile is the 25th percentile, the second quartile is the median or the
50th percentile, and the third quartile is the 75th percentile. These are abbreviated
Q1, Q2 and Q3 respectively. One quarter of the cases fall below Q1 one half fall
below Q2 and three quarters fall below Q3.

The interquartile range is the interval of scores bounded by the range of scores
that represents the middle 50% of the distribution.

Deciles are similar to quartiles except they use parents that mark 10% rather
than 25% intervals. Thus the top decile, or D9 is the point at which 90% of the
cases fall below. The next decile D8 marks the 80th percentile, and so forth.

The Quartiles Q1 and Q3 mark of the middle 50% of scores in the distribution
and the distance between these points is called the interquartile range. Q is one-
half the range of the middle 50% or the semi-interquartile range.

Since, Q measures the average distance of the quartile points from the median,
it is a good measure of score density around the middle of the distribution. If the
scores of a distribution are packed closely together, the quartiles will be nearer to
one another and Q will be small. If the scores are widely scattered, the quartiles will
be relatively far apart, and Q will be large.

When the distribution is symmetrical or “skewed”, Q1 and Q3 are at unequal


distances from the median, and the difference between (Q3-Mdn) and (Mdn-Q1)
gives a measure of the normal amount and direction of the skewness. When the
distribution is symmetrical or normal, Q marks of exactly the 25% of cases just
above, and the 25% off cases just below, the median. The median then lies just half
way between the two Quartiles Q1 and Q3.

ANNAMALAI UNIVERSITY
Semi-Interquartile range is found from the formula

Q3  Q1
Q
2
Figure 8.3 illustrates the quartiles Q1, Q2 and Q3, the interquartile ranges and
the quarters in the sample in the normal distribution.
67

Line erected at the


second quartile (Q2)
Line erected at (also the median)
the first quartile
(Q 1) Line erected at
the third quartile
(Q 3)

High middle
quarter
Highest
quarter

Q1 Q2 Q3
Fig. 8.3
TABLE 8.2
Determination of Q3, Q1 and Q (the semi-interquartile range) for the
Ink Blot – test scores.
Scores f cum f
55-59 1 50
50-54 1 49
45-49 3 48
40-44 4 45
34.5 35-39 6 41 Q3 lies within this interval
30-34 7 35
25-29 13 28
19.5 20.24 6 16 Q1 lies within this interval
15-19 8 10
10-14 2 2 N = 50
 N / 4  cum f 
To find Q1 = LQ1 +  i
 FQ 
 1 
ANNAMALAI UNIVERSITY
Steps
1. Find N/4 = 50/4 = 12.5
2. Locate the Q1 – 12.5 in the cumulative frequency and find the frequency on
which Q1 falls. Here, it is the cumulative frequency of 16 and the frequency is 6,
where Q1 falls FQ1 = 6
3. The total frequency below 6 is found to be 10. This is the cum f = 10.
4. The lower limit (exact limit) of the class interval in which Q1 is located is 19.5
Therefore LQ1 = 19.5.
Substituting in the formula, we find that
68

 N / 4  cum f 
Q1 = LQ1 +  i
 FQ 
 1 
N/4 = 12.5;
LQ1 = 19.5
cum f = 10; FQ1 = 6
i = 5 (width of the class interval)
 12.5  10 
Q1 = 19.5 +  5
 6 
 2.5 
= 19.5 +   5 = 19.5 + 2.08 = 21.58
 6 
 3N / 4  cum f 
To find Q3 = LQ3 +  i
 FQ 
 3 

Steps
3(50) 150
1. Find 3N/4 which is   37.5
4 4
2. Locate Q3 in the cumulative frequency 41. Therefore the frequency on which
the Q3 is located is 6. FQ3 = 6.
3. The total frequency below this 6 is found to be 35. cum f = 35
4. The lower limit (exact limit) of the class interval in which Q3 is located is
39.5.
Substituting in the formula; the following values.
3N/4 = 37.5 cum f = 35
FQ3 = 6 LQ3 = 39.5
i=5
 3N / 4  cum f 
Q3 = LQ3 +  i
 FQ 
ANNAMALAI UNIVERSITY
 3

 37.5  35 

 2.5 
= 39.5 +  5  39.5   5
 6   6 
= 39.5 + 2.08 = 41.58
Q1 = 21.58; Q3 = 41.58
Q3  Q1 41.58  21.58 20.00
Q =    10.00
2 2 2
STEN SCORES
Can filed proposed a 10 – step sten scale which has same unique advantages.
There are slightly two different kinds of stens (and of stanines too). The
69

conventional S – sten (standard – score based sten) scale takes the raw score mean
of the population as the central value, which is therefore exactly midway between
sten 5 and sten 6; And it extends out one sten for each half standard deviation
(0.5) of raw score.
Thus, the mean of the scale has the precise value of 5.5 stens as shown in
figure 8.4. Fig. 8.4 shows the standard scores and the stens on a normal
probability curve.
Any score between their mean at (5.5) and a point (1½) one and half standard
deviation downward becomes a sten point of 5; and one falling within the limits of a
one step of a half a sigma upward of the mean point gets 6. Thus, the range of
“average”, “normal” scores, a one sigma range, centered on the mean is represented
by stens 5 and 6. Only when we get to stens of 4 and 7, we should think of a person
definitely “departing from the average”.
The S-stens are thus based on the calculated raw score standard deviation. All
the raw scores from which the distribution of these S-stens can be somewhat
skewed.

Mean & Median

STANDARD SCORES

-2.5 -2.0 -1.5 -1.0 -.50 0 +.50 +1.0 +1.5 +2.0 +2.5

STENS
1 2 3 4 5 6 7 8 9 10

-2½ -1½ -1½ -½ -½ 0 +½ +½ +1½ +1½ -2½
CENTILE RANKS OF CENTRAL STEN VALUES

ANNAMALAI UNIVERSITY
1.2 4.0 10.6 22.7 40.1

CENTILE RANKS FOR STEN VALUES


59.9 77.3 89.4 96.0 98.8
84.13
15.87

30.85

50.00

69.15

93.32

97.72
2.28

6.68

Fig. 8.4 Relations from stens to standard scores and to centile ranks
The mean and median do not exactly coincide, and in some instances the
extreme scores, 1 and 10 may not have any cases in them Therefore, we can
introduce the concept of normalised stens or n-stens. The n - sten has the
70

advantage of agreeing with the above assumption that equal interval scale units are
those which give a normal distribution.
For either s-stens or n-stens, the translation of raw scores (sounded) in the
main tables is therefore simply a matter of convenience for most users.
Some attractive features of n-stens are that once one is working in this system
(a) the population scores will always be normally distributed, (b) the mean and
median will coincide and (c) there is some rationale for considering the stens to be
equal interval units.
To convert the raw scores into stens, it is necessary to find the mean and
standard deviation of the obtained scores.
TABLE 8.3
Raw scores Sten scores
10 1
15 2
Mean = 20 
20 6
= 2.67
22 7
33 10
100
To transform the score, the score ranges for each should be calculated. The
mean being 20 and the standard deviation of 2.67 can be used.
sten 5.5 = 20 = Mean
sten 6.0 = 20 +(¼) 2.67 = 20.67
sten 6.5 = 20 +(1/2) 2.67 = 21.33
sten 7.5 = 20 +(1) 2.67 = 22.67
sten 8.5 = 20 +1 ½ (2.67) = 24.00
In the same way stens lower than the mean is calculated.
sten 4.5 = 20 - (1/2) 2.67 = 18.66
sten 3.5 = 20 - (1) 2.67 = 17.33

ANNAMALAI UNIVERSITY
Thus calculated stens and the score ranges are presented in the following table.
TABLE 8.4: STEN SCORES BASED ON THE DATA IN TABLE 8.1
Stens Scores (rounded numbers)
1 (0.5 – 1.5) 14 and below
2 15
3 16
4 17.18
5 19
6 20, 21
71

7 22
8 23
9 24
10 25 and above
THE STANNINE SCALE
The stanine scale is contraction for standard nine. It is a condensed form of the
T scale Stanine scores run from 1 to 9 along the baseline of the normal curve
constituting a scale in which the unit is 0.5 and the median is 5.
SUMMARY
McCals T score is a standard score which help us to transform raw stock into a
standard one. Further, ‘T’ scores are normalized standard scores with a mean of 50
and a standard deviation of 10. Semi quartile range is also used as a standard
score because it has the interval of scores bounded by a range of scores that
represents the middle 50% of the distribution. Sten scores are termed as standard
ten which help us to comment the raw score into a standard one. Stannine scores
are also a standard scores which ranges from 1 to 9. All these standard scores have
a wide application in comparing the behaviour of people.
KEY TERMS
Mc Calls ‘T’ score Quartiles
Deciles Sten score
Stenine score
QUESTIONS
1. Explain Mc Calls ‘T’ score.
2. Describe the Quartiles and Deciles with their uses.
3. How is the sten scores computed.



ANNAMALAI UNIVERSITY
72

LESSON – 9

BIVARIATE ANALYSIS
OBJECTIVES
After reading this lesson the student should
 Understand the meaning of correlation
 Differentiate the positive, negative and zero order correlation.
 Calculate the correlation coefficient by using Karl Pearsons product –
Moment method.
 Explain the spearman’s Rank order Calculation.
 Compute the ‘t’ Value and interpret the same.
SYNOPSIS
Correlation – Rank order – Product Moment. Test of significance : ‘t’ Test –
Calculation and Interpretations – The ‘t’ ratio and its assumptions.
CORRELATION
Lots of things seem to be related in our day today life. For example stress is
associated with our performance; Intelligence is associated with an achievement;
Over eating is associated with indigestion and so on. Hence, it is important to have
a precise index to explain the association between two variables. The index of
association used most frequently in testing is correlation.
The correlation is a fascinating and powerful statistical tool that we use
whenever we examine the relationship between two or more variables. A correlation
can be used to answer many interesting questions. We can use the correlation to
determine the presence, direction and magnitude of the relation between variables.
In psychological research correlations are typically computed between two
organismic variables or dependent variables or one dependent and one organismic
variables.
SCATTERPLOT
Whenever it is essential to examine the data the important recommendation is
ANNAMALAI UNIVERSITY
that you look each of the variables and then make a graph of the data. Since,
graphing the data is one of the best ways to learn about the results of the research,
it is advised to use the scatterplot to represent the relationship between two
variables. A scatterplot is a type of graph used to represent the data collected in a
correlation research design. Each point represents the value for the x and y
variable.
73

TABLE 9.1 INDIVIDUALS SCORES IN TWO DIFFERENT TESTS


I II III
Individuals
X score Y score X score Y score X score Y score
A 2 4 2 11 20 19
B 4 6 4 13 19 14
C 5 7 5 9 15 13
D 6 8 6 10 17 11
E 7 9 7 12 16 19
F 8 10 8 7 13 17
G 9 11 9 5 15 12
H 10 12 10 8 14 18
I 12 14 12 3 12 16
J 13 15 13 7 11 19

For example, Table 9.1 shows the data of 10 individuals. There are three types
of relationships exist between the X and Y scores in situations I, II and III. Let us
plot the scatter diagram to explain the relationship between these variables.

Y r=1 Y Y r=0
For I r = -1 For III
For II

X X X

Fig. 9.1 Scatter diagrams showing various degrees of relatioship between two variables

Figure 9.1 represents the positive, negative and zero correlation between the
variables in Table 9.1. When we explain a relationship we expect a straight line that
will best describe the data. This straight line represents a linear relationship. i.e. If
the value of the variable in ‘X’ increases the value of the variable in ‘Y’ also
increases. The scatterplot for relationship I in Fig 9.1 represents a positive relation
ANNAMALAI UNIVERSITY
(or) positive correlation. In this relationship when the ‘x’ value increases
correspondingly the ‘Y’ value also increases. In contrast, the figure for relationship
II represents a negative correlation. In this relationship when the value of ‘X’
increases correspondingly the values of ‘Y’ decreases. Here, it is to be noticed that
the correlation is to either +1.0 (or) –1.0, most of the data fall along a straight line.

The correlation allows us to examine the degree to which two variables are
interrelated. That closer the correlation is to either +1.0 (or) –1.0, the greater the
relatedness between the two variables. When the correlation equals zero there is no
systematic relationship between the two variables. In Fig 9.2, the scatterplot for the
relationship III shows the zero correlation between the variables X and Y. The
74

correlation ‘zero’ tells that there is no systematic linear relation between x and y. In
other words, the two variables are independent of each other.
CORRELATION COEFFICIENT
The relationship between two variables (or scores) is expressed in terms of a
single member which is always less than one or equal to one and never greater than
one is called correlation co-efficient. In other words the quantitative expression of
the magnitude of the relationship is given in terms of a single number called
correlation co-efficient.
A co-efficient of correlation is a single number indicating the going togetherness
of two variables. It tells us to what extent the two variables are related and to what
extent variation in one variable go with the variations in the second variable. The
correlation coefficient is denoted by ‘r’ (or) ‘’ (Rho).

Interpretation of ‘r’
Value of ‘r’ Verbal description
0.00 to  0.20 Negligible relationship
 0.21 to  0.40 Low correlation
 0.41 to  0.70 Substantial correlation
 0.71 to  1.00 High to very high correlation
Uses of correlation co-efficient
1. It is useful in validating a test.
eg. A group intelligence test.
2. It is useful in determining the reliability of a test.
3. It give us an indication of the degree of objectivity of a test.
4. It can answer the validity of arguments for or against a statement or a belief.
e.g. Women are more intelligent than men.
5. It indicates the nature of relationship between the variables.
6. It predicts the value of one variable on the other.
eg. From Aptitude test Battery we predict certain aptitude. Based on the correlation
between that aptitude and the success in the particular occupation we can
suggest the occupation for the individuals.
ANNAMALAI UNIVERSITY
7. The knowledge of correlation coefficient is helpful in educational and vocational
guidance, prognosis in the selection of workers for an organization and in
educational decision – making.
I) KARL PEARSONS PRODUCT – MOMENT CORRELATION COEFFICIENT
If the data is available in ratio or Interval scale the product moment correlation
coefficient is calculated. Also, when the raw scores are small in numbers or when a
good calculating machine is available it is very easy to compute Pearsons ‘r’.
Moreover, if the bivariate distribution of scores are arranged to form a scatter
diagram the product moment method of correlation can be used.
75

Karl Pearsons product – moment correlation co-efficient is given by the formula,


NXY - (X ) (Y )
r=
NX2  (X)2  NY 2  (Y)2 
Here,
r  Correlation between X and Y.
N  Total number of pairs.
X  Total of the scores in X.
Y  Total of the scores in Y.
X2  Sum of the squares of X scores.
Y2  Sum of the squares of Y scores.
Steps
1. Find the sum of all ‘X’ scores i.e. X
2. Find the sum of all ‘Y’ scores i.e. Y
3. Square the values in ‘x; and find the X2
4. Square the values in ‘Y’ and find the Y2
5. Multiply the X & Y and find the XY.
6. Substitute all the values in the formula for ‘r’ and compute ‘r’.
Example
The scores of two groups in a 15 point scale is given below:
X : 1 7 2 3 4 12 11 5 10 5
Y : 2 5 6 4 1 5 8 2 6 1
Find out the relation exists between these two groups.
TABLE 9.2: CALCULATION CORRELATION COEFFICIENT
Sl.No X Y X2 Y2 XY
1 1 2 1 4 2
2 7 5 49 25 35
3 2 6 4 36 12

ANNAMALAI UNIVERSITY
4
5
3
4
4
1
9
16
16
1
12
4
6 12 5 144 25 60
7 11 8 121 64 88
8 5 2 25 4 10
9 10 6 100 36 60
10 5 1 25 1 5
N=10 X = 60 Y = 40 X2 = 494 Y2 = 212 XY = 288
76

N  xy   x  y
r 
{ N  x  ( x )
2 2
} { N  y 2  ( y )2 }
(10  288)  (60  40)
=
{10  494  (60)2 } {10  212  (40)2 }

480
=
696800
480
=
834.74
r = 0.575
The correlation co-efficient is found to be 0.575, that too with positive sign. Hence,
there is a significant positive relationship exists between the variables X and Y.
II) SPEARMAN’S RANK ORDER CORRELATION COEFFICIENT
If the data is in ordinal scale we can compute the Spearman’s ‘r’. Especially
when the samples are very small i.e. less than 30, this is the convenient method of
correlation to find out the relationship between two variables. It should be applied
when data are already in term of rank orders rather than the interval
measurements. For example, persons may be ranked in order of merit for honesty,
athletic ability and social adjustment when it is too difficult or impossible to
measure these complex behaviours. Similarly, if various products or Specimens
such as advertisements, colour combinations, handwriting, compositions etc. are
definitely difficult to be measured may be arranged in order for beauty, quality or
some other characteristic.
In this Spearman’s Rank order correlation the essential step is to arrange the
scores in order of merit and assign ranks. For example the problem is to find the
relationship between the length of service and selling efficiency of 12 salesmen
(Table 9.3) Column (2) has the member of years of service and the average sales in
the past year is given in column (3). In column (4) and (5) the salesmen are ranked
in accordance with their length of service and the average sales. For example, the
individual who has the longest with the company is ranked 1, next largest service is

ANNAMALAI UNIVERSITY
ranked 2 and so on.

TABLE 9.3 ILLUSTRATION OF THE RANK DIFFERENCE METHOD OF CORRELATION


Sl. Years of Efficiency Rank for Rank for
R1 ~ R2 = d d2
No service (sales) service R1 efficiency R2
1 5 13 7.5 6 1.5 2.25
2 2 7 11.5 12 0.5 0.25
3 10 22 2 1 1.0 1.00
4 8 10 4 9 5.0 25.00
5 6 11 6 8 2.0 4.00
77

6 4 14 9 5 4.0 16.00
7 12 20 1 2 1.0 1.00
8 2 9 11.5 10 1.5 2.25
9 7 17 5 3 2.0 4.00
10 5 12 7.5 7 0.5 0.25
11 9 15 3 4 1.0 1.00
12 3 8 10 11 1.0 1.00

N = 12 d2 = 58.00
Note that the individual 1 and 10 have the same period of service and that each
is ranked as 7.5 (i.e. the 7th and 8th position should be used have, the average of 7th
and 8th is 7.5 and it is assigned as rank). Similarly, all the entries in the sales
efficiency have also been assigned ranks. Now, the correlation between the two
order of merit is computed by using Spearman’s formula

 6d 2 
r 1 

2 
 N N  1 
N  number of individuals
d  difference in ranks of an individual in the two series.
d2  sum of the squares of all such differences.
Then substitute the values in the formula,
 6  58 
r = 1- 
 2

12 12  1  
 348 
r = 1-  
12(144  1) 
 348 
= 1-  
1716 

ANNAMALAI UNIVERSITY
= 1-0.202
r = 0.798
The rank order correlation co-efficient value is obtained as 0.798. Hence, we can
state that there is a significant positive relationship exists between the serive and
the sales efficiency of the salesmen.
PROPERTIES OF CORRELATION CO-EFFICIENT
1. The Rang of r
The correlation co-efficient is always have the value between –1 to +1 including
zero. r=1, tells the perfect positive relationship, whereas r = -1, tells the perfect
negative relationship. Hence, the value of ‘r’ never greater than +1 and never less
than –1.
78

2. The co-efficient of determination


The correlation coefficient ‘r’ is interpreted interms of coefficient of
determination r2. When multiplied by 100, the coefficient of r2 gives us the
percentage of variance in Y with respect to x. For example, r = 0.88 means r 2 = 0.77
and the percentage of variance is 0.774  100 = 77. This means that 77% of the
variance in Y scores has been accounted by the variance in X. Hence, r 2 indicates
the degree of relationship between two variables.

3. The coefficient of Alienation


The proportion of the variance in Y1, which is not determined or not associated
with the variance in X is given by K2 which is called the coefficient of non-
determination or co-efficient of alienation

K2 = 1-r2 K= 1  r2
The coefficient of alienation ‘K’ indicates the degree of lack of relationship.

4. The value r is invariant under Transformations


The correlation coefficient does not change if every score in either or both
distributions are increased or multiplied by a constant. This property leads to a
number of applications. Whatever the units of the measurement, the correlation
between the variables will be the same. Even, while working with large values it is
advisable to subtract a constant from all the scores so that we can avoid the
calculation with larger numbers.

5. Correlation and Causation


The correlation is sometimes misunderstood as indicating a causal relationship
between the two variables and at times to the extent that the first variable was the
cause and the second variable was the effect. But, it is too difficult to predict the
effect like this. For example, noise pollution creates headache. But, we can’t say
that headache creates noise pollution. Hence, causality cannot be in formed solely
on the basis of a correlation between two variables. In other words, high correlation
co-efficient does not ensure the causal relationship between the variables.

6. Conversion of ‘r’ into ‘z’ scores


For the interpretation purposes the ‘r’ score may also be converted as standard
ANNAMALAI UNIVERSITY
‘z’ score using the conversion Table. Then, the test of significance is applied to ‘z’
and we make the inferences.
TEST OF SIGNIFICANCE ‘t’ TEST
The statistic employed in the testing of hypotheses when the population
parameter is not known is the familiar ‘t’ ratio. The ‘t’ distribution is a theoretical
distribution discovered by an English statistician W.S. Gossett in 1908. It is also
called student’s t since; it was written in respect of his teacher R.A. Fisher. The ‘t’
ratio is obtained by the formula.
M
t=
Sm
79

Here,
M  Mean of the sample
  Population mean
Sm  standard deviation of the population.
‘t’ test is the familiar parametric statistic which is used to make inferences
about a number of things. The main purpose of this test is to evaluate the null
hypothesis (Ho).
The denominator SM in the ‘t’ Formula is the standard error of the difference
between means. This part of t-ration estimates the amount of variability in the
difference between means. The numerator of the t-ratio is the difference between
the means. Hence, t-ratio is a comparison of the difference between the obtained
sample means relative to the standard error.

Calculation of ‘t’ – Ratio


Usually t-ratio is used to test the significance difference between the means of
the two samples or variables. The general formula for t-test is given as.

 SS1  SS2   1 1 
t  X1 – X 2    
 n1 – 1  n2 – 1   n1 n2 

Here, X1  Mean of the first group.

X 2  Mean of the first group.


SS1  sum of squares of group I

SS1 = X12 -
x12
n1
SS2  sum of squares of group II

SS2 = X22 -
x22
n2
n1  number of cases in group I
n2  number of cases in group II.
ANNAMALAI UNIVERSITY
Before computing the ‘t’ value let us discuss the basic concepts in t-ratio
calculation.

i) Null Hypothesis (Ho)


Ho : 1  2 (or) 1  2
The null hypothesis states that the mean of the group I is more or less equal to
mean of group II. If we accept Ho, then we will conclude that there is no statistically
significant difference between the two groups.

ii) Atternate Hypothesis (Ha)


Ha : 1 < 2 (or) 1 > 2
80

Alternate hypothesis states that the mean of group I is higher or lower than
mean of group II. In other words, there is a significant difference exists between the
means of the two groups.

iii) Level of significance


These are the arbitrary standards which serve as the cut-off points or critical
points along the probability scale. Generally, 0.05 and 0.01 levels of significance are
most popular in social science research. The confidence with which an
experimenter rejects or retains a null hypothesis depends upon the level of
significance adopted.
0.95 level indicates 95% amount of confidence which tells that if the experiment
is repeated 100 times only on five occasions the obtained mean will fall outside the
limits   1.96 SE. 0.99 level indicates 99% amount of confidence, which tells that
if the experiment is repeated 100 times only on one occasion the obtained mean will
fall outside the limits   2.58 SE.

iv) Degrees of Freedom


It is the geometric interpretation related to the movement of a point in relation
to the number of dimensions it is attached with. Generally it is given by the formula
df = N-2.

v) Decision Rule
The decision rule is nothing but the statistical value obtained from the Table of
‘t’ for the particular f . Usually, if the calculated ‘t’ value is equal or more than the
table ‘t’ value we have to reject the null hypothesis and the alternate hypothesis
becomes the solution to the problem. If the calculated ‘t’ value is less than the table
value, the null hypothesis should be accepted.

Example
The scores of 2 groups in a psychological test is given below. Check whether the
two groups differ significantly from each other.
Group I 10 11 11 12 15 16 16 17
Group II 8 9 12 12 13 14 15 17 12 16

ANNAMALAI UNIVERSITY
Null Hypothesis (Ho) : M1  M2
There is no significant difference exist between the means of the group I and II.
The formula for ‘t’ is given as,

 SS1  SS2   1 1 
t  X1 – X 2    
 n1 – 1  n2 – 1 n
 1 n 2

Here,
Group I Group II
x1 = 108 x2 = 128
x12 = 1512 x22 = 1712
81

n1 = 8 n2 = 10
108 128
X1   13.5 X2   12.8
8 10

 SS1 = X12 —
x1 2
n1
(108)2
= 1512 —
8
SS1 = 54

SS2 = X22 —
x 2 2
n2
(128)2
= 1712 —
8
SS2 = 73.6
13.5  12.8
t=
 54  73.6   1 1 
   
 (8  1)  (10  1)   8 10 
0.70
=
1.7944
t = 0.523
Here, df = N-2, N = n1 + n2
N = 8 + 10 = 18
Therefore df = 18 – 2 = 16
 The table t-value for the df=16, at 0.05 level is 2.12.
Since the calculated ‘t’ value is less than the table t-value, we should accept the
null hypothesis. Hence, it is concluded that there is no significant difference exist
between the means of the two groups.

Interpretation of t-ratio
Usually the t-ratio tells the cause and effect relationship between the variables.

ANNAMALAI UNIVERSITY
If the computed ‘t’ value is equal to or more than the table t-value we reject the null
hypothesis and accept the alternate hypothesis. In this case, the conclusion is there
is a significant difference between the means of two groups that attributed to the
effect of independent variable. On the other hand if the computed t-value is less
than the table value we accept the null hypothesis and conclude that there is no
significant difference between the mean of two groups.
The t- Ratio and its Assumptions
The t-ratio is known as a parametric test. A parametric test is a test that is
based on a specified sampling distribution when samples are drawn at random
from the population.
82

The assumptions we make when performing a t-ratio are basically about the
distribution of error. It is assumed that,
1) Sampling error is distributed normally around in each treatment population.
2) The distribution of error is the same for each population, 1=2. This
assumption is known as the homogeneity of variance.
3) The observations in each sample are independent of each other.
In other words, we assume that the scores we obtain from subjects will be
normally distributed, the spread in the distributions will be the same and the two
groups are independent. To meet the requirement of these assumptions we should
carefully consider the procedures we use to select subjects, to assign them to
treatment groups and to examine out data before we conduct analysis.
For example, when we use t-ratio for independent means, we assume that the
sampling distribution for the differences between the means is best described by
students t-distributions. Indeed, whenever we use the t-test, we make many
specific assumptions that we assume to be true. Specifically the assumptions for
the t-ratio are as follows.
a) The subjects are selected at random
The random selection tells that if each member of the population has an equal
opportunity / probability of being chosen, the sample can be selected randomly and
we have the basis for making inferences to the population. Without this critical
element in the t-distribution we would be unable to say that the results of the
experiment can be generalized beyond the type of subjects in the study.

b) Subjects are randomly assigned to groups


It is important that in any experimental design the observations in treatment
and control groups be independent of each other. This is sometimes called the
independent group assumption. In other words in an independent groups, subjects
are randomly selected from a population and assigned to treatment groups in such
a way that the selection and observation of one subject have no effect on any other
subject or observation. Once the assignments of the subjects to the groups are
made, the treatment and data collection techniques should be identical for all
subjects except for the manipulation of the independent variable.

ANNAMALAI UNIVERSITY
c) Data are Normally Distributed and 1 = 2
It is important that the distribution of error in the population be normally
distributed and the same across treatment groups. But, we do not know the
population parameters and we should use sample statistics that we use to
determine if we have met the assumptions of the t-ratio. Further, we assume that
both samples are drawn from populations whose variances are equal. i.e. the
homogeneity of variance.
If the homogeneity of variance is not satisfied, i.e. there is heterogeneity of
variance, then the t-ratio is adjusted for the inequality of variances. The advantage
of this statistic is that it accounts for the inequality of variances and the inequality
of sample sizes. This t-ratio is called a corrected t-ratio and is calculated as
83

 1  SS1  1  SS1 
f̂  x1  x 2 /     
 n1  n1  1  n2  n2  1 
Hence, using the t-ratio for independent groups, we can compare two groups to
determine whether the sample means were drawn from the same population or
from different populations with different means.
SUMMARY
Bivariate analysis are powerful measures which assess the relationship between
two variables correlation co-efficient and student’s ‘t’ ratio are the widely used
bivariate measures in the field of assessment. A correlation co-efficient tell us the
relationship between two variables with their direction. Karl-Pearson product
moment correlation co-efficient and spearman ran order correlation co-efficient are
frequently used. These are highly helpful in finding out a relationship between
variables, value of one variable on other, validating a test etc. Students’ ‘t’ ratio is
a powerful statistical method to test a null hypothesis ‘t’ ratio is useful if there is
either homogeneity or variance. Using ‘E’ ratio we can determine a number of
things about sample mean.
KEY TERMS
Correlation Positive correlation
Negative correlation Zero – order correlation
Scatter plot Correlation co-efficient
Spearman’s Rank order correlation co-efficient
Pearl Pearsons Product moment correlation coefficient
t-ratio Homogenity of variance
QUESTIONS
1) What is correlation? Explain the positive, Negative and Zero – order
correlation with suitable examples.
2) Calculate the product moment correlation coefficient for the following pair.
X : 15 14 11 13 14 16 17
Y : 16 20 26 15 22 19 12
3) What is rank difference correlation? Explain its computation with suitable –
example.

ANNAMALAI UNIVERSITY
4)

5)
What are the Basic assumptions in the use of t-ratio? How is t-ratio
interpreted?
Two statistics classes of 25 students each obtained the following results on
the final examination.
X1  82, SS1  384.16, X2  77, SS2  1536.64
Employing level of significance at 0.05, test the hypothesis that the two classes
are equal in ability.


84

LESSON – 10

AN OVERVIEW OF EXPERIMENTATION
OBJECTIVES
After reading this lesson the student should
 Understand the characteristics of science.
 Explain why the psychological research is the application of scientific
method.
 Describe the stages in a scientific method.
SYNOPSIS
The nature of science – Psychological Experimentation – An application of the
scientific method – An example of a psychological experiment.
THE NATURE OF SCIENCE
Man has a greater ability to engage in abstract thinking. He is more able to
survey a number of diverse items and to abstract certain characteristics that the
items have in common. We can arrive at a general definition of science when we
proceed in such a manner. That is, we can consider the various sciences as a group
and abstract the salient characteristics that distinguish them from other
disciplines.
There are three group of disciplines which man studies. The first one is the
science; the second one is Arts and Humanities; and the third is metaphysical
disciplines. These three disciplines are similar to each other in some important
ways but they certainly differ among themselves in a number of ways. One common
characteristic of the sciences is that they all use the same general approach in
solving problems - the scientific method. The scientific method is a serial process by
which all the sciences obtain answers to their questions. The other two disciplines
do not use the scientific method. Individuals those who study the sciences, arts and
humanities, consider only problems that can be solved. The metaphysical
disciplines generally study the unsolvable problems. A solvable problem is one that
posses a question that can be answerable with the use of man’s normal capacities.
An unsolvable problem raises a question that is essentially unanswerable.
ANNAMALAI UNIVERSITY
Individuals who study the sciences and arts and humanities simply believe that
they must limit their study to problems that they are capable of solving.
In brief, first sciences use the scientific method and they study solvable
problems. Second, the other disciplines like arts and humanities do not use the
scientific method but their problems are typically solvable. Third, the metaphysical
disciplines neither use the scientific method nor do they pose solvable problems.
These considerations lead to the definition – Science is the application of the
scientific method to solvable problems.
85

METHODS OF ACQUIRING KNOWLEDGE IN SCIENCE


There are various methods of acquiring knowledge in science like, authority,
tradition, experience, common sense, intuition and revelation and logic. Let us
discuss these methods with suitable examples.

i) Authority
Appeal to authority and seeking its advice was a well established method of
solving problems in the earliest civilizations.
E.g. For floods, famines and disease terrified man people accepted their
ancestral explanations. They prayed to super natural powers for help.
During the middle ages ancient scholars such as Plato and Aristotle were
accepted as sources of truth than first hand experience and analysis of facts.
The authorities may be persons, who have had experience with that problem.
When factual evidence cannot be obtained to solve a problem, one may have to rely
upon authoritative opinion temporarily as the only possible method of solution.
In such a situation care must be employed in choosing authorities & evaluating
their claims to knowledge one should check not only the credentials of authorities
but also the arguments & evidence upon which they base their judgements.

ii) Tradition
People unquestioningly accept many traditions of their forefathers or culture
such as customary styles, food, speech and worship.
Although automatic acceptance of tradition and custom is often necessary, one
should not always assume that every thing that has customarily been done is right
and valid. One should, evaluate custom and tradition carefully before he accepts
them as truth.

iii) Experience
Our own personal experience is the most primitive familiar and fundamental
source of knowledge.
E.g. In olden time, grains ripened at particular times of the year, sudden floods
in some seasons, etc. can be known by experience.
According to Van Dalen, A person may make errors when observing or when

ANNAMALAI UNIVERSITY
reporting what he has seen or done He may,
1. Omit evidence that does not agree with his opinion
2. Use measuring instruments that require many subjective estimates.
3. Establish a belief on insufficient evidence.
4. Fail to observe significant factors revealing to a specific situation.
5. Draw improper conclusions or inferences owning to personal prejudices.
In the light of these remarks, one should, continuously and critically use
experiences as an avenue as an avenue for obtaining reliable knowledge.
86

iv) Common sense


The appeal to common sense can be considerably from two points of views as
a) a means for justifying pre conceived beliefs
b) as a way of referring to knowledge that has been previously verified.
As long as common sense dictated that the study of difficult subjects such as
Latin, Greek or Mathematics would ‘exercise the mind’, there were few demands for
experimental investigation of the matter. These common sense beliefs were not
modified until the first quarter of the 20th century, when experimental evidence had
been accumulated to discredit them.
The second meaning of common sense refers to generally accepted and
empirically verified knowledge i.e. Information derived through common sense
represents a body of already investigated beliefs which may be modified as
additional evidence. These common sense beliefs can be reexamined and
reformulated as new information is accumulated.

v) Intuition and Revelation


Revelations are direct and immediate insights concerning ‘truth’ or ‘reality’ ;
they are often presumed to originate from God. If these hunches or experiences are
believed to have natural” origins, they are called intuitions.
Revelations are more difficult to analyze than are intuitions, since their source
is presumed to originate outside of human experience, because the beliefs in a
supernatural existence rests, upon faith.
While the revelation may have tremendous importance for the person
experiencing it, its private nature does not provide a very convincing argument for
others.

vi) Logic
Rationalism is the philosophic position that places reason over both revelation
and experience. In general rationalists take the position that concepts such as God,
causality or mathematical proofs do not depend upon their experience or revelation
but can be proven through rational processes especially deduction.

a) Deductive Reasoning
ANNAMALAI UNIVERSITY
Deduction is the process of drawing special conclusions from premises in a form
known as syllogism.
Eg: All humans are mortal
So crates is a human
 Socrates is mortal
i.e. The syllogism containing a (i) major premise (ii) the minor premise and (iii)
the conclusion.
In deductive reasoning if the premises are true the conclusion is necessarily
true. Deductive reasoning enables one to organize premises into patterns that
provide conclusive evidence for the validity of a conclusion.
87

Deductive reasoning provides a means for linking theory and observations. It


enables the researchers to deduce from existing theory, what phenomena should be
observed. Deductions from theory can provide hypotheses, which are vital part of a
scientific inquiry.

b) Inductive Reasoning
In inductive reasoning, on the basis of the observed events, inferences were
made about the whole thing.
E.g. Every rabbit that has ever been observed has lungs.
 Every rabbit has lungs.
In inductive reasoning, a conclusion is reached by observing all premises. In
order to be absolutely certain of an inductive conclusion, the investigator must
observe all examples. This is known as perfect induction under the Baconian
system; it requires that the investigator examine every sample of a phenomenon.
Inductive conclusions can be absolute only when the group about which they
are asserted is small. Since, one can make perfect inductions only on small groups,
we commonly use imperfect induction, a system in which one observes a sample of
group and infers from the sample what is the characteristic of the entire group.
RESEARCH
Research is a search for knowledge. It helps us to find out a solution to the
problem as well as to make certain decision. There are various definitions available
for research.
 It is an investigation undertaken in order to discover new facts, to get
additional information in a field of study.
 It is a scientific and systematic search for potential information on a specific
topic.
 It is a careful investigation specially through search for new facts in any
branch of knowledge.
 It is a systematized effort to gain a new knowledge.
Objectives of Research

ANNAMALAI UNIVERSITY
1. To gain familiarity with a phenomenon (or) to achieve new insights into it.
2. To study and analyse the characteristics of a particular individual or group.
3. To determine the frequency with which something occurs or with which it is
associated with something else.
4. To test a hypothesis of a casual relationship between variables.
Characteristics of a Research
1. Research is directed towards the solution of the problem.
2. Its goal is to discover the cause and effect relationship between the
variables.
88

3. It goes beyond specific objects, groups or situation and emphasis the


development of generalization principles or theories.
4. It is based on observable experience or empirical evidence.
5. It demands accurate observation and description.
6. It involves gathering new data from primary or first hand sources or
using existing data for a new purpose.
7. It requires expertise i.e. The researcher should know what is already
known about the problem and what others have investigated on it.
8. It is the guess for discovering the answer for the unsolved problems.
9. The findings should be reported clearly and concisely and it should be
verifiable.
Psychological Research
It is the process of applying the scientific method to the solution of
psychological problems.
It is the method of obtaining objective solutions to problems encountered in the
theory and practice of Psychology.
It is the process of arriving a verifiable solutions in the field of Psychology.

Research is a Scientific and Disciplined Inquiry


Research is a relatively new activity in the history of psychology in olden days.
Yet in the early centuries the concept of behaviour is in common focus, individuals,
developed the knowledge of the world around them through experiences and
observations of others experiences. They then passed their knowledge to the next
generation by telling the stories. These stories provided an understanding, a
repertoire of wisdom from which a person could extrapolate, or extend known
experience into an unknown area so as to arrive at a useful conjecture or image of
the future.
The ultimate aim of science is the generation and verification of theory. A theory
predicts and explains natural phenomena. Instead of explaining each and every
separate behaviours of adolescents, the scientist seeks general explanations which

ANNAMALAI UNIVERSITY
link together different behaviours. Theory generation and verification are central
concept of scientific inquiry. A scientist values the empirical approach for “its
manner of exposing to falsification, in every conceivable way, the system to be
tested”. The purpose is not to promote false knowledge, but to select that which is
most accurate and reliable by exposing competing theories to empirical testing.
Scientific inquiry thus, contracts sharply to other ways of arriving at valid and
trustworthy knowledge.
Scientific inquiry is the search for knowledge by using recognized methods in
data collection, analysis and interpretation. The term “scientific” refers to an
approach and is not synonymous with science. Science is a body of established
knowledge whereas “Scientific” refers to the way the knowledge was generated. The
89

scientific method has a number of sequential processes. Similarly, in Psychology we


have a member of sequential process, which lead to the development of theory.
Hence, it is stated that psychology is the application of scientific method.
SCIENTIFIC METHOD
Science may be defined as follows:
The observation and classification of facts using verifiable and objective
methods.
The interpretation of gathered facts.
The procedures used in science for developing its body of knowledge have come
to be known as scientific method. The scientific method.
i) has acquired highly specific meaning in modern science.
ii) stands for systematic acceptable set of procedures used for generating new
knowledge, the validity of which is self-evident, because of the logical constructs
implied in their ordering.

Steps in Scientific Method


1. Identification and definition of the problem.
2. Formulation of a hypothesis.
3. Implication of hypothesis through deductive reasoning.
4. Collection and analysis of evidence.
5. Verification, rejection or modification of hypothesis .
Another equally valid classification of the steps is presented below:
1. Delimitation of the problem.
2. Collection and classification of the necessary facts.
3. Weighing possible methods of solution
4. Selection of criteria for the solution
5. Setting up suitable hypotheses and their try out
6. Interpretation and generalization of the findings
7. Revision and detail of hypothesis.
Characteristics of scientific method
Scientific method is characterized by a fixed sequence of procedure. The

ANNAMALAI UNIVERSITY
characteristics of the scientific method are,
(i) Use of facts in developing its structure
i.e. Use of empirically verifiable observations for building up scientific
knowledge.
(ii) Quantitative description of facts
i.e. Anything that is subjected to study is converted into number in which form
it is used for further discussions.
(iii) Suspended judgements
i.e. Willingness to disbelieve what has not been scientifically proved: even when
it conflicts with popular and accepted notations.
90

iv) Concern with relevant facts


i.e. From a complex situation involving many variables the ability to select and
study only what is relevant to the problem on hand.
v) Sensitivity to problems
i.e. Searching for problems even in ordinary and non challenging situations.
vi) Effort is to discover rather than prove
i.e. Neutrality to the problem on hand, unconcerned about the consequences of
the final solution.
vii) Continuous appraisal
i.e. Looking for loop holes in existing knowledge acceptance of the assumptions
that no knowledge is perfect and existing knowledge is capable of infinite
improvement.
viii) Developing more inclusive generalizations
i.e. Viewing and discovered truth as part of a bigger whole trying to find out the
whole from the part that is revealed.

The Nature of Science


There are certain aspects of the Scientific approach, which give the reliable
knowledge. They are,
1. assumptions made by scientists
2. attitudes of scientists
3. culmination of scientific theory interns of theoretical formulations.
Assumptions underlying scientific method
1. Events investigate are lawful or ordered i.e. no event is capricious.
2. Truth can be ultimately be derived only from direct observation.
3. Reliance upon empirical evidence.
4. Utilizing relevant concepts.
5. Committed only to objective considerations.
6. It results into probabilistic predictions.

ANNAMALAI UNIVERSITY
7. Methodology is known to all concerned for critical scrutiny.
8. It aims at formulating most general axioms.
In brief, first sciences use the scientific method, and they study solvable
problems. Second, the other disciplines arts and humanities do not use scientific
method, but their problems are typically solvable. Third, the metaphysical
disciplines neither use the scientific method nor do they pose solvable problems.
These consideration lead to the definition – “Science is the application of the
scientific method to solvable problems”.
91

PSYCHOLOGICAL EXPERIMENTATION AN APPLICATION OF THE SCIENTIFIC METHOD


The most powerful application of the scientific method is experimentation. A
Psychological experiment starts with the formation of a problem. The problem is
usually best stated in the form of a question. The only requirement that the
problem most meet is that it be solvable – the question that it raises must be
answerable with the tools that are available to the Psychologist. The problem may
be concerned with any aspect of behaviour, whether it is important or unimportant.
The experiment generally expresses a tentative solution to the problem. This
tentative solution is called a hypothesis; it may be a reasoned potential solution or
only a vague, guess. Following the statement of his hypothesis, the experimenter
seeks to determine whether the hypothesis is (probably) true or (probably) false, i.e.
does it solve the problem he has set for himself. To answer this question he collects
data, as the data is his only criterion. Though various techniques are available for
data collection, experimentation is the most powerful.
The first step that the experimenter will take in actually collecting his data is to
select a group of subjects. The type of subjects are determined by the nature of the
problem. If he is concerned with psychotherapy, he may select a group of mentally
disturbed persons. Or if he is interested in the function of parts of the brain he
would use animals as his subjects because few persons volunteer to serve as
subjects for brain operations. But whatever may be the type of subjects, the
experimenter will assign them to groups.
The assignment of subjects to groups must made in such a way that the groups
will approximately equivalent when the experimenter starts the experiment. This is
achieved through randomization. Next the experimenter typically administers an
experimental treatment what he wishes to evaluate, and it is administered to the
experimental group. The other group, called the control group, usually receives a
normal or standard treatment.
In his study of behaviour, the Psychologist generally seeks to establish empirical
relationships between aspects of the environment and the aspects of behaviour.
These relationships are known by a variety of names, such as hypotheses, theories
or laws. These relationships in psychology essentially state that if a certain
environmental characteristics is changed, behaviour of a certain type also changes.
The aspect of the environment which is experimentally studied is called the
ANNAMALAI UNIVERSITY
independent variable; the resulting change in behaviour is called the dependent
variable. Roughly ‘variable’ is anything that can change in value. A variable is
anything that may assume different numerical values. According to E.L. Thorndike,
anything that exists is a variable. He asserted that anything that exists, exists in
some quantity.
Psychological variables change in value from time to time for any given
organism and according to various experimental conditions. Some examples of
variables are the height of men, the weight, the speed with which a rat runs a maze,
the number of trials required to learn a poem, the brightness of light, the number of
words a patient says in a psychotherapeutic interview and the pay a worker
receives for performing a given task. A variable may assume any fraction of value
92

and it is called a continuous variable. A continuous variable is one that is capable


of changing by any amount, even a small one. A variable that is not continuous is
called a discontinuous or discrete variable. A discrete variable can assume only
numerical values that differ by clearly defined steps with no intermittent values
possible. For example, number of students in a class room would be discrete
variable. One might say 20, 40, 60 or 125 students in a class room but not 1.5 or
19.5 students. Similarly, gender (male or female) or eye colour (brown or blue) are
examples of discrete variables.
As the psychologist seeks to find relationship between independent and
dependent variables a number of independent variables are available in nature for
the psychologist to examine. But he is interested in discovering those relatively few
that affect a given kind of behaviour. So an independent variable is any variable
that is investigated for the purpose of determining whether it influences behaviour.
Some independent variables that have been scientifically investigated are age drugs,
anxiety, intelligence, socioeconomic status and home environments.
To determine whether a given independent variable affects behaviour, the
experimenter administers one value of it to his experimental group and a second
value of it, to his control group. The value administered to the experimental group
is called the ‘experimental treatment’ while the control group is usually given a
‘normal treatment’. Thus, the essential and normal treatment is the specific value
of the independent variable that is assigned to each group. For example, the
independent variable may be the number of trials. The experimenter may provide
15 trials to experimental group and 5 trials to the control group. The experimental
group practices a task 15 times, the control group 5 times.
In another instance, if one group is administered a ‘zero’ value of the
independent variable and a second group is given some positive amount of that
variable, then the zero treatment group would be called the ‘control group’ while the
other would be the experimental group. So it is only arbitrary, which group is
labelled the control group and which is called the experimental group.
The dependent variable is usually some well defined aspect of behvaiour
(response) which the experimenter measures in his experiment. It may be the
number of times the subject may recall the words or the number of item a worker
learns out in an hour. The value obtained for the dependent variable is the criterion
ANNAMALAI UNIVERSITY
of whether or not the independent variable is effective. It is in their sense that it is
called a dependent variable – the value that assumes is expected to be dependent
on the value assigned to the independent variable. Thus, an experimenter will vary
the independent variable and note whether the dependent variable changes. If the
dependent variable changes in value as the independent variable is manipulated,
then it may be said that there is a relationship between the two. If the dependent
variable does not change, we say there is no relationship between them.
The most important principle of experimentation is that the experimenter must
hold constant all of the variables that may affect has dependent variables, except
the independent variable(s) that he is attempting to investigate or study. There are
number of variables that may affect the dependent variable. But the experimenter is
93

interested in the relationship between his independent and his dependent variables.
If the experimenter allows a number of these variables to operate in the
experimental situation they are called extraneous variable.
So far we know that a scientist starts his investigation with the statement of a
problem, after which he advances a hypothesis as a tentative solution to that
problem. He then conducts an experiment to collect data, which should indicate the
probability that his hypothesis is true of false. He may use certain types of
apparatus or equipment in his experiment. The particular type of apparatus used
will depend on the nature of the problem.
The hypothesis that is being tested will predict whether the experimental group
will perform better than the control group or not. By conforming the hypothesis
with the dependent variable values of the two groups the experimenter can
determine if the hypothesis accurately predicted the results. But, it is difficult to
tell whether the dependent variable values for one group are higher or lower than
the values for the second group by simply looking at the data. Therefore, the
experimenter has to reduce his data to numbers that can be reasonably handled
because they provide him with an answer. For this reason he must resort to
statistics. For example, he may compute an average (mean) score for both the
experimental group and control group. The experimental group may have a higher
mean score, say 75, than the control group, say 60. The difference between the two
groups is 15. We do not know whether this difference is a ‘real’ difference or it is
only a chance difference. So, to know whether the difference is real or is due to
random fluctuation (chance), the experimenter resorts to a variety of statistical
tests. More appropriately, the statistical tests indicate whether or not the difference
is statistically significant and this is meant by a ‘real and reliable’ difference. If the
difference between the dependent variable scores of the groups is significant, the
difference is not due to random fluctuations; and it is concluded that the
independent variable is effective.
Thus, by starting with two equivalent groups, administering the experimental
treatment to one but not to the other, and let us suppose that we find a significant
difference between the two groups. We may assume that the two groups differ
because of the experimental treatment and therefore the hypothesis is supported or
confirmed. On the other hand, if the control group is found to be equal to the
ANNAMALAI UNIVERSITY
experimental group, the hypothesis is not supported.
An additional step of the scientific method that is closely related to hypothesis
testing is generalization. Usually, the experimenter must underline specific
conditions under which he tests his hypothesis. But he wants to generalize his
results. The broad principle to remember is that he should state that his hypothesis
is applicable to a wide set of conditions.
The next step in the scientific method concerns making predictions on the
basis of the hypothesis. It means, a hypothesis may be used to predict certain
events in a new situation. The final step in the scientific method is replication. By
replication it means that an additional experiment is to be conducted in which the
method of the first experiment is precisely repeated. The confirmed hypothesis may
94

be used with a new sample as the basis for predicting the same result as did the
original sample. If the prediction made by the use of the previously confirmed
hypothesis is found to hold in the near situation, the probability that the
hypothesis is true is tremendously increased.
Hence, in a psychological experimentation the scientist states a problem that he
wishes to investigate. Next, he formulates the hypothesis a tentative solution to the
problem. Third, he collects data relevant to the hypothesis. Following this he tests
the hypothesis by confronting it with the data and makes the appropriate
inferences. He organizes the data through statistical means and determines
whether the data support or refuse the hypothesis. Fifth, assuming that the
hypothesis is supported, he may wish to generalize to all things with which the
hypothesis is concerned. Sixth, he may wish to make a prediction to new
situations, to events not studied in the original experiment. And finally, he may
wish to test the hypothesis again in the novel situation; that is he might replicate
the experiment (conduct the experiment with a new sample of subjects). In
experimentation a psychologists advent all the steps in a scientific method and
hence psychological experimentation is an application of scientific method.
AN EXAMPLE OF A PSYCHOLOGICAL EXPERIMENT
A Psychologist may be interested to study the impact of anxiety on the problem
solving ability of students. This problem may be stated as “Does anxiety influence
the problem solving ability of students? The psychologist may take two groups of
subjects consisting of 10 each, based on their scores on anxiety scale they may be
categorized as low anxiety group and high anxiety group. These subjects may be
administered some numerical problems and their scores on them constitute
dependent variable. The level of anxiety constitute the independent variable. The
design of this experiment may be a simple ten randomized group design. Assuming
that the extraneous variables were adequately controlled. The psychologist collects
data on numerical ability test. The scores of two groups may be subjected to a
simple statistic called ‘t’ test. The psychologist formulates hypothesis such as –
There would be significant difference between the two groups with regard to their
numerical ability’. If the ‘t’ value is higher than the table value then he confirms
that high anxiety group is lower than the low anxiety group in performance and we
may say that anxiety significantly influences the numerical ability of the subjects.

ANNAMALAI UNIVERSITY
SUMMARY
Science refers to the observation methods. Scientific methods are the
procedures used to develop a body of knowledge of science. Silence use the
scientific method and study solvable problems. There are various methods of
acquiring knowledge in science. Inductive reasoning help us to link theory and
observations. Research is a scientific and systematic search for potential
information on a specific topic. Psychological experimentation is the application of
science and scientific method since adopts the procedures in science. In
psychological experimentation we states a problem, formulates hypothesis collect
generalize the findings. These steps are termed as “theory building” in science.
95

KEY TERMS
Science Scientific Method
Deductive Reasoning Inductive Reasoning
Research Psychological Research
Variable Independent Variable
Dependent Variable Extraneous Variable
Control Group Experimental Group
QUESTIONS
1. Discuss the characteristics of science.
2. How is scientific method approved in psychological research?
3. Write notes on
(a) Variable (b) Experimental group.
4. Explain why the psychological experimentation is called the application of
scientific method.



ANNAMALAI UNIVERSITY
96

LESSON – 11

MAJOR STAGES IN RESEARCH

OBJECTIVES
After reading this lesson the student should
 Know the various stages in Research.
 Understand the purpose of related literature.
 Explain the criteria of a good research problem.
 Describe how the problems are manifested.
SYNOPSIS
Defining a research problem – Sources of research problem: Study of Related
Literature – Criteria for Selecting a problem.

INTRODUCTION
Research may be defined as an investigation undertaken in order to discover
new facts, get additional information etc. Also, it is defined as the application of the
scientific method to the study of a problem. Research is a way to acquire
dependable and useful information. The important characteristics of research are,
1. Research is directed towards the solution of a problem.
2. The goal is to discover cause and effect relationship between the variables.
3. It goes beyond the specific objects, groups or situation and emphasizes the
development of generalization principles or theories.
4. It is based on observable experience or empirical evidence.
5. It demands accurate observation and description.
6. It involves gathering new data from primary or first hand sources or using
existing data for a new purpose.
7. It requires expertise, (i.e.) The researchers should know what is already known
about the problem and what others have investigated on it.
8. It is objective and logical. The researchers should eliminate personal bias. The
attempt is on testing rather than proving the hypothesis.
9. It is a guess for discovering the answer for unsolved problems.

ANNAMALAI UNIVERSITY
10. The finding should be reported clearly and concisely and it should be verifiable.
MAJOR STAGES IN RESEARCH
Research is a scientific method or investigation which involves a number of
concrete steps (or) stages. The major stages in research are given below:

1. Formulating the Research Problem


The research should begin with a statement of problem. First, the researcher
has to identify and select a research problem. Initially the problem may be stated in
a broad general way. The feasibility of a particular solution has to be considered
before a working formulation of the problem can be set up. Hence, the formulation
97

of a general topic into a specific research problem is the first step in any scientific
enquiry.

2. Literature Survey
After the formulation of a research problem the extensive survey about the
related literature should be done. The summary of previous theories and research
findings, regarding the problem should be reviewed. This stage help us to
i) to avoid duplication of work.
ii) to avoid waste of time.
iii) to provide a background about the problem to the researcher.
3. Development of Hypothesis
After the literature survey, a researcher should clearly state the hypotheses of
his study; A research hypothesis is a tentative answer to the problem. It is a guess
based on the prior studies and previous experience of the researcher. It should be
formulated before the data are gathered. This makes the investigator to remain
unbiased.
A research hypothesis should be
a) self-explanatory.
b) testable.
c) consistent with the existing body of knowledge.
d) stated as simply and concisely as possible.
4. Preparing the Research Design
The function of a research design is to provide a plan for the collection of
relevant evidence with minimal expenditure of effort, time and money. Usually the
research design is of four types viz. Exploration, Description, Diagnosis and
Experimentation. Hence, depending on the purpose any one of these four methods
may be selected as a research design.

5. Methodology / Strategy
This indicates the method in which the experimenter is going to conduct the
ANNAMALAI UNIVERSITY
research. This consists of the following parts.
a) Sample: The population/ universe from which the sample for the study to be
determined has to be defined. Then, the method of sampling to be used to select the
sample for the study should be clearly described.
b) collection of data: The collection of appropriate data for any research problem
is very very essential. There are several ways of collecting the data. Usually,
primary data can be collected through experiments (or) survey. In the case of
survey the data can be collected by observation, through personal interviews,
through telephone interviews, by mailing the questionnaire and schedules.
98

c) Analysis of data: After the collection, the data should be tabulated and
analysed. The suitable statistical techniques may be planned well in advance.
Hence, the researcher can analyse the collected data with the help of various
statistical measures.
d) Hypothesis testing: After the analysis of data the hypothesis should be tested
using ‘t’ test, ‘z’ test, ‘F’ test, correlation etc. This leads to either accepting or
rejecting the hypothesis.
e) Generalisation and Interpretation: If the hypothesis is tested and upheld
several times, it may be possible for the researcher to derive generalisation. If the
research has no hypothesis to start with, the researcher has to explain his findings
on the basis of some theory and it is called as interpretation.

6. Preparation of Research Report


Finally, the researcher has to prepare the report of what has been done by him.
In the research report the researcher should write the following things.
a) introduction, b) summary of findings
c) main report d) conclusion
At the end of the report the bibliography (list of journals, books, etc.) and
appendices if any should be provided.
Hence, in any scientific investigation the above stages should be clearly planned
so that we conduct the research in a systematic manner.
STUDY OF RELATED LITERATURE
Research can never be done in isolation of the work that has already been done
on problems which are directly or indirectly related to the problem of study. A
careful review of related literature is one of the major stages or steps in any
research study. The literature review enables a researcher to gain further insight
into the purpose and the results of a study. Review of literature includes many
types of sources: professional journals, reports, scholarly books and monographs,
government documents and dissertations. It may include empirical research,
theoretical discussions, reviews of the status of knowledge, philosophical papers
and methodological treatises.
Related literature is obviously relevant to the problem such as previous research
investigating the same variables or similar question: references to the theory and
ANNAMALAI UNIVERSITY
the empirical testing of the theory; and studies of similar practices. New or little
researched topics usually require a review of any literature related in some
essential way to the problem to provide the conceptual framework and a rationale
for the study. Related literature may be found outside the field of education such as
in sociological research on small-group interaction, political studies of voter
behaviour or psychological research on cognitive processes.
A study of related literature must precede any well planned research. A review
of the literature serves several purposes in research. Knowledge from the literature
is used in stating the significance of the problem, developing the research design,
relating the results of the study to previous knowledge and suggesting further
research. A review of the literature enables a researcher to:
99

1. Define and limit the problem


Most of the research studies that add to knowledge, investigate only one aspect
of the larger topic. The researcher initially becomes familiar with the major works in
that topic and the possible breadth of the topic. The research problem is eventually
limited to a subtopic within a larger body of previous theory, knowledge or practice
and stated in the appropriate terms.
2. Place the study in a historical and associational Perspective
To add to the knowledge in any subfield, researchers analyze the way their
studies will relate to existing knowledge. A researcher may thus states that the
research of A, B and C has added a certain amount to knowledge; the work of D
and E has further added to our knowledge: and this study extends our knowledge
by investigating the stated question.
3. Avoid unintentional and unnecessary replication
A thorough search of the literature enables the researcher to avoid
unintentional replication and to select a different research problem. The research
however, may deliberately replicate a study for verification. A research topic that
has been investigated with similar methods that failed to produce significant results
indicates a need to revise the problem or the research design. Evaluation studies
may seem to duplicate prior research, but this duplication is necessary if the study
is designed for site decision – making.
4. Select Promising Methods and Measures
A researcher sort out the knowledge on a subject, they asses the research
methods that have established that knowledge. Previous investigations provide a
rationale and insight for the research design. Analysis of measures, sampling and
methods of prior research may lead to a more sophisticated design, the selection of
a valid instrument, a more appropriate data analysis procedure or a different
methodology for studying the problem.
5. Relate the findings to previous knowledge and suggest further research
The results of a study are contrasted with those of previous research in order to
state how the study added additional knowledge. If the study yielded not-significant
results, the researcher’s insight may relate to the research problem or to the design.
ANNAMALAI UNIVERSITY
Most researchers suggest directions for further research based on insights gained
from conducting the study and the insights gained from conducting the study and
the literature review.
Hence, in essence it is stated that the review of related literature serves many
purpose in research. Particularly, it helps a lot to the researcher to limit the
problem in the restricted scope so that the researcher can avoid overloading of
work, boredom and duplication.

NATURE OF RESEARCH PROBLEM


Researchers can ask many questions about existing psychological theories and
practices. Consider the following questions.
100

What is the present status of a group’s performance?


What is the best way to conduct one’s work?
How does intelligence affect achievement?
What are the assumptions in creativity assessment?
These questions seek some answers. Hence, the questions comprise the initial
step in research.
Some questions, although important to an individual or a group, may not
connote research problem as stated. The noun “problem” has conventional and
technical meanings. In the conventional sense, a problem is a set of conditions
need discussion, a decision, a solution or information. A research problem implies
the possibility of empirical investigation – that is, of data collection and analysis.
Explanations of how to do something, vague propositions and value questions
are not research problems as such since, they are beyond research. In the process
of asking such questions, however, a researchable problem may emerge. A research
problem in contrast with a practical problem, is formally stated to indicate a need
for empirical investigation. Quantitative research problems may be phrased as
questions or hypotheses. Let us look the following examples:
What are the attitudes of 10th standard students toward professional course?; Is
there is any difference in the attitude of 10th students toward professional courses
and ITI courses? Each of these statements implies data collections and analysis.
Qualitative research problems are phrased as research statements or questions
but never as hypotheses. A research hypothesis implies deductive reasoning;
whereas qualitative research uses primarily inductive reasoning to suggest an
understanding of a particular situation or historical period. Qualitative problems
usually are phrased more broadly than Qualitative problems by using terms such
as “how”, “what” and “why”. Quantitative problems state the situation or context in
such a way as to limit the problem. A qualitative problem might be a study of one
specific situation, a person, one state or a historical period. Hence, any research
problem must be selected in such a way that there should be a possibility of
collecting the relevant data.

Criteria of a good Research Problem

ANNAMALAI UNIVERSITY
To select a good research problem the following criteria should be kept in mind.
1. Variables: The problem should express two or more variables, and the
relations between the variables should be expressed.
2. Clarity and unambiguity: The problem should be stated clearly and
unambiguously. It may be stated either in question form or in declarative form.
3. Empirical verification: The problem should have the possibilities for
empirical verification. There should be no room for guessing. The variables should
be measurable.
4. Availability of guidance: Every research activity needs the help of others to
some extent. Hence we should ensure whether the competent authority is available
in the present field of study before selecting the problem.
101

5. Novelty: The problem should have originality. There should be no duplication


or copying from others work. Ignorance of prior literature leads to these kind of
problems.
6. Interest: The problem should be interesting to the investigator in order to
enable him to face and overcome the obstacles which may arise at every step. Our
interest will help to overcome these problems.
7. Significance / Justification: A good research problem should involve an
important principles or practice. It should be worthy for the money, effort and time
spent. Further, it should add something to knowledge.
8. Experience and creativity: The problem should require a clear
understanding of the theoretical, empirical and practical aspects of the subject
derived from personal experience and from a thorough review of literature, so that
there is a scope for creating.

Sources of a Research Problem


Problems are identified initially as general topics. After doing the preliminary
work, the general topic is focussed as a specific problem. We get these general
topics in many ways. The most common sources are casual observations, deduction
from theory, review of the literature, current social issues, practical situations,
replication of previous work and the personal experiences and insights. Let us
discuss these things below:
1. Casual observations: These are the rich sources of questions and hunches.
Research questions can be suggested by observations of certain relationships for
which no satisfactory explanation exists. Even by observing the existing practices
we may select a research question.
Eg.: By observing the relationship between the systematic counseling in schools
and the school effectiveness we may select a research problem as “whether the
systematic counseling enhances the school effectiveness?”
2. Deduction from theory: Theories are general principles whose applicability
to specific educational or psychological problems are unknown until tested
emprically. The validity and scope of organizational and many other theories might

ANNAMALAI UNIVERSITY
be tested under educational conditions. Such studies could verify the usefulness of
a theory for explaining educational occurrences.
Eg. Organizational theory suggests that ‘fringe benefits’ enhances the workers
motivation. Now let us take it as a research problem as ‘whether the fringe benefits
will enhance the motivation of workers in a particular organization?
3. Related literature: A thorough review of related journals, text books,
monographs, government reports etc. may provide a research problem. While
reading these, we may find a gap between the theory and the existing practices.
Hence, we may select a research problem in such a way that it should fill the gap.
Reports on psychological problems (or) opinions of psychologists (or) general news
on psychological issues that appeared in newspapers and magazine may provide
the researcher with materials to work on.
102

E.g. From the H.S.C. result in newspaper, we can select a problem as ‘why
female students are getting more scores than males in the H.Sc. Examination?”
4. Current social and political issues: In society the women’s movement
raised questions about sex equity in general and in sex stereotyping of educational
materials and practices. The civil rights movement led to research on the education
of minority children and the effects of desegregation on racial attitudes, race
relations, self-concept, achievement etc. So, by observing the current social and
political issues we may select the research problem.
E.g. The voting pattern of educated and uneducated in the current election may
be taken as a research problem.
5. Practical situations: Questions for research problem may focus on
educational needs; information for program planning, development and
implementations or the effectiveness of a practice.
E.g. ‘What is the impact of D.P.E.P. programme in villages?’ May be taken as a
research problem.
6. Replication of previous work: Certain research work may suggest that they
are not complete or truthful or elaborated. A researcher may take the same problem
and try to revise or modify or develop it.
E.g. Emotional Intelligence is viewed as the emotional competencies. We may
again take it for testing. If possible we may elaborate it.
7. Personal experiences and insights: A researcher may come across certain
problems in his own wife. Hence, personal experiences and insights may suggest
research problems that should be examined more in depth through qualitative
methodologies.
E.g. A teacher who has worked with exceptional children can recognize more
readily the meaning in a situation involving exceptional children than any other
and he can pay more concern if he takes the problem of exceptional children for
research.
Conditions for a Research Problem
The following conditions are to be satisfied when a research problem is selected.
1. There must be an individual or a group that has some difficulty or the
problem.
ANNAMALAI UNIVERSITY
2. Certain objectives need to be achieved.
3. Alternative courses of action should be available.
4. The researcher should gets a tentative solutions.
5. The environment to which the difficulty related should also be presented.
Techniques Used to Define a Research Problem
The research problem should be defined in a systematic manner; giving due
weightage to all relative points. The techniques for this purpose should involve the
following steps one after the other.
i) Statement of the problem in a general way.
ii) Understanding the nature of the problem.
103

iii) Surveying the available literature.


iv) Developing the ideas through discussion.
v) Rephrasing the research problem.
In addition to the above, the following points must also be observed while
defining a research problem.
1. Technical terms and words or phrases with special meaning used in the
statement of the problem should be clearly defined.
2. Basic assumptions or postulates (if any) relating to the research problem should
be clearly stated.
3. A straight forward statement of the value of the investigation (i.e. the criteria for
the selection of the problem) should be provided.
4. The suitability of the time-period and the sources of data available must be
considered by the researcher in defining the problem.
5. The scope of the investigation and the limits within which the problem is to be
studied should be mentioned explicitly in a research problem.
Formulation and Stating the Problem
After the selection of a good research problem, it should be formulated and
stated in precise terms. It may be stated in declarative form or in the interrogative
form. While stating the problem the following steps have to be followed.
1. Definition of the problem: To define a research problem it has to be
specified in detail with precise terms. Each question and subordinate questions
may also be specified. Technical and unusal terms should be explained, to
minimize the misinterpretation. For this purpose, the operational definitions if any
may be provided. An operational definition assigns meaning to a variable by
specifying the activities or operations necessary to measure, categorize or
manipulate the variable. Operational definitions tell us what is necessary for
answering the question or testing the hypothesis.
2. Limitations and delimitations: Limitations, scope and delimitation of a
problem should be stately clearly and precisely. Limitations are the boundaries
within which the problem has been worked out and delimitation’s are factors which
are beyond the control of the investigator. Also, the scope of the problem must also

ANNAMALAI UNIVERSITY
be presented.
3. Justification of the Problem: The need, worthiness and requirement of the
problem should be justified. This will prevent the tryout of unimportant problem
and wastage of time, money and personnel factors.
4. Evaluating the problem: Before taking the problem for consideration, the
following questions should be raised to ensure the significance of the problem.
1. Can this type of problem be effectively solved through the process of
research?
2. Does this problem involve any principles (or) practice?
104

3. Would the solution make any differences as far as the theory (or) practice is
considered?
4. Is the problem a new one?
5. Whether researching this problem is feasible?
If the above questions are answered in the affirmative manner, then the problem
is considered as important, appropriate and significant. Further, there is no clear
distinction between an important problem and an unimportant problem. Some
problems are more likely to contribute to the advancement of psychology than the
others. Experimenters should try to choose what they consider as an important
problem than a relatively unimportant problem. Any researcher is free to work on
whatever problem be chooses.
Hence, in accordance to the present requirements we should select a research
problem. A formal problem statement may be phrased as statement of research
purpose specific research questions or as research hypothesis, depending on the
purpose of the study and the selected research design.

SUMMARY
Research is a means to acquire dependable and useful information. Study of
related literature (or) literature survey is one of the important stage in research
which helps the researcher to avoid duplication of work, to define problem, to select
promosing methods and measures. Any research should start with the statement of
a problem. 4 research problem implies the possibility of empirical investigation. We
may use eight important criteria to select a good research problems. We have
various sources of research problems like casual observations, deduction from
theory, from review of literature and so on. There are numerous conditions to select
and various techniques are used to define a research problem. After formulation the
research problem should be stated clearly and precisely.

KEY TERMS
Related literature Research problem
Research design Research methodology / strategy
Research report Scope
Limitations Delimitations
ANNAMALAI UNIVERSITY
QUESTIONS
1. Discuss the major stages in research.
2. What is “review of related literature”? Delineate the important uses of review
of related literature.
3. What is a research problem? Explain the criteria for selecting a good
research problem.
4. Explain the various sources of a research problem.

105

LESSON – 12

MAJOR STAGES IN RESEARCH (Contd… )

OBJECTIVES
After reading this lesson the student should
 Understand the meaning of hypothesis.
 Explain the various types of hypothesis.
 Describe the method of deriving hypothesis.
 Explain the basic concepts in testing the hypothesis.

SYNPSIS
Hypothesis meaning – Types and formulation of hypothesis.

INTRODUCTION
A scientific investigation starts with the statement of a solvable problem.
Following this, a tentative solution to that problem is offered in the form of a
proposition. The proposition must be testable – it must be possible to determine
whether it is probably true or false. Thus a “hypothesis is a testable proposition
that may be the solution of a problem”. If, it is found that the relevant hypothesis is
probably true, then we may say that the hypothesis solves the problem. If, the
relevant hypothesis is false we may say that it does not solve the problem. For
example, take a problem “which factors causes higher achievement”? Our
hypothesis is “people who are intelligent and who show a strong aptitude in
learning cause higher achievement”. If the collection and interpretation of data
confirm the hypothesis we can say that we have solved the problem. On the other
hand, if we fail to confirm our hypothesis, we can say that we have not solved the
problem. i.e.) We have failed to obtain definite information that these specific
factors contribute higher achievement.
When we obtain a true hypothesis and which solves a problem, we may state
that the hypothesis explains the phenomenon with which the problem is concerned.
A problem exists because we are in a possession of a certain fact. This fact requires
an explanation. The need to explain the fact presents us with our problem. If we

ANNAMALAI UNIVERSITY
can relate that fact to some other fact in an appropriate manner, we can say that
the first fact is explained. A hypothesis is a tool by which we seek to accomplish
such an explanation. Hence, we use a hypothesis to state a possible relationship
between one fact and another. If we find that the two facts are actually related in
the manner stated by the hypothesis we have explained the first fact. If we use the
term “variable” in the place of fact now we can say that a hypothesis is a testable
statement of a potential relationship between two or more variables.

MEANING OF HYPOTHESIS
Hypothesis may be defined as a tentative proposition suggested as a solution to
a problem. It is a suggested solution to a problem which is later tested by an
investigator.
106

Hundberg defined hypothesis is “a tentative generalisation, the validity of which


remains to be tested”. In its elementary stage, “it is a guess imaginative idea which
becomes the basis for further investigation”.
Bar and Scates defined hypothesis as “a statement temporarily accepted as true
in the light of what is at the time known about a phenomenon and it is employed as
a basis for action in the search for new truth”.
A hypothesis must be stated in a synthetic statement. Only in a synthetic form
it is capable of being proven (probably) true or false. (A synthetic statement is one
that may be either true or false or more precisely it has a probability of being true
or false). Usually in stating a hypothesis, we attempt to say something informative
about the natural world and hence here we have a change to prove either true or
false.
Hence, we can understand that a hypothesis is a statement of a potential
empirical relationship between two or more variables and also it is possible to
determine whether a hypothesis is probably true or false.

Uses / Purposes of Hypothesis


1. Hypothesis provides a tentative explanations and facilitates the thinking process
– It helps the investigator where to begin and how to begin?
2. It provides the researcher with a relational statement that can be directly tested
in a study – Questions cannot be tested directly. For example, we state a
following hypothesis “Special tuitions enhance the test performance of
students”. In the hypothesis there is no question but, we can test the
relationship between two variables i.e. the special tuitions and performance.
3. A hypothesis gives a direction to researcher - It provides the base for selecting
samples and the methodology to be used.
4. If helps a researcher to keep the study in a restricted scope – It prevents the
broadening of a study.
5. It provides a frame work for reporting the conclusions of a study.
6. It prevents blind research and indiscriminate gathering of masses of data which
may later prove irrelevant (P.V. Young)
ANNAMALAI UNIVERSITY
Hence, a hypothesis serves as an investigators “eyes” in looking answers to the
tentative statement. It focuses a research by providing clear and specific goals. It is
“a sort of guiding light in the world of darkness”, they are “islands in the steam of
thought” and it serves “as a powerful beacon that lights the way for a research
worker”.

Types of Hypothesis
The goal of any scientific enquiry is to assert a hypothesis in universal fashion.
There are two ways of stating the hypothesis viz. universal and existential
hypothesis.
107

i) Universal hypothesis: It asserts that the relationship in question holds for


all the variables that are specified for all time and at all places.
E.g. “Students who experience anxiety world perform very poorly”.
It becomes a universal hypothesis if any group of students any where in the
world experience anxiety performing very poorly at any time. The relationship holds
for all time and at all places.
ii) Existential Hypothesis: It asserts that the relationship stated in the
hypothesis holds for atleast one particular case (‘existential’ implies that one
exists).
E.g. “At least one group of students who experience anxiety performs poorly”.
This becomes the existential hypothesis since it pertains to one group. This form
of hypothesis is very useful in psychological work since, many times a psychologist
asserts that a given phenomenon exists regardless of how frequently it occurs.
Usually, scientists prefer the universal statements rather than the specific
statements. More general statements has the great predictive power whereas a very
specific statement has extremely limited predictive power. The hypothesis with
limited predictive power would not be very useful for predicting future occurrences.
But, to ascertain a relationship in specific phenomenon or for specific criteria we
should use existential hypothesis.

Forms of Hypothesis
There are 3 forms of stating the hypothesis viz. Declarative form, Null form and
question form.

a) Declarative form
This form generally states the relationship between the variables that the
researcher expects to emerge.
E.g. There is a significant relationship between anxiety and performance.

b) Null Form
A null hypothesis states that no relationship exists between the variables
concerned.

ANNAMALAI UNIVERSITY
E.g. There is no relationship between anxiety and performance.
The logic behind the null hypothesis is usually, we have to start the scientific
study without any prejudice. If we state that there is some relationship exists it
may bias the findings. Hence, it is assumed that null hypothesis has no bias and
which can be used for any scientific inquiry.

c) Question Form
A hypothesis can also be stated in a question form.
E.g. Is there is any significant relationship between anxiety and performance?
108

Criteria for a Hypothesis


After formulating a hypothesis we must determine whether the hypothesis is a
good one? We test a hypothesis to determine whether the data confirm or
disconfirm it. A confirmed hypothesis is better than a disconfirmed hypothesis
since it offers one solution to a problem and provides some additional knowledge
about nature. The following criteria may be used to test the goodness of any
research hypothesis.
1. The hypothesis should be testable
2. It should state the relationship between the variables.
3. It should be limited in scope.
4. It should be consistent with known facts.
5. It should have general harmony with other hypothesis in the field of
investigation.
6. It should be stated in simple and specific terms (i.e. self – explanatory).
7. It should be expressed in a quantified form or be susceptible to convenient
quantification.
8. It should answer the problem (i.e. to be relevant to the problem)
9. The hypothesis should be parsimonious.
(i.e. two hypothesis are advanced to answer a problem,usually a simpler one
should be preferred).
10. It should have logical simplicity.
(i.e. If one hypothesis can account for a problem by itself and another
hypothesis can also account for the problem but requires a number of
supporting hypothesis, the former one is preferred for its greater logical
simplicity).
11. The hypothesis should have a large number of consequences
(i.e. The hypothesis that yields a large number of consequences will explain
more facts and make more predictions about events that are not yet studied or
established. The hypothesis that leads to a large number of consequences will
be a more fruitful hypothesis).

ANNAMALAI UNIVERSITY
Methods of Deriving Hypothesis
There are two important methods of deriving hypothesis. They are inductive and
deductive methods.

i) Inductive Method
This method involves the formulation of a hypothesis as a generalization from
observed relations. Usually, the researcher observers or notices the trends or
relationship and then formulates a hypothesis as an explanation of the observed
phenomenon.
E.g. A physiologist observing a student and may state the hypothesis as “The
individual’s performance is affected by test anxiety’.
109

ii) Deductive Method


A hypothesis generate derived from a theory is a deductive hypothesis.
Knowledge can be cumulative only by building on the existing body of theory. It
cannot develop to a sufficient level if each study remains an isolated one. Hence,
from a theory we form generalizations and from which we may derive hypothesis.
There are 3 ways of deducting the hypothesis.

a) Deduction from Existing Theories


Every important theory provides us an adequate materials for deducting new
hypothesis.
E.g. Spearman’s theory of general intelligence asserts that all forms of human
mental performance are based on the ‘g’ factor of intelligence.
From this theory, we can deduct an experimentally testable hypothesis as
“higher level of intelligence of pupil enhances educational performance”.

b) Deduction from Existing Studies


The results of existing research studies can be used either single or in
combinations to deduct the new hypothesis.
E.g.
1. School adjustment can be increased through systematic counselling.
2. School achievement increases if school adjustment is increased.
From the above 2 reports we may deduct the hypothesis as “systematic
counselling can be used to improve school achievement.

c) Deduction from Personal Experiences


The first hand experience of an investigator (as a psychologist, counsellor
headmaster, principal, curriculum specialist etc.) in the area under investigation
will provide the bases for research hypothesis.
E.g. The psychologist finds the difference between the government organizations
and private organizations forms a hypothesis as “the workers' morale is lower in
government organisations than the private organization”. Then, he may start his

ANNAMALAI UNIVERSITY
study to find out the reasons for the low morale in government organizations.

TESTING OF HYPOTHESIS
After formulating the hypothesis, it should be tested to prove either to be true or
false. To test a hypothesis a researcher,
1. deduces the consequences that should be observable if the hypothesis is
correct.
2. Selects the research methods that well permit the observation or
experimentation.
3. Applies the methods and gathers data that can be analysed to indicate whether
or not the hypothesis is confirmed.
110

Further, for testing hypothesis the following basic concepts should be explained.
Basic Concepts in Testing the Hypothesis
1. Null Hypothesis and Alternative Hypothesis
The null hypothesis and an alternative hypothesis are chosen before the sample
is drawn. A null hypothesis states that there is no significant relationship between
the variables (it is denoted as Ho) whereas the alternative hypothesis states that
there is a significant relationship exists between the variables (it is denoted as Ha).
In the choice of null hypothesis the following considerations are usually taken into
account.
a) A null hypothesis represents the hypothesis we are trying to reject and an
alternative hypothesis should represent all other possibilities.
b) In the rejection of certain hypothesis there is a great risk if it is actually true
and if it taken as null hypothesis.
c) A null hypothesis should always be a specific hypothesis i.e. It should not state
anything about criterion value.
Generally, in hypothesis testing we proceed on the basis of null hypothesis by
keeping the alternative hypothesis in view. It is based on the assumption that if a
null hypothesis is true, one can assign the probabilities to different possible sample
results. But, this cannot be done if we proceed with an alternative hypothesis.
Hence, most of the times we proceed the research with a null hypothesis.
2) The Level of Significance
The level of significance is the maximum value of the probability of rejecting Ho
when it is true and is usually determined in advances before testing the hypothesis.
Usually 5% level has been chosen with great care, thought and reason. This
level of 5% significance (0.05 level) implies that Ho will be rejected when the
sampling result has a less than 0.05 probability of occurring if Ho is true. In other
words, 5% level means the researcher is willing to take as much as a 5% risk of
rejecting the null hypothesis when Ho happens to be true.
3) Decision Rule
The decision rule states that, based on the particular level of significance
whether the null hypothesis has to be rejected (or) an alternative hypothesis has to

ANNAMALAI UNIVERSITY
be accepted. For example, in humanities we prefer 0.05 level of significance which
implies that there are none or only if 5 defective items among 100 items, we will
accept Ho otherwise reject Ho. This is known as decision rule.
4) Type I and Type II error
There are two possibilities in testing the hypothesis. We may reject Ho when Ho
is true and we may accept Ho if it is not true. The former creates type I error (
error) and the later creates type II error ( error).
111

Decision
Accept H0 Reject H0
H0 (true) Correct decision Type I (or)  error
H0 (False) Type II or  error Correct decision

Usually Type I error can be easily minimized by taking 0.01 level of significance
where the probability of rejecting Ho is reduced. If type I error reduces type II error
increases. It should not be possible to reduce both errors simultaneously. Hence, in
testing of hypothesis, we should take all possible effect to strike an adequate
balance between type I and type II errors.
5) Two tailed and One – tailed tests
In two – tailed test there are two rejection regions. It rejects the null hypothesis
if the sample mean is significantly higher or lower than the hypothesis value of the
mean of the population.
In one – tailed test there is only one rejection region. It rejects the null
hypothesis if the sample mean is significantly lower than the hypothesis if the
sample mean is significantly lower than the hypothesised value of the mean of the
population. But, in most of the psychological testing we use one – tailed tests rather
than two tailed tests.

Procedure for hypothesis testing


Procedure for hypothesis testing involves all the steps that we undertake for
making a choice between two actions i.e. rejecting or accepting a null hypothesis.
The various steps in the hypothesis testing are as follows:
1. Stating the null and alternative hypothesis.
2. Selecting a significance level.
3. Deciding the distribution.
(Here, the choice is generally between normal distribution and the ‘t’
distribution).
4. Select a random sample and computing an appropriate value.

ANNAMALAI UNIVERSITY
5. Calculation of the probability.
6. Comparing the probability and either accept Ho (or) reject Ho.
SUMMARY
Hypothesis is the tentative proposition which may become a solution to a
problem. There are various types and forms of hypothesis. Even hypothesis can be
derived either by the inductive or deductive method. It is a testable proposition
which can be tested using a decision rule. The basic concepts in testing a
hypothesis and the procedure for testing a hypothesis indicates that, it is in the
hands of the researcher to minimize the effect of type I and type II errors. By
selecting a large random samples we can minimize the errors as much as possible.
112

KEY TERMS
Hypothesis Null hypothesis
Alternative hypothesis Universal hypothesis
Existential hypothesis Inductive method
Deductive method Level of significance
Type I and Type II error One tailed and two-tailed test.
QUESTIONS
1. Define “hypothesis”. State the important uses of hypothesis.
2. Discuss the types and various forms of stating hypothesis.
3. Discuss the criteria for selecting a “good hypothesis”
4. Explain the basic concepts in testing the hypothesis and state the various
steps in the testing of hypothesis.



ANNAMALAI UNIVERSITY
113

LESSON – 13

MAJOR STAGES IN RESEARCH (Contd… )

OBJECTIVES
After reading this lesson the student should
 Know the meaning of sampling.
 Understand the steps in sampling.
 Describe various methods of sampling.
 Explain the sample size.
SYNOPSIS
Sampling: Meaning – Probability and Non-probability sampling.

INTRODUCTION
The primary purpose of research is to discover principles that have universal
application. But, a study of whole population in the universe for arriving at
generalization is impracticable and also not possible. Most of the time, populations
are so large that their characteristics cannot be measured also the populations may
be changed before the measurement is completed. The process of sampling makes it
possible to draw valid inferences or generalizations on the basis of careful
observation of variables within a relatively small proportion of the population. A
measured value based upon sample data is a ‘statistic’. A population value inferred
from a statistic is ‘parameter’.
A population is any group of individuals that have one or more characteristics
in common that are of interest to the researcher. The population may consist of all
individual part of particular type or a more restricted part of that group. For
example, if we want to measure the attitude of high school students toward normal
education in Tamil Nadu. Now, all the high school students in Tamil Nadu is our
“population”.
A sample is a small proportion of a population selected for observation and
analysis. By observing the characteristics of the sample, one can make certain
inferences about the characteristics of the population from which it is drawn. In the

ANNAMALAI UNIVERSITY
previous example, it is too difficult to conduct the attitude survey throughout all
the high schools in Tamilnadu. Hence, we may select a few high schools in and
around Annamalainagar for conducting a survey. Now, these schools have been
called “sample”.

SAMPLING
In order to collect the data for a research problem, the researcher has to select a
representative sample from the population so as to enable him to draw conclusions
from the sample, which will hold good for the population.
Sampling is the process by which a relatively small number of individuals or
measures of individuals or objects or events are selected and analysed to enable the
114

researcher to make generalisation for the entire population of which the sample is
drawn it. The essence of sampling is the selection of a “part” (sample) from the
“whole” (population) in order to make inferences about the “whole”.

Sampling Design
A sampling design is a definite plan for obtaining a sample from a specific
population. It refers to a technique or procedure the researcher would adopt in
selecting items for the sample. Any researcher must prepare a sample design before
he starts the data collection. He must plan how a representative sample should be
selected and of what size such a sample would be. There are many sample designs
form which a researcher can select a representative sample. Researcher must
select/ prepare a sample design which should be reliable and appropriate for his
research study.

Steps in Sampling
In any sampling design the researcher should adopt the following steps:
1. Defining the Population / Universe
The first step in any sampling design is to clearly define the set of objects,
technically termed as ‘Universe’ to be studied. The universe may be a collection of
specified group of human beings or non-human entities such as object, educational
institutions, time units and so on. A population containing a finite number of units,
individuals or members is called a finite population. (E.g. Students studying in a
particular school). A population with infinite number of units or member is called
an infinite population. (Eg. Listeners of a specific radio programmes, member of
readers for a specific news paper). Hence, we should avoid ambiguity when we
define the population.
2. Sampling Unit
A sampling unit may be a geographical units such as village, district, state etc
(or) it may be a social unit such as family, school, college, university etc. (or) it may
be an individual. A decision must be taken concerning a sampling unit before
selecting a sample. The researcher should decide one or more of such units that he
has to select for his study.

ANNAMALAI UNIVERSITY
3. Source List / Sampling Frame
After deciding the sampling unit, the researcher has to prepare a source list or
sampling frame. In order to collect a sample from the population, it is necessary to
have a complete, accurate and update list of all the units in the population & it is
termed as ‘sampling frame’. This should be prepared before a sample is drawn to
represent the population.
4. Size of Sample
A good sample must be a representative of the entire population. In case of
homogeneous population, it is enough if a small sample is drawn. (less than 30).
But, in case of heterogeneous population large sample must be drawn. For survey
115

researches with heterogeneous population the sample size should be considerably


more, otherwise the sample may not be a representative of the whole population.
5. Sampling Procedure
The technique used to select a representative sample from the population is
known as sampling procedure. There are a number of sampling designs available in
the research methodology. However, we should select the appropriate sampling
design depending on specific purpose situations and availability.

Characteristics of a good sample design


1. It should result in a truly representative sample.
2. A good sampling design should minimise the sampling error as much as
possible.
3. Systematic bias should be controlled using a sampling design.
4. It can be applied in general for the universe with a reasonable level of
confidence.

Types of Sampling
The sampling design is divided into two broader categories viz. probability and
non-probability sampling. Probability sampling is based on the random selection,
whereas non-probability sampling is ‘non-random’ sampling. In each design there
are different types of sampling designs. Let us discuss the types of sampling here.

SAMPLING

Probability Non-Probability
sampling sampling

Judgement (or) Quota


Purposive sampling
sampling

ANNAMALAI UNIVERSITY
Unrestricted
random sampling
Systematic
sampling
Double
Sampling

Stratified random Cluster and Incidental


sampling multistage sampling Sampling

a) Probability Sampling
In the probability sampling, the units are selected by means of certain
procedures which ensure that every unit of a population has a probability of being
included in the sample. This is called as “random sampling”.
116

Randomness
The concept of randomness is the basic to scientific observation and research. It
is based on the assumption that, if the individual events cannot be predicted with
accuracy, the aggregate events can. For instance, although it may not predict with
great accuracy an individual’s academic achievement, it will predict the average
academic performance of a group.
Randomization has two important application in research viz.
i) Selecting a group of individuals for observation who are representative of the
population about which the researcher wishes to generalise.
ii) Equating experimental and control groups in an experiment. Assigning
individuals by random assignment is the best method of providing for equivalence.
It is important to note that a random sample is not necessarily an identical
representation of the population. Characteristics of successive random samples
drawn from the same population may differ to some degree but it is also possible to
estimate their variation from the population characteristics and from each other.
The variation, known as ‘sampling error’ does not suggest that a mistake has been
made in the sampling process. But, sampling error refers to chance variations that
occur in sampling. By using the randomization, these variations are predictable
and taken into account in data analysis techniques. Let us discuss the various
random sampling techniques.

i) Unrestricted Random Sampling


In simple or unrestricted random sampling each unit of the population is given
an equal chance of being selected as sample. The selection of unit from the
population is done in such a manner that every unit in the population has an equal
chance of being chosen and the selection of any one unit is in no way tied to the
selection of any other. The ‘law of change’ is allowed to operate freely in the
selection of such a sample and carefully controlled conditions are created to ensure
that each unit in the population has an equal chance of being included in the
sample. To prevent the researcher from biasing the results by exercising direct
control over the selection of units several devices are employed to draw samples
from the population. The lottery method and the table of random numbers by
ANNAMALAI UNIVERSITY
Fisher & Yates are usually used in the selection of random samples.

ii) Stratified Random Sampling


At times a random sample, particularly a small one, may by chance have an
undue proportion of one type of units in it. Therefore, it is necessary to make
certain that the units included in the sample are selected in proportion to their
occurrence in the population. When the units in a sample are proportional to their
presence in the population the sample is said to be stratified.
In the stratified random sampling, it is advisable to subdivide the population
into smaller homogeneous groups to get more accurate representation. When
employing this method a researcher divides his population into different strata by
117

some characteristic which is already known from the previous theories, research
and so on. Thus, in addition to randomness, stratification introduces a secondary
element of control as a means of increasing precision and representatives.
The usual stratification factors are sex, age, socio-economic status, educational
background, locality (rural/urban), occupation, religion caste and so on. The
efficiency of the stratified random sample depends upon the allocation of samples
under each strata. The simplest and most common system of allocation of sample
units among strata should be in proportion to the size of the strata. If the member
of units selected from different strata are proportional to the total number of units
in the strata, the sample is said to be selected with proportional allocation.

iii) Systematic Sampling


If a population can be accurately listed or is finite, a type of systematic selection
will provide a random sample. A systematic sample consists of the selection of each
nth term from a list. In systematic sampling a researcher generally starts with a list
in which all the units of population are listed in alphabetic or some other order. To
select a sample size ‘n’ he has to select a unit at random from the first ‘k’ unit of the
list and then every ‘k’ the subsequent unit selected. The member of ‘k’ is so chosen
that ‘nk’ is equal to ‘n’ or is just less than ‘n’.

iv) Cluster and Multistage sampling:


The area or cluster sample, is a variation of simple random sample. In this
method, instead of an individual as an unit of sampling, a group has been taken as
an unit of sampling. This method is appropriate only when the population of
interest is finite or when a list of the members of the population does not exist or
when the geographic distribution of the individuals are widely scattered.
For example, in order to obtain a sample 10th standard students in Tamil nadu,
the researcher can take 20 high schools at random and study all the 10 th standard
students in those 20 schools. Here, the school is taken as unit of sampling. (or) We
may take the classroom as the unit so that all the 10th standard class divisions can
be treated as the population and random selection of class divisions can be done to
yield the sample.

ANNAMALAI UNIVERSITY
This method is very popular in education and social research. Multistage
sampling is used in large scale surveys for a more comprehensive investigation. The
researcher may have to use two, three or four stage sampling. This method of
sampling is likely to introduce an element of sample bias due to the unequal size of
some of the subsets selected. Hence, this method may be adopted only when a
simple random sample would be impracticable.

v) Incidental Sampling
The term is applied to a sampling procedure where a sample is used just
because it is the only sample available for the purpose”. For example, A study is on
the reactions of the students toward educational broadcasts. The usual method is
to obtain a representative sample and conduct a survey to collect the opinion. But,
118

often the officials do not have time or facility to conduct such a study. But, they
want to make a quick (and crude) assessment of the reactions. Hence, they depend
upon the letters received by the office about the programmes as samples and they
used the letters to study the students' reactions. We know that the sample is not
really a representative of the population of the students. However, we use this as a
reasonably good indication of the reaction of the students.

vi) Double Sampling


This method of sampling consists of obtaining a smaller sample from the
already selected larger sample. In other words, this method involves sampling
within a sample. Suppose, a large sample of 3000 individuals is used to standardize
an attitude scale. (For instance the sample is selected under a random sampling
technique) We may like to take another sample of 500 from this sample using the
stratified sampling. We can estimate the member of boys and girls for the smaller
sample are 300 and 200 respectively. We isolate the scores of all the boys in the
original sample and then select the scores of 300 boys by random selection.
Similarly, 200 girls can be selected. The two small samples can be pooled to yield
the required ‘sub-sample’.

b) Non-Probability Sampling
In non-probability sampling the units are selected at the discreation of the
researcher. Such sample derive its control from the judgement of the researcher. Is
the researcher wants to select a sample of 200 school students he may exercise his
own judgement based on his experience for including a given student in the
sample. The sample so selected is called judgement (or) purposive sample. Such
samples are arbitrarily selected because there is a good evidence that it is the
representative of the total population. Here, the evidence is always depending on
the experience of the researcher.
The non-probability sampling methods are very convenient in the situation
when the sample to be selected is very small and the researcher wants to get some
idea of the characteristics of the population in a short period. Such samples are
used in the situations where the researcher does not want a representative sample
but to gain into the problem by selecting only informed persons who can provide
him the maximum degree of insight into his problem with comprehensive

ANNAMALAI UNIVERSITY
information. There are two kinds of non-probability sampling, viz. judgement or
purposive sampling and quota sampling.
i) Judgement or Purposive Sampling
In this method the samples are collected whatever subjects are available. (or)
The samples can be selected by the discretion of the researcher for a specific
purpose and hence it is called purposive sample. The researcher uses his
experience as a criteria to select the sample and hence it is also called judgement
sampling.
119

ii) Quota Sampling


This method is an extension of judgement sampling. In addition to the specific
purpose / judgement the researcher adopts certain strategy to select a sample. For
example, to select a sample of 200 high school students, the researcher can select
100 males and 100 females.
Among the 100 boys or girls, 50 from high and 50 from low socio-economic
status have been selected. Among the 100 boys or girls, 50 from rural area and 50
from urban area may be selected. So, the researcher uses some strategy here, but
based on his judgement he selects a sample. Hence, it is called quota sampling.
The limitations non-probability sampling are
1. The results cannot be generalised.
2. The variance of the sample and the sampling errors cannot be
determined.
3. Sometimes these sample is based on an obsolate frame which does not
adequately cover the population.

SAMPLE SIZE
There is usually a trade – off between the desirability of a large sample and the
feasibility of a small one. The ideal sample is large enough to serve as an adequate
representation of the population about which the researcher wishes to generalize
and small enough to be selected economically – in terms of subject availability,
expense in time and money, complexity of data analysis and so on. There is no fixed
number of percentage of subjects that determines the size of an adequate sample. It
may depend upon the nature of the population of interest or the data to be gathered
and analysed.
A statistical test makes a distinction between large and small samples. Usually
a sample size of 30 and above is treated as a large sample, and a sample below 30
is treated as a small sample. Sampling distributions for large samples usually
follow the well known normal distributions. But, sampling distributions for small
samples follow different distributions for different sample size.
Rather than the size of a sample, another important thing is the care with which

ANNAMALAI UNIVERSITY
the sample is selected. The ideal method is random selection. If we used the
random sampling, whether the sample is large or small the errors of sampling may
be estimated, giving researchers an ideas of the confidence that they may place in
their finding.
Hence, the following practical considerations about sample size should be
observed while selecting a sample.
1. The larger the sample, smaller the magnitude of sampling error.
2. Survey type studies probably should have larger samples than needed in
experimental studies.
120

3. When sample groups are to be subdivided into smaller groups to be compared,


the researcher initially should select large enough samples so that the
subgroups are of adequate size for his purpose.
4. Subject availability and cost factors are legitimate considerations in determining
appropriate sample size.

SUMMARY
Sampling is very essential for any type of research. The choice of sampling
method depends upon many consideration unique to each individual study. These
include the definition of the population, available information about the structure of
the populations, the parameters to be estimated, the objectives of the analysis
including the precision required and the financial and other resources available for
the study. But, we ensure that the samples should be a representative of the
population so that the results can be hold good for the whole population.

KEY TERMS
Population Sample
Sampling Probability sampling
Non-probability sampling Sample size

QUESTIONS
1. What is sampling? Discuss the various steps in sampling.
2. Describe the various probability sampling with suitable examples.
3. Elucidate the difference between probability and non-probability sampling.
4. Write a short notes on
a) Judgement sampling.
b) Quota sampling.



ANNAMALAI UNIVERSITY
121

LESSON – 14

TOOLS OF RESEARCH AND TESTS

OBJECTIVES
After reading this lesson the student should
 Understand the criteria of selecting a research tool.
 Explain the factors related to tool construction.
 Describe how to conduct an interview
 Discuss the various forms of questionnaire.
SYNOPSIS
Criteria for selection of tools – Factors related to construction of tools – Tools of
different types: Observation – Interview – Questionnaire.

INTRODUCTION
To carry out the research investigation data must be gathered to make
inferences. Many different tools and procedures have been developed to aid in the
acquisition of data. These tools employ distinctive ways of describing and
quantifying the data. These tools employ distinctive ways of describing and
quantifying the data. Each one is particularly appropriate for certain sources of
data, yielding information of the kind and in the form that can be most effectively
used.
The frequently used methods in the psychological investigation are:
1. Observations 2. Interview
3. Questionnaire 4. Check list
5. Inventories 6. Rating scales
Most of the researchers become preoccupied with one method of inquiry and
neglect the potential of the others. Too much of dependence upon single method of
inquiry may not be sufficient to explain the human behaviour as intact. Since, each
data – gathering procedure or device has its own particular weakness or bias, there
is a merit in using multiple methods. Supplementing one with others to counteract

ANNAMALAI UNIVERSITY
bias and we can generate more adequate data.

CRITERIA FOR SELECTION OF TOOLS


There are two major classification in research viz. qualitative and quantitative
studies. Qualitative studies are those in which the description of observations is not
ordinarily expressed in quantitative terms. In the qualitative research the
investigator may use observations, interviews and the examination of documentary
materials, little measurement may be involves. However, observations may be
classified to discrete categories yielding nominal level data. Further, the projective
tests and opinions also used in qualitative research.
In contrast to the qualitative research the empirical data is collected in the
quantitative research. In behavioural research many of the qualities or variables of
122

interests are abstractions and cannot be observed directly. It is necessary to define


them in terms of observable acts from which the existence and amount of the
variables are inferred. The operational definition of a particular psychological
construct tells us what the researcher must to do measure the variable. For
example, intelligence is an abstract quality that cannot be observed directly.
Intelligence may be define operationally as scores achieved in a particular
intelligence test. The interpretation of the quantitative research is sometime
subjective, which may lead experts to disagree about their validity. The fact here is
the numerical data generated does not insure valid observation and description, for
ambiguities and inconsistencies are often represented quantitatively. But, even with
all these limitations, quantitative methods are widely used as a powerful research
techniques. It has played an essential role in the history and development of
science as it progressed from pure philosophical speculation to modern empirical
and verifiable observations. Effect is being made in developing more valid
operational definitions and better observation techniques.
Before employing the tools (or) tests we should ensure the following important
criteria of a test. The criteria are,
1. Reliability (Discussed in lesson 15)
2. Validity (Discussed in lesson 16)
3. Economy – The matter of expenses of administrating a test is often a
significant factor in testing programme and is being operated on a limited
budget.
4. Administration and Interpretation – The test should easily be administered,
scored and interpreted.
5. Interest – Tests that are interesting and enjoyable help to gain the
cooperation of the subject. Those that are dull or seem silly may discourage
or antagonize the subject. Under these unfavourable conditions, the test is
not likely to yield useful results.
Hence, in selecting a test, it is important to recognize that a good test does not
necessarily possess all the desirable qualities for all subjects on all levels of

ANNAMALAI UNIVERSITY
performance. Within a certain range of age, maturity or ability a test may be
suitable. The selection should be made after careful examination of the
standardizing data contained in the test manual and extensive analysis of
published evaluations of the instrument. Research workers should select the most
appropriate standardized tests available.

Factors related to construction of tools


Although there are many instruments available to psychological researchers,
there are occasions when researchers have to develop their own instruments. The
most common situation that requires a locally developed measure is evaluation
research for a specific setting. A more common approach is to develop an
instrument that seems reasonable and to gather pilot data on it to revise the
123

instrument. While developing an instrument the researcher should follow a few


basic steps if faced with a situation that requires the development of an instrument.
1. Become acquainted with common approaches to measure the trait or behaviour
of interest. There are many existing sources that summarize approaches for
measuring such things as attitudes, interests, personality and self-concept.
2. Write out specific objectives for your instrument with one objective for each trait
or behaviour of interest.
3. After reading about the area and discussions with others about what approach
would best measure the trait, brainstorm several items for each objective.
4. Ask the professional who are knowledgeable in the assessed area to review the
items. i.e. You should examine the items as whether it is clear unbiased,
concise and so on.
5. Revise if necessary and find a small sample of individuals that is similar to
those that will be used in the actual study and administer the instrument to
them. This is called “pilot study”.
6. Check for clarity, ambiguity in sentences, time for completion, directions and
any problems that may have been experienced.
7. Check for an adequate distribution of scores for each item in the instrument.
i.e. There should be an opportunity for responding to both the extremes
(agreement to disagreement).
8. Revise, delete and add items where necessary, depending on feedback from the
sample subjects in the pilot study.
The stages in the test construction is discussed elaborately in lesson 17.
TOOLS OF DIFFERENT TYPES
i) OBSERVATION
To evaluate the overt behaviour of individuals in controlled and uncontrolled
situation, observation is used as a tool of research. In order to make the
observation reliable and valid, it is necessary that it is to be conducted by an
expert, purposive, systematic, carefully focussed, and thoroughly recorded. The
conditions necessary for a sound and accurate observations are:
1) attention, 2) sensation, 3) perception, 4) conception

ANNAMALAI UNIVERSITY
Observation as a tool of research is useful in descriptive researches, evaluation
of certain personality traits, evaluation of physical aspects of school plan, physical
activities of students in games and in classroom situations.

Types
Observation may be classified as participant and non-participant. In the first
observer is one of the participants while in the second, observer does not
participate in the situation under watch.
A sound observation needs careful planning, expert execution and adequate
recording.
124

Planning
In planning for an observation, the observer should distinctly define the
behaviour to be observed in specific activities. He has to decide whether the
observation will be centered around an individual or a group of individuals. The
time to be spent in observation, the mode of observation and the mode of recording
should be decided upon before the actual executive of observation.

Execution
Proper condition should be manipulated for observation.

Recording
For a systematic collection of data through observation, sometimes recording is
done with the help of
i) Check lists, ii) Rating scales, iii) Score cards, iv) Scaled specimens, v) Blank
forms.
Interpretation of data based on observation needs professional skill and careful
handling as it is quite complex and is likely to be pretend if care is not taken.

Guidelines to observe
1. Observe one situation at a time.
2. Have specific criteria for making observations.
3. Observations should be made over a period of time.
4. The thing should be observed in differing and natural situations.
5. The thing should be observed in the context of the total situation.
Merits
1. Being a record of the actual overt behaviour observation is more reliable and
objective.
2. It can be done in a natural situation.
3. It does not require any special tools or equipment.
4. It can be used in every situation.
5. It is adaptable both to individuals and groups.
Limitations

ANNAMALAI UNIVERSITY
1. There is a great scope for personal prejudice and bias of the observer.
2. There may be some time lag in observing and recording things.
3. It is very difficult to observe everything.
4. It reveals the overt behaviour only – behaviour that is expressed and not that
is within.

ii) INTERVIEW
Meaning
Interview is one of the important and powerful tools for data collection in social
research. It is mostly a verbal method but it is not only the words spoken which
matters but also the gestures, facial expressions, pauses, modifications of voice etc.
125

also mater in an interview. Hence, Pauline V. Young defines interview as “an


effective, informal verbal and non-verbal conversation”.
All interviews have three elements in common:
1. A person to person relationship
2. A means of communication with each other
3. An awareness on the part of at least one of the persons for the purpose of
interview.
Objectives
The main purpose of interview as a tool of data collection is to gather
information needed to solve a research problem. V. Young brings out that the
objectives of the interview are,
1. exchange of ideas and experience.
2. To elicit information pertaining to a particular phenomenon.
Steps in interviewing
1. Preparation for the interview and establishing rapport.
2. The unfolding of the problem.
3. The joint working out of the problem.
4. The closing of the interview.
5. Evaluation of the interview.
6. Follow – up – of the interview.
Types (or) classification of interviews
i) Formal and Informal Interviews
If the interview is carried out with well defined purpose and with carefully
prepared list of questions, it is a formal interview, when the interview has full
freedom to make suitable alteration in the questions to suit a particular situation it
is informal interview.
ii) Personal and Group Interviews

ANNAMALAI UNIVERSITY
Personal interview is carried out with a single individual and the group interview
is carried out with a group of person.
iii) Qualitative and Quantitative Interviews
Qualitative interviews are about complex subject matter. e.g. Interview held for
case studies.
Quantitative interviews are those in which certain facts are gathered about a
large number of people e.g. census interviews.
iv) Research Interviews
These Interviews are held to gather information pertaining to certain problems.
Questions are prepared and the responses are collected for the purpose of solving a
research problem.
126

v) Some Other Types of Interviews


a) Diagnostic Interview –To understand the causes of a melody to grasp the
nature and causes of a disease.
b) Treatment Interview – To know the psychological melody of a patient to treat
lunatics.
c) Telephone interview – It is used for personal interviews. It has some
substantial advantages and disadvantages.
Telephone interviews are of
i) Low cost
ii) Access to remote places
iii) Quality (i.e. bias reduces)
Interview schedule or guide
The interviewer uses either a schedule of a guide. They are prepared after a
thorough study of the subject concerned and careful thinking and planning. A
schedule is a structured set of questions, which are usually used orally and the
responses are recorded by the interviewer. An interview guide is like a “map of a
road”. It is a list of topic questions or areas which the interviewer uses more as a
prompter during the interview. There is no order in which the points in the guide
should be covered. The interviewer may change the order according to his own
convenience.
Schedule or guide focuses the attention on salient points on the study.

Interview – Questionnaire and Observation.


The interview involves presentation of oral verbal stimuli and the return of oral –
verbal response. It also gives information regarding the interviewee’s perception,
beliefs, motivation future plans, attitude etc.
In a questionnaire, the information obtained is limited to the written response to
prearranged questions and nothing more. As in the interviews, here the interviewer
has no opportunity to observe both the respondent and the total situation on which
he is placed.
ANNAMALAI UNIVERSITY
The observational techniques are restricted mostly to non-verbal acts.. i.e. To
understand the behaviour and describe it as it occurs. Observational technique also
like questionnaire, is quite ineffective in giving information about the inner feelings
and thoughts of the respondent.

Preparation for Research Interview


1. The category (rank, grade, office etc) and number of persons to be interviewed
should be first decided.
2. The interviewer should have a clear conception of the purpose and the
information required.
127

3. The schedule or a guide of the best sequence of questions should be prepared.


This should be done after a careful study of the literature concerning the study
in question.
4. The type of interview (telephone, personal, group) should be decided.
5. A ready plan for recording the responses should be made available at hand (i.e.
recording the response on the schedule or check list).
6. The time for the interview should be fixed up with the interview well in advance.
Executing an interview and collection of data
The interviewer should be friendly and courteous to keep the respondent at
ease. No sign of surprise or disapproval should be shown because it may
embarrasses the respondent. If some questions are misunderstood or complex they
should be repeated and paraphrased.
While collecting the response the interviewer should write down the actual
words of the respondent. Paraphrasing the reply or polishing any slang or
correcting bad grammar may distort the respondents meaning and emphasis.
Recording and evaluating the response simultaneously should be avoided.

Merits
1. It helps to reduce tension by “Taking it out”
2. It serves to accept and clarify negative feelings
3. It helps to recognize and reinforce positive relationships.
4. It may develop the insights of the individual.
5. It is the most dynamic way of understanding the individuals as a whole.
6. It is natural like conversation.
7. It can be made very flexible so as to suit many situations and conditions.
8. It can be used for a variety of purposes.
9. It is relatively easy to conduct.
10. It is very useful in those cases where the individuals are illiterate.
Demerits
1. Interview method is very expensive. Visiting people at different places involves
travelling expenditure.

ANNAMALAI UNIVERSITY
2. Sometimes the respondent may give a biased responses instead of giving what
he thinks to be true. Questions on politics or some influential persons are
likely to be answered in a biased manner.
3. Interview technique is time consuming. A questionnaire can be distributed to
a large section at a time. But, interview can be held only with one person or a
small group of individuals at a time.
4. Interview method requires a high level of expertise. The investigator should
possess qualities of objectivity, insight and sensitivity.
iii) QUESTIONNAIRE
Questionnaire is a research tool containing a set of questions to be answered by
the respondent. Questionnaires are useful,
i) When people cannot be personally contacted from whom we desire response.
128

ii) When we cannot personally interview so many respondents.


Questionnaires should be properly designed and clearly printed or cyclostyled
and systematically classified.
Classification
P.V. Young classifies two types of questionnaire’s
i) Structured questionnaire ii) Non-structured questionnaire

i) Structured questionnaire
It contains difinite and concrete questions

ii) Non-structured questionnaire


This questionnaire is often known as interview.
George. A Lundber classifies questionnaire as
i) Questionnaire of facts which requires information from the respondent,
certain information without any reference to his opinion or attitude about them.
ii) Questionnaire of opinion and attitude-which requires the respondents
opinion or attitude towards the phenomenon.
The most common classification of questionnarie is (1) closed form which
contains a set of questions which require “Yes” or “no” response (or) “correct” or
“incorrect” response and
(2) open form, which contains questions to which the respondent gives a free
response.
e.g. For closed form: Are you married? Yes ……. No (Please check  (or) 
against)
2. What is your qualification?
The closed form is easier for the respondent to fill it up and it takes less time
than the open form.
Construction of a Questionnaire
A questionnaire should be constructed in such a manner that
1. It includes important items only.
2. Responses expected are simple [i.e., in the form “yes” or “no” etc.]
3. It should not be suggestive (or) ambiguous.
ANNAMALAI UNIVERSITY
4.
5.
It should not embarrass the respondent.
It should be clearly worded.
Improving a Questionnaire
It takes a great deal of time to frame suitable items. It is always desirable to
analyse the problem and then see what type of questions need be framed. The items
have to be carefully analysed by the investigator, discussed with friends, experts
and supervisor before the questionnaire is mailed out to the respondents. Terms
and words such as value, age, democracy etc should not be used which may convey
different meaning to different persons. These words should be properly defined.
Double negatives should also be avoided. Double barreled questions are
undesirable. A point of reference is necessary where ratings or comparisons are
129

needed. Systematic qualification of responses should be provided. The items should


be complete and appropriate to all respondents.
Characteristics of a good questionnaire
1. All questions are relevant and significant i.e. focus on the topic.
2. It seeks information which cannot be procured by any other mode i.e. It seeks
information not available from other sources.
3. It is short but complete.
4. It looks attractive
5. The questions are objective, the directions are clear.
6. The items are categorically arranged.
7. The order of presentation of question item is logical
8. It can be easily tabulated and interpreted.
9. It is clearly printed or cyclostyled as short as possible and clear. i.e. not
ambiguous.
Problems in preparing the questionnaire
1. Information we need for realizing the objectives of the study.
2. The language used (The language should be simple)
3. Leading questions (There should be no place for this)
4. Length of the questionnaire (it should not be too long)
5. Cross checking (it should be avoided)
6. Types of questions (open and closed ended).
7. Different categories of questions (it is necessary).
8. Ordering of questions (logical and sequence is necessary).
9. Short gun questions (There should be no place for this E.g. What is your age?
It is difficult to say whether with month, years, etc.)
Administrate the Questionnaire
While administrating questionnaire, the following points should be kept in
mind.
1. You should make – sure that the respondent is a fit person to answer your
questions. i.e. The status of the respondent should be kept in mind.
2. Questionnaire must take minimum of the respondents time. Reasonable time

ANNAMALAI UNIVERSITY
should be given to him to fill out the questions.
3. The purpose of the investigation should be made clear to him.
4. Direction to fill out the questions should be brief and clear.
5. The respondents can be selected in such a way that we may get the
maximum return of the questionnaire.
Advantages
1. It is less expensive and less time consuming.
2. Once skillfully constructed the investigator can employ anyone to administer
it. The technical skill required for conducting interview (or) observation is not
required.
3. Responses of the subjects are available in their own handwriting and
therefore they are fully authentic.
130

4. It places less pressure on the subject for immediate response.


5. It can be administered to a large group of people simultaneously.
Disadvantages
1. Questionnaire cannot be used with illiterates and children.
2. If a subject misinterprets a question there is no way of instructing him.
3. It is not helpful in finding information about the emotional attitude, private
sentiments of the subjects.
4. Some respondents may not like to put their views on controversial issues in
writing. Such views can be had only through interviews.
5. If certain answers are not clear to the investigator, he has no chance of
clarifying it from the respondent.
SUMMARY
Different tools and procedure have been developed to aid in the acquisition of
data. Observations, interview, questionnaire, chick lists, inventories, and rating
scales are the frequently used tools of research. In the selection of research tools
the following criteria and used viz, reliability, validity, economy, administration and
interpretation and interest. Observations are used to evaluate the overt behaviour
of individuals in controlled and uncontrolled situation. A sound observation needs
careful planning, aspect, execution and adequate recording. Interviews are widely
used to collect date since it provides the option of direct interaction between the
interviewer and interviewee. Questionnaires are also a powerful research tool which
measures a number of tendencies. Al the research tools have their own advantages
and limitations.

KEY TERMS
Observation Attention Sensation
Perception Conception Interview
Questionnaire (Q) Structured “Q” Non-structured “Q”
Q – of facts Q- of opinion
QUESTIONS
1. Discuss the various criteria for selecting a research tools.
2. Explain how observation is conducted.
3. Describe the various types of interviews with suitable examples.

ANNAMALAI UNIVERSITY
4. What are the characteristics of a questionnaire? Elucidate the merits and
demerits of a questionnaire.

131

LESSON – 15

TOOLS OF RESEARCH AND TESTS (Contd… )

OBJECTIVES
After reading this lesson the student should
 Understand the meaning and types of rating scales.
 Explain the various errors in rating scales.
 Distinguish the Thurstone and Likert type of attitude scales.
 Explain the case study.
 State the merits and limitations of case study.
SYNOPSIS
Tools of different types: Check List Schedule Rating Scales Attitude scales:
Thurstone’s Method and Likert Scale – Case Studies.

INTRODUCTION
The researcher chooses the most appropriate instruments and procedures that
provide for the collection and analysis of data upon which hypotheses may be
tested. The data gathering devices that have proven usefulness in psychological
research include psychological tests and inventories, questionnaires, rating scale,
observations, interviews and so on. In the previous lesson we have discussed about
the importance of observation, questionnaire and interview. In this lesson we will
discuss about the checklist and attitude scale construction. Apart from all these
things the ‘case study’ method is discussed since, it is the powerful tool in a
descriptive research.

I) CHECK LIST
The checklist, the simplest of all the devices is a prepared list of behaviours or
items. The presence or absence of the behaviour may be indicated by checking ‘yes’
or ‘no’ or the type or number of items may be indicated by inserting appropriate
word or number. The simple “laundry-list” type of device systematizes and
facilitates the recording of observations and helps to ensure consideration of the
important aspects of the object or act observed. We are familiar with checklists
ANNAMALAI UNIVERSITY
prepared to help buyers to purchase a car, T.V., refrigerator and so on. Checklists
also can be used to count the number of behaviours occurring in a given time
period.
For example, we can construct a check list to check about the procedures in a
research method. The questions can be constructed as follows:
i) The research procedures described in detail Yes / No
ii) The procedure has adequate sample Yes / No
iii) It has appropriate research design Yes / No
iv) The experimental variables are controlled appropriately Yes / No
v) Effective data collection tools are used Yes / No
132

Likewise, we construct a check list to measure or observe any phenomena as


well as to evaluate the programmes, methods and procedures.

II) RATING SCALES


A rating scale is a selected list of words phrases, sentences, paragraphs,
following which an observer records a value or rating based on some objective scale
values’. It is a special kind of check list in which the items or characteristics
checked are rated quantitatively or qualitatively according to the degree of presence
or absence of a trait, the degree or perfection of a skill or the degree of completion of
a task.

Types of Rating scales


1) Numerical Scales
In a numerical scale a sequence of defined numbers are supplied to the
observer. For example how was the lesson introduced to the class. The defined
numbers to this judgement may be as follows:
(1) very unsatisfactory (2) unsatisfactory (3) I do not know (no comment) (4)
satisfactory (5) very satisfactory.
Sometimes the numerical scale carries no numbers and the experimenter
assigns numbers after having obtained the responses of the rater or observer.
Numerical scales are among the easiest to construct and to apply the results are
also easy to handle.
2. Graphic scales
In graphic scales the responses are written along a line in which the line may be
segmented in units or it can be continuous. The lines can be drawn horizontally or
vertically.
For example, is he a slow or quick reader may be expressed as follows:
Extremely slow, Slow, Ordinary speed, quick and Extremely quick
They are simple and easily administered. They look more attractive than other
types of rating scales. The scoring pattern may be given as one desires.
3. Standard Scales

ANNAMALAI UNIVERSITY
These scales present to the rater a set of standards. With the set of standards at
hand the new stimuli can be rated very effectively. The man-to-man ratings portrait
matching, hand writing comparisons fall under this category. It is no doubt a
cumbersome task to prepare a set of standards but once it has been finalized
ratings become easy and more meaningful.
4. Rating by Cumulated Points
The quality of this category of ratings lies in its scoring method. The score of an
individual is cumulated on different items. The check list method or a list of
adjective works to evaluate the self concept of individuals falls in this category of
ratings.
133

Guess – who – technique devised for use with children by Harshorne and May is
also included in this category. The number of favourable and unfavourable
responses are centered and a cumulated score is found for each individual.
In terms of quantitative description these scales require least discriminations on
the part of the rater. One may say that for a rater each item has two responses to
be weighted as +1 and 0.
5. Forced Choice Ratings
In both graphic and descriptive rating scales the rater finds it difficult to have
control over the final results of his rating.
Forced choice rating scale forces the rater to choose between two alternatives
and the responses made are more correct. The two alternatives appear of equal
strength, nevertheless one statement is better than the other as it is implied by the
pairing. A number of such paired statements are given and the rater has to choose
one between the two from each pair.

Errors in Rating Scale


There are six types of errors in the rating scales viz.
i) The Error of Leniency
The raters have a tendency to rate the items higher than they deserve in whom
they are interested
ii) The Error of Central Tendency
The raters hesitate to give extreme ratings and restrict themselves in the central
zone of the scale
iii) The Halo Effect
The raters tend to rate some traits in the same direction which is carried over by
the general impression of the individual rated. Judgements in this case are
contaminated and the ratings on some traits become less valid.
iv) A Logical Error
The raters tend to give similar rating for traits that seem to be logically related
in the mind of raters.

ANNAMALAI UNIVERSITY
v) A Contrast Error
There is a tendency among raters to rate in the opposite direction from
themselves with regard to a trait.
vi) A Proximity Error
Due to closeness the ratings on adjacent traits are likely to be correlated high
than the traits remotely placed.

Cues for constructing rating scales


1. Traits should be described objectively and uni-vocally.
2. A trait should not be a composite of two or more independent traits.
134

3. Each trait should refer to a single type of activity.


4. In describing traits the use of adjectives like very, extreme…. should be avoided.
5. Traits should be judged on the basis of past and present accomplishments.
6. In self ratings there should be no trait in which all individuals over or under
estimate themselves.
7. Rating scales should not be used for such traits for which more reliable data
can be obtained by other sources or tools.
8. Select only very significant traits to be rated.
Advantages
1. Rating requires less time than ranking methods
2. With graphic methods the procedure is very interesting.
3. They can be used with persons having minimum training for making ratings.
4. They have much wider range of application.
5. They can be used with large numbers of stimuli.
6. Best judgement are made when stimuli are presented individually.
Limitations
1. Unless the rater understands, the purpose of rating clearly, the data obtained
from his ratings are of no use.
2. Rating scales required great ability and understanding on the part of the
observers and any one is not suited for this job.
3. If a rating scale is prepared by a single individual, it may contain a number of
errors.
4. Ratings by some raters are too low while, by others they are too high. With
proper care this short coming can be removed.

iii) ATTITUDE MEASUREMENT


An attitude represents an individuals feeling for or against something. It is also
defined as the degree of acceptance given by an individual for something.
ANNAMALAI UNIVERSITY
An attitude is one’s mental disposition or degree of acceptance directed towards
an object which may either concrete or abstract.

Definitions
i) “An attitude is a mental and neural state of readiness, organized through
experience, exerting a directive or dynamic influence upon the individual’s response
to objects and sections with which he is related” – Allport.
ii) “A state of readiness for motive arousal” – New comb
iii) “An affect for or against a psychological object” – Thurstone.
135

Characteristics of Attitudes
1. Attitudes represent predisposition’s to behaviours implied in the attitudinal
concept.
2. They are most learned set of behaviours and hence they are not inherited or
innate dispositions.
3. They are more or less permanent and persist for a reasonable period of time.
4. They are directed towards a goal or an object and hence, it may manifest in a
positive or negative way towards the attitudinal concept.
5. They can be indirectly inferred from ones covert or overt behaviour.
6. There are essential components of one’s personality and are organized within
the personality system of the person.
Attitude Scales
To evaluate expressed opinion or individuals opinion attitude scales are used.
Attitude scales present bipolar continue. While attitudes have a characteristic
which is subject to change, yet they tend to be sufficiently enduring over periods of
time.
Usually there are two important types of attitude scales are used. They are:
i) The Thurstone Scale
The scaling procedure follows some steps which have worth consideration. For
any purpose there are some 100 to 200 statements are collected which describe the
purpose favourably or unfavourably. Then groups that differ along continuum are
selected for obtaining all types of opinions. The statements are to be rated along an
eleven-point continuum from extremely unfavourable to favourable. Judges are
selected for the purpose and their ratings on all statements are obtained. The scale
values for each statement are calculated in terms of median and for these
statement ‘Q’ values are also found out. The statements are so selected which
represent the continuum evenly and which have low ‘Q’ values.
When the final attitude scale is administered to individual they are asked to
check their opinion on each statement. The score is taken on the mean or median
of the expressed opinions, which will indicate the overall opinion on each attitude
towards the psychological object under investigation.

ANNAMALAI UNIVERSITY
ii) The Likert Scale
The Likert method of attitude scale is the popular and simple. This method
consisted of multiple choice type statements categorized in three, five or seven
categories along the continuum. The construction of the Likert type scale has 4
important steps and they are discussed below:

Step – 1 Collection of Items


Collecting a large set of items relating to the social or psychological object in
question. The statements should be selected in such a way that they represent
different degree of acceptance of the object.
136

The statement can be collected from a wide variety of sources like authoritative
books dealing with them, research literature, newspaper statements etc. A very
large pool of items will be adequate for developing a good attitude scale.

Step – 2: Editing of items


The collected statements will have to be edited or modified to:
1. avoid double statements
2. avoid abstract or complex ideas or terminalogy.
3. cover all statements expressing all degrees of acceptance (rejection).
4. cover aspects and dimensions relating to the object.
5. include approximately equal number of positive and negative items.
Step – 3 : Preliminary administration and item analysis
The preliminary pool of items is printed in the form of an attitude questionnaire
with a five point response form against each example; The subjects is instructed to
enter one X mark in any of one of the five boxes to indicate the agreement with
item.
strongly agree partly undecided partly strongly
agree disagree disagree
    

Each item in the test is scored using the following method. In scoring we
distinguish the positive and negative items.
An opinion unfavourable to the purpose is taken as negative item and the others
are taken as positive items. The scoring for positive and negative items are shown
below:
Item Strongly Partly Undecided Partly Strongly
agree agree disagree disagree
Positive 5 4 3 2 1
Negative 1 2 3 4 5

ANNAMALAI UNIVERSITY
The draft is administered to a representative sample.
If the test is administered to a sample of 370, the calculations, required for item
analysis are made easy. The answer sheets are arranged in the descending order of
the total scores. From the 370 answer sheets, top 27% (top 100 answer sheets) and
the bottom 27% (bottom 100 answers sheets) are used for item selection. The t
value can be calculated using the following formula.
MH  ML
t
(XH  MH)2  (XL  ML )2
n(n  1)
137

Where,
MH  mean score in the item of high group. (top group)
ML  mean score in the item of low group. (bottom group)
XH  any representative score of the high group.
XL  any representative score of the low group.
n  sample size in either high or low group (=100)
The symbol ‘’ is used to show that the differences is to be obtained for each
score separately, then squared and then all the squares should be added up.
Items showing high t values are considered to be better as compared with items
with low ‘t’ values. This is so because a good item is one which produces a higher
mean score in the high group and lower mean score in the low group. Those
showing higher ‘t’ values are selected. If the number of items for the final scale is
decided ( 30) than the first 30 highest ‘t’ value items can be taken.
Another condition is that approximately half the chosen items must be positive
and the rest are negative. Hence it is usual in item analysis to analyse positive and
negative items, and choose the best item for each category, to include in the final
scale.
Step - 4: Final scale
The selected items will be arranged in such a way that positive and negative
items are alternate. This arrangement will help us to cover the real intention of the
measuring tool. The final test is administered on a large sample and the scores are
used for three purposes, viz.
a) For developing test norms.
b) For estimating the validity of the test.
c) For estimating the reliability of the test.
IV) CASE STUDY
Meaning
Pauline, V. young defines case study as “A comprehensive study of a social unit,
be that unit a person, a group, a social institution, a district or a community”. Case
study is concerned with everything that is significant in the History or development

ANNAMALAI UNIVERSITY
of a case: The case may be an individual, a family, a group an institution or a
community.
The case – study method views any social unit as a whole. The case is some
phase of the life history of the unit under study or it may present the entire life
history. Case study is a very good method of collecting information about an
individual a family so on.
Case study was primarily limited to the problems of maladjustment such as,
truancy or failure in school, a broken or poverty – stricken home or an under
privileged or malfunctioning community. But, more recently this approach has been
extended to investigation of normal or bright children, successful institution, well
organized communities or effectively functioning cultural group.
138

Origin and Growth of Case Study Method


Frederick Le piay (1806-1882) introduced the case study method into social
sciences in the middle of the 19th century. He used it as a handmaid to statistics in
his studies of family budgets (1820-1903).
Herber speneer, an English Sociologist was the first to use case studies in his
ethnographic studies (studies of races of mankind).
Dr. William Healy, was the first psychologist who adopted the case study
method in his studies with juvenile delinquents (young criminals).
Later on historians used this method for describing persons, eras and nations;
anthropologists for describing primitive and modern cultures; novelists for
sketching their characters and families; and in the field of social psychology. Some
of the most extensive case studies have been made by H.A. Murray, and his
associates at the psychological clinic of Harvard University.

Characteristic
The case study method emphasizing the following two characteristics.

i) The whole as one unit


It is the most important character of this method, that it treats as students an
individual or an institution or a community as a whole unit.

ii) Intensive study


It aims at a deep and thorough study of a unit dealing with every aspect of it
intensive.

Steps in a Case Study


1. Initial status - The initial status of an individual or group should be stated.
2. Determination of factors
First of all the collection of material about each of the units, or aspects is done
with which the melody or the plus point of the unit is learnt.
3. Diagnosis and statement of the problem
The data collected are intensively studied and the melady or the specialty of the

ANNAMALAI UNIVERSITY
unit is named.
4. Analysis and Remedy (or) Treatment
After studying the factors and defining the problem, an analysis is made as to
how it can be treated often which conclusions are drawn.
Example
i) Initial status : Reading disability of a child
ii) Factors : Physical, Intelligent, Social environment
iii) Diagnosis : Defective vision
iv) Remedy : Correcting fitting glasses.
139

Sources of data in case study


i) Personal Documents
Necessary information may be collected from personal diaries, letters or
autobiographis. The evidences or secrets of an individual, his objectives, and mode
of life are usually written in such a personal documents. So, personal documents
are one of the best sources for the study of the problem.
ii) Life History
Here, the entire life of a person is taken into account. It is studied, tested, and
enquired by means of observation and interviews.

Aim of the Case study


The main aim of the case study technique is to find out the behaviour pattern of
the unit under study and the causes for such behaviour. It tries to understand the
complex factors that are operative within a social unit as an integrated totality.

Types of case Studies


i) Case Studies of Individuals
a) Study of child development.
b) Autobiography of mentally patient or criminal
c) Study of personality development
ii) Case studies of Institutions
a) Case study of institution of learning
b) Case study of hospitals.
c) Case study of prisons.
iii) Case studies of Communities or Cultural Groups
a) Case study of rural villages.
b) Case study of an industrial community.
c) Case study of a factory.
Limitations

ANNAMALAI UNIVERSITY
i) False Sense of Confidence
In this method, a researcher may become overconfident and he may think that
he knows everything about that person under study. It should be remembered that
there may be many hidden aspects about which even the respondent is ignorant.
ii) Difficulties in Collection of Historical Data
Proper data collection is very difficult because the respondents do not reveal the
actual facts to the researcher.
iii) False Generalisations
Since, it is not possible to collect all the information about hence an individual
the generalizations become defective.
140

iv) Possibility of Error


There are many chances of errors in the selection of cases, observation etc. due
to failures of memory, suppression of unpleasant facts, tendancy to exaggerate and
so on.
v) Lack of Quantitative Study
The case study method is qualitative in nature. It deals with only the
psychological aspects of a human being. Hence quantification of results is not
possible in this type of study.
vi) Unorganized and Unsystematic
This method is unorganized and unsystematic because, there is no control what
– so – ever on the researcher and on the respondent. Thus, verification is not
possible. Hence, the data collected by this technique is often unreliable and
generalization drawn from the data is inaccurate.

Suggested improvements for a case study


Inspite of the drawbacks of the case study method, social scientists are in great
need of this method for conducting researches. Scientists like-Carl Rogers, Alfred
Kinsey and John Dollar have suggested some methods for the improvement of the
case study. Some of their suggestions are as follows:
1. The subject of study must be regarded as a specimen in a series of similar
problems.
2. The life – history material of the cases should be well organised and properly
understood.
3. The technique of elaboration of organic materials into social behaviour should
be properly specified.
4. The important role of any group or institution which is responsible for
transmitations of a culture should be recognized.
5. In a case study relating to individuals the continuously related experience from
childhood should be stressed.
6. The serial situation in which the unit under study operates should be specified

ANNAMALAI UNIVERSITY
as a part and parcel of the study.
Case study technique is still considered to be important in the studies of
juvenile delinquents, mentally retarded persons and criminals.

SUMMARY
In research the researcher must select most appropriate procedures and
instruments. These list Rating scale and case studies are also used to collect
objective data. The main advantage of rating scale is quantification of a
psychological entity. Also, we may commit a variety of errors in our rating. Attitude
measurement is of the important area in psychological assessment. Thurstone’s
scale and Likert scale are widely used to measure the attitude of people The
constitution of liKert scale involved a standardized procedure. Case study is a
141

comprehensive of social unit. The case may be an individual, a family, a group, and
institution or a community. Case study was primarily connected to the problems of
maladjustments. All the psychological tools have their own merits and demerits in
spite of the demerits they are used widely for date collection.

KEY TERMS
Check List Rating Scale
Numerical Sacles Graphic Scales
Standard Scales Halo Effect
Logical Error Constant Error
Attitude Thurstone’s Scale
Likert’s Scale Case Study

QUESTIONS
1. Write a short note on i) check list, ii) rating scales.
2. Explain the various types of errors in rating scales.
3. Define attitude. Explain the characteristics of attitude and the Thurstone’s
scale for measuring attitude.
4. Describe the various stages in the construction of Likert type of attitude
scales.
5. Explain the case study with suitable examples.



ANNAMALAI UNIVERSITY
142

LESSON – 16

TOOLS OF RESEARCH AND TESTS (CONTD …)

OBJECTIVES
After reading this lesson the student should
 Understand the concept of reliability
 Explain the sources of errors in test scores
 Describe the different types of reliability
 Examine the factors influencing the reliability of a test scores.
 Interpret the reliability co-efficient.
SYNOPSIS
Characteristics of a research tool - Reliability – Methods of obtaining reliability
co-efficient.

INTRODUCTION
The two essential characteristics of a good test are its Reliability and Validity.
Whenever anything is measured, whether physical, biological or behavioural,
there is some possibility of chance error. This is applicable to psychological tests as
well. Variations of results with same test using the same persons are due to (1)
individual differences, (2) chance factors and (3) defects inherent in the test itself.
Components of high quality of “good” test is defined by the professional
organisations involved in the development of standards of quality in tests. These
standards were developed by a Joint Committee of American Psychological
Association (APA) the American Educational Research Association (AERA) and the
National Council on Measurement in Education (NCME). These standards were
published in 1974.

MEANING OF RELIABILITY
The reliability of a test refers to its ability to yield consistent results from one set
of measures to another; it is the extent to which the obtained test scores are free
from internal defects. Reliability also refers to the extent to which a test yields

ANNAMALAI UNIVERSITY
consistent results upon testing and re-testing.
Reliability includes such terms as consistency, stability, replicability and
repeatability.
The concept of reliability is defined using three assumptions.
(1) Each person being tested has some fixed amount of the attribute which is
known as the person’s “true score”. For example, a subject being tested is assumed
to possess some “true” level of intelligence.
(2) Every observation of an attribute – Intelligence. for example, contains some
degree of error.
143

(3) Following the first two assumptions is that any observed score reflects both
the “true” score and some “error”.
Thus a score on an Intelligence Test for example, reflects some degree of “true”
variance and some degree of “error” variance. The proportion of observed score
variance, that is “true” score variance is an index of the test’s quality. A reliability
coefficient indicates this kind of test quality rather than the error variance.

SOURCES OF ERROR IN THE TEST SCORES


There are many ways of categorizing the types of error in test scores. Lyman
(1978) presents one useful system, in which errors in test scores are related to five
major factors:
1. Time influence
2. Test content
3. The test administrator
4. The test situation and
5. The examine herself / himself
1. Time influence: This is the major source of error in test scores. This error is
due to fluctuations in test performance over time. Error scores occur when
measurements taken at two different points of time do not yield the same results.
Sometimes, this error occurs when individuals remember their response in the
previous administration of the same test. Or it may be because the subjects have
practiced the specific test items used in the test. In certain other cases, changes in
scores over time reflect real changes in the individual being tested; in such cases,
changes in scores are reflecting ‘true’ rather than ‘error’ variance.
2. Test Items: The second major source of error variance involves the content of
the test items. The items selected represent only a sample of these that could
measure that characteristic. The items on a test represent a sample from a
population, or “domain” of possible test items. Therefore the particular selection of
test content could greatly influence an individual’s test performance.
3. The test Administrator: A third kind of error is that which can occur if the test

ANNAMALAI UNIVERSITY
administrator fails to administer the test correctly or if the test is improperly
scored. A test administrator may lose track of the passage of time and through
carelessness, give examinees 25 minutes of work instead of the 20 minutes
specified in the instructions. The administrator may forget to ask the examinees to
read the instructions or to complete the practice questions, thus affecting the
proper performance on the test. Errors in scoring can occur through lack of correct
scoring information or through carelessness.
4. The test Situation: The situation, in which testing occurs can contribute to
error if, conditions are not constant from one situation to the other. If conditions
distract the individual from full attention to the testing process, for example, a
noisy, poor light, or excessively warm testing room, humidity may affect
performance.
144

5. The examinee: The examinee himself/herself may also be a source of error.


Sickness or fatigue during testing may cause an individual’s score to reflect
inaccurately his / her true score to be poorly related to performance under good
conditions. Similarly, individuals who are not motivated to do well on the test, and
don’t care about providing accurate information or who purposely make mistakes
or provide inaccurate information can contribute error to their test scores.
The objective of developing reliable tests is to minimise these kinds of errors and
their influence on that score.
The basic approach to minimise the effects of errors is to include the
development of detailed instructions both for administration and scoring. It is also
necessary to make sure that individuals administering and scoring tests are
responsible and should be careful in following the instructions.
It is important to establish testing conditions that permit the examine to work
without distractions and physical discomfort. It is essential to ensure that
examinees understand the instructions for taking the test and are motivated to do
their best on the test.
Beyond these methods of standardizing and optimizing testing conditions
different types of reliability focus on the reduction of different types of error
variance in test scores.
These different types of reliability can be used as guidelines to the construction
of high quality tests and also as a means of evaluating the quality of a test.
The different types of reliability include.
1. Test – Retest Reliability
2. Alternate Forms Reliability
3. Split – half Reliability and
4. Internal consistency Reliability
1. Test – Retest Reliability: This type of reliability is known as “stability”. Stability
is the extent to which individuals tend to obtain a similar score, upon retaking the
same test. If test scores are relatively stable across repeated testing separated by a
time interval, there is some basis for believing that the test is measuring something
ANNAMALAI UNIVERSITY
in a consistent and generalizable manner across time.
The study of test – retest reliability involves two administrations of the same test
to the same individuals with a time interval between the two test administrations.
Typically, a test – retest design should involve a time interval of atleast one week
between test and retest. The reliability coefficient is the correlation between the two
sets of scores. That is, the correlation of scores obtained at the first administration
of the test with those obtained from the same individuals at the second
administration of the test.
Stability coefficients can be obtained using any time interval, but generally, the
longer the time interval, the less stability is likely. In other words, the longer the
time interval between testing, the greater the likelihood that individuals will
145

actually change their responses. Thus, for interpreting a stability coefficient, it is


important to know the time interval observed. For example, a coefficient of r = 0.80
representing a one-year interval would indicate greater stability than identical value
representing a one-week interval.
A test – retest design is appropriate only if the trait or dimension itself is stable
over time. For example, characteristics like intelligence and extroversion are
relatively enduring characteristics of an individual; thus the stability of these
variables / measures is an appropriate dimension to measure.
Some characteristics of individuals such as mood states, tend to vary
considerably across time. The Minnesota Multiphasic Personality Inventory (MMPI),
depression scale, reflects mood at the time of testing, is relatively unstable upon
repeated testing. Thus stability may not be an appropriate criterion. Test retest
interval can be strongly influenced by experience, training or practice specifically
related to the content of the test. For example, a test of a typing speed and
accuracy given to two individuals. Following the first testing, one extensively
practices his / her typing, while the other doesn’t. On retest, it is reasonable to
expect the first individuals scores to improve considerably but would not expect
such improvement from the second individual. Practice or experience effects such
as this reduces the magnitude of the correlation between scores obtained from the
retest. To obtain meaningful indices of the stability of such tests, it is important to
control for the nature of experiences relevant to test performance that occur in the
test-retest interval.
A person’s memory of previous responses may artificially inflate the stability
coefficient. Practice with the test materials themselves (for e.g. verbal analogies or
numerical word problems) are likely to improve the performance of some people
more than others and may thus reduce the stability scores. Generally, the longer
the interval between testing, the less influence of memory and the effects of practice
with the specific test items will have. Longer time intervals increase the subjects
experiences and learning. Usually studies include both short-term and longer term
designs.
Another problem with test-retest designs, specially with those using longer time

ANNAMALAI UNIVERSITY
intervals, involves less of subjects between the first and the second testing. For
example, if the college students are tested during the last term of the final year and
re-tested in the next first term, a percentage of those initially tested may have left
the college or changed address before the second test is administered.
It is important to attempt to retest as many subjects as possible initially tested
and to determine whether subjects who do not take the second test differ in
important respects from these individuals tested twice.
In summary, while test – retest reliability should be designed, appropriate time
intervals should be considered for the conceptualization of characteristics to be
measured, the possible effects of memory, practice, and experience and the possible
loss of subjects.
146

2. Alternate Forms Reliability: Alternate forms reliability assesses the degree to


which two different forms of the same test yield similar results. This type of
reliability refers to the variance associated with test content, if an individual’s score
in one from of a test is similar to his score on an “alternate” form of a test, it
indicates that the test items are measuring the dimension in a relatively consistent
manner.
For this purpose, two forms of a test, that are equivalent or “alternate”
measures of the same attribute or dimensions are developed. For example, if a test
constructor developed a pool of test items designed to measure numerical ability,
those items could be used to construct two 50 – item tests of numerical ability
designated Form A and Form B. It is essential that pairs of similar items be
assigned to the two forms of the test, one to each form. If there are ten subtraction
and ten multiplication items, for example, five of each should be assigned to Form A
and five of each to Form B.
The alternate forms thus constructed should be administered to a single group
of examinees. The reliability coefficient is the correlation between the two sets of
scores. If the interval between the first form and the second form is very short
(usually the two forms are administered in the same testing session), the resulting
reliability coefficient reflects the adequacy of the content sampling and the extent to
which the items on the two forms are measuring the same thing. In addition to
this, alternate forms can be used to estimate the test’s stability, if a longer time
interval for example – two weeks or a month – separates the administration of the
two forms. Stability coefficients obtained in this way will generally be lower than
coefficients obtained from re-administering the same test. Stability coefficient of
this type reflect not only changes over time, but also differences in the two forms of
the test.
Care should be exercised in ensuring the equivalence of alternate forms. The
two forms should contain the same number and type of items. In tests of ability or
achievement, the items should be appropriately equal in difficulty, that is, alternate
forms of a test should contain the same number of easier items and more difficult
items.

ANNAMALAI UNIVERSITY
When alternate forms are administered the order in which two forms are given
should be counter balanced that is, half the subjects should receive Form- A first,
while the other half should receive Form – B first. This counter balancing controls
the possible occurrence of fatigue, practice or motivational effects of fatigue.
The construction of alternate forms is somewhat time consuming and requires
considerable care. Alternate forms reliability provides a good estimate of the degree
to which performance fluctuates from one set of the test items to a second set of
theoretically equivalent items. In addition, it can provide information regarding the
degree to which test scores fluctuate or are relatively stable over time.
3. Split – Half Reliability: From the discussion of alternate forms reliability, it
may be noted that it should be possible to divide a test into two half and examine
147

the relationship between the two half – scores. This approach is known as split –
half reliability. Conceptually, it is similar to alternate forms reliability. If scores
obtained from one half-of the test are similar to those obtained from the other half,
there is reason to believe that the test items are measuring the same dimension or
characteristic.
To arrive at the split half reliability, the test is administered as a whole, but test
scores are computed separately for each half. The correlation between the two half
scores is related to the index of reliability. Temporal stability of test scores does not
enter into this type of reliability, since only one test is administered and used for
calculation.
Generally, the division of test should lead to the same kind of equivalence
expected by alternate forms. A common approach to split – half test division is to
assign the odd-numbered items to one half and the even numbered items to the
other half.
In other cases, item assignment may be accomplished using a random numbers
table. It is usually not used to divide a test at its midpoint, since factors such as
fatigue or practice may operate to effect the latter portion of the test.
The correlation between the two-half scores gives an estimate of the reliability of
each half – test, but not of the reliability of the full test.
Other things being equal, the more items in a test, the more reliable it will be.
To estimate the reliability of the full complement of items, the Spearman – Brown
formula is used. The general form of this formula can be used to estimate the effect
a reliability of lengthening or shortening a test by any number of items. A
simplified version of the formula used in split – half designs is given below:
2r12
 test =
1  r12
In this formula,  test is the estimate of the reliability of the whole test, and r 12
is the observed correlation between scores obtained from the two halfs of the test.
In summary, split – half reliability provides a relatively simple method of
evaluating the degree to which the test items measure something in a relatively

ANNAMALAI UNIVERSITY
consistent manner.
4. Internal – consistency Reliability: This type of reliability refers to the degree to
which each item on a test is measuring the same thing as each other item. This
inter-item consistency is known an internal consistency, or homogeneity. A test is
internally consistent, or homogeneous, if an individual’s responses to, or
performance on one item is related to his/her responses to all of the other items in
the test.
Internal consistency is directly related to the “unidimensionality” of the test.
The term unidimensionality refers to the extent to which the test items reflect one
dimension rather than several dimensions.
148

For example, a test of intelligence measuring the verbal ability alone is known
as measuring only one dimension. Thus it is a uni-dimentional test whereas a test
like Weschler’s Adult Intelligence Scale measures intelligence in terms of verbal
ability, numerical ability and performance ability. Thus, it is a multi-dimensional
test measuring ability on different dimensions of intelligence.
An internal consistency reliability coefficient is a reflection of the
unidimensionality versus multi dimensionality of test items; the more internally
consistent a test is, the more evidence there is for the unidimensionality of test
items. Internal consistency is also often referred to as homogeneity (sameness of
content) is distinguished from heterogeneity (differences in content) of test items.
Estimate of internal consistency reliability will not necessarily be similar to
estimates of alternate forms or split-half reliability, even though all three types of
reliabilities refer to the content of the test items. The homogeneity or heterogeneity
of test content is best studied using estimates of internal consistency reliability.
For obtaining an internal consistency reliability the test should be administered
once. Following administration of the test, internal consistency reliability of an
inter – item consistency is computed using
a) Cronback’s Alpha Coefficient (1951) and
b) The Kuder and Richardson Coefficient (1937) known as K-R 20
K – R 20 estimates the proportion of true variance, relative to the amount of
observed score variance.
The Kuder – Richardson method can be applied when the item are of equal
difficulty and have equal inter-correlation.
The most accurate of the K – R formulas known as formula – 20 is
 2 
 n   St   pq 
rtt   
 n  1  S2

 t 
Where n = Number of items in the test
p = Proportion passing an item.
q=1–p

ANNAMALAI UNIVERSITY
S2
t = Total test variance
pq = sum of the variance of all the items.
The K – R formula is applied only when items are scored either 0 or 1. (for
example right – wrong, yes – no)
Cronbach’s Alpha is appropriate for use with non-dichotomous items for
example, items scored using a five point response continuum ranging from
“strongly agree” to “strongly disagree”
Internal – consistency reliability coefficients are in appropriate for use with tests
having time – limits. Tests which assess abilities and skills, the differences in
scores append on the speed of performance. Inter – item consistency is not
149

appropriate (Speed) for the tests because the difficulty level of each item and the
time allotted to testing will also differ. A restrictive time limit would result in
correct responses to items appearing early in the test and largely incorrect
responses (or unattempted) to later items. Thus, it reduces the inter – item
correlation and correspondingly the internal consistency reliability.

RELIABILITY AND LENGTH OF TEST


The whole test is more reliable than either half. In general, there is an increase
in the length of the test.
The mean of five measurements of the height will be more reliable than a single
measurement. And the mean of ten measurements will be more reliable than the
mean of five. In the same way, increasing the length of the test or averaging the
results obtained from several applications of the same test, or from alternate forms,
will tend to increase reliability.
The effect of lengthening or repeating a test may be obtained from the
“Spearman – Brown” prophecy formula
nr12
rnn 
1  (n  1)r12
(Spearman – Brown formula for estimating the correlation between n forms of a
test and n other similar forms) in which
rnn = the correlation between n forms of a test and n alternate forms (or the
mean of n forms against the mean of n other forms);
r12 = The reliability coefficient
The subscripts ‘12’ shows that the correlation is between two forms of the same
test.
To illustrate the use of formula, suppose that in group of 100 adults, the self
correlation of a test is 0.70. The effect of tripling the length of the test upon test
reliability is
(r12 = 0.70 n=3)
3  0.70 2.10
rnn    0.88
1  (3  1)0.70 2.40

ANNAMALAI UNIVERSITY
Tripling the test’s length, therefore, increases its reliability coefficient from 0.70
to 0.88. Instead of tripling the length of the test, three parallel forms of the test
may be given and the average of the scores for each person can be calculated. The
reliability of these mean scores will be the same as the reliability got by tripling the
length of the test.
Predictions of test reliability by the Spearman – Brown formula are valid only
when the items or questions added to the test cover the same concept, are of equal
range of difficulty, and are comparable in other aspects to the items of the original
test.
The Spearman – Brown formula may be applied to ratings, judgements, and
other estimates as well as to test items.
150

FACTORS INFLUENCING THE RELIABILITY OF TEST SCORES:


Chance and Constant Errors
Many factors affect the reliability of a test besides fluctuations in interest and
attention, shifts in emotions, and the differential effects of memory and practice.
Environmental disturbances such as distractions, noises, interruption, errors in
scoring should be added to these psychological factors mentioned above. All of
these variable influences (environmental and psychological) are included under the
heading “chance errors”. The chance errors influence a score in such a way that
the score varies above or below its “true” value. Thus, the reliability coefficient a
quantitative estimate of the chance of variable influences upon test scores.
Constant errors, as distinguished from chance errors, work only in one
direction. Constant errors may raise or lower all of the scores on a test or the
alternate forms of the test, but will not affect the reliability coefficient.
How high should the self-correlation of a test, for the reliability of the test to be
considered satisfactory? It depends upon the nature of the test, the size and
variability of the group tested, and the purpose for which the test was given.
The difference between the means of two relatively small groups of narrow range
of ability, a reliability coefficient needs to be 0.50 or 0.60.
If the test is used to differentiate among the individuals in the group, however,
its reliability should be 0.90 or more.

THE INDEX OF RELIABILITY


An individual’s “true score” on a test is defined as the mean of a very large
number of observations or measurements for the person on the same test or
parallel forms of the test administered under approximately identical conditions.
The correlation between a series of obtained scores and their theoretically “true”
scores may be found by the formula
r1 = r12
(correlation between obtained scores on a given test and true scores in the
aspect measured by the test) in which
r12 = the reliability coefficient of the test;
r1 = the correlation between obtained and true scores.

ANNAMALAI UNIVERSITY
The symbol “inf” (infinity) designates “true” scores that is, scores obtained from
an “infinite” number of administrations of the test to the same group.
The coefficient of r1 is called the index of reliability; it measures the
trustworthiness of test scores. This coefficient shows how well obtained scores
agree with their theoretically true counterparts.
For example, for a given test the self-correlation is 0.64. Then r100  0.64 or
0.80; and 0.80 is the highest correlation of which this test is capable.
If the self-correlation of a test is only 0.25, so that r100  0.25 or 0.50, it is a
waste of time to use this test without lengthening or otherwise improving it. A test
whose index of reliability is only 0.50 is an extremely poor estimate of the concept
which it is trying to measure.
151

INTERPRETING RELIABILITY COEFFCIENTS


There are several ways of interpreting reliability coefficient of a given value.
The first method is to interpret a reliability coefficient as the proportion of
observed score variance, which the remaining 20 percent is the error variance.
Related to this definition, the reliability coefficient is the concept of standard
error of measurement (SEM). This is based on the assumption that observed scores
consist of a true score component plus an error component; SEM is an estimate of
the relative size of the error component for any individual tested. The formula for
measuring standard error of measurement is as follows:
SEM = 1 1  rtest test where

r1 = the standard error of an obtained score (also called standard error of
measurement)
1 = the standard deviation of the test score.
 test = the reliability co-efficient of the test.
The standard error of an obtained score is the best method of expressing the
reliability of a test, since it takes into account of the self-correlation of the test as
well as the variability with the group.
SUMMARY
The important characteristics of a research tools are reliability and validity.
Reliability refers to the upon testing and retesting. Reliability includes consistency,
sterility, replicability and repeatability. There are numerous error in a test score.
Reliability ensures the maximization error variance in test scores. There are four
important methods of establishing reliability of a test viz., test-retest reliability;
alternate forms reliability; Split-half reliability and internal consistency reliability.
The increase in length of a test ensures high reliability. Many factors affect the
reliability of a test. Chance errors and constant errors are two common errors
which affect the reliability of a test. The reliability co-efficient is interpreted as the
proportion of observed score variance. Standard error of an estimate indicates the
power of a test.
KEY TERMS
Reliability Test-Retest reliability

ANNAMALAI UNIVERSITY
Alternate forms reliability
Internal consistency reliability
Split – half reliability
Index of reliability
Standard error of the measurement
QUESTIONS
1. Explain the concept of reliability with suitable examples.
2. Examine the several sources of errors in the test scores.
3. Describe the split half and internal consistency reliability.
4. Discuss the retest and alternate forms reliability.
5. How do you interpret the reliability coefficients?

152

LESSON – 17

TOOLS OF RESEARCH AND TESTS (Contd… )

OBJECTIVES
After reading this lesson the student should
 Understand the meaning and importance of validity.
 Explain the different types of validity.
 Understand the rationale for test construction
 Explain the various approaches of the test construction
 Describe the principles involved in item analysis.

SYNOPSIS
Validity – Methods of obtaining validity co-efficients. Test construction :
Rational test construction – Empirical test construction – Factor analytic test
construction – Steps in test construction – Sources of information about tests.

INTRODUCTION
Reliability indicates the degree to which a test is measuring some attribute in a
consistent manner. The term ‘validity’ refers to the extent to which the test actually
measures the characteristic or dimension we intend to measure. If a test is
designed to measure intelligence, measures something else, then it is not a valid
measure of intelligence. Validity is the second major requirement for a test of good
quality that has some usefulness for applied purposes.
In addition to this, the concept validity is concerned with the theoretical and
applied usefulness of a test. The usefulness of tests depend upon ability to make
inferences about people or environments from the test scores.
There are 3 major kinds of validity are content validity, criterion – related
validity and construct validity. Other types of validity are appropriate for particular
uses.

i) Content Validity

ANNAMALAI UNIVERSITY
The term content validity refers to how well the content of a test measures the
entire domain and is related to the characteristic being measured. Content
validation allows us to judge whether or not the content of a test is representative of
the desired universe of content. Content validity described the extent to which our
sample of items or situations is a reflection of the dimensions, domain or
characteristic to which we wish to generalise.
To ensure content validity, it is necessary to carefully define the dimension of
interest – that is the test items should reflect the dimension. For example, in
definining the construct of ‘anxiety’, it is necessary to specify what behaviours may
indicate anxiety. For example, increased heart rate, sweaty palms or simply the
person’s report of feeling anxious are the indicators of anxiety. And also to specify
153

an universe of situations in which anxiety might be felt. Some people may feel
anxious in social situations, some in academic or job situations, some while driving
a car or while shopping and some in all these situations. For an estimate of a
persons overall tendency toward becoming anxious, it is important to sample out
from the entire range of situations in which the behaviour could be exhibited.
Evidence in support of content validity is the judgement of those who construct
the test or other experts familiar with the subject area or definition. Since, this
kind of evidence is usually subjective, it should be accompanied by a detailed
definition of the behavioural domain and by a clear specification of the methods
used to select items.
Content validity can be indirectly evaluated through the degree to which the test
shows high internal consistent reliability or homogeneity.

ii) Criterion Related Validity


This type of validity refers to the extent to which a measure of a concept
indicates an association with some independent or external indicator of the same
concept.
This external indicator, “the criterion” often represents the behaviour, we are
actually interested in. For example scholastic aptitude test are used to predict
success in completing college curriculum.
There are two kinds of criterion related validity (1) Predictive validity (2)
Concurrent validity.
1. Predictive Validity: This is studied when the criterion is measured sometime
after the scores are obtained on the predictor. This is to study how present status
on the test predicts future status on the criterion variable.
2. Concurrent validity: This kind of validity is studied when both the predictor
and the criterion scores are obtained at the same time. Here, the interest is in the
relationship between the present status on the test and also on the criterion. For
example the observed relationship between scores on scales of the MMPI and
present psychiatric status would be concurrent validity data.
Two major ways of demonstrating a relationship of this kind are through

ANNAMALAI UNIVERSITY
correlational data regarding differences between groups.
‘Predictive validity’ tests whether the test provides the information that they
need for predicting outcomes and specific kinds of tasks and assignments – For
example clerical performance, academic achievement or salesmanship. A test is
valid if the test predicting outcomes in clerical work correlates with an assessment
of clerical proficient. Another test is valid if it predicts the extent of sales under
standard conditions.
The Pearson – Product moment correlation is appropriate for criterion – related
validity studies. If such correlations are zero or in a direction other than that
predicted, the criterion – related validity of the test is questionable. For example, if
people scoring high in introversion on a test of personality, have more friends,
154

belong to many social groups, the validity of the test is low. Thus the test is not
valid.
The practical use of criterion – related validity coefficients, based on the
methods of regression – analysis, is in prediction. Regression equations of this sort
are used extensively in the selection of people for educational and job-training
programmes.
If the relationship between two variables is assumed to be linear, Pearson –
Product – moment correlation is an appropriate index of the relationship. If the
relationship is curvilinear, the eta coefficient is the appropriate index of
relationship. For example, if it is assumed that very high or very low levels of test
anxiety is associated with lower levels of test performance, the relationship of
anxiety to performance will not be linear. Thus the eta coefficient would be
appropriate.

Group Difference Approach to Criterion – Related Validity


A mother approach to the study of criterion validity involves the extent to which
test scores can differentiate between groups of people.
In applied situations, scores can be used to predict success or failure in job
training programmes or to separate people who have psychotic tendencies from
normal individuals or to separate people who will be satisfied in a particular
occupation from those who will not be satisfied.
For predictive purposes, an important aspect of group-differences data is not
only the degree of difference between group means, but also the amount of overlap
between the score distributions of the two groups. The amount or percentage of
overlap between the two score distributions may be described exactly through the
calculation of Tilton’s (1937) overlap statistic. A high percentage of overlap (for
example 75%) indicates a test with little utility for predictive (classification)
purposes, while overlap percentages below 50 per cent indicates that the test can
be useful in prediction of group membership. For example the ‘successess’ versus
the failures (Dummette, 1966). Any statistical index of relationship between
variables may also be appropriate.

ANNAMALAI UNIVERSITY
iii) Construct Validity
Construct validation is the process of gathering data to support that the test
designed is actually a reflection of the construct or attribute.
Construct validation occurs within the context of a theory or set of hypotheses
concerning the construct. The constructs of the theory are made observable
through the tests or assessments and then the relationship among variables can be
studied through the relationship between tests.
The process of construct validation involves the following steps:
1. Construct is carefully defined and hypothesis regarding the nature and
extent of its relationship to other variables are postulated.
2. An instrument designed to measure that construct is developed.
155

3. The reliability of the test is examined. After establishing the reliability, the
relationship of the test variables are examined.
If the hypotheses about the construct are supported in research studies using
our measure of that construct, the construct validity of the test is supported.
The validation of construct or theories may require the use of Factor Analysis.
Construct validity is an important characteristic by which the test acquires
meaning, interpretability and usefulness.

iv) Convergent and Discriminate Validity


The process of validation involves the study of relationship of the new test to
other variables, which is designed to measure the same construct. For example, if
a new measure of creativity is developed, it should have some relationship to other
valid measures of creativity. Thus the relationship between this test to other tests
of creativity or to other’s judgements regarding a persons artistic, literary, musical
or scientific productivity. The relationship of the new test to independent tests or
indices of the same trait is known as convergent validity.
The validation process – should also demonstrate moderate or negligible
relationship between the test and certain other variables. For example, if the new
test of creativity is found to have a very high relationship to intelligence, there
would be no need for the test of creativity. A very high correlation between two
theoretically different constructs suggests that one of them is unnecessary.
It is also necessary to demonstrate that the new test is unrelated to variables
that is entirely different; this is known as ‘discriminate validity’. If the concept of
creativity assumes that it is unrelated to weight of an individual it is expected to
find a zero correlation between creativity scores and weight.
It is necessary to consider the study of “method variance” in the testing for
Convergent and Discriminant validity. Method of variance concerns the extent to
which scores on two different tests are correlated because they used the same
format – for example, True – False, Multiple choice, essay, and also to consider
whether or not speed is required in taking the test.
If three different abilities are tested for the correlations among scores on the
three different tests could be artificially inflated due to the common method of
ANNAMALAI UNIVERSITY
measurement. Method variance operates to increase estimates of convergent
validity because it inflates the correlations between test scores. For example, if
three tests of intelligence showed high correlation, when all used the same method
of measurement. The measures of intelligence used here measures three different
abilities.

v) Incremental Validity
Tests are used widely for selection and placement in Educational institution and
in Business and Industry.
Before using test for selection and placement purposes, it is evaluated as to
what extent the test improves the accuracy of decision – making beyond the
156

available or less expensive methods of assessment. The capacity of a test to lead to


improved prediction over already available predictive tools is known as ‘Incremental
Validity’.
For example, a test having a correlation of r = 0.90 with the criterion measure
(e.g. job performance) would not have incremental validity if some available method
of selection had the same degree of predictive accuracy. On the other hand, a test
with low-to-moderate predictive validity. For example r = 0.30 or r = 0.50 might
have considerable incremental validity, if there were few or no other bases upon
which to make the necessary decision.

vi) Face – Validity


The term face validity concerns the extent to which the test appears to measure
the concept. For example, if you are told to begin an intelligence test, in which all
the items ask your feelings about your parents.
Face validity is not necessarily equivalent to content validity. In some cases,
indirect ways of getting at certain characteristics may be necessary, especially if
people’s knowledge of themselves or their honesty in self-description is in doubt.
Thus a test items may have content validity as judged by experts in the field, but
may not have face validity for test users and test takers.
A test is not valid just because it appears to be valid. Face validity is not
required in the standards of educational and psychological tests (1974) developed
by a joint committee of the AERA, APA and NCME.
However, the effective use of tests depend on the established and maintenance
of trust and ‘rapport’ between test users and the public and face validity may be
very important in this regard.

RELIABILITY AND VALIDITY


The essential difference between reliability and validity is that reliability
coefficients always reflect the correlation of an examination with itself (or with a
parallel form of itself). Whereas validity coefficients require some criterion external
to the test itself.

TEST CONSTRUCTION
ANNAMALAI UNIVERSITY
The method of test construction follows from the rationale underlying the test
itself and is thus important for its interpretation. There are three major approaches
to test development.
(1) The rational (2) the empirical and (3) the factor analytic.

i) Rational Test Construction


This approach to test construction is based on the assumption that the content
of the test items directly reflects the characteristic or dimension we are interested
in measuring. If we are measuring knowledge of Indian history, for example, we
assume that the question who was the first prime minister of India? is direct
157

reflection of an individual’s knowledge. In essence, then, the test items bear a


logical or intuitive relationship to the dimension of interest.
Rational test construction begins with a careful detailed definition of the
characteristic or construct to be measured. If we are measuring the knowledge of
Indian history, we need to define the term ‘Indian’ and delineate the time period to
be included in our measure.
Once the dimension of interest (say, the construct) is defined, the test
constructor develops test items that are related to the content – that is, the
definition – of that dimension. For example, for a scale assessing interest in being
democratic, the item would bear obvious relationship of interest. Generally, a test
constructor collects large number of items more than that needed in the final
version of the test. A large number of items is necessary, so that items that do not
relate to the dimension of interest may be eliminated.
Finally, based on item analysis findings the test constructor selects from the
initial collection of items a number of good items desired in the final version of the
test. Then, these items are logically related to the construct of interest. In
interpreting scores on a rationally constructed test, we not only know the person’s
score on the dimension of interest, but also inferences about the specific ways in
which he/she is described.

ii) Empirical Test Construction


In contrast to the rational approach to test construction, empirical test
construction is based on the differences in the individuals responses on the
dimension of interest, rather than on item content. In other words, test
construction begins with a search for items that will predict a criterion measure.
For example, the occupational scales of the Strong-Campbell interest Inventory
(SCII). In this case, the criterion groups were people satisfied in various
occupations (e.g. Engineers, physicians, etc.). Items on which the responses of
people in a given occupation differed from the responses of people in general were
used in constructing the occupational scale.
Item content was not considered in the construction of the SCII scale. This
scale, then, could contain items bearing no obvious relationship to the dimension

ANNAMALAI UNIVERSITY
being measured.
Several points concerning empirical test construction should be noted. First,
the practical purposes of psychological testing have emphasized the capabilities of
test scores in predicting true phenomena. Since, empirical test construction is
based on the capacity of a test item to predict a criterion, rather than item content,
this approach has had considerable utility for applied purposes. Second, empirical
test construction is based on a theoretical rationale that criterion groups differ from
each other as reflected in test scores. However, the ways in which groups will differ
is not known. Thus the underlying rationale for test construction differs from that
of the rational approach.
158

Finally, interpretations of empirically based test scores is in terms of the


similarity of the individual responses to those of various criterion groups, rather
than in terms of item content. Unless we have familiarity with the specific items
used in constructing the scale. We cannot make inferences about the person’s item
responses simply from knowledge of the test score.

iii) Factor – Analytic Test Construction


In this approach a large pool of items, perhaps taken from many different tests,
is factor – analysed to determine the basic dimensions underlying the test items.
These basic dimensions become the scale and the items loading highly on each
factor are those used in constructing the test. One of the first test batteries
constructed in this way was the General aptitude Test Battery, developed by the
United States Employment Service.
Factor analaysis has also been used in the construction of personality
inventories. For example, the Sixteen Personality Factor Questionnaire was based
on factor-analytic research directed at identifying primary or basic personality
traits.
In interpreting factor – analytically derived test scores the primary objective of
this approach of test construction is the derivation of scales that are homogeneous
and relatively independent of each other. The primary criterion for item selection is
that the item loads highly on that factor. Thus factor – analytic approaches may
provide a blend of the content – relevance of items found in the rational approach to
test construction and the high correlation with a criterion found in the empirical
approach to test construction.

TEST CONSTRUCTION
One way of evaluating a constructed test is by their reliability and validity. But
the best way to ensure a high quality test is through proper and careful methods of
test construction. The components of the desirable approach to test construction is
a combination of the rational and the empirical approaches which include the
following steps:
1. Careful definition of the attribute to be measured.

ANNAMALAI UNIVERSITY
2. Writing large pool of items / statements logically related to the attribute of
interest.
3. Administration of the items to a large sample of subjects, often called the
‘development sample’.
4. Refinement of the original item pool through item analyses and expert
judgement.
5. Administration of the revised test to a new sample of subjects.
6. Based on the new samples, examination of evidence for reliability and
validity and computation of normative data.
159

i) Definition of the construct


The test construction begins with a careful, detailed definition of the
characteristics to be measured. This definition should include both what the
attribute is and what it isn’t; it should include a specification of both behaviours /
tasks include in the definition and those not included.
In many cases, actually defining the meaning of the construct is the most
difficult step. Although we all have intuitive commonsense understanding of
characteristics such as intelligence, anxiety, dominance and sociability it is not
easy to carefully specify exactly what behaviours and responses should be assumed
to indicate the construct. This is the very important and challenging step since
everything that follows depends on the specificity and clarity of the definition of the
construct.

ii) Developing test items


Next, the test constructor develops test items that are related to the content
(definition) of that dimension. This kind of item development requires that the test
developer carefully examines the definition of the construct and then infer specific
behaviours or responses that should reflect components of that definition. The test
constructor would examine the definition and then write items that reflect the
characteristics indicated by the definition. The test constructor should develop a
pool of items of a larger number than that is needed in the final version of the test.
A large pool of items are necessary so that items, that do not relate as postulated to
the dimension of interest may be eliminated.

iii) Administration of test items to a developmental sample


The next stage of test construction involves administration of the items to a
preliminary sample of subjects. The subjects in this group should be representative
of the population of subjects for whom the test itself is intended. Subjects in the
development sample are administered the test under conditions identical to those
that will be used in the administration of the completed test. The responses
obtained from this pilot testing are used in item analysis.

iv) Item Analysis


Although definition of the construct and development or relevant test items are
ANNAMALAI UNIVERSITY
intended to ensure that the items in the initial item pool are reasonably good
representatives of the construct, knowledge of other properties of the tst items is
used in ‘refining’ the item pool. Sometimes the process of item refinement requires
us to write new items, because in the process of eliminating items, we don’t have
enough items left to constitute a test of the desired length.
The two major statistical properties useful in the selection or elimination of test
items are item difficulty and item discrimination.
a) Item difficulty and indices of difficulty: When the item difficulty is used in an
item – analysis context, it also covers the concept of ‘popularity’. There are two
wrong things with using the item mean as an index of difficulty for an item. One is
that the larger the pi (the proportion ‘passing’ item), the easier the item (i.e.) the
160

scale of difficulty is reversed. The other is that the scale of proportions is not an
interval scale. A correct rank ordering of items can be achieved, and items having
the same pi, may be assumed to be of equal difficulty for the same population.
A Rational Scale for item Difficulty: A rational scale of difficulty of items is
achieved by making a transformation from proportions to corresponding standard Z
score values. The conventional rational scale procedure assumes that the
proportions represent area under the normal – distribution curve. A given
proportion pi is taken to represent the proportion of the area under the normal
distribution ‘above’ a certain corresponding Zi value on the base line of the curve.
Thus, the greater the proportion, the lower the Zi value.
Scaling items for difficulty on Z scales is not always necessary. Doing so,
however makes feasible a number of things that would otherwise not be possible.
Having equal unit or interval scale, values for items we find it possible to talk about
functional relationship between difficulty of the items and some other properties.
For example, if items are sounds to be judged for differences in pitch or in
loudness, stimulus properties can be related to item difficulty. Relation to
chronological age and other variables also become possible in quantitative terms.
Item difficulty when there is chance success: Most of the frequently used tests
are composed of items to each of which the response is to be chosen from several
(commonly 2 to 5) possible answers. In such cases, the proportion of right answers
is inflated by an increment due to chance success. The smaller the number of
alternatives, the greater the chance contribution.
Item Discrimination: This is an important property for almost all types of tests
designed to assess some unitary attribute. Item discrimination refers to the extent
to which an individual's response to a given item measuring a construct are related
to their scores on the measure as a whole. If item responses do not relate to
performance on the scale as a whole, we have reason to suspect that the item is not
measuring what it was intended to measure. For example, the brightest child in
the class scores 99 in an achievement test, and that the one item he / she
answered incorrectly was an item that the poorest student answered correctly: we
would wonder the extent to which that item was actually measuring achievement in

ANNAMALAI UNIVERSITY
the subject matter.
Items like as these may be poorly discriminating, because they are not clearly
worded, or ambiguous in meaning, or subject to alternate interpretations, or simply
are not related to the dimension of interest. Indices of item discrimination allow the
test constructor to select the ‘good’ items and to eliminate from the test the ‘bad
items’.
Item correlation with an external criterion: There are many indices of item
discrimination. Two correlation coefficients used extensively in the examination of
discriminating power are the bi-serial and point bi-serial rp , which is a Pearson’s
6
Product moment correlation coefficient, describes the relationship between a
dichotomous variable and a continuous variable, such as total score on a test.
161

1) A special formula, which does not resemble the basic Pearson formula reads
Xp  Xq
rpbi = pq
St
Here, rpbi = point-biserial coefficient of correlation.
X p = Mean of the score for those passing the item.
X q = Mean of the scores for those failing the item.
St = Standard deviation for the total scores.
p = proportion passing the item.
q = proportion failing the item.
The biserial rb, on the other hand, describes the relationship between a
dichotomized variable and a continuous variable.
2) The general formula for biserial correlation coefficient is
Xp  Xq pq
 bis  
St Y
where,
X p = mean of X values for the higher group in the dichotomized variable, the
one having more of the ability or which the sample is divided into two sub-groups.

X p = mean of X values for the lower group.


p = proportion of cases in the higher group.
q = proportion of cases in the lower group.
y = ordinate of the unit normal distribution curve at the point of division
between segments containing p and q proportions of the cases.
St = standard deviation of the total sample in the continuously measured
variable X.
The computation of both rpb and rbis is based on comparing the mean test scores
of those who pass the item and of those who fail it, in relationship to the difficulty
of the item itself. Evidence for item discrimination requires that the overall test
ANNAMALAI UNIVERSITY
scores of those who respond correctly to an item be higher than the mean scores of
those who respond incorrectly to the item.
Another index of item discrimination is the phi coefficient (), also a Pearson
product – moment correlation coefficient. Phi expresses the relationship between
two dichotomous variable. As an index of item discrimination it tells us the degree
to which the percentage of subjects passing the item is related to their position in
the upper or lower group of overall scorers.
Figure 10.1 A normal distribution of the cases along the scale of ability to pass
the course of training. The area to the right of the ordinate shown represents the
65 percent who graduated and the area to the left represents the 35 percent who
failed to graduate. The Y ordinate is 0.3704.
162

Other indices of item discrimination included a simple comparison of the


percentages of subjects answering correctly in extreme groups. This has been
referred to as ‘U-L index’ or simply as ‘D’.

y=0.3704
P = 0.65

N = 0.35
Below average ability 0 Above average ability

SUMMARY
Validity refers to the extent to which the test actually measures the characteris-
tic we intend to measure. Validity is concerned with the theoretical and applied
usefulness of a test. There are 3 types of validity such as content validity, criterion
– related validity and construct validity. Apart from these we have convergent and
discriminate validity, incremented validity and face validity. All the validity test
need not be valid but the valid tests should be reliable. There are 3 methods to
construct a test viz. rational, empirical and factor analytic method. Test
construction involves many steps including item difficulty and item discrimination.
After constructing a test the reliability and validity of the test should be
established.

KEY TERMS
Validity Content validity
Criterion related validity Predictive validity
Concurrent validity Construct validity
Convergent and discriminate validity
Incremental validity Face validity
Test construction Rational test construction

ANNAMALAI UNIVERSITY
Empirical test construction
Item difficulty
Factor analytic test construction
Item Discrimination
QUESTIONS
1. Define validity and explain the different types of validity with suitable
examples.
2. Explain the importance of establishing validity of a test.
3. Bring out the rationale for test construction in detail.
4. Discuss the empirical and factor analytic test construction.
5. Elaborate the principles and procedures of item analysis.

163

LESSON – 18

RESEARCH METHODS

OBJECTIVES
After reading this lesson the student should
 Understand the various research methods.
 Describe the historical research.
 Explain the normative survey with suitable examples.
 Describe the experimental research.
 Explain the caution approach in experimentation.

SYNOPSIS
Historical research – Normative survey – Experimental Research.

INTRODUCTION
A scientific research is a systematic and objective attempt to provide answers to
certain questions. The purpose of scientific research is to discover and develop an
organised body of knowledge. Therefore, scientific is the systematic and empirical
analysis and recording of controlled observation, which may lead to the
development of theories, concepts, generalizations and principles resulting in
prediction and control of those activities that may have some cause – effect
relationship.
There are various methods of research has been widely used in social sciences.
Best and Kahn (1992) classified the research in social sciences under 3 broader
categories. They are, historical research, descriptive research and experimental
research. This comprehensive classification provides an idea about the social
sciences research and all researches fall under one of the above three types or a
combination thereof. In this lesson let us discuss the historical descriptive and
experimental researches in brief.

i) Historical Research

ANNAMALAI UNIVERSITY
History is a meaningful record of human achievement. It is not merely a list of
chronological events but a truthful integrated account of the relationships between
persons, events, times and places. We use history to understand the past and try
to understand the present in light of past events and developments. Historical
analysis may be directed toward an individual, an idea, a movement or an
institution. However, none of these objects of historical observation can be
considered in isolation. People cannot be subjected to historical investigation
without some consideration of interaction with the ideas, movements and / or
institution of their times. The focus merely determine the points of emphasis
toward which historians direct their attention. People those who wish to engage in
historical research should read the works of historians regarding the methods and
approaches to conducting historical studies in psychology.
164

Meaning
A historical research is one which investigates, records, analyses and interprets
the events of the past for the purpose of sound generalizations that are helpful and
useful in understanding the past and the present and to a limited extent the
anticipated future. Thus the historical research describes “what was?”.
Historical research refers to the application of scientific method to the
description and analysis of past events. It seeks to test the truthfulness of the
reports made by other. It concerned about the past and the social forces which
have shaped the present by discovering past trends of events, facts and attitudes.
It traces the lives of development in human thought and action. A great deal of
social insight and historical orientation is necessary for historical research.

Purpose
Historical research is useful in psychology in many ways.
1. It helps us to establish a link between the past experiences on the one hand
and the present attitudes and values on the other.
2. It helps us to demonstrate the roots of prejudice and other psychological
factors on the basis of heredity and environment.
3. It enables us to understand the dynamic nature of human being and the
dynamic changes in the human behaviour from time to time.
4. It increases our understanding of the relationships between education,
psychology and the culture in which it operates.
5. It is useful for removing our prejudices.
6. It helps us to avoid the mistakes of the past.
7. It helps us to take up the best in the past.
Steps in historical research
There are five important stages or steps in historical research.
i) Selection of Problem: All problems are not suited for historical research.
Only those problems which need a scanning of historical records and have some
social utility should be selected for historical research. Usually, the current

ANNAMALAI UNIVERSITY
problem cannot be studied meaningfully by historical research. For example, to
explain the genetically basis of I.Q. is we can take up a historical research.
ii) Formulation of Hypotheses: Hypotheses should be formulated in a
historical research in psychology. Historian gathers evidences and then carefully
evaluates the authority of the hypotheses. If the evidence is compatible with the
hypotheses and their consequences, then hypotheses should be confirmed.
Through the synthesis of hypotheses the historical generalizations are established.
iii) Collection of data: Historical data are collected with the help of primary
sources or secondary sources. Primary sources are eyewitness accounts reported
by actual observer or participant in the event. Secondary sources are accounts of
165

an event that are gathered after the reporter has discussed with the actual observer
or has seen the accounts of an observer.
iv) Criticism of the data: To test the trustworthy, authenticity and worth using
of the historical data a careful analysis and criticism have been made. A historical
evidences is usually derived from historical data by the process either external or
internal criticism or by both.
v) Data analysis and generalization: The collected historical evidences
should be analyzed carefully and generalizations to be arrived at this stage. In
reaching the conclusions the historians employ the principles of probability as is
done by a physical scientist. Further, the results are subjected to further analysis
in order to establish perfect authenticity and correctness.

Sources of Data in historical research


There are two major sources of data in historical research:

a) Primary Sources
Primary sources are eyewitness accounts of events reported by an actual
observer or participant in an event. Such important sources which commonly used
are presented below.
i) Documents: Documents are those records which are kept and written by
actual participants or direct witness of an event. These sources are also produced
for the purpose of transmitting information to be used in future. Documents which
are included as primary sources are official minutes or records, constitutions,
certificates, declarations, licenses, deeds, wills, affidavits, books, diagrams,
paintings, pictures, films, newspapers, catalogues, advertisement, findings of
research, reports and so on.
ii) Remains or relics: Remains or relics are objects which are associated with
some persons, groups or periods. Fossils, skeletons, tools, weapons, clothing,
buildings, coins, furniture, pictures and art objects are examples of such remains
or relics. They provide a clear evidence about the past life and happenings.
iii) Oral Testimony: Oral testimony is the spoken account of a witness or a
participant in an event. Such evidences can be obtained in a personal interview or

ANNAMALAI UNIVERSITY
can be recorded or transcribed as the witness or participant relates to his or her
experiences. Even the oral traditions like ballads, anecfodes, tales and songs may
be used as a primary sources of data.

b) Secondary Sources
These are sources which are provided / given by others who are not seen the
past events. The writer of the secondary sources is not on the actual scene of the
event, rather he merely reports what the person, who was there, said or wrote.
Many books, and encyclopedia are the best examples for secondary sources
because they are often far from the original contents. Secondary sources of data
are used less than the primary sources because of the errors that may result when
the information is passed from one person to another.
166

Historical Criticism
Historians are often facing a difficulty for getting trustworthy and authenticated
data since they are not directly observing the events. Since, the past events cannot
be repeated at the will of investigators, they have been forced to depend upon those
who witnessed or participated in those events. They have to make a careful
analysis so as to distinguish unerringly between a true event and a false event or
between relevant and irrelevant information. The data which are trustworthy,
authentic and worth using are called as historical evidences. A historical evidence
is derived from historical data by the process of criticism which is of two types viz.
external and internal.
a) External criticism: It is primarily aimed at determining whether the
documents evaluated are authentic or not Historians employ various tests to test
the genuineness of data. For this purpose, they thoroughly examine a signature,
handwriting, script, language documentation, knowledge available at that period as
also what is known. Sometimes historians resort to physical and chemical tests of
ink, paint, paper, cloth, metals or wood. They try to establish whether or not these
elements are consistent with known facts about the person, knowledge available
and the technology of the period from which the remains have been obtained.
b) Internal criticism: Internal criticism is one in which the investigator tries to
evaluate the accuracy or worth of the data. It aims at determining whether the
writer who has made a report or observation about a part phenomena has done
justice to the topic or not. It involves questions like
Did the writer has any bias?
Did he possess adequate knowledge of the facts?
Did he has any motive in writing such a report?
Did he has any religious, political or class prejudices?
How does his account correspond with other competent writers? and so on.
Although these questions are difficult to answer the historians try to establish
that the data are not only authentic but also accurate ones, failing which they
cannot introduce historical evidence worth consideration.

ANNAMALAI UNIVERSITY
Evaluation of Historical Research
1. Historical research is never complete. It is derived from the surviving records
of a limited number of events that took place in the past. Historical factors
are examined as critically as scientific facts.
2. Hypotheses are tested mostly through observation and analysis of
documents and relics.
3. The reliability of a research report is determined by the depth and breath of
knowledge about the past events.
4. Usually generalisations do not fall within the limits of historical findings.
They are factual and are true in one situation in the context of time and
place. The future events are also not predicted by these findings.
167

5. Recognizing all the historian must endeavour to make his work as accurate
as possible.

Limitations
Most of the historical research studies often reveal serious limitations. Some of
the limitations are:
1. The depth and reliability of adequate data.
2. No life size writing of historical events are possible.
3. There is a wide disposal of documents which makes it difficult to collect
them.
4. Neither the data collected nor the inferences made are verifiable.
5. Some personal biases and private interpretations often enter unconsciously
into the interpretation of historical events.
Even though the historical research has a number of limitations still it is a
demanding one. The gathering of historical evidence requires long hours of careful
examination of a number of documents or other primary sources of data. In fact
any significant historical study would make demands that few students have the
time, financial resources, patience, and expertise to meet. For these reasons, good
historical studies are not often attempted for the purpose of meeting academic
degree requirements.

ii) Normative Survey (or) Descriptive Research


A descriptive study describes and interprets “what is?” It is concerned with
conditions or relationships that exist, opinions that are held, processes that are
going on, effects that are evident or trends that are developing. It is primarily
concerned with the present although it often considers past events and influences
as they relate to current conditions.

Meaning
Descriptive research sometimes known as non-experimental or correlational
research deals with the relationships between variables, the testing of hypotheses
and the development of generalizations, principles or theories that have universal

ANNAMALAI UNIVERSITY
validity. It is concerned with the functional relationships. The expectation is that if
variable, ‘A’ is systematically associated with variable ‘B’, prediction of future
phenomena may be possible and the results may suggest additional or competing
hypotheses to test.
In carrying out a descriptive research, the researcher does not manipulate any
variables or treatments as in the experimental research. In fact, the events that are
observed and described would have happened even though there had been no
observation or analysis. Descriptive research involves events that have already
taken place and may be related to a present condition.
168

Purpose
The method of descriptive study is particularly appropriate in the behavioural
sciences, because, many of the types of behaviour that interest the researcher
cannot be arranged in a realistic settings. Further, introducing significant variables
may be harmful or threatening to human subjects. Hence, in the descriptive study
we can conduct the research under the conditions that naturally occur in home,
the classroom, the office or the work place, human behaviour can be systematically
examined and analysed.

Characteristics
1. Descriptive study is characterized by disciplined inquiry, which requires
expertise, objectivity and careful execution.
2. It develops knowledge, adding to what is already known.
3. It uses the important techniques such as observation, description and
analysis.
4. Descriptive studies lead to the generalizations beyond the given sample and
situation.
Process of Descriptive Research
1. Examination of problematic situation.
2. Definition of problem.
3. Statement of hypotheses.
4. List of assumptions.
5. Selection of appropriate subject (sample) and source materials.
6. Selection or constructions of tools for data collection.
7. Descriptive analysis and interpretation of data.
Types of Descriptive Studies
i) Assessment Studies
Assessment describes the status of a phenomenon at a particular time. It
describes without value judgement a situation that prevails. It attempts no

ANNAMALAI UNIVERSITY
explanation of underlying reasons and makes no recommendations for action. It
may deal with prevailing opinion, knowledge practices or condition. It is ordinarily
used in psychology and Education, assessment describes the progress that the
students have made toward educational goals at a particular time or it describes
the behaviour modification taken place in an individual in a particular period.
There are various types of assessment studies are used in psychology. They are
a) Survey – like the social surveys
b) Activity analysis – like the job analysis, motivation analysis
c) Trend analysis – like public opinion surveys.
169

ii) Evaluation Studies


Evaluation is the process used to determine what has happened during a given
activity or in an institution. The purpose of evaluation is to see if a given
programme is working, if an institution is successful according to the goals set for
it, or if the original intent is being successfully carried out. To assessment,
evaluation also adds the ingredient of value judgement of the social utility,
desirability, or effectiveness of a process, product or programme and it sometimes
includes a recommendation for some course of action.
The important types of evaluation studies are
a) School surveys
b) Programme evaluation
c) Case studies
d) The follow-up study
Survey Research
It is noticed that in most of the assessment and evaluation studies survey
research becomes the key method. In survey research, the investigator selects a
sample of respondents and administers a questionnaire or conducts interviews to
collect information on variables of interest. They collect detailed descriptions of
existing phenomena with the intent of employing the data to justify the current
conditions and practices or to make more intelligent plan for improving socio-
economic or educational conditions and processes. Surveys are used to learn about
people’s attitudes, beliefs, values, demographic facts, behaviour, opinions, habits,
desires, ideas and other types of information. They are used frequently in business,
politics, government, sociology, public health, psychology and education because
accurate information can be obtained for large members of people with a small
sample. Most surveys describe the incidence, frequency and distribution of the
characteristics of an identified population. In addition to being descriptive, surveys
can also be used to explore relationships between variables in an explanatory way.
Any survey requires an expert and imaginative planning, careful analysis and
interpretations of the data, and logical and skilful reporting of the findings. Survey

ANNAMALAI UNIVERSITY
research depends upon three important factors.
1. As survey research deals with the characteristics, attitudes and behaviours of
individuals or a group of individuals called a sample, direct contact with those
persons must be established by the survey researcher.
2. The success of survey research depends upon the willingness and the co-
operation of the sample selected for the study.
3. Survey research requires that the researcher should be an expert. i.e. a trained
personnel. He should possess the manipulative skills and insight in research.
He should possess social intelligence so that he may deal with people effectively
and be able to extract the desired information from them.
170

Major Steps in Survey Research


1. Preparation of plan / Design / Proposal
2. Preparation of measuring instruments or tools like (a) Questionnaire (b)
Interview (c) observation etc.
3. Data Collection
4. Analysis and interpretation of data
5. Reporting the results.
Types of Survey Research
a) Personal Interview (Discussed in Lesson 14)
b) Mail Questionnaire (Discussed in Lesson 14)
c) Panel Technique
Some survey techniques require successive interviews with the same sample.
The panel technique is one of them where the re-interview design is used and the
same sample is interviewed more than once. If the purpose of the survey is wide
and extensive, multiple interviews are taken with the same sample. The panel
technique
i) Enables the investigator to know how the various factors bring chances
through time in the attitudes of the sample being studied.
ii) When the same sample is interviewed twice or more than twice, it becomes a
more sensitive and an accurate measure of change than when two different samples
from the same population are testes.

Merits
1. Survey research has a wider scope. A great deal of information can be
obtained by studying the larger population.
2. Survey research is more accurate.
3. It has been frequently used in almost all social sciences.
4. Survey research is considered as a very important and indispensable tool for
studying social attitudes, beliefs, values, etc. with maximal accuracy at the

ANNAMALAI UNIVERSITY
economical rate.
Limitations
1. Survey research remains at the surface and it does not penetrate into the
depth of problem being investigated.
2. It is a time consuming method and demand a high cost when large survey is
to be conducted.
3. It has sampling errors. It is too difficult to select a random or stratified
random samples for a survey research.
4. Some techniques used in the survey research are too sensitive and hence
such a technique leaves the respondent out of his own social content.
171

5. It demands expertise, research knowledge and sophistication on the part of


the researcher.
But, with all these limitations still the survey research is widely used by
psychologist, educationist sociologist and an economist.

iii) Experimental Research


The experiment is the basic tool of the physical sciences for tracing cause and
effect relationships and for verifying inferences. Experimental studies have their
purposes to test a hypothesis of a casual relationship between variables.
Experimentation differs from normative (or) descriptive research in that the
experimenter has some degree of control over some variables involved and the
conditions under which the variable are observed.
Experimentation in psychological researches describes “what will be?” when all
relevant conditions are controlled.
In experimentation the investigator controls the psychological factors to which a
child or a group of children is subjected during the period of study and observes the
resulting achievement. He starts the experiment with some measurement of the
initial attainment of children in the trait to be influenced, then subjects the group
to the experimental factor (such as particular type of …). For a particular duration
and by applying a test determines the gain in achievement, that has resulted from
the application of the experimentel factor.
Greenwood, mentions about five types of experiments,
1. Trial and Error expt’s
2. Controlled observational study
3. Ex-post facto expt.
4. Natural expt.
5. Laboratory expt.
The trial and error approach is conducted in the first or pioneering stage of
experimentation to know what the child is capable of learning and what is the level
of his self – learning, skill natural or uncontrolled approach suited for sciences like

ANNAMALAI UNIVERSITY
astronomy, which is made up mostly of observation. The laboratory type of
research is conducted in natural sciences like physics, chemistry and biology.
Uncontrolled observational study is conducted with a group of children with certain
conditions controlled to see how far each child varies from other according to their
intelligence, interest and aptitude.

Ex-Post Research
It is a systematic empirical inquiry in which the scientist does not have direct
control over independent variables because.
1. their manifestations have already occurred
2. they are inherently not manipulatable
172

Hence he makes inferences about relations among variables from inconsistent


variation of independent and dependent variables. Independent variables are
studied, for seeing possible relations and effects that may be produced by a set of
dependent variables in independent variables.
In ex-post research, the researcher’s control on independent variables is weak
and in many cases almost nil. Social sciences do not afford a possibility of
controlling independent variables. Ex-post research therefore has to take things as
they are and examine their impact on the explained variables.

CONDITION OF AN EXPERIMENT
Experimentation is a controlled observation. The researcher is expected to
control all such variables that are not involved in actual experiment. He has not
only control the independent variable but also to control certain other variables
arising out of population or from the testing procedures or from other external
sources. He exercise three types of controls either single or in combinations. These
controls are manipulated through selection or physically or statistically. The
controls are necessary for isolating the determines individually or in combination
for varying them and for describing quantitatively the extent of their expression.

Major Steps in Experimental Research


1. Selecting and delimiting the problem: Investigating the needs in the field and
selecting a problem and then converting the problem in to a hypothesis.
2. Reviewing the related literature: Study of the literature related to he same
problem.
3. Defining the population: Selecting the population and the sample.
4. Planing the procedure: Determining the method of experiment place of
experimentation, duration of it etc.
5. Conducting the experiment: Control of variables and non-experimental
factors, keeping record of the steps in the process, applying the experimental
factor.
6. Measuring the Results: Evaluating the observation.
7. Analyzing and Interpreting the Results: Statistical analysis
8. Drawing up conclusions: The conclusion must be restricted to the population
investigated and it should not be generalised.

ANNAMALAI UNIVERSITY
9. Reporting the Results: In clear, unambiguous words. It should be reported in
such a way that the reader can have a judgement as to its adequacy.
Control and Experimental Factors
Control factor is the useful customary, method (or) device. Experimental factor
is the new method (or) device introduced to develop pupils.
A comparison is made between the results of these two situation and the
difference between the mean of the two group will determine the relative superiority
of the method
173

Control Group Experimental Group


1. Pre test Pre test
2. Application of control factor Application of experimental factor
3. Final Test Final Test
4. Measurement of pupil mean Measurement of pupil mean
Need for a Caution Approach to Experimentation
1. Subject should be representative as to the age, sex, grade and intelligence.
2. Subjects should be so selected as they are available through out the
experiment.
3. The materials must be appropriate to the subjects and the experimenter.
4. The places in which the experiment is conducted should be typical of both
the situations.
5. Appropriate measuring instruments should be used.
SUMMARY
The purpose of research is to discover and develop an organized body of knowledge.
Historical research, descriptive research and experimental research are frequently used
in psychology and other social sciences. Historical research may be directed forward an
individual an idea, a movement or an institution. In historical research we should read
the previous works regarding the methods and approaches of conducting historical
studies in psychology. In historical research we have data from primary and secondary
sources. Historical research involves both internal and external criticism. Descriptive
research is also termed as normative survey or non-experimental research or
correlational research. Descriptive research deals with the relationship between
variables. Descriptive research involves assessment studies, evaluation studies of a
survey. Various tools have been employed in the collection of data. Experimental
research assess the cause and effect relationship between the variables.

KEY TERMS
Historical research Remains / Relics
Oral Testimony Historical Criticism
Normative survey Assessment studies
Evaluation studies Panel Technique

ANNAMALAI UNIVERSITY
Experimental Research
QUESTIONS
Ex-post facto experiment

1. Define historical research. Discuss the steps involved in a historical


research.
2. Discuss the major sources of data in historical research Point out the
limitations of the historical research.
3. What is a survey research? Discuss the relative merits and limitations of the
survey research.
4. What are the conditions required for an experiment? Delineate the important
stages in an experimental research.

174

LESSON – 19

RESEARCH METHODS (Contd …)

OBJECTIVES
After reading this lesson the student should
 Understand the experimental variables
 Explain how to control the experimental variables
 Understand the various types of relationships in psychology
 Describe the nature of experimental control

SYNOPSIS
The experimental variables – Dependent Independent and extraneous variables
– Experimental control – The nature of experimental control types of empirical
Relationship in Psychology.

TYPES OF EMPIRICAL RELATIONSHIPS STUDIED IN PSYCHOLOGY


The primary purpose of an experiment is to test a hypothesis. A hypothesis is a
statement to the effect that two or more variables are related. The two variables are
independent and dependent variables. There are many types of relationships that
may obtain between them.
There are three general classes of variables with which the psychologists deals:
Stimulus variables, organismic variables and response variables. The psychologist
attempts to determine relationships between these three. The possible relationship
may be studied are shown in the figure.

(2)
R1
(3)
The organism

S (1)

ANNAMALAI UNIVERSITY R2
(4)

S denotes the stimulus variables


O denotes the organismic variables
R denotes the response variables
The possible relationships are indicated symbolically as follows:
1. R1 = f(R2) – response number one (any given response) is a function of
response number two (any other response). When determining the classes of
175

responses are related, we are determining the first type of relationship. However, it
is difficult to establish a relationship experimentally between two responses, and
the application of correlational techniques is a more appropriate approach to this
problem. Eg. An experimenter computing correlation between the number of errors
and the total time to run the maze.
2. R = f(S) – certain response class is a function of a certain stimulus (class).
Here, one varies values of the stimulus to see if values of the response change. The
stimulus is the independent variable while the response is the dependent variable.
We are concerned with this type of relationship most often in experimentation. The
examples of the areas of psychology in which this types of relationship is sought are
those of perception and learning. In studies of perception we vary stimulus
conditions and determine whether the perceptual responses of the organism also
vary. Eg.: highting conditions of a given object (stimulus variable) and person’s
verbal report of its size changes (a response measure).
3. R = f(o) – a response class is a function of (a class of) organismic variables.
The primary purpose of research aimed at this type of relationship is to determine
whether certain characteristics of the organism lead to certain types of responses
Eg.: Do people who are short and stout behave differently than people who are tall
and thin. Do these two types of people differ in happiness, general emotionality etc.
4. O = f(S) – a class of organismic variables is a function of a class of stimulus
variables. Here, we are primarily asking what environmental conditions or events
influence the characteristics of organisms. For example, we might be interested in
whether the degree to which a child has been isolated influences his intelligence.
Apart from these basic types of relationships that are sought by psychologists
that are more complex relationships Eg. R1 = f(R2, R3)
R = f(S1, S2) R = f(O1S)
the statement of these types of relationships sought depends on the way the
classify our variables.

THE INDEPENDENT VARIABLE


In the first type of relationship we would vary one response to see if another is

ANNAMALAI UNIVERSITY
thereby affected. The response that we vary would be the independent variable, the
other response the dependent variable. However, response – response relationship
are not often sought with the use of standard experimental designs. In the second
type of relationship we vary a stimulus and determine its effects on a response.
Hence, the stimulus is the independent variable and the response is a dependent
variable. In the third type of relationship we vary an organismic variable as the
independent variable and determine its relationship to a response to the dependent
variable. In the fourth type of relationship a stimulus is the independent variable
and an organismic variable the dependent variable. There are three independent
variables – responses, stimuli and organismic variables.
176

Response variables are not frequently used as independent variables in


experimentation. Most of the independent variables are stimulus variables. This
may be any aspect of the physical or social environment that excites the receptors.
Examples are: the effect of different sizes of type on reading speed; the effect of
different styles of type on reading speed; the effect of number of people present at
dinner on the amount of food eaten; the effect on social atmosphere on problem
solving ability. These variables differ considerably in their complexity – organismic
variables are relatively stable characteristics of the organism including sex, eye,
colour, height, weight, body build and such psychological characteristics as
intelligence, educational level, anxiety, neuroticism and prejudice. Intelligence can
also be classified as a response variable.
There are essentially two ways in which an investigator may exercise
independent variable control: (1) purposive manipulation of the variable (2)
selection of the desired values of the variable from a number of values that already
exist. When purposive manipulation is used, we say that an experiment is being
conducted; but when selection is used, we say that the method of systematic
observation is being used. The experimenter himself creates the values of the
independent variables.
So, purposive manipulation occurs when the investigator determines the values
of the independent variable ‘creates’ those values himself and determines which
group of subjects will receive which value. Selection occurs when the investigator
selects subjects who already possess the desired values of the independent
variables.
In studies, involving more than one independent variable, the values of one
variable might be purposively manipulated and the values of the other selected as
they naturally occur. Such a investigation may be referred to as a quasi-
experiment.

THE DEPENDENT VARIABLE


Generally, response measures are viewed as dependent variables in
psychological experimentation. These response number of errors made in learning
experiments, time taken to solve a problem, number of words spoken in a given

ANNAMALAI UNIVERSITY
time, judgements of people about certain traits. But whatever the response, it is
better to measure it as precisely as possible.
The standard ways of measuring responses are:
1. Accuracy of the responses: Several ways of measuring accuracy are
possible. Eg. counting the number of errors the subject makes; the number of
erroneous movements a person makes in putting puzzle together, the number of
blind alleys in running maze etc.
2. Latency of the response: This is a measure of the time that it taken the
organism to begin the response, as in the case of reaction time studies. Eg. the time
interval between the onset of the stimulus and the onset by the response.
177

3. Speed of the response: This is a measure of how long it takes the organism
to complete its response. If the response is a simple one, the time measure would
be quite short. But, if it is a complex response such as solving a difficult problem,
the time measure would be long. Latency is the time between the onset of the
stimulus and the onset of the response, and speed is the time between the onset
and termination of the response.
4. Frequency of response: It refers to the number of times a response occurs.
If the frequency of responding is counted for a given period of time, the rate of
response can be computed. If a response is made ten times in one minute, the rate
of responding is ten responses per minute. The rate gives the indication of the
probability of response – the higher the rate, the greater the probability that it will
occur. Rate of responding is most often used in experiments involving operant
conditioning.

THE NATURE OF EXPERIMENTAL CONTROL


Among the most striking advances in methodology was the recognition of the
necessity for control conditions so called “normal” conditions against which to
evaluate experimental treatments. Control is one of the most important phases in
the planning and conduct of experiments. The word “Control” implies that the
experimenter has a certain power over the conditions of his experiment; he is able
to manipulate variables in an effort to arrive at a sound conclusion. An
experimenter exercises independent variable control when he varies the
independent variable in a known and specified manner. Independent variable
control is the essential defining feature of an experiment as distinguished from the
method of systematic observation. Extraneous variable control refers to the
regulation of extraneous variables. Extraneous variable is one that is operating in
the experimental situation in addition to the independent variable. Since, the
extraneous variable might affect the dependent variable it must be regulated so
that, it will not mask the possible effect of the independent variable. If we fail to
control extraneous variables it results in a confounded experiment a disastrous
consequence for a experimenter i.e. if an extraneous variable is allowed to operate
in an uncontrolled manner, it is mingled with and the independent variable and

ANNAMALAI UNIVERSITY
confounded (the dependent variable is not free from irrelevant influences).
Confounding occurs when there is an extraneous variable that is systematically
related to the independent variable, and it may act on the dependent variable.
Hence, the extraneous variable may effect the dependent variable scores of one
group, but not the other. If confounding is present, there, the reason that any
change occurs in the dependent variable cannot be ascribed to the independent
variable.
In summary, confounding occurs when an extraneous variable is systematically
related to the independent variable and it might differently affect the dependent
variable scores of the two groups.
178

DETERMINING EXTRANEOUS VARIABLES


If extraneous variables are allowed to influence our dependent variable,
however, any change in our dependent variable could not be ascribed to variation of
our independent variables. We would not know which of the numerous variables
caused the change. So, we must control the experimental situation so that these
extraneous variables can be dismissed from further consideration. We can identify
the extraneous variables by referring to our literature survey; we can study
previous experiments concerned with our dependent variable to find out which
variables have been demonstrated to affect the dependent variable. We should also
note what other variables previous experimenters have considered that it is
necessary to control. Discussion sections of earlier articles may also yield
information about variables that had not been previously controlled, but were
recommended for consideration in the future. So with the results of our literature
survey, our general knowledge about potentially relevant variables, we may arrive at
a list of extraneous variables that should be considered.
Once the list of extraneous variables is constructed, we must decide which
should be controlled i.e. those variables that are likely to affect our dependent
variable. It is to these variables that the various techniques of control will be
applied.

Techniques of Control
There are different ways of controlling the extraneous variables.
1. Elimination: The most desirable way to control extraneous variables is to
eliminate them from the experimental situation. Unfortunately, most extraneous
variables cannot be eliminated. The extraneous variables that cannot be eliminated
are subject's previous experience, sex, levels of motivation, age, weight, intelligence,
and so on.
2. Constancy of conditions: When certain extraneous variables cannot be
eliminated, we can attempt to hold them constant throughout the experiment.
Control by this technique means essentially that whatever the extraneous variables,
the same value of it are present for all subjects. For example, the time of day is an
important variable one of the standard applications of the technique of holding

ANNAMALAI UNIVERSITY
conditions constant is to conduct experimental sessions in the same room. The
influence of the characteristics of the room would be the same for all subjects.
Similarly, the experimenter may use the same instructions to all the subjects
except where they must be modified for different experimental conditions. And the
attitude of the experimenter should also be held as constant as possible for all
subjects. If he is social with one subject and serious with other, confounding of
experimenter attitude with the independent variable would occur.
3. Balancing: When it is not possible to hold constant conditions in the
experiment, the experimenter may resort to the technique of balancing out the
effect of extraneous variables. There are two general situations in which balancing
may be used. (1) Where the experimenter is either unable or uninterested in
179

identifying the extraneous variables, (2) Where he can identify them and desires to
take special steps to control them. Eg. The experimenter can control the effect of
gender on his dependent variable by making sure that it balances out of his two
groups. This would be accomplished by assigning an equal number of subjects of
each gender to each group. In a similar manner he could control the age of the
subjects; he would make sure than an equal number of each age classification is
assigned to each group. Balancing is also applied where there is more than one
experimenter.
4. Counter balancing: The general principle of the technique of counter
balancing may be stated as ‘Each condition must be presented to each subject an
equal number of times and each condition must occur an equal number of times’.
Further, each condition must precede and follow all other conditions an equal
number of times. More generally, counter balanced designs entail the assumption
that there is no differential transfer between conditions. By differential transfer we
mean that the transfer from condition one (when it occurs first) to condition two is
different than the transfer from condition two (when it occurs first) to condition one.
Balancing is used when each subject is exposed to only one experimental
condition. Whereas counter balancing is used only when there are repeated
measures or (more than one condition for) the same subject.
5. Randomization: This technique is used for two general situations: (1) Where
it is known that certain extraneous variables operate in the experimental
situations, but it is not feasible to apply one of the above techniques of control (2)
where we assume that some extraneous variables will operate, but cannot specify
them and therefore cannot apply the other techniques. Here the assumption is that
extraneous variables will “randomize out”, i.e. that whatever their effects, they will
influence both groups to approximately the same extent. The experimenter cannot
control such variables as previous learning experiences, levels of motivation,
relationship with boy or girl friends and money problems. If the experimenter
randomly assigns subjects to the ten people experimental and control may assume
that the effect of such variables is about the same on both groups. He may expect
the two groups to differ on such variables only within the limits of random
sampling. Hence, the extraneous variables should not differentially affect his
ANNAMALAI UNIVERSITY
dependent variable.

SUMMARY
The experimental method ascertain the relationship between variables. Four
types of empirical relationship can be studied in psychology. In all these
relationship a number of variables such as independent, dependent variables are
involved. Both the independent and dependent variables are manipulated by the
experimenter. Extraneous variables are the uncontrolled variables that has
significant influence on the dependent variables. Experimental control refers to
minimizing the influence of the extraneous variable on dependent variable.
180

Elimination, consistency of conditions, balancing , counter balancing and


randomization are some of the methods used to control the effect of extraneous
variables.

KEY TERMS
Independent variable Dependent variable
Extraneous variable Organismic variable
Balancing Counter balancing
Randomization

QUESTIONS
1. Examine the various types of empirical relationships studied in
psychology.
2. Discuss the characteristics of independent and dependent variable.
3. How can we control the influence of extraneous variables?



ANNAMALAI UNIVERSITY
181

LESSON – 20

RESEARCH METHODS (Contd…)

OBJECTIVES
After reading this lesson the student should
 Understand the types of experiments.
 Describe how an experiment has to be planned.
 Explain the condition of an experiment with an example.
 Describe the ethical principles to be followed in human and animal research.

SYNOPSIS
Planing an experiment – A summary and preview – conducting an experiment –
Ethical principles in the conduct of research with human participants – Ethical
principles for animal research.

THE EXPERIMENT
In the early years of science, the non-experimental methods were more
prominent. In sociology, for example, only non-experimental methods can be
generally used because sociology is concerned with the effect of existing culture and
social institutions on behaviour. It is very difficult to manipulate these two factors
as independent variables in an experiment. As scientific investigations become
more and more searching, the ‘spontaneous’ happenings in nature are not adequate
to permit the necessary observations. This leads to the setting up of special
conditions to bring about the desired events under circumstances favorable for
scientific observations and thus experiments originate. An experimenter takes an
active part in producing the event. The advantages of experiment as expressed by
Woodworth and Schlosburg are:
1. The experimenter can make an event occur when he wishes. So, he can be
fully prepared for accurate observation.
2. He can repeat his observation under the same conditions for verification. He
can describe his conditions and enable other experimenters to duplicate them and

ANNAMALAI UNIVERSITY
make an independent check of his results.
3. He can vary the conditions systematically and note the variation in results.
Since, psychology is concerned with the behaviour of organisms, the
psychologist cannot wait until the behaviour occurs naturally. When a theory is
tested through the use of experiment the conclusion is more highly regarded. The
evidence report obtained through experimentation is more reliable. It is because
the interpretation of results is clearer in an experiment. Ambiguous interpretation
of results can occur in non-experimental methods because lack of control over
extraneous variables. Sometimes, experimental method can lead to errors and in
the hands of poor experimenters, the errors are sometimes great. However, the
experimental method is preferred where it can be appropriately used. But, if it
182

cannot be used, we must use the method of systematic observation. When it is not
possible to produce the events that we wish to study, we must rely on non-
experimental methods.

TYPES OF EXPERIMENTS
The types of experiments that a psychologist uses are exploratory, confirmatory
crucial experiment and pilot study or pilot experiment. The type of experiment the
psychologist uses depends on the state of the knowledge relevant to the problem
with which he is dealing. If there is little knowledge about a given problem, the
psychologist or experimenter performs an exploratory experiment. In the
exploratory experiment the scientist is interested primarily in finding new
“independent variables that affect a given dependent variable. He is simply curious,
and collects data to determine the relationship between the two variables. In the
confirmatory experiment, he is interested in confirming that a given variable is
influential. In the confirmatory experiment he may also want to determine the
extent and precise way in which one variable influences the other or more
generally, to determine the functional (quantitative) relationship between the two
variables. The exploratory experiment refers to “I wonder what would happen if I
did thus” type of problem; while the confirmatory experiment refers to “I will bet
this would happen if I did this” type of problem.
The purpose of any experiment is to arrive at an evidence report. In the
exploratory experiment the evidence report can be used as the basis for formulating
a specific, precise hypothesis. In a confirmatory experiment the evidence report is
used to determine whether the hypothesis is probably true or false. The difference
between these two types of experiments has direct implications for the experimental
designs that are used one type of design is more efficient for the exploratory
experiment, which another design is more efficient for the confirmatory experiment.
The crucial experiment is used to test one or several “Counter hypothesis”
simultaneously. A crucial experiment is one whose results support one theory and
disconfirm all possible alternative theories.
The pilot study or pilot experiment refers to a preliminary experiment,
conducted prior to the major experiment. It is used with only a small number of

ANNAMALAI UNIVERSITY
subjects, to suggest what specific values should be assigned to the variables being
studied, to try out certain procedures to see how well they work, and more generally
to find out what mistakes might be made in conducting the actual experiment so
that, the experimenter can be ready for them.

PLANNING AN EXPERIMENT
When a problem is stated and a hypothesis is formulated which is a tentative
solution to that problem, we must design an experiment that will determine
whether that hypothesis is probably true or false. In designing an experiment the
researcher uses his ingenuity to obtain data that are relevant to the hypothesis.
This involves such problems of experimental technique as: What apparatus will
best allow manipulation and observation of the phenomenon of interest? what
183

extraneous variables may contaminate the phenomenon that of primary interest


and therefore in need of control? What events should be observed and which
should be ignored? How can the results of the experiment best be observed and
recorded? By considering these problems an attempt is made to rule out the
possibility of collecting irrelevant evidence. There should be adequate planning of
an experiment. If the experiment is improperly designed, then either no inferences
can be made from the results, or it may only be possible to make inferences that
answer questions that the experimenter has not asked. It is a good idea for the
experimenter to draft a thorough plan of the experiment before he conducts the
experiment. Once the complete plan of the experiment is set down it is desirable to
obtain as much criticism of it as possible. The critical review of others may bring
potential errors to the surface. No scientist is beyond criticism, and it is far better
for him to accept criticism before an experiment is conducted than to make errors
that might invalidate the experiment.
There are a series of steps that the experimenter can follow in planning the
experiment.
1. Label the experiment: The title, time and location of the experiment should be
clearly specified. As time passes the experimenter can accumulate a number of
experiments in the same problem area, he can refer to this information without
much chance of confusing – one experiment with other.
2. Survey of the literature: All the previous work that is relevant to the
experiment should be studied. It helps in the formulation of the problem. It tells
the experimenter whether or not the experiment needs to be conducted. If the same
experiment has previously been conducted there is no need to repeat the
experiment unless it is specifically designed to confirm previous findings. Other
studies also provide numerous suggestions about extraneous variables that need to
be controlled and tells how to control them.
3. Statement of the problem: The experiment is conducted because there is a
lack of knowledge about something. The statement of the problem expresses this
lack of knowledge. The actual statement of the experimental question should be
stated in a single sentence, preferably as a question. The statement of the problem

ANNAMALAI UNIVERSITY
as a question implies that it can be answered in either a positive or negative
manner. If the question cannot be so answered, in general we can say that the
experiment should not be conducted.
4. Statement of the hypothesis: The variables specified in the statement of the
problems are stated in the hypothesis as a sentence. The if – then relationship is
the basic form for stating hypothesis.
5. Definition of variables: The independent and dependent variables specified in
the statement of the problem and in the hypothesis must be defined operationally.
6. Apparatus: Every experiment involves two things (1) an independent variable
must be manipulated and (2) the resulting value of the dependent variable must be
recorded. The frequently occurring type of independent variable is the presentation
184

of certain values of a stimulus and in every experiment a response is recorded.


These functions may be performed manually by the experimenter or sometimes he
may resort to mechanical or electrical assistance. There are two general functions
of apparatus in psychological experimentation. (1) to facilitate the administration of
the experimental treatment and (2) to aid in recording the resulting behaviour. The
types of apparatus used in behavioural experimentation are numerous.
7. Control of variables: The scientists must consider all of the variables that
might contaminate the experiment. He should attempt to evaluate any and all
variables which might affect the dependent variables. Any variable that might
invalidate the experiment or bears the conclusion of the experiment open to
question need to be controlled. We must make sure that no extraneous variable
will differentially affect our groups. i.e. that no such variable will affect one group
differently than it will affect another group.
8. Selection of a design: There are number of designs which can be used by the
experimenter. There is the two-groups design where the results for an experimental
group are compared to those for a control group. There are other designs among
which the experimenter may chose the one most appropriate to his problem. For
example, it may be more advantageous to use multi group design several groups
instead of just two group design. Another type of design that is most efficient and
which is being used more and more in psychology is the factorial design.
9. Selection and assignment of subjects to groups: The experimenter conducts
an experiment because he wants to conclude something certain subjects to study.
The larger group of subjects is the population (or universe) under study: those who
participate in the study or experiment constitute the sample. In designing an
experiment one should specify with great precision the population he is studying. If
we are concerned with population of people we must specify the age, sex, education
and socio-economic status. If it is a small population it may be possible to study or
observe each individual. Where the population is very large, the experimenter must
select a number of subjects and study them. The technique used in selecting a
sample is that of randomization. In random selection of a sample of subjects each
number has an equal chance of being chosen.

ANNAMALAI UNIVERSITY
Once the experimenter randomly selects a sample, then he assumes that his
sample is typical of the entire population – that he has drawn a representative
sample. Drawing samples at random is usually sufficient to assure that his sample
is representative.
Once the population has been specified, a sample drawn from it, and the type of
design determined, it is necessary to divide the sample into groups to be used. The
subjects must be assigned to groups by some random procedure. Here, each
subject has an equal opportunity to be assigned to each group. Some procedure
such as coin flipping can be used for this purpose. The next step is to determine
which group is to be the experimental group and which is to be the control group.
This decision should also be determined in a random manner. By using
185

randomization process we attempt to eliminate biases (errors) in the experiment. It


is not necessary, however, to have the same number of subjects in each group and
the experimenter may determine the size of each group in accordance with some
criteria.
10. Experimental Procedure: The procedure for conducting the data collection
phase of the experiment should be set down in great detail. The experimenter
should carefully plan how the subjects will be treated, how the stimuli will be
administered, how the responses will be observed and recorded. He should specify
the instructions to the subjects and formulate a statement concern to the
administration of the independent variable and the recording of the dependent
variable. It is also advisable to try out a few subjects to see how the procedure
works. This will suggest new points to be covered and modification of procedures
already set down. For this purpose a pilot study might be conducted.
11. Statistical treatment of the data: The data of the experiment are usually
subjected to statistical analysis. There are some powerful techniques developed. In
some manner the reliability of the results of the experiment should be evaluated. A
statistical technique is used to determine whether the difference between the mean
scores of the two groups in significant. The statistical analysis will tell us whether
the difference is due to chance or it is because of the independent variable. If the
difference is significant we may conclude that the experimental group is superior to
the control group.
Number of statistical tests are available to evaluate the data. Some tests are
appropriate to one kind of data or experimental design and some are not. The use
of statistical test requires that certain assumptions about the experimental design
and the kind of data collected must be met. It is advisable to plan the complete
procedure for statistical analysis prior to conduct the experiment.
12. Forming the evidence report: The evidence report is a summary statement of
the findings of the experiment. It tells us that the antecedent conditions of the
hypothesis held in the experiment. The evidence report is a statement that asserts
that the antecedent conditions of the hypothesis obtained, by the hypotheses were
found either to occur or not to occur. If the consequent conditions were found to

ANNAMALAI UNIVERSITY
occur, we may refer to the evidence report as positive, and if they were not found to
occur the evidence report is negative.
13. Making inferences from the evidence report to the hypothesis: The evidence
report is related to the hypothesis for the purpose of determining whether the
hypothesis is probably true or false. To do this we must make an inference from
the evidence report to the hypothesis. If the evidence report is positive, the
hypothesis is confirmed (the evidence report and the hypothesis coincide – What
was predicted to happen by the hypothesis actually happened as stated by the
evidence report). If, however, the evidence report is negative, we may conclude that
the hypothesis is not confirmed.
186

14. Generalisation of the findings: Generalisation of the findings depend on the


extent to which the population has been specified and the extent to which those
populations have been represented in the experiment by random sampling. If the
population is not adequately defined or the sample is not randomly drawn from it,
generalisation cannot be made and the results would apply to the sample studied
only.

CONDUCTING AN EXPERIMENT: AN EXAMPLE


Conducting an experiment affords an opportunity to make errors that can be
avoided in later research. Whether the experiment contributes new knowledge or
not it is very useful for students to work on a problem. Many students feel that
their first experiment should be an important one.
An example
The title of the experiment is “The effect of knowledge of Results on
Performance”. The problem can be stated as “what is the effect of knowledge of
results on performance”?
The hypothesis stated is “If knowledge of results is furnished to a person, then
those persons performance will be facilitated”.
The statement of the problem and the hypothesis determine the variables. The
task that the students, perform is the drawing, while blindfolded, of 5 inch lines.
The independent variable is the amount of knowledge of results, which varies form
zero to some large amount. The knowledge of results is operationally defined as
telling the subject whether his line is “too long”, “too short” or “right” (5¼ inches or
longer 4¾ inches or shorter; right is between 5¼ inches and 4¾ inches). Zero
amount of knowledge of results is defined as giving the subject no information
about the length of his line. The dependent variable is the actual length of the line
that the subject draws. Each subject is required to draw 50 lines, his proficiency
being determined by his total performance on all 50 lines.
The apparatus is the drawing board on which a ruled paper is fixed, a blindfold
and a pencil.
Two values of the independent variables are selected for the study, a positive

ANNAMALAI UNIVERSITY
amount and zero amount. Two groups are required, an experimental and a control.
The experimental group received knowledge of results, whereas the control group
did not.
The population of the study is the students in a college. From the list of
students 60 subjects are randomly selected for study. The 60 subjects are
randomly divided into two groups, then randomly determined that one of the group
is the experimental group and the other the control group.
Next, the investigator is concerned with controlling extraneors variables. The
same instructions were read to both groups; a constant “experimental attitude” is
maintained in the presence of both groups. Another variable to control here is the
amount of time between the trials. This variable is controlled by holding it constant
187

for all subjects. The other extraneous variable is the time of day which is controlled
by conducting the experiment between 2 pm and 4pm. The subjects are all of
psychology students. The other extraneous variable is the distracting influence of
people talking. It is assumed that is affected both groups equally – it is
“randomized out”.
The next step is the experimental procedure. The plan is as follows: “After the
subject enters the laboratory room, he is seated at a table and given the following
instructions”. I want you to draw some straight lines that are five inches long,
while you are blindfolded. You are to draw them horizontally (the experimenter
demonstrates by drawing a horizontal line in the air). The subjects perform the
task of drawing lines. The experiment subjects are given the appropriate knowledge
of results. No information is given to the control subjects. After the subject
completes a trial, he waits and records. The same procedure is followed until the
subject drawn 50 lines.
After collecting the data from 60 subjects, the data are to be subjected to
statistical treatment, based on this the evidence report is formed and reach his
conclusions and writes up the experiment.
The main principle to follow in writing up an experiment is that the report must
include every relevant aspects of the experiment so that someone else will be able to
repeat the experiment on the basis of the report. The aspects of the experiment
include: (1) Title; (2) Name and institutional connection; (3) Introduction (4) Method
(subjects, apparatus, design, procedure) (5) Results (6) Discussion and (7)
References.

ETHICAL PRINCIPLES IN THE CONDUCT OF RESEARCH WITH HUMAN PARTICIPANTS


The American Psychological Association (APA) has published its guidelines for
research involving human subjects in a book entitled “Ethical Principles in the
conduct of research with Human Participants (1982)” The 10 basic ethical
principles are:
1. In planing a study the investigator has the responsibility to make a careful
evaluation of its ethical acceptability. To the extent that the weighing of
scientific and human values suggest a compromise of any principle, the

ANNAMALAI UNIVERSITY
investigator includes a correspondingly serious obligation to seek ethical advice
and to observe stringent safeguards to protect the rights of human participants.
2. Considering whether a participant in a planned study will be a “subject at risk”
or a “subject at minimal risk” according to recognized standards, is of primary
ethical concern to the investigator.
3. The investigator always retains the responsibility for ensuring ethical practice in
research. The investigator is also responsible for the ethical treatment of
research participants by collaborators, assistants, students and employees, all
of whom, however, incur similar obligations.
4. Except in minimal – risk research, the investigator establishes a clear and fair
agreement with research participants, prior to their participation, that clarifies
188

the obligations and responsibilities of each. The investigator has the obligation
to honor all promises and commitments included in that agreement. The
investigator informs the participants of all aspects of the research that might
reasonably be expected to influence willingness to participate and explains all
other aspects of the research about which the participants inquire. Failure to
make full disclosure prior to obtaining informed consent requires additional
safeguards to protect the welfare and dignity of the research participants.
Research with children or with participants who have impairment that would
limit understanding and / or communication requires special safeguarding
procedures.
5. Methodological requirements of a study may make the use of conceament or
deception is necessary. Before conducting such a study, the investigator has a
special responsibility to (i) determine whether the use of such techniques is
justified by the study’s prospective scientific, educational or applied value; (2)
determine whether alternative procedures are available that do not use
concealment or deception; and (3) ensure that the participants are provided
with sufficient explanation as soon as possible.
6. The investigator respects the individuals freedom to decline to participate or to
withdraw form the research at any time. The obligation to protect this freedom
requires careful thought and consideration when the investigator is in a position
of authority or influence over the participant. Such positions of authority
include, but are not limited to situations in which research participation is
required as part of employment or in which the participant is a student, client,
or employee of the investigator.
7. The investigator protects the participant from physical and mental discomfort,
harm and danger that may raise from research procedures. If risks of such
consequences exist, the investigator informs the participant about that fact.
Research procedures likely to cause serious and lasting harm to a participant to
the risk of greater harm, or unless the research has great potential benefit and
fully informed and voluntary consent is obtained from each participant. The
participant should be informed of procedures for conducting the investigations
within a responsible time period following participation should stress, potential
ANNAMALAI UNIVERSITY
harm, or related questions or concerns raise.
8. After the data are collected the investigators provides the participant with
information about the nature of study and attempts to remove any
misconceptions that may have arisen. When scientific or human values justify
delaying or with holding this information, the investigator incurs a special
responsibility to monitor the research and to ensure that there are no damaging
consequences for the participant.
9. Where research procedures result in undesirable consequences for the
individual participant, the investigator has the responsibility to detect and
remove or correct these consequences, including long term effects.
189

10. Information obtained about a research participant during the course of an


investigation is confidential unless otherwise agreed in advance. When the
possibility exists that others may obtain access to such information, this
possibility, together with the plans for protecting confidentiality, is explained to
the participant as part of the procedure for obtaining informed consent.

Ethical issues in human research


There are five ethical issues in research that involve human participants. They
are (1) lack of informed consent (including invasion of privacy, (2) coercion to
participate (3) potential, physical or mental harm (4) deception, and (5) violation of
confidentiality.

1. Principles of Informed Consent


A primary responsibility of any researcher is to obtain the informed consent of
the individuals who participate in his or her research. Not only does obtaining
informed consent ensure that researches do no violate people’s privacy but it
ensures that prospective research participants are given enough information about
the nature of a study that they can make a reasoned decision regarding whether
they want to participate.

2. Freedom from Coercion to Participate


All ethical guidelines insist that potential subjects not be coerced into
participating in research. Coercion occurs when subjects agree to participate
because of real or implied pressure from an individual who has authority or
influence over them. The examples include employees in business and industry
who are asked to participate in research by their employees, military personnel who
are asked to serve as subjects, prisoners who are required to volunteer for research,
and clients who are asked to provide data by their therapists or physicians. All
these classes of subjects have one thing in common, i.e. they believe that refusing
to participate will have negative consequences for them. Ethical principles states
that researchers must respect an individuals freedom to decline to participate in
research or to discontinue participation at any time.

3. Minimising Physical and Mental Stress


The ethical principle states that the investigator protects the participant from
ANNAMALAI UNIVERSITY
physical and mental discomfort, harm, danger and may arise form research
procedures. If risks of such consequences exist, the investigator informs the
participant of that fact. Research procedures likely to cause serious and lasting
harm to a participant are not used unless the failure to use these procedures might
expose the participant to the risk of greater harm, or unless the research has great
potential benefit and fully informed and voluntary consent is obtained from each
participant.
At the expenses, most people tend to agree regarding the amount of discomfort
that is permissible. For example, most people agree that are experiments that lead
subjects to think they are dying is highly unethical. Similarly few people object to
studies that involve minimal risk.
190

4. Deception in Research
Behavioural scientists use deception to prevent subjects from learning the true
purpose of a study so that their behaviour will not be artificially affected. Other
uses include: Presenting subjects with a false purpose of the study; providing false
feedback to subjects, involving subject without their knowledge; presenting the
related studies as unrelated; and giving incorrect information regarding stimulus
materials.
In general, as long as the (subjects) are informed about details of the study
afterward, subjects appear not to mind being mislead for good reasons. In other
words, research participants do not seem to regard deception in research setting in
the same way they view lying in everyday life. Instead, they view it as a necessary
aspect of certain research.

Debriefing
APA guidelines require that research participants be debriefed.
After the data are collected the investigator provides the participant with
information about the nature of the study and attempts to remove any
misconceptions that may have arisen’.
Debriefing accomplishes four goals
1. It clarifies the nature of the study for participants.
2. It removes any stress or other negative consequences.
3. Research obtains the subjects reactions to the study itself.
4. Subjects should study feeling have good. Researchers should convey
their genuine appreciation for subjects time and cooperation, and give
subjects the sense that their participation was important.

5. Confidentiality in Research
The information obtained about research participants in the course of a study is
confidential. Such information may be used for purposes of the research and may
not be disclosed to others. When others have access to participant’s data, their
privacy is invaded.
The easiest way to maintain confidentiality is to ensure that subjects responses
ANNAMALAI UNIVERSITY
are anonymous and no information is collected that can be used to identify the
subject confidentiality will not be a problem.
To solve the problem of confidentiality sometimes subjects are given codes to
use on their data that allow researchers to connect their data without revealing
their identity. In some cases in which the data are in no way potentiality sensitive
or embarrassing, names may be collected.

ETHICAL PRINCIPLES FOR ANIMAL RESEARCH


In 1985, the American psychological association approved that the most recent
version of its guidelines for Ethical conduct in the case and use of Animals. These
191

guidelines suggest that nonhuman animals should be treated in a human and


ethical fashion.
The guidelines stipulate that research that uses nonhuman animals must be
monitored closely by a person who is experienced in the case and use of laboratory
animals and that a veterinarian be available for consultation. Further, all
personnel who are involved in animal research, including students, must be
familiar with these guidelines and adequately trained regarding the use and case of
animals. If one is to involve in animal research he must acquaint himself with
these guidelines and abide by them at all times.
The facilities in which laboratory animals are housed are closely regulated by
some laws. The animals must be housed under humaic and healthful conditions.
According to the guidelines, the facilities should be inspected by veterinarian at
least twice a year.
Advocates of animal rights are most concerned about the experimental
procedures to which the animals are subjected during research. The APA
guidelines require the investigator to justify the use of all procedures that involve
more than momentary or slight pain to the animal.
The scientific purpose of the research should be of sufficient potential
significance as to outweigh and harm or distress to the animals used. (The APA
Guidelines, 1985)
Procedures that involve more than minimal pain to distress require strong
justification.
The APA regulations also provide guidelines for the use of surgical procedures,
the study of animals in field settings, the use of animals for educational purposes,
and the disposition of animals.
Some animal rights activists argue that like people animals have certain moral
rights, and that human beings have no right to subject nonhuman animals to pain,
stress and often death to their own purposes.
In an ideal world we would be able to solve problems of human suffering
without using animals in research.

ANNAMALAI UNIVERSITY
SUMMARY
Since the scientific investigation become more and more searching. The
spontaneous happing in nature are not adequate to permit the necessary
observations. This leads to the experimentation in research. In an experiment the
research can manipulate the variable and find out the relationship between the
variables. Experimentation has many advantages. Exploratory, Confirmatory,
Crucial and Pilot experiments are the four type of experiments used in psychology.
Experimentation involves careful planning and it follows various steps/stages. APA
has published a guideline for conducting the research/experimentation with
human participants as well as research with animals. The important ethical
192

consent; Freedom from coercion to participate; minimizing physical and mental


stress; deception in research and confidentiality.

KEY TERMS
Exploratory experiment Confirmatory experiment
Crucial experiment Pilot experiment
Ethical Principles Informed consent
Freedom from coercion Deception
Debriefing Confidentiality

QUESTIONS
1. Examine the different types of experiments.
2. Explain the steps in planning an experiment.
3. State the ethical principles in human and animal research.



ANNAMALAI UNIVERSITY
193

IMPORTANT QUESTIONS
1. Discuss the various scales used in the assessment.
2. What is a psychological test? Explain the basic assumption behind a
psychological test.
3. Explain the frequency distributions in measurement with suitable example.
4. Describe the measures of control tendency.
5. Calculate the mean, median and mode for the following frequency distribution.

6. Describe the measures of variability


7. Calculate the quartile deviation, average deviation and standard deviation for
the following scores.
C–I 119-125 126-132 133-139 140-146 147-153
F 8 16 20 14 12
8. Discuss the properties and applications of a normal probability curve.
9. Explain skewness and Kurtosis with suitable example.
10. Describe the standard scores and their applications.
11. Elucidate the importance of correlation co-efficient in research methods
12. Calculate the Karl Pearson’s Product moment correlation co-efficient for the
data given below
X: 17 13 14 10 18 18 20
Y: 7 6 13 8 8 9 3
13. The achievement scores of 8 students in physics and mathematics are given
below. Find out whether there is any significant relationship between physics
and mathematics.
Physics 68 73 80 43 69 75 60 49
Maths 90 100 94 96 94 80 70 60
14. Discuss the ‘t’ ratio with its basic assumption.
15. “Psychological experimentation is the application of science and scientific
method” – Discuss.
16. Describe the major stages of research
17. Elucidate the importance of “review of related literature”
ANNAMALAI UNIVERSITY
18. Explain the sources of a research problem and the criteria for selecting a
problem.
19. Discuss criteria for selecting a research hypothesis and explain the types and
forms of hypothesis.
20. What are the basic concepts in testing hypothesis. Discuss the various steps
in the testing of hypothesis.
21. Elucidate the importance of sampling in research.
22. Discuss the various methods of probability and non-probability sampling.
23. Write a short note on (i) check list, (ii) Case study
24. Discuss the steps in the construction of Likert type of aptitude scale.
25. Elucidate the different types of rating scales.
194

26. Examine the sources of error in test scores.


27. What is reliability? Explain the methods of reliability with suitable example.
28. Define validity and discuss the different types of validity.
29. Delicate the importance of test construction and explain various methods
constructing a test.
30. Describe the stages in test construction
31. Write a note on “Item analysis”.
32. Define historical research. What are the major sources of data in historical
research?
33. What is a survey? Discuss the types of survey with the steps involved.
34. Explain the conditions for an experiment to discuss the various types of
experiment.
35. Discuss the types of empirical relationships studied in psychology.
36. Explain the various variables used in experimental research.
37. Write a note on “experimental control”.
38. Explain the different types of experiments with suitable example.
39. Discuss the “Experimental Plan”.
40. Write a note on “ethical principles in human research”.
*****

ANNAMALAI UNIVERSITY
195

M.Sc. APPLIED PSYCHOLOGY


First Year
COURSE – II: PSYCHOLOGICAL STATISTICS AND RESEARCH METHODS

Model Question Paper


Time : 3 hours Maximum: 100 Marks
Section-A

Answer any Five questions in about 300 words


Each questions carry Equal marks
5 × 8 = 40
1. Elucidate the important of testing and assessment in Psychology.
2. Discuss the measure of variability with suitable example.
3. Explain the uses of normal probability curve.
4. Elucidate the basic assumptions in the use of ‘t’ ration.
5. Write a note on variables to experimental group.
6. Explain the sources of research problem.
7. Discuss the case study with suitable examples.
8. Differentiate reliability and variability
Section-B
Answer any Three questions in about 1200 words
Each questions carry Equal marks
3 X 20 = 60
9. Explain the basics concepts involved in testing of hypothesis.
10. Describe the styles in the construction of Likert type attitude scales.
11. What are the conditions repaired for an experiment? Delineate the important
stages in experimental research.
12. Examine the influence of psychological theories on the assessment process.
13. Explain rank order and product moment correlation co-efficient with suitable
example.
ANNAMALAI UNIVERSITY

You might also like