0% found this document useful (0 votes)
6 views

Chapter 10 Processing Analysis and interpretation of data

The document outlines the processes involved in the processing, analysis, and interpretation of data, emphasizing steps such as data preparation, editing, coding, and data cleaning. It describes methods for summarizing data through classification, tabulation, and graphical representation, as well as the importance of descriptive statistics and measures of central tendency. The document serves as a guide for researchers to ensure accurate and effective data handling for analysis.

Uploaded by

TapanKumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
6 views

Chapter 10 Processing Analysis and interpretation of data

The document outlines the processes involved in the processing, analysis, and interpretation of data, emphasizing steps such as data preparation, editing, coding, and data cleaning. It describes methods for summarizing data through classification, tabulation, and graphical representation, as well as the importance of descriptive statistics and measures of central tendency. The document serves as a guide for researchers to ensure accurate and effective data handling for analysis.

Uploaded by

TapanKumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 21
_—_———E Le PROCESSING, ANALYSIS AND INTERPRETATION OF DATA PROCESSING OF DATA Before writing the Report, the processing, analysis and interpretations of Data begins. Once the data are received from the field, the researcher has an important duty to process data for subsequent statistical analysis. The processing of data includes : (i) Data preparation (ii) Classification (iii) Tabulation (iv) Graphical representation (v) Diagrammatical representation (vi) Computative of statistical deviation Thus, data preparation consists of three important steps, 1. Editing, 2. Coding, 3. Data entry, 4. Transcribing, 5. Data Cleaning EDITING : Editing is the review of the questionnaires with the objective of increasing accuracy and Precision. It is needed to detect and if possible, to eliminate errors in the filled in questionnaires. Eedting work, although uninteresting and dull in nature, is no doubt necessory for faultless analysis of survey data, There are three points - Completeness, Accuracy and Uniformity to be checked while editing the data, Completeness : While checking the questionnaires for completeness it should be remembered that there is an answer to every question. If not, the answer should be deduced ftom other data. or referred to the concerned respondent. Accuracy : Besides checking that all questionnaires are provided with answers, one must “yo check whether answers are accurate. Inconsistencies in answers should be looked for and Tesolved. Business Research Metho dg The questionnaires with unsatisfactory responses may be returned to the field Where tte interviewers recontact the respondents. If returning to the field to correct the unsatisfactory Tesponsey and to fill up the missing values is not feasible, the editor may assign seemingly appropriate Values to unsatisfactory responses. Alternatively the editor may discard the unsatisfactory Tesponses, ip these are few in number. Uniformity : While editing the schedules efforts should be made to check whether the interviewers have interpreted questions and instructions uniformly. In case of lack of Uniformity editing staff may try to make corrections or refer back the schedules to the Tespondents or omit schedules from analysis. CODING : After editing the data it may be required in most Surveys to put the results in questionnaire form by coding the answers before summarization and analysis begin. This may also be conveniently carried out at the time of editing, if not coded in the questionnaires/schedules, The purpose of coding in surveys is to put the answers to a particular question into meaningfy and unambiguous categories to bring out essential pattern, concealed in the mass of information, Essentially, coding means assigning a code, usually a number, to each possible Tesponse to each question. Closed questions can be easily handled by the researchers for coding. As regards open ended questions the researcher should note the varieties of answers and after preliminary evaluation, response categories can be settled down / created for coding. Although most responses could be accounted for by the derived categories, another category might be established to meet the coding tule exhaustiveness. It is to be noted that open questions are more difficult to code since answers are not prepared in advance. However, they do encourage disclosure of complete information without restriction imposed by prior suggestive answers. Transcribing : Transcribing data involves transferring the the coded data from the questionnaires or coding sheets into a computer for subsequent data treatment. Data Cleaning Data cleaning involves : (i) Range checks (ii) Consistency checks (iii) Treatment of missing observations Range checks compare each data item to the set of usual and permissible values for that variable. Range checks are used to (a) detect and correct invalid values (b) note and investigate unusual values (c) note outliers which may need special statistical treatment. Consistency checks examine each pair of related data items in relation to set of usual and permissible values for the variables as pair Consistency checks are used to (a) detect and correst impermissible combinations (b) note and correct unusual combinations. proces: ir nalysis and Interpretation of Data e& Inconsistent responses may have noticeable impact on estimates and can alter comparisons ¢1085 gTOUPS. Missing responses correspond to unknown values of the variable because of ambiguous answers provided by the respondents and also because the interviewers fail to record answers propery. The treatment of missing responses may be made by e _ substituing a neutral value such as mean response, etc, ¢ substituting an imputed response by studying the pattern of responses. « deleting the missing responses from the analysis. e taking into account only available responses for each question. DATA ENTRY Data entry implies conversion of information gathered from secondary or primary sources to a medium for viewing and manipulation. Keyboarding helps the researchers who need to create a data file immediately and store it in a minimum space on a variety of media. For large research projects involving large bulk of data, database programs serve as valuable data entry devices. A database is a collection of data organized for computerized retrieval. Spreadsheets are a specialized type of database. It provides an easy to lea mechanism for organizing and tabulating data and computing simple statistics. Data entry on a spread sheet uses numbered rows and letter columns with a matrix of thousands of cells into which an entry may be placed. A dataware house organizes large volumes of data into categories to facilitate retrieval, interpretation and sorting by end users. SUMMARIZATION OF DATA: The raw data are often in bulky form and are difficult to be comprehended easily for ready reference. This calls for devising certain suitable methods for processing of data such as condensing / summarizing data for easy comprehension and providing directions for subsequent analysis in computing required statistical derivatives and applying further statistical treatment. The most prominent and universally adopted data processing methods which present data in a summary form are : Classification Tabulation ‘ Graphical Representation Diagrammatic Representation Computation of Basic Statistical Derivatives for univariate / multivariate data. CLASSIFICATION OF DATA: Classification is a process by which the data are arranged according to resemblances, affinities, intemal homogeneity and common characteristics. This process is forerunner to the wR YN Business Research Methods tabular, graphical and diagrammatic representation of data, For example, in a socio-economic enquiry, data can be classified according to age, sex, educational qualification, religion, caste, Income group, occupation etc. TABULATION. After classification of data the next step in the process of summarization of data is to put the classified data in rows and columns having special characteristics on a piece of paper. Such representation of data in orderly and easily comprehensible fashion is called tabulation. Classification is a pre-requisite for tabulation. General format of a table : Table number Title Head note (Row Headings and Column Headings) Body of the table Foot note (if any) Source note (if any) An ideal table should have (1) title (2) stub (3) caption (or box head) (4) body Frequency Table : In frequency table data are classified according to class intervals. Suppose we have monthly income data of 1000 households in a large community. What is the number of households whose monthly incomes lie between Rs.4000 and Rs.5000 ? Suppose from the given data we find there are 400 households having monthly income between Rs.4000 and Rs.5000. Now the class interval is stated as Rs.4000 — Rs.5000. The frequency in that class is 400. Rs.4000 is the lower limit of the class and Rs.5000 is the upper limit. The difference between the upper limit and the lower limit is defined as the size or length of the class interval. In the present case the size is Rs.1000. The class intervals may be either continuous or discontinuous depending on the nature of the variable. In practice the values observed on the continuous variable (say height, weight etc) are classified into continuous class intervals and the values observed on discontinuous / discrete variable (number of children in households, number of books in shelves etc) are classified into discontinuous class intervals. Analysis and Interpretation of Data 05 ation - 1 : Continuous class intervals (1 ust 1000- 000, 2000 - 3000, 3000 - 4000 etc, piscontinous class intervals 1000 1999, 2000 - 2999, 3000 - 3999 ete. The principal steps in the construction of frequency distribution are @ decision as to number of and size of class intervals (ii). selection of class limits. (ii) counting of frequencies through tally marks, etc. size of the class interval depends on the number of class intervals. The large number of cass intervals may not justify the reasons for condensation / summarization of data. If the number of class intervals is very small, essential features of the data may be concealed in the frequency distribution. In practice it is expected that the number of class intervals should not exceed 15. ‘After class limits of the class intervals are fixed the number of observations falling in each class interval is counted by using the oldest method, Tally marks (| || [) or by the use of computers. The frequency distribution may consist of equal sized class intervals or unequal class intervals. The equal sized class intervals help comparison of frequencies in different class intervals. T' ‘he beginning and end class intervals may be either open or closed. In open ended class intervals the beginning class intervals is written as ‘less than’ the lower limit of the succeeding class interval and the end class interval is expressed as ‘Greater than or equal to’ the upper limit of the preceding class interval. Ilustration — 2 : TABLE - 1 (Frequency Distribution) Cumulative frequency _ | Class Frequence| Relative Greater than] Less than interval fF freq. type type <30 4 0.08 4 50 30-40 12 0.24 16 46 40-50 18 0.36 34 34 50-60 14 0.28 48 16 >=60 2 0.04 50 2 Total 50 1 = The open ended class intervals are used when the extreme class intervals contain very small fiequencies and hence grouping is resorted with inequalities. Relative frequency and cumulative frequency on ae frequency is defined as the ratio of frequency in a particular class interval to the requency in a data set. (Table — 3) ae frequency is the consecutive addition of class frequencies when the class S are either in increasing or decreasing order (Table — 3) B Business Research Methods GRAPHICAL REPRESENTATION OF DATA The graphs provide an alternative method of representing data in a con form. A graph is a scale-dependent geometrical figure and provides a visual presentation of statistical data. Graphs are immensely useful to di have become an indispensable tools for economic analysis (i.e, price vrs d expenditure, time vrs population, time vrs agricultural production, economic development vrs population growth etc. Graphs with more than two variables lose their ready recognisability and are not in common idensed and summary lepict economic relationship explicitly through times and lemand, income vrs use. ‘An ideal graph should be self explanatory and drawn neatly with indication of scales on both axes and with a clear and unambiguous title, Head note, Foot notes and Source notes at suitable places in the body. All descriptions used in the graph should be written horizontally. Illustration - 3 : Growth of Population in India TABLE - 2 (Population of India) Population in Milhons Y(t) 8888888 Year 1901 4917 1921 1931 1941 1951 1961 1971 1981 1991 2001 Fig.-1 : Population of India (1901 - 2001) processi Analysis and Interpretation of Data ee Frequency graphs Frequency distribution can be represented by suitable graphs to show the characteristics of frequency distribution. These are 1. Histogram 2, Frequency curve 3, Frequency polygon 4. Cumulative frequency curves (ogives) Histogram is a graphical representation of frequency distribution, where the frequencies in the form of rectangles are erected over the respective consecutive class intervals. The areas of the rectangles are proportional to class frequencies. Total area of the rectangles represents total frequency. The frequency polygon is drawn by joining the mid points of the tops of the rectangles. The area bounded by the frequency polygon is supposed to be equal to the area bounded by the histogram, which in turn represents the total frequency. The frequency curve is a smoothed curve passing approximately through the extreme points of the frequency polygon and the area bounded by the frequency curve and the X-axis represents the total frequency. The cumulative frequency curve is drawn by plotting cumulative frequencies greater than type / less than type against the lower class limits / upper class limits of the class intervals. The cumulative frequency curves are also called ogives. DESCRIPTIVE STATISTICS Besides the classification and tabulation of data, and their representation through graphical and diagrammatic representation, there may be further necessity of calculating certain important statistical derivatives (numerical quantities) like percentages, rates, ratios, indices, measures of central tendency (averages) / measures of location, measures of variation / dispersion, coefficient of variation and analytical ones such as differences, correlation coefficient, regression coefficient, etc. which also lead to ultimate reduction of data. Such reduction to some numerical quantities is useful for the sophisticated analysis and interpretation of data. Thus, descriptive statistics provide a clear, concise, useful and informative picture of a mass of numerical figures. Measures of Central Tendency Given a data series, a researcher may like to know what is the average of the values of observations around which the values in the data series lie ? This is answered by computing the statistical quantities termed as averages grouped under a broad name Measures of Central Tendency / Measures of location. These measures are Arithmetic mean, Geometric mean, Harmonic mean, Median and Mode. The most popular measure of central tendency is the Arithmetic mean. Arithmetic Mean The arithmetic mean of a set of values is tl values and represents the average of values in the data set. he sum of values divided by the total number of Business Research Methods Illustration — 4 ; ; : Suppose the daily cereal expenditure of five households in a locality are Rs.60, Rs.80 , Rs.65, Rs.45 and Rs.55. Then, the average or arithmetic mean or simply mean daily cereal expenditures of these five households is computed as Sum of expendituresof five households Arithmetic mean 4M = ber of households _ 60+80+65+45+55 5 Mathematically gives a set of n observed values (ratio or interval data) x4.%3,%,.4%, the afithmetic mean is defined by =Rs.61 n n :. Inspite of its simplicity in calculation, the arithmetic mean is affected by the extreme values in the data set, which is a major draw back in its application. This has led to innovation of two other measures with concept of average. These are Geometric mean e * Harmonic mean The geometric mean of x,,x5,..,x, is defined as Geometric Mean 1 Xoo = (1 xX, XX, X...Xq)a» Provided x, >0 for all i= 1, 2, ...n. The Harmonic Mean x,,x,,..,x, is defined as provided x; #0 for all i= 1, 2,.....n. The geometric mean is an appropriate average of observations in the data set which lead to be logarithmic in form. It is the correct average of percentage rate of increase and ratios. The harmonic mean is useful when the data are given in terms of rates, Sometimes it is Conventional to deal with data in kilometres per hour, units purchased per rupee or units produced per hour. Mathematically it can be proved that x >xG >xH For example, given four observations 5, 8, 10 and 117 > _5+8+10+11 _— a =35 XG =(5%8%10%117)" =(46,800)!* -14.71 XH ea ee 8 10 117 a 5 and Interpretation of Data However, 1 practice the arithmetic mean is invariably used because of its simplicity and vanced applications in statistical analysis. a dia ¢ ‘Median is another measure of position which divides a data set into two equal halves and is the middle most value. Half of the values lies below the median and half above the median. To median from a data set, the values are to be ordered either from the lowest to highest ute the aie highest to the lowest. or from the lustration -5: Data in Illustration - 10 regarding the daily cereal consumption expenditure of five households may be ordered as Rs.45, Rs.55, Rs.60, Rs.65, Rs.80 The median is the value of the middle most observation when arranged either in ascending or descending order. This comes cut to be 3rd observation, i.e, Rs.60/-. If the number of ‘observations js even, the median is approximately equal to mean of two middle most observations. ‘Median is not affected by extreme values in the data set and involves one draw back that it is not based on the numerical values of all observations. Further it lacks simplicity in advanced statistical applications compared to simple mean. Mode Sometimes we may be interested in the most typical or the most frequent value in the data array or the value around which maximum concentration of items accurs. That value is called Mode. For instance, a garment manufacturer may like to know the size of men’s shirts that has the maximum demand in the market, so that his production will have to concentrate around that shirt size, Ina series of 6 observations - 10, 15, 18, 15, 20, 15, the modal values is 15, because it is the most frequent value. However, mode has limited application in statistical studies. Measures of Dispersion / Variability Dispersion implies spread or variability in the values of the items in the given data set. Referring to data in Illustration - 10 on daily cereal consumption expenditure of 5 households Rs.60, Rs.65, Rs.80 Rs.45, Rs.55, we find that all the values are not the same and hence there is Variability in the data set. In order to throw light on this variability we need a measure of variation / dispersion. In statistical literature there are a number of measures of disperssion to meanure the variability in the data set. If there is no variablility in data set, it is said to have zero dispersion and as such all the Measures will result in zero values. Suppose there are n observations having the values oe ke The methods of computation of some important measures of dispersion are stated below. (236) Business Research Methods Vi ‘ariance (¢2) ; a where X=— ee Kal Sa, Standard deviation (0) ia Mean absolute deviation (MAD) : SSS MAD. pods x| Note : Variance is measured in st absolute deviation, Range (R) The range is a very simple and rough measure of variation and is defined as R= Xn, —X, quared units of measurement unlike standard deviation or mean in where Xa, and X, Mean difference (A) : 1 ase -x|| Pa six ar€ maximum and minimum values in the set of observations, n(n The mean difference is attributed to Gini and measures independent of any central value. Quartile Deviation : The median divides the data array into two equal parts, Le, 50% the intrinsic spead of the distribution, of the observations lie above the median and 50% lie below the median. The first quartile (Q,) in an ordered data array is the value above which 75% of the values lie and below which 25% of the values lie. The second quartile is the median. The third quartile (Q,) is the value below which 75% of the values lie and above which 25% of the values lie, Thus, the interquartile range is Q, ~Q,. The Quartile deviation is defined as -3-% QD= 2 This is also termed as Semi-interquartile range. processing, Analysis and Interpretation of Data Relative Measures of dispersion Sometimes we may be required to compare variations in two data series where the units / opservations are measured in different units of measurement. For example, suppose we are interested in comparing the variations in heights (measured in cm) and weights (measured in kg) of a group of individuals Here we need to construct some relative measures of variation which are free from the units of measurement. Some measures of this type are given below. Coefficient of variation Let o and X be the standard deviation and arithmetic mean of the values in a data series. Then, the coefficient of variation is defined as standard deviation _o CV= mean Xe which is free from the unit of measurement, whatever it may be. It is usually expressed in percentage. Other relative measures of variation are _ Mean absolute deviation from mean @ Arithmetic mean (ii) Mean absolute deviation from median Median , -Q (iii) QG+Q (iv) Coefficient of Range = Xmax —Xmin xx Illustration — 6 : Referring to data in Illustration - 10 we compute X = 61, X,, (median) = 60 gi= (45-61) +(55-61). + (60-61)? +(65~61) +(80-61)" 2 = 134 (squared units) Thus, o = 11.58, MAD (from mean) = —(46) = 9.20 5 5 1 Mean absolute deviation from median = 3(43) =9 p= =O - 2.5059 11.25 2 2 Business Research Methods Relative measures of variation o_ 11.58 =D === 0.1898 @ Wee ig) Mea absolute dey from median 9 4 () nedian SOS OO = Q _ 72.50=50 _ 99 590.1836 (iii) Q+Q, 72.50+50 Xwnax—Xmin _ 80-45 : : = mean = «08 (iv) Coefficient of Range max + Xin 80445 from the ungrouped data, Measures of Central tendency, Let x), X2ou9X\ represent the mid-values of k class interval SioSyy «sf, Tespectively, Is with frequencies k Sin Arithmetic mean = ¥ = 1+ 6%) +..+ fy fi +f +. at A, Geometric mean Ry = (x x9"... df 2 ial Harmonic mean Xq= proces nalysis and Interpretation of Data ee) Median 4 Median in a frequency distribution is calculated by forming a cumulative frequency table. As the exact value of the median is not possible to locate in a grouped frequency distribution, it is requited to use a numerical interpolation formula such as Median = M, =|, where 1, = the lower limit of the median class [= the upper limit of the median class f= frequency in the median class Athtoth et Total frequency mu 2 c= the frequency in the class preceding the median class. It may be pointed out here that the computation of median in a frequency distribution needs continuous ordered class intervals. Mode ‘The modal class is the class for which the frequency is maximum, To compute mode from the grouped frequency distribution with continuous class intervals, the numerical interpolation formula is to be adopted. AL Ay +A, J, = lower limit of the modal class i= size of the class interval. Xo = 1) +i A, = difference between frequency is the modal class and the preceding class. A, = difference between frequency in the modal class and the succeeding class. Measures of Dispersion (i) Variance (o*) Business Research Methods Illustration — 7 ; Given below the frequency distribution of number of defective electric tubes in 50 lots, each of size 100. TABLE-3 Frequency distribution of defective tubes Defective Frequency Cumulative Mid- fxx Median/ tubes C) frequency _value(x) Modal class 0-5 4 4 25 10 5-10 8 12 15 60 10-15 14 26 tas 175 median class 15-20 18 44 17.5 315 modal class 20-25 6 50 22.5 135 Total 50 — _ 695 Median (M,) = 4+ 15-10 iq (25-12), (median lies in the class 10 — 15) = 14-64, where J, =10 =10+, L #15, f=14,m= 225 and e=12 4A, Mode (M,) =! “(e -), (mode lies in the class 15 — 20) 4 =15+5|—_| _ = Fel =15+1.25=16.25 Standard deviation (6) . yh (5-3) = (1552) =5.571 and Interpretation of Data Mean Absolute Deviation (MAD) 1S ff -3| = 7228 - 4.656 “Thi 50 TABLE -4 Computations for standard deviation and mean deviation - ay 2 = = vidvale) Ff ax (=x) iu -3) 3] bs | 25 4 2.5 -13.9=-11.4 129.96 519.84 1L4 45.6 15 8 75-139 40.96 327.68 6.4 51.2 =-64 12.5 14 12.5 -13.9=-1.4 1.96 27.44 14 19.6 17.5 18 17.5-13.9=3.6 12.96 233.28 3.6 64.8 22.5 6 22.5 -13.9= 8.6 73.96 443.76 8.6 51.6 - 0 . 1552 = 232.8 ANALYSIS OF DATA Data do not speak themselves. The researchers or analysts make them speak. Processing and summarization of data help to get a feel for the data. Exploration of data at certain stages becomes descriptive analysis and involves statistical representation of data ¢ logical ordering of data. such that the question can be raised and answered. The statistical representation of data needs clasification, frequency distribution or tabulation. The tabulation in order to have utility, must have internal logic and order. ‘The most common summary statistics / descriptive statistics are ratios, proportions, measures of location / averages, measures of dispersion. Consider the age - sex two - way classification of some population below 15 years of age given in Table — 3. The proportion of males in 0-4, 5-7 and 10-14 are 0.526, 0.532 and 0.517 Tespectively, TABLE -5 Ratio. | Proportion of males ] (F/M) M/(M+F) 0.90 0.526 (52.6%) 0.88 0.532 (53.2%) 0.517 (51.7%) 0.525 (52.6%) Business Research Methods Across tabulation of bivariate data displays the frequency (or percentage) of all combinations of two or more nominal or categorical variables to detect association / corelation or cause - effect relationship. The foregoing table is a cross table with sex and age groupp of two variables (attributes) representing frequency for each combination, The diagrammatical and graphical representation of data are some of the classical methods to understand the data and to open up concealed features there in, which are of interest to the researchers and also for the laymen, plot (b) box plot / box whisker Plot. To construct a stem and leaf plot divide each number into two groups - first few digits form stem and the remaining digits form leaf. A box-plot is a summary information contained in the quartiles, The Pareto diagram is a bar chart whose percentages sum to 100. The caused of problem under investigation are sorted in decresing importance with bar heights decreasing from left to tight, The computation of statistical measures of central tendency indicate average picture of the observations in the data set and the measures of variation indicate the amount of dispersion or variability in the set. Analytical statistics in the statistical analysis include the comparison of two data sets as regards their averages, ratios and variations. If the researcher likes to know the relationship between value of independent variable. When the population under study is large, we resort to a sample study because of time and Cost constraints and limited infrastructure facilities, The estimates computed from the sample for Population characteristics such as Population mean and Propotions will be subject to sampling and non sampling errors which ought to be controlled for valid inferences. Statistical procedures are available in literature to deal with such cases for efficient statistical analysis, Research questions are translated into research hypotheses. These hypotheses are tested with the help of samples randomly selected from the certain Populations having some infinite thoretical population models, such as normal, exponential, binomial and Poisson populations, etc which fit many real life situations. The concept of theoretical models is considered while testing the hypothesis in order to generalize the conclusions derived using sample from finite Populations. Analysis and Interpretation of Data process When sampling from finite or infinite populations, two inferential problems arise : Estimation of unknown population characteristics such as mean, total, proportion, variance, correlation coefficient etc. Testing. hypothesis about the unknown population characteristics as mentioned above The first problem requires to compute the estimates along with standard errors or confidence intervals. The standard error which is the square root of the sampling variance, a measure of sampling error indicates the extent of reliability to be put on the sample estimates. Alternatively we may also compute the confidence interval in which the unknown population parameter is expected £0 lie with certain probability. For the second problem we set up hypothesis based on research questions and take decision as to sample evidence whether to reject or accept the hypothesis with certain level of significance. Sometimes it may not be possible to guess the form of the distribution from which a particular sample is assumed to be drawn, excepting that the population is continuous having existence of certain moments. In such situation certain non-parametric or distribution free tests have been developed in the literature. Non-parametric tests are less efficient than parametric tests where the assumption of normality of observtions is an essential requirement. INTERPRETATION OF DATA : ‘After collection of data and its subsequent analysis, the researcher has to accomplish the task of drawing inferences followed by report writing. This has to be condensed carefully, otherwise misleading conclusions may be drawn and the whole process of doing research may get vitiated. It is only through interpretation that the researcher can expose relations and processes that underlie his findings. If the hypothses are tested and upheld several times, the researcher may arrive at generalizations. What is interpretation ? Interpretation refers to drawing inferences from the collected facts after an analysis and/or experimental study. It has two major aspects : (a) the effort to establish continuity in research through linking the results of a given study with those of another, and (b) the establishment of some explanatory concepts. Why Interpretation ? Interpretation is essential for the simple reason that the usefulness and utility of research findings lie in proper interpretation. It constitutes a basic component of research process. The Teasons are : 1. It is through interpretation that the researcher can well understand the abstract principle that works beneath his findings. 2. Interpretation leads to the establishment of explanatory concepts which can serve as future guide. Business Research Methods 3, Researcher can better appreciate only through interpretation why his findings are what they are and can make others to understand the real significance of his research findings. Techniques of Interpretation : The task of interpretation is not a very easy one. It requires the knowledge and expertise of the researcher along with basic understanding of the subject of research and its theoretical background, A scientific analysis of statistical data and survey findings help realistic interpretation. The researcher should base his conclusions and interpretation on the basis of reliable data collected for the purpose and the researches done earlier. Broad generalizations of the results should be avoided unless verified by repeated experiments and field surveys. The reseacher should avoid making scientific comments and interpretations without verifying from possible angles. Limited sample data should be interpreted with caution. If necessary, the researcher should consult experts in his subject of study while interpreting the results obtained in his researches. MODEL QUESTIONS a in the blanks. The averages around which the values of observations lie are called ——. Mean deviation and Quartile deviation are measures of. Relative measures of dispersion are free from 10. Coefficient of variation is a 2. The most popular average is 3. Arithmetic mean is affected by values. 4. Median is not affected by value in the data set. 5. The variability in the data set is measured by 6. The variance is measured in units. 7. The standard deviation is of variance. 8. 9, measure of dispersion. 1. Measures of Central tendency, 2. Arithmetic mean, 3. Extreme, 4. extreme, 5, Measures of variation / dispersion, 6. squared, 7. positive square root, 8. Dispersion, 9. units of measurement, 10. Relative. Ey muttipte Choice Questions 1. Point out the relative measue of dispersion (a) Variance (b) Quantile deviation (c). Range (@) Coefiicient of variation Ans (d) Coefficient of variation Analysis and Interpretation of Data cossite spat is the first step in analysis of data ? 2 We ia entry to computer (b) Tabulation Editing (d) None of the above as. (0) Editing which of the following measures is free from the unit of measurement ? 3 Mean (b) Variance o standard deviation (d) Coefficient of variation Ans. (d) Coefficient of variation if you change the origin of the measurement of a variable ? (@) variance may be increased (©) variance does not change (0) standard deviation may be reduced (d) — None of the above ‘Ans. (b) variance does not change Find out True / False statements 1, The variance of a set of observations corresponding to weights of 10 objects measured in kg. is Ske. Depending on data correlation coefficient can be greater than unity. ‘Two sets of observations (1, 3, 5) and (2, 4, 6) have the same variance. Variance or standard deviation is invariant under the change of origin. Variance or standard deviation is invariant under change of scale, Coefficient of variation is the ratio of standard deviation to the mean, usually expressed in percentage Oe) 1. False, 2. False, 3. True, 4. True, a Short answer-type questions if (a) What are steps for data preparation ? Ans, Editing, Coding , and Data entry. (b) What is transcribing ? Ans. Transferring coded data into a computer. (©) What is coding ? Ans. Survey coding is the process of taking, open-end responses and categorizing them into fee Once coded they can be analyzed in the same way as multiple response questions can be. may vary ftom person to person depending on what you code you use for open end comments. 2 How would you summarize data? ‘Ans, Classification, Tabulation. Graphs, Diagrams, Computation of Statistical derivatives. Business Research Methods (a) What is classification of data? Ans, Representing data in different classes (b) Wha Ans. Number of entities in a particular category, Frequency class 4. (a) What is class interval ? Ans, Particular numerical interval having upper falls. and lower limits within which certain (b) What are closed interval and open class interval ? Ans, Example: In age classification one can group ages of population in 15-20 years.20- 25, ete as (closed) intervals and above (>) 20 or < 15 (open interval). (c) What is size of class interval 2 Ans. (Upper limit- lower limit.) (d) What is Relative frequency ? Ans. Ratio of Frequency in a class interval to total frequency. 3. (a) What is Tabulation of data? Ans. Representing classified data in orderly fashion enclosed by horizontal and vertical lines. (b) What is the general format of a Table ? Ans. Table number, Title, Head note - Row and column, Body of the table, Foot note, Source note. (c) What is Master table ? Ans. Reference table/Information table/General purpose table —the detailed table with available enquiry in orderly fashion. (a) What is summary table ? Ans. Special purpose table derived from General purpose table, (€) Give a frequency table, how would you reproduce data ? Ans. Graphs and Diagrams (f) What is a graph ? Ans. Scale dependent geometrical figure (g) What is Histogram ? Ans. Graphical representation of frequency distribution, (h) What is a Diagram ? Ans. Diagrammatical representation of data in one dimension, two dimension and multi- dimensions. (i) What is importance of diagrams ? Ans. For easy visual interpretation. (j) What are statistical derivatives from a data set 2 ‘ nalysis and Interpretation of Data proces geome - type questions. ‘What do you understand by processing of data ? What are its different components ? Explaii nts ? Explain. What are different steps for data preparation ? Discuss. Which is brief reasons for (@ Graphical representation of data (b) Diagrammatical representation of data © Classification of data (a) Tabulation of data ‘What is cross tabulation of data ? Explain its uses. 5, Mention considerations for the analysis of data. : ene ? Why do you interpret ? What are techniques for interpretation ? o#e#d

You might also like