0% found this document useful (0 votes)
306 views13 pages

Anova

Uploaded by

Danish Vohra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
306 views13 pages

Anova

Uploaded by

Danish Vohra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 13
ANALYSIS OF VARIANCE 16.1. INTRODUCTION Analysis of variance, developed by R.A. Fisher is one of the most powerful tools of statistical analysis used to partition total variation into various components according to the nature of classification of data. In the Chapter on ‘Tests of Significance’ we discussed the methods to infer whether a pair of samples in question came from the same or two divergent populations. For large sample this was done by using the ratio of difference in sample means to its standard error where the said ratio followed normal distribution while in the case of small-samples, the ratio followed t-distribution. The method discussed in this Chapter will be useful in drawing inferences regarding difference between several means simultaneously. The main objective of this method is to partition the variation present in the given individual observations into different components and perform test of significance. For example in an agricultural experiment an experimenter applies four kinds of fertilizers to eight plots in each of which wheat has been sown (a particular fertilizer is applied to two plots) and records the yield from each plot. He may be interested in knowing the effects of the various fertilizers on the yield. The answer to this problem and many such problems shall be provided by the technique of Analysis of Variance. In the following paragraphs we shall discuss various aspects in one way and two way classification to pinpoint the use of analysis of variance. Cerone way classification where the observations are classified according to only one criterion, there are two types of variations in the data namely, the variation between samples and variation Within samples. If the variation within the samples and between the samples do not differ from a other, the samples are said to belong to the same population, as in such situation the variation Teincen the samples as well as within the samples will be estimating the same population variation. ese circumstances if we had taken one sample of all the items, it would have not made any fference as far as variation of items is concerned. On the other hand the larger variation between ' samples as compared to variation within the samples indicates that the samples come Bom ‘“tgent populations. . the n/t 20 way classification where the observations are classified according fo We factor nd ings Umber of observations in each cell is one, we partition the total variation in the chs Tat three components namely the variation due to factors and error (the left out pe ibs /Sbservatin nce Of each factor with the variation due to error. On the othe han i pe “Veryeetots i each cell is r (r > 1) we can estimate and test the significanc two factors as well. STATISTICAL METHODS FOR Rese, y 6 y 250 RCH 16.2. ASSUMPTIONS IN ANALYSIS OF VARIANCE Me, eo jati ithin samples obtai . iat io of variation between samples to within si ined dus; vss scarce follows F-distbution under the following realistic assumptions : Ing the aa, ol ve . / 1. All the populations from which we draw samples should be normally, ditibt « 6 2, The variances for all the populations from which samples are taken shouldbe 7 1. a apa op =.= of (t= Number of samples drawn). al a 3, The individuals or units constituting the sample have been randomly Seleceg iy eB 4 population. ; ; fog, ' we Violation in any of the above assumptions seriously affects the validity of test of Signif 2 se tay 16.3. ANALYSIS OF VARIANCE TECHNIQUE Now we shall discuss the analysis of variance under the following heads : [see I. One Way Classification petweet In one way classification yj, the j-th observation affected by the i-th factor can be exp ‘within 3 in the form of a model as below : yy =m taj t ey i= 1,2, os hf = 1, 2, my Znj = n where m denotes the general meas, otal the effect of i-th factor and ey the error in the observation yy. Here t denotes the number of fin Note n; represents the number of times i-th factor is applied and n denotes the total numberof phi. ervatio this classification the hypothesis of interest can be Hy: @;= 0 for all values of i, while the altenait | feryards hypothesis is H : At least two of the as’ differ. Teal To infer about the above cited hypothesis various steps involved are outlined below : ~ lu 1. Caleulate the, grand total of all the observations in question and denoteitby (0°25. is2oFw Take the square of G and divide it by the number of observations on which G is based. Ther WL thus obtained is called the correction factor (C.F) 4G : . "~~ Mathematically C.F. = S— 156 . ons. The ba 2. Subtract_C.F. from the sum of squares of all the individual observations. Soh Roduetio obtained is called total corrected sum of squares (SST = ED 3. Find the totals of all the observations in the samplé. Let J sample, Divide the square of each sample total by the number of observations * sd sine! "7 ‘Add all these values and subtract the correction factor from the sum. The result thu sum of squares between samples (SSC), i.e., 2 ssc==2_ cr, ~ 1 i 1; represents the number of observations of the ith sample. 4, Obtain sum of squares within samples (SSE) by subtractin} samples from the total corrected sum of squares, SSE = SST - SSC. 5. Obtain variance within samples and and variance between samples bY squares of each by its corresponding degrees of freedom (d.f.). the sum of squas® vidio ™ AR r Oy ey | yo F VARIANCE | 251 spematiallY Variance within samples = SSE/d.f. within samples ag he w ss between sample = SSCid . for between samples | hy, van alue F = Variance between samples it jate the F value, = : Sti, ¢,caleul Variance within samples * ul, are the value obtained in step 6 with the table value of F under appropriate degrees of ‘ ey 1 Sr desired level of significance. 17 at 4 Keg, ested value of Fis greater than or equal tothe table vale, ys ejected ie boy ied and vice-versa. Mis tomary to summarize the various results obtained in the form of a table called the oti | ee ryasance (ANOVA) table, eR ov = ANOVA TABLE Sure of variation af. | Sum of squares | Mean sum of squares | F ratio sc MSC peveen samples K1 ssc MSC = a SSE 1 Can beer | yin samples n-K SSE MSE=—=— geod aay||To | n-1 SST -nunberdfiat| Note, The analysis of variance technique is independent of the shift of origin. Ifthe individual numbercfyit i} gevations are large values we can subtract any constant quantity from each individual value and shiletedt#) serwards analyse the data with new observations by following the given steps. The calculated value fF and all other results will remain the same and the calculation work will be considerably reduced. ned ng ~ Illustration. Following are the yields obtained in kgs of three varieties WL 711, WG 357 and itbyG(@-3i] 1860 of wheat sown in 14 plots. isa) WT 12 13 14 13 WG 357 10 9 10 9 9 ‘sea 13, 14 13 12 14 | the any significant difference among the production of three varieties ns ™ veg tion, Let us take the null hypothesis that there i no significant difference emeng Ut E sil Selon ofticee varieties. We shall analyse the above data using analysis of varianc® : pote | of 4 i 144 169 196 | 169 a | ond rd an y TS TRIODS S FOR rege AR, Vv 8° AL genom tse innit “ig, go (165)? _ © Mot Correction factor C.F. = va 1944.64 | fener Tot cone sm of ques (857)= 678 +443-+874—Cp «gg im, | “ ee 52) (47)? 64. Hi Sum of squares between varieties (S8C) = SI)" aaa cn) Mea = 1989 1944.64 = 4436 Sum of squares within varieties (SSE) = SST - ssc = 50.36 ~ 44.36 = 6, 0 wore de ANALYSIS OF VARIANCE TABLE ; wooo pee i wilt ie | ypoese® Between Varieties = 8 ten int Within varieties x — 1.Ca vol mut Table value of Fat § per cent level of significance for (2, 1]) dis 46 Sine a value of F = 40.33 is greater than the table value of F, we reject one he ee conclude thatthe three varieties differ significantly as far as their yields are concemed, —s The calculations involved inthe above example can be further reducedit we subtractaconta ‘umber say ‘9° from each of the observations and then analyse the daty, | Te New Data: | WL 7 WG 357 1562 ] LF obserat x, X; ¥ | 4 16 3 % 4 6 ZF 3 9 | fobsery 5 % rc! 2X; = 21 7 Now G=16+2421=39 na 44 2 CR= or = 108.64 Total comected sum of squares (SST) = 66 + 2+ 91 ~ 108.64 = 5036 wo ON 2: 2 1 a4. Sum of squares between vatieties (sc) = 29" 5 a + a - 10864 : fi of squares within varieties (SSE) = 50.36 - 44.36 = 6.00 Thus we can, see th sa divi oot « results are not atall affected if we shift the origin of individu! iS The new values obtained ing teh : ‘te handling 4 observations. ‘an also be negative. One should be careful while hat a piysis OF VARIANCE tion without 253 ways classificat ut interactio, L well n (one observation in ification let yy be the observati i two way classi det Yi ervation obtained fro; is 7 ni tevel of factor A and j-th level of factor B has been applied. nite em ae 7 express Ij 4 _— Jy m+ ait Bite, rere m denotes the general mean, a, the effect of i-th level of factor A, "gn Band ey i the error in the observation y, In this classification 2 two null hypotheses of :aj= 0 for all i and Ho: B,~0 for all . The correspond yerest are Hy + = © TOF 0° Ry “orresponding altemate hypotheses i At least two a's differ and Hj : At least two Bs diffe. To infer abo a above cited iypoeses various steps involved are outlined below. We assume that when ous observations are wnt inthe form ofa two-way table it contains r (levels ofA) rows and c (levels of B) cole | Calulate the grand total G = yy of all the observations and divide its square by the ij » B; the effect of j-th level total number of observations to obtain correction factor, 2 2 Mathematically C.F, = S-= GS n re. where n= Number of rows X Numbers of column = rc. The total corrected sum of squares (SST) is given as SST = y}-CK. 2. Find the row totals. Each row total is a total of c observations. Let y;, denote the total of the sbservations of i-th row. The sum of squares due to factor A ( SSA) is ey? ssa= ¥ 24 cr, i= r 3. Find the column total. Each column total is a total of r observations. Let -y denote the total tions of j-th column. Then the sum of squares due to factor B (SSB) is obtained as ce ye SSB= > —-cF. ae SSE = SST SSA - SSB Variance due to factor A = SSA/A.E. for factor A - Variance due to factor B= SSB/d.f, for factor B Variance due to Error = SSE/ALE for error 4. Calculate the F values as: F = Variance due to a factor Comp 1. Variance due to error : : mathe ‘fe value of F obtained in step 4 with table value of F under appropriate degrees of ited level of significance. STATISTICAL METHODS FoR REsE, Pn ARC} a WOR, itten i ible form as : results can be written in the tal The above cane re af. Sum of squares | Mean sum of squares variation SSA MSA = 4 _ Factor A r-) ee =D SSB SB = ———_ Factor B (-) 5SB eSB es D _ SSE Error t-De-D | SSE MSE= G—De=D [roa | e-0) SsT Illustration. A company appoints four salesman P, Q, R and S and observes their sales iy three seasons-summer, winter and monsoon. The figures (in lakhs) are given below : Season Salesmen P Q R s Summer 13 16 16 14 ‘Winter 17 16 17 16 Monsoon, 13 14 1S 15 Carry out of the analysis of variance. What conclusions do you draw from the analysis? Solution. These data are classified according to (i) Salesmen and (ii) seasons. To make te calculations easier we subtract 15 from each observations and then analyse the data accorag* the above procedure. [ Season Salesman a Q R s Summer - 3 ‘Winter LU : Monsoon o 5 n 2 Saleman’s total o | & Comrection factor = G? _ 2? _ 033 n 12% Total corrected sum of squares (SST) yt =P +OP+ Cope " 12+ 2+ CG ee ap + are l = 220.33 = 21.67, ESOP C+ OP + OR + OF + CAF Sum of squares due to seasons (SSA) Cr? +6? +03) _ op att 4 6 4 7 4 5 $ ou draw fe - VARIANCE ae 255 ws! ry res due to salesmen (SSB) = (2? +0? +)? +? un of $1 5 cr=46) naan qpnot squares def COT (SSE) = SST ~ SSA ~ SSB = 21.67 - 11.17-434=6.16 ANOVA TABLE . source variation gs, Sum of squares | Mean sum of squares | F ratio 2 11.17 358 ; 54 on 1.45 an 6 6.16 103 nN 21.67 mal To test the hypothesis that there is no difference between the sales of salesmen and of we can test the calculated value of F with appropriate degrees of freedom at the desired level of significance. Table value of F for (2, 6) d.f. at 5 percent level of significance = 5.14. Since = 5.14, so we reject the null hypothesis and conclude that seasons have a Fa = 54 > Frabie sprifant effect on sales. Also table value of F for (3, 6) d.f. at 5 percent level of significance wi road since Fea = 1-4 Fyaie = 3.01, 80 we reject the null hypothesis and conclude that Vg uit significantly. To test the difference between spacings we test the calculated ¥9¥°°_ 5) vale of at (2, 24) df. at 5 percent level of significance. Since Fea = 5:24” Fat ot sejeethe null hypothesis and conclude that the various spacings differ significa je a il i Sat, : interaction due to varieties and spacings, we test calculated value of F wil & Atestwa ofa town Sche Schc Sche Carry o 7. The folk ind tre fandom Can 15 —- 25819 ; D: sed 3-1 1g 0F VARIANCE cent level of significance. Since Freq = 1.44 < Fiabe = Ba fable sis and conclude that the interactions of varieties sn 5 251 50 we accept the " 6 Pacings do not di from each other. not differ (hi hypo"! ae FXERCIS == ES az SSS Za ~~ ' ing and uses of analysis of variance, How i i Explain the meaning a i ©. How is an analysis of vari : ee set up and how is a test based on it performed, What are the tounge ata test ? ; ; pat is analysis of variance and what is the significance of its study ? Ak i 2 Ca of F-ratio used in the study. ae 4, Describe the technique of analysis of variance for a one way classification, 4, State the mathematical model with assumptions in anal " Gassification. Explain the hypothesis to be tested, Write table for a two way layout. 5, Following data gives the increase in weight of chicks (noted after a certain period) they were fed four different kinds of feeds. Can we infer from the data that the feeds differ from each other as far their effect of increasing the weights is concerned ? FO Increase in weight 50 of chicks 58 62 54 60 62 6. Atest was given to five students taken at random from the tenth class of three schools ofa town. The individual scores are : lysis of variance of a two way down the analysis of variance 1) when, F, SESASE|N BEVSRS| IN ZSELLBR School | 8 7 9 6 8 School II 7 5 4 6 5 School III 6 5 4 6 4 ' Carry out the analysis of variance and state your conclusions. + The following table rey i in bushels per acre for trial plots of presents the yield of wheat in bushels pi : : and treated with four different levels of fertilizer. Each level was applied to five plots ‘andomly choosen over a field. Plot Treatments Number 1 2 3 4 i By 24 w fe 2 25 33 26 39 3 3 wu 8 a 4 32 7 39 B Care = 5 = any out the analysis of variance and state your conclusions. 260 8. STATISTICAL METHODS Fo) y R RESE, ARCH The following table gives the wholesale price of a commodity in ¢ ome s OF wo random in four cities. ons Sete e oT S (Rs. per bgp P 2 oa 5 a ne e 19 a R 20 a 19 a s 23 24 2 2 s Carry out the analysis of variance to test the significance of the diffe prices of the commodity in four cities. erence among le Atea company appoints four salesmen, A, B, C and D and observes th seasons, i.e., summer, winter and monsoon. The figures (Rs. i ei Salen following table : (Rs. in lakhs) arg Siena Salesmen A 2 ca Summer 25 20 ai Winter 28 27 fa % Moonson 23 25, Ss 8 30 Carry out the analysis of variance and comment on the results thus obtaii tained. An experimenter applied Nitrogen (N) and PI - hosphorus (P) at eperiment The yields recorded from each plot are aes faerie keels reatment combination was tried on two plots). Analyse the data. aa (Nitrogen) No | mM | ny ty us 2,38 Py Phosphorus ce Pe 40,82 Ps 32 20, A certain c ‘ompany had four salesmen A, B, C and D each of whom was sent Montt ays. The pes of areacounty side K, out skirts of a city O and shopringc## -——___ ‘ands of rupees per month are shown below: Salesman ' A B Cc | D ° a 70 30 9 Ss % 50 40 70 100 60 80 @ Carry out th i © analysis of variance and interpret the results. ~AnNNnNnnoa Carn 1g OF VARIANCE on ys form the analysis of variance from the following information : ° ro ol yield from all plots = 84 kgs. 9 Total sum of squares 624. 6) Block totals 21, 15, 24, 24 Variety totals 24, 28, 32 (@) ' 4, AN experiment was conducted to determine the effects of five different varieties of 1 cowpeas (Ct, +1 Cs) and three different spacings (S1, Sp and Ss),i.e., 4", 6" and 8" apart ina row with rows 3 apart and also to see whether the varieties behave differently at different spacings. The data below gives the yield of each of 3 plots taken for each variety - spacing combination : Spacings Variety Si S2 S3 CG 55 48 52 60 55 57 65 62 58 Ce 58 50 62 62 57 59 63 64 65 C3 52 56 50 60 55 58 60 58 62 Ca 50 55 57 61 57 59 62 64 66 Cs 52 56 58 62 64 66 68 66 64 Carry out the analysis of variance for the above data. Qgaq000

You might also like