0% found this document useful (0 votes)
27 views

Maths (Module 4) 1

Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
27 views

Maths (Module 4) 1

Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 16
MODULE-IV (PART-1) Testing of Hypothesis One and Two-tailed Test Test on Single Mean * Test on two Mean ‘TEST OF HYPOTHESIS HYPOTHESIS: It is a claim or belief or assumption about the population Parameter. A statistical hypothesis is)an assertion or conjecture concerning one or more populations. + Itis of two types : zi (1) Null hypothesis (Hy or Hy): The hypothesis which isinitially assumed to be true, (although it may in fact be either true or false based onithe sample data) is called Null hypothesis. Null hypothesis refers to any hypothesis we wish, to test..It is denoted by Ho or Hy . Ds (2) Alternative hypothesi (Hao ))aThe hypothesis which goes against the null hypothesis is called as the Alternative hypothesis. The rejection, of null hypothesis Teas to the acceptance of an alternative hypothesis. | oBtdecsidh is the fh potest testing can be classified as follows, which leads wowrong coriclusions (errors) : a [Hi true H, is false ACCEPTH, | () || Correct decision (1—a) | Type-llerror (B) REJECT H, Type-lError (a) Correct decision (1 — B) * From the above table, itis clear that two types of errors are committed in the hypothesis testing. They are: Type- | Error: The error committed when we reject the null hypothesis ifit is true is called Type- error. That is, the rejection of true hypothesis is called Type-1 error. ‘Type: Il Error: The error committed when we accept the null hypothesis if itis false is called Type-II error. That is, the acceptance of false hypothesis is called Type-ll error, Level of significance (Los ) () The probability of committing Type-I error is called level of significance. It is denoted by a, Thatis, = P(Accepting H when Hy is true) be, a= P(H4/Ho) Power of Test (1 — 8) : The ability (or probability) to reject t hypothesis test is called the power of a te: ‘ ull hypothesis when it is false in a is denoted'by 1 — B. Thatis, 1— 8 = P(Reject Ho, when it false.) ie, 1B = P(Hy/Hy) Where we define, ~~ mnitting Type 211 error) hen it is false) the sample si 2. The type | err and type Il error are related. A decrease in the probability of one generally sults in an increase in the probability of the other. 3. The size of the critical region, and therefore the probability of committing a type | error, can always be reduced by adjusting the critical value(s), 4. An increase in the sample size n will reduce @ and 8 simultaneously. 5. If the null hypothesis is false, f is a maximum when the true value of a parameter approaches the hypothesized value. The greater the distance between the true value and the hypothesized value, the smaller f will be. Direction of the hypothesis test (Type of Test): Hypothesis test are two types. They are: One-tailed & Two-tailed test. (i) One-tailed Test: It is a test of any statistical hypothesis where the alternative is one sided. The type of test, where the critical region (region of rejection) under the sampling distribution curve lies only one tail of the curve is called one-tailed test. It is again of 2 types: {a) Left One-tailed test: Ithas the form- Ho: @=0) or @ > 8 Hy: 0 <6 Here the critical region lies to the left tail ofthe sampling ‘distribution curve and the area of the critical region is a. The critical region is z < —Zq (b) Right One-tailed test: Here the critical'region lies to either the left tail or the right tail of the sampling distribution curve and the area of the critical region is a. The critical region is z > Zqj2 OF Z< —Zaj2 PROCEDURE FOR HYPOTHESIS TESTING : * 1, State the null and alternative hypotheses. : * 2. Choose a fixed significance level a. * 3, Choose an appropriate test statistic by using the formula- sample statistic—hypothetical parameter Test statistic =< naard error in sampling distr.of statistic 6-0 Compute the value of test statistic a5 Scomp = oH + 4, Establish the critical region based on a. + 5, Reject H. if the computed test statistic is in the critical region. 0 Otherwise, do not reject. P-Value Approach: + A P-value is the lowest level (of significance) at which the observed value of the test statistic is significant. For the right one-tailed test, the P-value is P = P(Z > Zcomp)+ For the left one-tailed test, the P-value is P = P(: : Zcomp): eh For the two-tailed test, the P-value is P=-2P(Z <‘Zcoinp) * Decision step: , ‘ If P a, Then accept H,, \. . Procedure for Tests.on a Single'Mean (vafianc® Known) ~ AD 1. State the null and alternati sand idetify type of test. 4. establish the critical region based on a. LEFT ONE-TAILED TWO-TAILED ] Ho: He Ho Ho: B= Ho | Ay: eS BB Hy: bt # Uo J roan e Woe Bo 2. choose. fixed sighificance level oltnd find critical value. LEFTONE-TAILED\, —“®|,RIGHT ONE-TAILED. TWO-TAILED 37,8 > iz Zapp and Zeya 3. Choose an appropriate test statistic and compute its value as Z = Ho _— Ho om oe aM TEST | LEFT ONE-TAILED | RIGHT ONE-TAILED | TWO-TAILED cR. Z<-Zq Z>Lq Z<~Zajp oF Z> Zar 5. Reject H, if the computed test statistic is in the critical region. Otherwise, do not reject. Example : Arandom sample of 100 recorded deaths in the United States during the past year showed an average life span of 71.8 years. Assuming a population standard deviation of 8.9 years, does this seem to indicate that the mean life span today is greater than 70 years? Use a 0.05 level of significance. Solution : 1, HM S70 years. 2. H,: H> 70 years. (ROT) 3. a= 0.05, So Critical value is Zq = 1.645 4. Critical region: Z > Zq,, ie. 2>1.645, 5. Computations: statistic is Zeom 02 is given by the area of shaded region. -P(Z > Zeomp) = P(Z> 2.02) = 0.0217 2.575. 5. Computations: £ = 7.8 kilograms, ¢ = 0.5 ,n = 50, and hence the computed Value of test statistics Zomp = =H2= -2.83, 6. Decision: Reject 4, and conclude that the average breaking strength is not equal to 8 butis, in fact, less than 8 kilograms. (ANS) Since the test in this example is two tailed, the désired P-value is twice the area Of the shaded region to the left of z = -2.83, Therefore, using Table, we have P= P(/Z] > 2.83) = 2P(2 < -2.; 83)=0.0046 & a Which allows us to reject the null hypothesis that y =\8 kilograms. Procedure for Tests on a Single Mean (Variance ‘unknown) LEFT ONE-TAILED RIGHT ONE-TAILED. ~ Ho: = Uo wy Hohe S io Hy: 0 Uy Hy fh> Ug 2. Choose a anti find critical Value (dt n-1 degree of freedom). LEFT ONE-TAILED. RIGHT ONE-TAILED TWO-TAILED Sta ar o> te =tey2 and taj 3. Here we use titest statistic and compute its value as t —*—Ho com = a 4. establish the critical region based on a. TEST __| LEFT ONE-TAILED | RIGHT ONE-TAILED | TWO-TAILED CR t<-ty t>ty t< =tey2 OF t> tase 5. Reject H, if the computed test statistic is in the critical region. Otherwise, do Not reject. Example : The Edison Electric Institute has published figures on the number of kilowatt hours used annually by various home appliances. It is claimed that a vacuum cleaner uses an average of 46 kilowatt hours per year. If a random sample of 12 homes included in a planned study indicates that vacuum cleaners use an average of 42 kilowatt hours per year with a standard deviation of 11.9 kilowatt hours, does this suggest at the 0.05 level of significance that vacuum cleaners use, on average, less than 46 kilowatt hours annually? Assume the population of kilowatt hours to be normal. Solution: 1. HH = 46 kilowatt hours. 2.4, A B.a= .05. 4. Critical region: t < a test fe -.796, (8 fa “tos 11.796 at v = 11. 2 kilowatth ous, 5 = 11.9kilowatt hours, and n= 12. aes 116, Ws ing] Viz, \ a 5. Computations: ¥ = Xone! svn S Coomp > —ba, We e Steet Hy Hence, tcomp a 6. Decision: [P= PT <~ 1:16) £=0.135>a, so we accepbH, , by P-value approach J. ‘Table 10.3: ‘Testa Concerning Means He Value of Test Statistic Hy Critical Region 2-10 ote pone 2 GR oe known a> bo A Ho 1a 2a, yannd, Ho Por are BD Ho tote unknowns ut bo t< “tas or t> tayo pa 1) de er rs 1 —H2 = do Votlm + ot/n;' miawa>do > se 1 and 6 known’ Mirwaédg =< ~ta/2 oe 2 > 2a) 1_ =F) ” -wa ty 01 = 04 but unknowe, af + (ng — 1933 tia = 2 mMa Fda tS ~typort > tay: em Usnte Ute HS hays OF > ba wpa da Two Samples: Tests on Two Means Example: An experiment was performed to compare the abrasive wear of two different laminated materials. Twelve pieces of material 1 were tested by exposing each piece to a machine measuring wear. Ten pieces of material 2 were similarly tested. In each case, the clepth of wear was observed. The samples of material 1 gave an average (coded) wear of 85 units with a sample standard deviation of 4, while the samples of material 2 gave an average of 81 with a sample standard deviation of 5. Can we conclude at the 0.05 level of significance that the abrasive wear of material 1. exceeds that of material 2 by more than 2 units? Assume the populations to be approyimatély normai with equal variances. Solution: Let ji and jx represent the population means of the abrasive wear for material 1 and material 2, respectively. 1. Ho: wi — 12 2. Hy: wy — pa > 2. 3. a = 0.05. 4. Critical region: t > 1.725, where t with v = 20 degrees of freedom. 5. Computations: 31=4, m=12, so=5, ng =10. Hence H+ OR) (DEO. vm, (5-81)-2 4478V1/12+1/10 P=P(T>1.04) 0.16. (Sce Table Ad.) 6. Decision: Do not reject Hy. We are unable to conclude that the abrasive wear of material 1 exceeds that of material 2 by more than 2 units. 4 Testing a proportion (single sample) 1. Ho: p = po. 2. One of the alternatives Hi: p < po, p> po, or p# po. 3. Choose a level of significance equal to a. 4. Test statistic: Binomial variable X with p = po. 5. Computations: Find x, the number of successes, and compute the appropriate P-value as follows: For left one-tailed test, P= P(X s x when p = po). For right one-tailed test, P = P(X 2 x when p = po) _ For two-tailed test, P= 2P(X < x when p = po) ifxen Po or P= 2P(X2x when p=po)if x >n po 6. Decision: Draw appropriate conclusions based on'the'P-value as : reject Ho, if the computed P-value is less than or equal to @ otherwise acceptit. EXAMPLE: A builder claims that heat pumps are installed in 70% of all homes being constructed today in the city, of Richmond, Virginia. Would you agree with this claim if a random survey of new homes in thisicity showed that 8 out of 15 had heat pumps installed?Use a 0:10 level of significance. Solution : 1. Ho: p = 0.7. Wa \ 2. Hip #07 ‘\ : Binomial variable X with p= 0.7 and n = 15. ‘and npg =(15)(0.7) = 10.5. Therefore, from Table, n .2622 > 0.10. 6. Decision: Do notreject Ho. Conclude that there is insufficient reason to doubt the builder’s claim. NOTE: For large)\n, approximation procedures are required. When the hypothesized value po is very close to 0 or 1, the Poisson distribution, with parameter y= n po, may be used, However, the normal curve approximation, with parameters u = n po and 02 = n po qo, is usually preferred for large n and is very accurate as long as po is not extremely close to 0 or to 1. If we use the normal approximation, the z-value for testing p = pois given by X=NPo _ P=Po V2 Pogo Podo Vn which is a value of the standard normal variable Z. Z= Hence, for a two-tailed test at the a-level of significance, the critical region is Z<~Zyy, oF Z> Zaya. For the one-sided alternative p < po, the critical regionis Z < —Z,, and for the alternative p> po, the critical region is Z > Zq. EXAMPLE: A commonly prescribed drug for relieving nervous tension is believed to be only 60% effective. Experimental results with a new drug administered to 3 random sample of 100 adults who were suffering from nervous tension show that 70 received relief. Is this sufficient evidence to conclude that the new drug is superior to the one commonly prescribed? Use.a 0.05 level of significance. Solution : 1. Ho: p = 0.6. , 2. Ai p> 0.6. 3.@=0.05. z 4. Critical region: Z > Zq,,i.e., 2> 1.645%" 5. Computation: 0,0 Za Boo _ 07-08 @ m Be-pr<0, pr-pr>0, pr-pr #0 3. Choose a levellof significance equal to a. 4, Identify the Critical region. 5.Computations: Mpute the value of test statistic as Where =H, f= 3p = BE 6. Decision: Draw appropriate conclusions. EXAMPLE: A vote is to be taken among the residents of a town and the surrounding county to determine whether a proposed chemical plant should be constructed. The construction site is within the town limits, and for this reason many voters in the county believe that the proposal will pass because of the large proportion of town voters who favor the construction. To determine if there is a significant difference in the proportions of town voters and county voters favoring the proposal, a poll is taken. If 120 of 200 town voters favor the proposal and 240 of 500 county residents favor it, would you agree that the proportion of town voters favoring the proposal is higher than the proportion of county voters? Use an a = 0.05 level of significance. Solution : Let p: and p2 be the true proportions of voters in the town and county, respectively, favoring the proposal. 1. Ho: p= 2, Hi: pa> p2. 3. =0.05. 4. Critical region: z > 1.645, 5.Computations: Compute the _ \\lof& test statistic as = (049) (395 + + 500) 6. Decision: R ject Ho and agree that the proportion of town voters favoring the -BERosal | is’ higher than the Proportion of county voters Note that"P = P(2> 29) = 0,0019°< 0.05 . Test of variance (single sample) 1.Ho: a? = a9) 2. One of the'alternatives Hi: 0? < 09”, 0? > 092, or 0? = a7. 3. Choose a level of significance equal to a. 4.Establish the critical region. For left one-tailed test, x” < x”, For right one-tailed test, x* > x? , For two-tailed test x? x? «2 5.Computations: Compute the. value of _~—Test_ statistic (n= 1s? # comp = Ga ay? 6. Decision: Draw appropriate conclusions as : reject Ho, if the computed value of x” lies in the critical region, otherwise accept it. EXAMPLE: A manufacturer of car batteries claims that the life of the batteries is approximately normally distributed with a standard deviation equal to 0.9 year. fa random sample of 10 of these batteries has a standard deviation of 1.2 years, do you think that a > 0.9 year? Use a 0.05 level of significance. Solution : 1. Ho: ? = 0.81. 2. Hs: 0? > 0.81, 3.a@=0.05. Critical value is 2? =16.919 at 4. Critical region: x? > x2, ie., x? > 16.919. 1.44Computed value of test statistics is (na ¥)s* —O)G.44) : \iog?) 081 6. Decision?”AS X%c9mpdoesn’t lies in the critical region, we accepts the null 2 : X\comp hypothesis. in Note that P-value is P= P(x? > x7 4,) = P(x? > 16) = 0.07 > 0.05, Two Samples: Tests on two variances 1.Ho: 04? = 6,7. 2. One of the alternatives Hi: 01? < 02”, o,? > a2, or a4? #02. 3. Choose a level of significance equal to a. 4.Establish the critical region. For left one-tailed test, f < f1-a(V1,V2) For right one-tailed test, f > f q(v3,V2) For two-tailed test f < f, (¥102) or f > faj2(v1,¥2) 5.Computations: Compute the value of Test statistic 6. Decision: Draw appropriate conclusions as : reject Ho, if the computed value of f lies in the critical region, otherwise accept it. Example: An experiment was performed to compare the abrasive wear of two different laminated materials. Twelve pieces of material 1 were tested by exposing each piece to a machine measuring wear. Ten pieces of material 2 were similarly tested. In each case, the depth of wear was observed. The samples of material 1 gave an average (coded) wear of 85 units with a sample standard deviation of 4, while the samples of material 2 gave an average of 81 with a sample standard deviation of 5. Assume the populations to be approximately normal with equal variances. Were we justified in making this assumption? Use 20.10 level of significance. Solution: Let 0,2 and a2? be the population variances for the abrasive wear of material 1 and material 2, respectively. x . Sa, 4. Critical region: We see that f2(Pi,Y2) = fogs(11)9) = 3.11, and seo tnat ys c So ? frs@oura) = foss QR) = Foy =0.340 Therefore, the null hypothesis.is rejected whenf< 0.34 or f > 3.11. vent and ve=9 5. Computations: Given that s} = 16, s3 = 25,0, = 11,v2 = 9. Feomp 6. Decision: Do not reject Ho. Conclude that there is insufficient evidence that the variances differ. Simple Linear Regression: The statistical technique that express the relationship between two or more variables in the form of an equation to estimate the value of a variable based on the values of another variable is called regression analysis. The variable whose value is estimated using the algebraic equation is called dependent (or response) variable and the variable whose value is used to estimate the response is called as independent (or regressor) variable. The linear algebraic equation use for expressing a dependent variable in terms of independent variable is called regression equation. A reasonable form of a relationship between the response Y and the regressor xis the linear relationship =a+ Bx, where, of course, a is the intercept and f is the slope. Method of Least-squares: Here we fit a straight line to the given set of data points such that the sum of the square of the vertical deviation (sum of squares of residuals or sum of squares of error) is minimum. : Suppose the regression equation of y on xis y =p + a,x, here a; is called the regression coefficient of y on x and is denoted by by, . The coefficients are determined from the normal equations: NA + Vika Xi = Lie Ye and ag Sy xy +a, Vx” = Li 1 MVE Similarly, the regression equation of x on y is x = by + biy, here by is called the regression coefficient of x on y and is denoted by xy - The coefficients are determined from the normal equations: bo + by Lier Ye = Lier and by Diy + by Lia? te Xi Alternative method to calculate the coefficients: let Sex =LX-H)?, Syy=LO-P*?, Sy =LOE-NO-H) Sy 3 Then ay =2%, ag =F— ays Similarly by =z, by =F — dF. EXAMPLE: The grades of a class of 9 students on a midterm report (x) and on the final examination (y) are as follows: [x 77 |s0__|71 [72 [8i_[94 [96 [99 [67 ly [a2 “66 [78 [3a a7 [as 99 | 99 [68 (a) Estimate the linear regression line. (b) Estimate the final examination grade of a student who received a grade of 85 on the midterm report, Solution: Heren=9, Dx = 707, Ly = 658, 0.x? = $7557 and Sxy = 53258 Suppose the regression equation of y on xis y= a+bx . The coefficients are determined from the normal equations: na + bY x =D NK and aha x + bE Rix? =Dh ix. So 9a+707b = 658 and -707a +57557b = $3258 solving these two equations, we get a = 12.0623 and b = 0.7771 Hence y = 12.0623 + 0.771x. (b) For x = 85, y= 12.0623 + (0.7771)(85) = 78. Alternative Method: (do yourself) x y x-% y-y &-%? [@=H0-H] 77 82 50 66 71 78 72 34 81 47 94 85 96 oe EY) 99 67 68 Ex=/Ly =| Ex 707 (658 | = and a The regression equation of y on xis y = a + bx = EXAMPLE: A study was made on the amount of converted sugar ina certain Process at various temperatures. The data were coded and recorded as follows: Temp (x) p 11/12/13 |14 J15 [16 [17 [38 19 [2 | Sugar (y) /8.1/7.8 |85/98 /95 [89 [86 102/93 lee 1305) (a) Estimate the linear regression line. (b) Estimate the mean amount of converted sugar produced when the coded temperature is 1.75. Solution: Heren=11, Dx = 16.5, Dy = 1004, 0x? = 25.85 and Dx y = 152.59 Suppose the regression equation of y on x is y = a + bx. The coefficients are determined from the normal equations: na + bY x= 1 Ye and Dhar x + bx? =D x. So 11a + 16.5b = 100.4 and 16.5a + 25.85b = 152.59 solving these two equations, we get a = 6.4136 and b = 1.8091, 4136 + 1.8091 -75, ¥ = 6.4136 + (1.8091)(1.75) = 9.580. Hence (b) For

You might also like