Statistika Ekonomi Dan Bisnis Lanjutan
Statistika Ekonomi Dan Bisnis Lanjutan
Assume that among the 600 consumers in the Appropriate when reaching small, specialized
population, 200 are heavy drinkers and400 are light populations
drinkers. If a research values the opinion of the heavy Each respondent, after being interviewed, is asked to
drinkers more than that of the lightdrinkers, more identify one or more others in the field
people will have to be sampled from the heavy
drinkers group. If a sample size of 60 is desired, a 10 Convenience
percent inversely proportional stratified samplingis Used to obtain information quickly and inexpensively
employed.The selection probabilities are computed as Quota
follows: Minimum number from each specified subgroup in
the population
Often based on demographic data
Quota Sampling - Example
Cluster Sampling
Involves dividing population into subgroups
Random sample of subgroups/clusters is selected and
all members of subgroups are interviewed
Very cost effective
Useful when subgroups can be identified that are
representative of entire population
Comparison of Stratified & Cluster Sampling
Processes
Where
Where
Often it is more convenient to specify directly
thedesired width of the confidence interval for
thepopulation mean rather than. Thus the researcher
specifies the desired margin of error forthe mean.
is the estimate of the variance of the sample Calculations are simple since, for example, a
proportion in 95%confidence interval for the population mean
the jth stratum willextend an approximate amount 1.96 on each side
of the sample mean, X
Provided the sample size is large, 100(1 - α)%
confidence intervals for the population proportion for Required Sample Size Example
stratified random samples are obtained from 2000 items are in a population. If σ = 45,what sample
size is needed to estimate themean within ± 5 with
95% confidence?
Cluster Sampling
Population is divided into several “clusters,”each
Consider estimating the proportion P of individualsin a representative of the population. A simple random
population of size N who possess a certainattribute. If sample of clusters is selected. Generally, all items in
the desired variance, , of the sample proportionis the selected clusters are examined. An alternative is
specified, the required sample size to estimate to chose items from selected clusters usinganother
thepopulation proportion through simple probability sampling technique
randomsampling is
SESSION 2
What is a Hypothesis?
A hypothesis is a claim(assumption) about
apopulation parameter:
population mean
Example: The mean monthly cell phone bilof this city
is μ = $42
population proportion
Example: The proportion of adults in thiscity with cell
phones is p = .68
The Null Hypothesis, H0 Level of Significance, α
States the assumption (numerical) to betested • Defines the unlikely values of the sample
Example: The average number of TV sets inU.S. statistic if the null hypothesis is true ( Defines
Homes is equal to three ( ). Null rejection region of the sampling distribution).
Hypothesis Is always about a population • Is designated by α , (level of significance),
parameter,not about a sample statistic Typical values are .01, .05, or .10
• Is selected by the researcher at the beginning
• Provides the critical value(s) of the test
Level of Significance and the Rejection Region
Example: Decision
Reach a decision and interpret the result:
Lower-Tail Tests
There is only one critical value, since the rejection
area is in only one tailH0: μ ≥ 3 H1: μ < 3
Two-Tail Tests
In some settings, the alternative hypothesis does not
Example: p-Value Solution specify a unique direction. There are two critical
Calculate the p-value and compare to α(assuming that values, defining the two regions of rejection
μ = 52.0) H0: μ = 3
H1: μ ≠ 3
Type II Error
Assume the population is normal and the
populationvariance is known. Consider the test
Calculating β
Suppose n = 64 , σ = 6 , and α = .05
Matched Pairs
Tests Means of 2 Related Populations
1. Paired or matched samples
2. Repeated measures (before/after)
3. Use difference between paired values: di = xi -
yi
Matched Pairs: Solution
Assumptions:Both Populations Are Normally Has the training made a difference in the number of
Distributed complaints (at the α = 0.01 level)?
H0: μx – μy = 0
The test statistic for the meandifference is a t value, H1: μx – μy ≠ 0
with Test Statistic:
n – 1 degrees of freedom: Critical Value = ± 4.604
d.f. = n - 1 = 4
Where
D0 = hypothesized mean difference
sd = sample standard dev. of differences
n = the sample size (number of pairs)
Assumptions:
1. Samples are randomly and independently
drawn
2. Populations are normally distributed
3. Population variances are unknown but
assumed equal
Forming interval estimates:
The population variances are assumed equal, so use
the two sample standard deviations and pool them to
estimate σ use a t value with (nx + ny – 2) degrees of
freedom, The test statistic for is :
σx2 and σy2 known
Assumptions:
1. Samples are randomly and independently
drawn
2. both population distributions are normal
3. Population variances are known
Where t has (n1 + n2 – 2) d.f.,and
2 2
σx and σy known
When σx2 and σy2 are known andboth populations
are normal, the
variance of X – Y is
σx2 and σy2 Unknown, Assumed Unequal
Assumptions:
1. Samples are randomly and independently
drawn
2. Populations are normally distributed
And the random variable 3. Population variances are unknown and
assumed unequal
Forming interval estimates:
The population variances are assumed unequal, so a
pooled variance is not appropriate, use a t value with
v degrees of freedom, where
σx2 and σy2 Unknown, Assumed Equal Pooled Variance t Test: Example
Muhammad Firman (University of Indonesia - Accounting ) 166
Masterbook of Business and Industry (MBI)
Where
Assuming both populations areapproximately normal
with equal variances, is there a difference in average
yield (α = 0.05)?
Calculating the Test Statistic
The test statistic is: Decision Rules: Population proportions
Decision:Reject H0 at α = 0.05
Conclusion:There is evidence of a difference in means.
Two Population Proportions
Goal: Test hypotheses for the difference between two
population proportions, Px – Py
Assumptions:Both sample sizes are large,nP(1 – P) > 9
Test Statistic forTwo Population Proportions
The test statistic forH0: Px – Py = 0is a z value: Critical values = 1.96 for α = .05
Muhammad Firman (University of Indonesia - Accounting ) 167
Masterbook of Business and Industry (MBI)
The statistic is :
Where:
SST = Total sum of squares
K = number of groups (levels or treatments)
ni = number of observations in group i
xij = jth observation from group i Between-Group Variation
x = overall sample mean
Total variation
Where:
SSG = Sum of squares between groups
K = number of groups
ni = sample size from group i
xi = sample mean from group i
x = grand mean (mean of all data values)
Variation Due toDifferences Between Groups
Within-Group Variation
Where:
SSW = Sum of squares within groups
K = number of groups Mean Square Between Groups= SSG/degrees of
ni = sample size from group i freedom
Xi = sample mean from group i
Xij = jth observation in group i
Summing the variationwithin each group and
thenadding over all groups n K
K = number of groups
n = sum of the sample sizes from all groups
df = degrees of freedom
One-Factor ANOVA (F Test Statistic)
H0: μ1= μ2 = … = μK
H1: At least two population means are different
Test statistic
Assumptions:
1. The samples are random and independent
One-Factor ANOVA ExampleSolution 2. variables have a continuous distribution
3. the data can be ranked
H0: μ1 = μ2 = μ3 4. populations have the same variability
H1: μi not all equal 5. populations have the same shape
α = .05
df1= 2 Kruskal-Wallis Test Procedure
df2 = 12 1. Obtain relative rankings for each value, In
event of tie, each of the tied values gets the
Test Statistic: average rank
2. Sum the rankings for data from each of the K
groups, Compute the Kruskal-Wallis test
statistic, Evaluate using the chi-square
distribution with K – 1 degrees of freedom
where:
n = sum of sample sizes in all groups
K = Number of samples
Ri = Sum of ranks in the ith group
ni = Size of the ith group
Complete the test by comparing the calculated H
value to a critical X2 value from the chi-square
distribution with K – 1 degrees of freedom
Decision rule :
Reject H0 if W > 2K–1,α
Otherwise do not reject H0
Kruskal-Wallis Example
Do not reject H0
Two-Way Notation
H0 : Mean M = Mean E = Mean B Let xji denote the observation in the jth group and ith
H1 : Not all population means are equal Block. Suppose that there are K groups and H blocks,
for a
The W statistic is : total of n = KH observations. Let the overall mean be
x
Denote the group sample means by
SESSION 7 - 8
Test the hypothesis that there is no overall package • Find the sums of the positive ranks and the
preferenceusing α = 0.10 negative ranks
• The smaller of these sums is the Wilcoxon
Signed Rank Statistic T:
T = min(T+ , T- )
Where T+ = the sum of the positive ranks
T- = the sum of the negative ranks
The test-statistic S for the sign test is n = the number of nonzero differences
S = the number of pairs with a positive difference= 2
S has a binomial distribution with P = 0.5 and n = 9 The null hypothesis is rejected if T is less than or equal
(there was to the value in Appendix Table 10
The p-value for this sign test is found using the
binomial distribution with n = 9, S = 2, and P = 0.5:
For a lower-tail test, Signed Rank Test Example
Mann-Whitney U-Test
Used to compare two samples from two populations
Mann-Whitney U-Test Example
Assumptions: Claim: Median class size for Math is larger than the
1. The two samples are independent and median class size for English A random sample of 10
random Math and 10 English classes is selected (samples do
2. The value measured is a continuous variable not have to be of equal size) Rank the combined
3. The two distributions are identical except for values and then determine rankings by original
a possible difference in the central location sample. Suppose the results are:
4. The sample size from each population is at
least 10
Consider two samples
• Pool the two samples (combine into a singe
list) but keep track of which sample each
value came from
• rank the values in the combined list in
ascending order
• For ties, assign each the average rank of the
tied values
• sum the resulting rankings separately for each
sample
If the sum of rankings from one sample differs enough Ranking for combined samples :
from the sum of rankings from the other sample, we
conclude there is a difference in the population
medians
Mann-Whitney U Statistic
Consider n1 observations from the first population
and n2 observations from the second. Let R1 denote
the sum of the ranks of the observations from the first
populationThe Mann-Whitney U statistic is
Rank byoriginalsample:
Muhammad Firman (University of Indonesia - Accounting ) 180
Masterbook of Business and Industry (MBI)
Use α = 0.05
Suppose two samples are obtained:n1 = 40 , n2 = 50
When rankings are completed, the sum of ranks for
sample 1 is
And variance where the di are the differences of the ranked pairs
Muhammad Firman (University of Indonesia - Accounting ) 181
Masterbook of Business and Industry (MBI)
SESSION 11
The population correlation coefficient isdenoted ρ Independent variable: the variable used to explain the
(the Greek letter rho). The sample correlation dependent variable(also called the exogenous
coefficient is variable)
Linear Regression Model
The relationship between X and Y is described by a
linear function. Changes in Y are assumed to be
where caused by changes in X. Linear regression population
equation model
Decision Rules
Graphical Presentation
And the constant or y-intercept is
where:
= Average value of the dependent variable
yi = Observed values of the dependent variable
Excel Output i = Predicted value of y for the given xi value yˆ
SST = total sum of squares
Measures the variation of the yi values around their
mean, y
SSR = regression sum of squares
Explained variation attributable to the linear
relationship between x and y
SSE = error sum of squares
Variation attributable to factors other than the linear
relationship between x and y
Graphical Presentation
House price model: scatter plot andregression line
Coefficient of Determination, R2
The coefficient of determination is the portionof the
total variation in the dependent variablethat is
explained by variation in theindependent variable.
The coefficient of determination is also calledR-
squared and is denoted as R2
house price = 98.24833+ 0.10977 (square feet)
Interpretation of theIntercept, b0
b0 is the estimated average value of Y when thevalue
of X is zero (if X = 0 is in the range ofobserved X note:
values). Here, no houses had 0 square feet, so b0 =
98.24833just indicates that, for houses within the Examples of Approximate r2 Values
range ofsizes observed, $98,248.33 is the portion of
thehouse price not explained by square feet
Interpretation of theSlope Coefficient, b1
b1 measures the estimated change in theaverage
value of Y as a result of a oneunitchange in X. Here,
b1 = .10977 tells us that the average value of ahouse
Correlation and R2
The coefficient of determination, R2, for a simple
regression is equal to the simple correlation squared where:
= Estimate of the standard error of the least
squares slope
H0: β1 = 0
H1: β1 ≠ 0
From Excel output:
where:
b1 = regression slopecoefficient
β1 = hypothesized slope
sb1 = standarderror of the slope
Confidence Interval Estimatefor the Slope
Estimated Regression Equation:
Where
Prediction
The regression equation can be used to predict a
value for y, given a particular x . For a specified value,
xn+1 , the predicted value is
Graphical Analysis
The linear regression model is based on minimizing
the sum of squared errors. If outliers exist, their
potentially large squared errors may have a strong
influence on the fitted regression line. Be sure to
This extra term adds to the interval width to reflectthe examine your data graphically for outliers and
added uncertainty for an individual case extreme points. Decide, based on your model and
Estimation of Mean Values:Example logic, whether the extreme points should remain or
Find the 95% confidence interval for the mean priceof be removed
2,000 square-foot houses
Predicted Price yi = 317.85 ($1,000s) SESSION 12
MULTIPLE REGRESSION
The confidence interval endpoints are 280.66 and The Multiple RegressionModel
354.90,or from $280,660 to $354,900 Idea: Examine the linear relationship between1
dependent (Y) & 2 or more independent variables (Xi)
Estimation of Individual Values:Example
Confidence Interval Estimate for yn+1 Multiple Regression Model with k Independent
Variables:
Find the 95% confidence interval for an
individualhouse with 2,000 square feet
Predicted Price yi = 317.85 ($1,000s)
Prediction
Given a population regression model
where:
β0 = Y intercept Testing the Quadratic Effect
β1 = regression coefficient for linear effect of X on Y
β2 = regression coefficient for quadratic effect on Y Compare R2 from simple regression toR2 from the
εi = random error in Y for observation i quadratic model
Linear vs. Nonlinear Fit If R2 from the quadratic model is larger than R2 from
the simple model, then the quadratic model is a
better model
Example: Quadratic Model
Purity increases as filter time increases:
Interpretation of coefficients
For the multiplicative model:
Let:
y = Pie Sales
x1 = Price
x2 = Holiday (X2 = 1 if a holiday occurred during the
week)
(X2 = 0 if there was no holiday that week)
Identify model form (linear, quadratic…) Interpreting the Dummy VariableCoefficients (with 3
Determine required data for the study Levels)
Coefficient Estimation
• Estimate the regression coefficients using the
available data
• Form confidence intervals for the regression
coefficients
• For prediction, goal is the smallest se
• If estimating individual slope coefficients,
examine model for multicollinearity and
specification bias
Model Verification
• Logically evaluate regression results in light of
the model (i.e., are coefficient signs correct?)
• Are any coefficients biased or illogical?
• Evaluate regression assumptions (i.e., are
residuals random and independent?)
• If any problems are suspected, return to Experimental Design
model specification and adjust the model Consider an experiment in which
Interpretation and Inference • four treatments will be used, and
• Interpret the regression results in the setting • the outcome also depends on three
and units of your study environmental factors that cannot be
controlled by the experimenter
• Form confidence intervals or test hypotheses
about regression coefficients Let variable z1 denote the treatment, where z1 = 1, 2,
• Use the model for forecasting or prediction 3, or 4. Let z2 denote the environment factor (the
“blocking variable”), where z2 = 1, 2, or 3
Dummy Variable Models (More than 2 Levels) • To model the four treatments, three dummy
Dummy variables can be used in situations in which variables are needed
the categorical variable of interest has more than two
categories. Dummy variables can also be useful in • To model the three environmental factors,
experimental design two dummy variables are needed
• Experimental design is used to identify Define five dummy variables, x1, x2, x3, x4, and x5
possible causes of variation in the value of the Let treatment level 1 be the default (z1 = 1)
dependent variable
• Define x1 = 1 if z1 = 2, x1 = 0 otherwise
• Y outcomes are measured at specific
combinations of levels for treatment and • Define x2 = 1 if z1 = 3, x2 = 0 otherwise
blocking variables • Define x3 = 1 if z1 = 4, x3 = 0 otherwise
• The goal is to determine how the different Let environment level 1 be the default (z2 = 1)
treatments influence the Y outcome
• Define x4 = 1 if z2 = 2, x4 = 0 otherwise
Consider a categorical variable with K levels. The • Define x5 = 1 if z2 = 3, x5 = 0 otherwise
number of dummy variables needed is one less than
the number of levels, K – 1 The dummy variable values can be summarized in a
table:
Example:
y = house price ; x1 = square feet
If style of the house is also thought to matter:
Style = ranch, split level, condo
Three levels, so two dummy variables are needed
Let “condo” be the default category, and letx2 and x3
be used for the other two categories:
y = house price The experimental design model can be estimated
x1 = square feet
x2 = 1 if ranch, 0 otherwise using the equation
x3 = 1 if split level, 0 otherwise
The multiple regression equation is: The estimated value for β2 , for example, shows the
amount by which the y value for treatment 3 exceeds
the value for treatment 1
Lagged Values of theDependent Variable
Heteroscedasticity Vs Homoscedacity
Homoscedasticity
The probability distribution of the errors has constant
variance
Heteroscedasticity
The error terms do not all have the same variance,
The size of the error variances may depend on the size
of the dependent variable value, for exampleWhen
heteroscedasticity is present
Residual Analysis forIndependence • least squares is not the most efficient
procedure to estimate regression coefficients
• The usual procedures for deriving confidence
intervals and tests of hypotheses is not valid
Tests for Heteroscedasticity
To test the null hypothesis that the error terms, εi, all
have
the same variance against the alternative that
theirvariances depend on the expected values
Estimate the simple regression
• Leads to sb estimates that are too small (i.e., Negative autocorrelation exists if successiveerrors are
biased) negatively correlated. This can occur if successive
• Thus t-values are too large and some variables errors alternate in sign
may appear significant when they are not
Decision rule for negative autocorrelation:
Autocorrelation reject H0 if d > 4 – dL
Autocorrelation is correlation of the errors(residuals)
over time. Violates the regression assumption
thatresiduals are random and independent
Negative Autocorrelation
Index Numbers
• Index numbers allow relative comparisons
over time
• Index numbers are reported relative to a Base
Period Index
• Base period index = 100 by definition
• Used for an individual item or measurement
Unweighted aggregate price index
Consider observations over time on the price of a Unweighted aggregate price index for periodt for a
single group of K items:
item
• To form a price index, one time period is
chosen as a
• base, and the price for every period is
expressed as a
• percentage of the base period price
• Let p0 denote the price in the base period
• Let p1 be the price in a second period
• The price index for this second period is
i = item
t = time period
K = total number of items
Seasonal Component
Time-Series Data • Short-term regular wave-like patterns
Numerical data ordered over time • Observed within 1 year
The time intervals can be annually, quarterly, daily, • Often monthly or quarterly
hourly, etc.
The sequence of the observations is important
Example:
Moving Averages
Example: Five-year moving average
First average:
Cyclical Component
• Long-term wave-like patterns Second average:
• Regularly occur but may vary in length
• Often measured peak to peak or trough to
trough
Irregular Component
Unpredictable, random, “residual” fluctuations
Due to random variations of
• Nature
• Accidents or unusual events
“Noise” in the time series
Time-Series Component Analysis
• Used primarily for forecasting
• Observed value in time series is the sum or
product ofcomponents
Additive Model
where
Tt = Trend value at period t Calculating Moving Averages
St = Seasonality value for period t Each moving average is for aconsecutive block of
Ct = Cyclical value at time t (2m+1) years
It = Irregular (random) value for period t
Let m = 2
Smoothing the Time Series
Calculate moving averages to get an overall
impression of the pattern of movement over time.
This smooths out the irregular component
Moving Average: averages of a designatednumber of
consecutivetime series values
(2m+1)-Point Moving Average
• A series of arithmetic means over time
• Result depends upon choice of m (the number
of data values in each average)
Examples:
For a 5 year moving average, m = 2
For a 7 year moving average, m = 3
Muhammad Firman (University of Indonesia - Accounting ) 203
Masterbook of Business and Industry (MBI)
Exponential Smoothing
A weighted moving average
• Weights decline exponentially
• Most recent observation weighted most
Used for smoothing and short term forecasting (often
one or two periods into the future)
The weight (smoothing coefficient) is α Sales vs. Smoothed Sales
• Subjectively chosen Fluctuations have been smoothed
• Range from 0 to 1 NOTE: the smoothed value in this case is generally a
• Smaller α gives more smoothing, larger α little low, since the trend is upward sloping and the
gives less smoothing weighting factor is only .2
The weight is:
• Close to 0 for smoothing out unwanted
cyclical and irregular components
• Close to 1 for forecasting
Exponential smoothing model
where:
• Choose p
• Form a series of “lagged predictor” variables
xt-1 , xt-2 , … ,xt-p
• Run a regression model using all p variables
• Test model for significance
• Use model for forecasting