Ch2 Two Variable Analysis
Ch2 Two Variable Analysis
• weekly
consumption
expenditure(Y)
• The 60
families are
divided into 10
income groups
(from
$80 to $260).
The weekly expenditures of each family in the various groups are as shown in the table. Therefore, we have 10
fixed values of X and the corresponding Y values against each of the X values; so to speak, there
are 10 Y subpopulations.
• There is considerable variation in
weekly consumption expenditure
in
each income group
• But the general picture that one
gets is that, despite the variability
of weekly consumption
expenditure within each income
bracket, on the average, weekly
consumption expenditure
increases as income increases
Conditional vs unconditional
• conditional expected values, as they depend on the given values of the
(conditioning) variable X. Symbolically, we denote them as E(Y|X), which is read
as the expected value of Y given the value of X .
• unconditional expected value of weekly consumption expenditure, E(Y).
If we add the weekly consumption expenditures for all the 60 families in
the population and divide this number by 60, we get the number $121.20
($7272/60), which is the unconditional mean, or expected, value of weekly
consumption expenditure, E(Y); it is unconditional in the sense that in arriving at
this number we have disregarded the income levels of the various
families.
Conditional Probabilities:
The dark circled points in Figure show the
conditional mean values of Y against the
various X values.
• In simple terms, it tells how the mean or average response of Y varies with
X.
THE MEANING OF THE TERM
LINEAR
Linearity in the Variable
• (a) is a linear function because the variable X appears with a power or index of
1.
• (b) is not a linear function because the variable X appears with a power or index
of 2.
Linearity
Linearity in the Parameters
The second interpretation of linearity is that the conditional expectation of Y, E(Y | Xi), is
a linear function of the parameters, the β’s; it may or may not be linear in the variable X
E(Y | Xi) = β1 + β2^2 Xi
Now suppose X = 3; then we obtain E(Y | Xi) = β1 + 3β2^2, which is nonlinear in the
parameter β2. The preceding model is an example of a nonlinear (in the parameter)
regression model.
Of the two interpretations of linearity, linearity in the parameters is relevant for the
development of the regression theory to be presented shortly.
• Therefore, from now on the term “linear” regression will always
mean a regression that is linear in the parameters; the β’s (that is,
the parameters are raised to the first power only). It may or may not
be linear in the explanatory variables, the X’s.
Error Term:
• Therefore, we can express the deviation of an individual Yi around its
expected value as follows:
ui = Yi - E(Y | Xi) or Yi = E(Y | Xi) + ui
• where the deviation ui is an unobservable random variable taking positive
or negative values. Technically, ui is known as the stochastic disturbance
or stochastic error term.
• the expenditure of an individual family, given its income level, can be
expressed as the sum of two components: (1) E(Y | Xi), which is simply
the mean consumption expenditure of all the families with the same level
of income. This component is known as the systematic, or deterministic,
component, and (2) ui, which is the random, or nonsystematic,
component