0% found this document useful (0 votes)
8 views

3.5 Multinomial Choice Models Goodness Hypothesis Testing

dfbdfbdf

Uploaded by

sital101299
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

3.5 Multinomial Choice Models Goodness Hypothesis Testing

dfbdfbdf

Uploaded by

sital101299
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 36

Goodness of Fit

The most naïve or simplest model is one that assigns 50% probability to each of the two
alternatives for all travelers– This is called the “equal shares” model

Let (0) be the log-likelihood value of the equal shares model:

() =
=
= N*

Let () be the log-likelihood value of the model with parameters calculated at the optimal value
of the parameters

() =

1
Goodness of Fit
How much of an improvement is the model with parameters providing over the
simplest model possible?
2 ℒℒ ( 𝛽 )
𝜌 =1−
ℒℒ ( 0 )

The best possible model (or the perfect model) should assign a probability of 1
to the actually chosen alternative. Therefore the likelihood function becomes
=* * * *…*

= 1 for the perfect model

Therefore, () = 0 and = 1

2
Goodness of Fit
When the model has no parameters (i.e., the simplest possible case),
and
Therefore, =
Thus, =

=0

1 Higher the value, better the model fit

Two models with the same number of parameters estimated using the same dataset
can be compared using values
 value is not an absolute measure of how good a model is – the most appropriate use
for value is for comparing models

3
Goodness of Fit
Rho-square with respect to the constants only model:
=0

Log-likelihood for the perfect prediction model:


=0

ℒℒ( 0) ℒℒ( C ) ℒℒ( ^𝛽 ) ℒℒ(∗)

4
Goodness of Fit
A problem with the rho-squared measures is that they improve no matter
what variable is added to the model independent of its importance.
This directly results from the fact that the objective function of the model is
being modeled with one or more additional degrees of freedom and that
the same data that is used for estimation is used to assess the goodness of
fit of the model.
One approach to this problem is to replace the rho-squared measure with
an adjusted rho-square measure which is designed to take account of these
factors

ℒℒ( 𝛽 ) − K
ℒℒ ( 𝛽) − K 2
𝜌 𝑐 =1−
ℒℒ (0) − 𝐾 𝑐
2
𝜌 =1−
ℒℒ (0 )

5
Independence from
Irrelevant Alternatives (IIA)
property
The IIA property states that for any individual, the ratio of the probabilities of
choosing two alternatives is independent of the presence or attributes of any other
alternative.
The premise is that other alternatives are irrelevant to the decision of choosing
between the two alternatives in the pair.
The ratios of probabilities for each pair of alternatives depend only on the attributes
of those alternatives and not on the attributes of the third alternative and would
remain the same regardless of whether that third alternative is available or not.

The IIA property may not properly reflect the behavioral relationships among groups
of alternatives. That is, other alternatives may not be irrelevant to the ratio of
probabilities between a pair of alternatives.

6
Independence from
Irrelevant Alternatives (IIA)
property
Consider the case of a commuter who has a choice of going to work by auto or taking a blue
bus.
Assume that the attributes of the auto and the blue bus are such that the probability of
choosing auto is two-thirds and blue bus is one-third so the ratio of their choice probabilities is
2:1.
Now suppose that a competing bus operator introduces red bus service (the bus is painted red,
rather than blue) on the same route, operating the same vehicle type, using the same schedule
and serving the same stops as the blue bus service. Thus, the only difference between the red
and blue bus services is the color of the buses.
The most reasonable expectation, in this case, is that the same share of people will choose
auto and bus and that the bus riders will split equally between the red and blue bus services.
That is, the addition of the red bus to the commuters’ choice set should have no, or very little,
effect on the share of commuters choosing auto since this change does not affect the relative
quality of drive alone and bus.
Therefore, we expect choice probabilities following the initiation of red bus service to be auto,
two-thirds; blue bus, one-sixth and red bus, one-sixth.

7
Independence from
Irrelevant Alternatives (IIA)
property
However, due to the IIA property, the multinomial logit model will maintain the
relative probability of auto and blue bus as 2:1.

If we assume that people are indifferent to color of their transit vehicle, the two bus
services will have the same representative utility and consequently, their relative
probabilities will be 1:1

The share probabilities for the three alternatives will be: Pr(Auto) = ½, Pr(Blue Bus) =
1/4, and Pr(Red Bus) = 1/4

That is, the probability (share) of people choosing auto will decline from two-thirds to
one half as a result of introducing an alternative which is identical to an existing
alternative.

The red bus/blue bus paradox provides an important illustration of the possible
consequences of the IIA property. Although this is an extreme case; the IIA property
can be a problem in other, less extreme cases.

8
Hypothesis Testing:
Standard Errors of
Parameters
Statistical tests may be used to evaluate formal hypotheses about individual
parameters or groups of parameters taken together
There is sampling error associated with the model parameters because the
model is estimated from only a sample of the relevant population
The magnitude of the sampling error in a parameter is provided by the
standard error associated with that parameter
◦ the larger the standard error, the lower the precision with which the
corresponding parameter is estimated
The standard error plays an important role in testing whether a particular
parameter is equal to some hypothesized value

9
The t-statistic
The statistic used for testing the null hypothesis that a parameter is equal to
some hypothesized value , is the asymptotic t-statistic, which takes the following
form:

is the estimate for the parameter,

is the hypothesized value for the parameter and

is the standard error of the estimate.

10
The t-statistic
Sufficiently large absolute values of the t-statistic lead to the rejection of the
null hypothesis that the parameter is equal to the hypothesized value
When the hypothesized value , is zero, the t-statistic becomes the ratio of the
estimated parameter to the standard error
The rejection of this null hypothesis implies that the corresponding variable has
a significant impact on the modal utilities and suggests that the variable should
be retained in the model
Low absolute values of the t-statistic imply that the variable does not
contribute significantly to the explanatory power of the model and can be
considered for exclusion

11
The t-statistic
The selection of a critical value for the t-statistic test is
a matter of judgment and depends on the level of
confidence with which the analyst wants to test his/her
hypotheses
The critical t-values for different levels of confidence for
samples sizes larger than 150 (which is the norm in
discrete choice analysis) are:
Critical t-value increases with the desired level of
confidence
The hypothesis that a particular variable has no
influence on choice (or equivalently that the true
parameter associated with the variable is zero) can be
rejected at
 90% level of confidence if absolute value of t-statistic is
greater than 1.65
 95% level of confidence if the t-statistic is greater than
1.96

12
The t-test
The hypothesis testing procedure
(1) Formulate the Null (H ) and Alternate (H ) hypotheses
0 1

H 0 : 1 0 and H1 : 1 0
(2) Pick a confidence level (1-α)
95% conf. level  (1 -  ) 0.95   0.05

(3) Obtain the critical ‘t’ value tcr ,  for the chosen confidence level

Confidence Level α Critical 't' value

90% 0.1 1.65

95% 0.05 1.96

99% 0.01 2.58

99.50% 0.005 2.81

99.90% 0.001 3.29

13
The t-test
ˆ1
(4) Compute t
SE ( ˆ1 )
(5) If t  tcr ,
We are 100(1- α)% sure that
We are 100(1- α)% sure that the NULL hypothesis is incorrect
Reject the NULL Hypothesis with 100(1- α)% confidence
The parameter is statistically significant

(6) If t tcr ,
We are not 100(1-α)% sure that
We are not 100(1- α)% sure that the NULL hypothesis is incorrect
Unable to reject the NULL Hypothesis with 100(1- α)% confidence
The parameter is statistically insignificant

14
Sample
Estimatio
n Results
Testing Statistical
Significance
Both the travel cost and travel time parameters have large absolute t-statistic values (20.6
and 16.6, respectively)
Reject the hypothesis that these variables have no effect on modal utilities at a confidence
level higher than 99.9%.
Thus, these variables should be retained in the model.
All the other t-statistics, except for Income-Shared Ride 2, Income-Shared Ride 3+ and the
walk constant are greater than 1.960 (95% confidence) supporting the inclusion of the
corresponding variables.
The t-statistics on the shared ride specific income variables are even less than 1.645 in
absolute value (90% confidence)
The effect of income on the utilities of the shared ride modes may not differentiate them
from the reference (drive alone) mode.
The analyst should consider removing these income variables from the utility function
specifications for the shared ride modes.
16
Testing Statistical
Significance
It is important to recognize that a low t-statistic does not require removal of the
corresponding variable from the model.
If the analyst has a strong reason to believe that the variable is important, and the
parameter sign is correct, it is reasonable to retain the variable in the model.
One should be cautious about prematurely deleting variables which are expected to
be important as the same variable may be significant when other variables are added
to or deleted from the model.
The lack of significance of the alternative specific walk constant is immaterial since
the constants represent the average effect of all the variables not included in the
model and should always be retained despite the fact that they do not have a well-
understood behavioral interpretation.

17
Testing Statistical
Significance
It is often interesting to determine if two parameters are statistically different from
one another or if two parameters are related
These tests are similarly based on the t-statistic; however, the formulation of the
test is somewhat different from that described earlier
To test the hypothesis ; we use the asymptotic t-statistic, which takes the following
form

18
Comparing Entire Models
The t-statistic is used to test the hypothesis that a single parameter is equal to
some pre-selected value or that there is a linear relationship between a pair of
parameters.
Sometimes, we wish to test multiple hypotheses simultaneously.
This is done by formulating a test statistic which can be used to compare two
models provided that one is a restricted version of the other

The restricted model can be obtained by imposing restrictions (setting


some parameters to zero, setting pairs of parameters equal to one
another and so on) on parameters in the unrestricted model

19
Log-Likelihood Ratio (LR)
Test
If all the restrictions that distinguish between the restricted and unrestricted models are
valid, one would expect the difference in log-likelihood values (at convergence) of the
restricted and unrestricted models to be small
If some or all the restrictions are invalid, the difference in log-likelihood values of the
restricted and unrestricted models will be “sufficiently” large to reject the hypotheses.
This underlying logic is the basis for the likelihood ratio test

The test statistic is:

is the log-likelihood of the restricted model, and


is the log-likelihood of the unrestricted model

The likelihood ratio test can be applied to test null hypotheses involving the exclusion of
groups of variables from the model.

20
Log-Likelihood Ratio (LR)
Test
This test-statistic is Chi-squared distributed.
As with the test for individual parameters, the critical value for determining if the statistic is
“sufficiently large” to reject the null hypothesis depends on the level of confidence desired
by the model developer.
It is also influenced by the number of restrictions between the models.

21
Testing Behavioral
Hypothesis
Behavioral Null Hypothesis: Time and cost variables have no impact on the mode
choice decision
=0
Behavioral Null Hypothesis: Income has no effect on the travel mode choice

=
=0

22
Comparing Non-Nested
Models: I
In statistics, the Bayesian information criterion (BIC) is used for selection among a
finite set of models
The model with the lowest BIC is preferred
When fitting models, it is possible to increase the likelihood by adding parameters,
but doing so may result in overfitting. BIC attempts to resolve this problem by
introducing a penalty term for the number of parameters in the model

where,
K : Number of parameters
N : Sample size or number of observations
: Log-likelihood value
23
Comparing Non-Nested
Models: II
Another metric to assess the relative quality of models is the Akaike Information
Criterion or the AIC
Penalizes for the number of parameters in the model; and is given by:

where,
K : Number of parameters
: Log-likelihood value

24
Comparing Non-Nested
Models:
Horowitz’s III test:
non-nested hypothesis
The likelihood ratio test can only be applied to compare models which differ due
to the application of restrictions to one of the models. Such cases are referred to
as nested hypothesis tests.
However, there are important cases when the rival models do not have this type
of restricted – unrestricted relationship. For example, we might like to compare
the base model to an alternative specification in which the variable cost divided
by income is used to replace cost.
This reflects the expectation that the importance of cost diminishes with
increasing income. This analysis can be performed by using the non-nested
hypothesis test proposed by Horowitz (1982).

25
Comparing Non-Nested
Models:
Horowitz’s III test:
non-nested hypothesis
The null hypothesis that the model with the lower value is the true model is
rejected at the significance level determined by the following equation:

Significance level =

where, : Adjusted LR index for the model with lower value


: Adjusted LR index for the model with higher value
No. of parameters in model L
No. of parameters in model H
Standard normal cumulative distribution function

26
27
Null Hypothesis
◦ The effect of income relative to drive alone is the same for the two shared ride modes
(shared ride 2 and shared ride 3+) but is different from drive alone and different from
the other modes)

◦ The effect of income relative to drive alone is the same for both shared ride modes
and transit but is different for the other modes. This is represented in the model by
constraining the income coefficients in both shared ride modes and the transit mode
to be equal as

0
◦ The effect of income on all the automobile modes (drive alone, shared ride 2, and
shared ride 3+) is the same, but the effect is different for the other modes.
28
Marginal Effects
MEASURES OF RESPONSE TO CHANGE IN ATTRIBUTES OF ALTERNATIVES
One measure for evaluating the response to changes is to calculate the derivatives of the choice
probabilities of each alternative with respect to the variable in question. The mathematical
expression for this measure, termed the direct derivative of with respect to is:

Typically the utility function is specified to be linear in parameters, where is the coefficient of
attribute :

Thus,
The sign of the derivative is the same as the sign of the parameter describing the impact of in the
utility of alternative i. Thus, an increase in will increase (decrease) if is positive (negative)
The value of derivative is largest at =0.5 and becomes smaller as

29
Marginal Effects
MEASURES OF RESPONSE TO CHANGE IN ATTRIBUTES OF ALTERNATIVES
Often it is important to understand how the choice probability of other alternatives
changes in response to a given change in the attribute level of the action alternative. The
mathematical expression for this measure, called the cross derivative of with respect to is:

 The sign of the derivative is opposite to the sign of the parameter describing the impact
of in the utility of alternative i. Thus, an increase in will decrease (increase) if is positive
(negative)
Note: The sum of derivatives over all the alternatives must be equal to zero.

Since sum of all probabilities is fixed at 1, sum of the derivatives of the probability due to
change in any attribute of any alternative must be equal to 0.

30
Elasticities of Choice
Probabilities
MEASURES OF RESPONSE TO CHANGE IN ATTRIBUTES OF ALTERNATIVES

Elasticity is another measure that is used to quantify the extent to which the choice probabilities of
each alternative will change in response to the changes in the value of an attribute. It is defined as the
percentage change in the response variable w.r.t. a one percent change in an explanatory variable.

In the context of logit models, the response variable is the choice probability of an alternative, such
as , and the explanatory variable is the attribute

Elasticities are different from derivatives in that elasticities are normalized by the variable units

Let us consider that and are choice probabilities of an alternative i at attribute levels and ,
respectively. In this case, the elasticity is the proportional change in the probability divided by the
proportional change in the attribute under consideration:

Elasticity = = =

31
Elasticities of Choice
Probabilities
MEASURES OF RESPONSE TO CHANGE IN ATTRIBUTES OF ALTERNATIVES

There is some ambiguity in the computation of this elasticity measure in terms of whether it should
be normalized using the original probability-attribute combination ( , ) or the new probability-
attribute combination (, )

This confusion can be avoided by computing elasticities for very small changes; termed point elasticity

Using the expression for direct derivative that we derived earlier,

32
Marginal Effects
MEASURES OF RESPONSE TO CHANGE IN DECISION-MAKER ATTRIBUTES
The important difference from the earlier discussion is that we were assessing the impact of a
change in probability of an alternative in response to a change in an attribute of a single alternative;
either the same alternative (direct-derivative and direct-elasticity) or another alternative (cross-
derivative and cross-elasticity).

In the case of traveler characteristics, those characteristics may appear in alternative specific form in
all alternatives (except for one reference alternative). Thus, we are considering what, in effect
becomes a combination of one direct response and multiple cross responses.

Consider, for example, the probability of choosing alternative i in response to a change in income,
specific to alternative i

33
Marginal Effects
MEASURES OF RESPONSE TO CHANGE IN DECISION-MAKER ATTRIBUTES
However, since an identical change in income will occur for all alternatives in which income appears
as an alternative specific variable, we consider the cross-derivative of the probability of choosing
alternative i in response to a change in income, specific to alternative j

Note:

The corresponding sum over all alternatives is

34
Marginal Effects
MEASURES OF RESPONSE TO CHANGE IN DECISION-MAKER ATTRIBUTES
The corresponding sum over all alternatives including is

= -
=
=
where, is the probability-weighted average of the alternative-specific parameters
That is, the derivative of the probability with respect to a change in income is equal to the
probability times the amount by which the income coefficient for that alternative exceeds the
probability weighted average income coefficient over all alternatives

35
Elasticities of Choice
Probabilities
MEASURES OF RESPONSE TO CHANGE IN DECISION-MAKER ATTRIBUTES
As discussed earlier, elasticity is another measure that can be used to quantify
the extent to which the choice probabilities are influenced by changes in a
variable; in this case, a variable that describes the characteristics of the traveler.
In this case, the elasticity of the probability of alternative i to a change in
income is given by:

As before, the elasticity represents the proportional change in probability of


an alternative to a proportional change in the explanatory variable.

36

You might also like