0% found this document useful (0 votes)

42 views

Tree Choro

This document describes a binomial mixture model used to model the number of credits earned by university freshmen in their first year. A binomial mixture model is appropriate as credits are a bounded count variable. The model includes concomitant variables to investigate how student characteristics and pre-enrollment test scores predict credits earned. Results are presented through tables and graphs to summarize the effects of covariates on component probabilities and expected credits for different student profiles. Model predictive performance is evaluated using cross-validation. The main finding is the pre-enrollment test provides additional information for student advising, though predictive power is modest.

Uploaded by

Fabian Moss

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views

Tree Choro

Uploaded by

Fabian Moss

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

See discussions, stats, and author proﬁles for this publication at: https://www.researchgate.

net/publication/236742423

Binomial Mixture Modeling of University Credits

Article in Communication in Statistics- Theory and Methods · January 2013

DOI: 10.1080/03610926.2013.804565

CITATIONS READS

2 439

3 authors:

Leonardo Grilli Carla Rampichini

University of Florence University of Florence
65 PUBLICATIONS 682 CITATIONS 64 PUBLICATIONS 435 CITATIONS

SEE PROFILE SEE PROFILE

Roberta Varriale
Italian National Institute of Statistics
25 PUBLICATIONS 149 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Advances in Multilevel and Longitudinal Modelling SID project, University of Padua View project

All content following this page was uploaded by Leonardo Grilli on 22 May 2014.

The user has requested enhancement of the downloaded ﬁle.

Binomial mixture modelling of university credits
Leonardo Grilli*, Carla Rampichini* and Roberta Varriale**
*Department of Statistics, Computer Science, Applications - University of Florence
**ISTAT, Rome

Corresponding author Leonardo Grilli Department of Statistics, Computer Science, Appli-

cations Viale Morgagni, 59 50134 Firenze e-mail: [email protected]ﬁ.it

Pre-print version of the article to appear in 2013 in

.
COMMUNICATIONS IN STATISTICS - THEORY AND METHODS
http://www.tandfonline.com/loi/lsta20#.UZNHoMrmDtk
Binomial mixture modelling of university credits

Abstract

The paper reviews ﬁnite mixture models for binomial counts with concomitant variables.
These models are well known in theory, but they are rarely applied. We use a binomial

ﬁnite mixture to model the number of credits gained by freshmen during the ﬁrst year at the

School of Economics of the University of Florence. The ﬁnite mixture approach allows us to
appropriately account for the large number of zeroes and the multi-modality of the observed

distribution. Moreover, we rely on a concomitant variable speciﬁcation to investigate the role

of student background characteristics and of a compulsory pre-enrolment test in predicting

gained credits. In the paper we deal with model selection, including the choice of the number
of components, and we devise numerical and graphical summaries of the model results in

order to exploit the information content of the concomitant variable speciﬁcation. The main

ﬁnding is that the introduction of the pre-enrolment test gives additional information for
student tutoring, even if the predictive power is modest.

Key Words: concomitant variables; excess zeroes; latent class; prediction; pre-enrolment

test.

1 Introduction

In the statistical literature there has been a growing interest in ﬁnite mixture modelling as a

tool to increase the ﬂexibility of conventional parametric models (McLachlan and Peel, 2000;
Schlattmann, 2009). Finite mixture models can be seen as a compromise between a simple

parametric model and a non-parametric approach. Moreover, these models allow to account

for unobserved heterogeneity due to latent sub-populations, often called latent classes.
In our application the response of interest is the number of credits gained by university

1
freshmen during the ﬁrst year. This is a count variable with a maximum of sixty common
to all students, thus the binomial distribution is a natural candidate. However, the large

number of zeroes and the multi-modality of the observed distribution call for a ﬂexible model,

such as a binomial ﬁnite mixture. This model should be considered as an approximation

to the observed distribution and it is not intended to be an accurate representation of the

processes of credits accumulation, which would require a more complex model taking into

account that each student can choose her own exam sequence and that exams have diﬀerent

success rates.
Indeed, the main purpose of the analysis is to identify predictors of student performance

among background characteristics and the results of a pre-enrolment test. To this end we

use a concomitant variable approach (Dayton and Macready, 1988).

In applied research there are many examples of ﬁnite mixture models for unbounded

count data, where the outcome distribution is assumed to be Poisson or negative binomial.

In case of bounded counts, the usual strategy is to use a truncated Poisson distribution

(Saﬀari et al., 2013), whereas binomial ﬁnite mixture models are rarely used. The only
two examples we found are: a study of fetal deaths in litters by Brooks et al. (1997), who

apply diﬀerent types of ﬁnite mixture models without covariates; and a study of welfare

participation by Melkersson and Saarela (2004), who specify a hurdle model with a binomial
ﬁnite mixture for non-zero counts. In our application the zero counts are not modelled

separately, so there is not a hurdle stage. Moreover, Melkersson and Saarela (2004) adopt a

mixture regression approach where the covariates aﬀect the binomial probabilities, while we
adopt a concomitant variable approach where the covariates aﬀect the mixture probabilities.

The application of binomial mixtures raises several issues that must be carefully ad-

dressed. Most of the issues are common to all mixture models, such as the choice of the

number of components, whereas other issues pertain to models with concomitant variables,
such as the strategy to summarize the eﬀects of the covariates. In fact, in models with

2
concomitant variables the eﬀects are diﬃcult to interpret and often results are presented in
a hasty way, so that the information content of the model is largely unexploited. The results

will be eﬀectively presented in form of tables and graphs for the component probabilities and

the expected responses, considering some relevant student proﬁles. Moreover, the predictive

performance of the model will be evaluated via cross-validation techniques.

The structure of the paper is as follows. Section 2 outlines the binomial ﬁnite mixture

model with concomitant variables and reviews some of the methods for selecting the number

of components. Section 3 describes the data and discusses model speciﬁcation. Section 4
illustrates the results through numerical and graphical summaries, including the assessment

of the predictive ability. Section 5 concludes.

2 Finite mixture models for binomial counts

Let us consider a discrete random variable yi observed on a random sample of subjects

i = 1, . . . , n. A ﬁnite mixture model for yi assumes that its mass distribution function P (yi )

is deﬁned by a ﬁnite mixture of conditional distributions P (yi | ui ), where ui is a categorical

latent variable taking values k = 1, . . . , K with prior probabilities πk = P (ui = k), where

πk > 0 and K k=1 πk = 1:
K
P (yi ) = πk P (yi | ui = k). (1)
k=1

In this paper we assume that all the conditional distributions P (yi | ui ) are binomial with

common number of trials t and component-speciﬁc probabilities of success θk :

t
P (yi | ui = k) = θkyi (1 − θk )t−yi . (2)
yi

Titterington et al. (1985) show that, in general, ﬁnite mixtures of distributions of the expo-
nential family are identiﬁed, even if for the binomial distribution the number of components

K should be limited with respect to the number of trials t. In particular, the K-component

3
1
binomial mixture model (1) with 0 < θk < 1 is identiﬁable if and only if K ≤ 2
(t + 1)
(McLachlan and Peel, 2000).

A common interpretation of the latent variable ui is in terms of latent classes, namely

the population is assumed to be partitioned into K latent classes, where ui = k for subject

i belonging to the k-th latent class. Thus, the prior probability πk corresponds to the
proportion of subjects in the k-th latent class (class size).

The covariates can enter a ﬁnite mixture model in two ways: through the conditional

distributions P (yi | ui ), yielding a Mixture Regression Model (Wedel and DeSarbo, 1995),
and through the component probabilities πk , yielding a Concomitant variable mixture model

(Dayton and Macready, 1988). The mixture regression approach allows the relationship

between the response variable and the covariates to diﬀer across the latent classes. This
approach is not suitable in our application on university credits, where mixture modelling

is a way to account for the multi-modality of the response variable. Moreover, our interest

lies in predicting the performance of a student, which requires computing the component

probabilities using the available covariates. We therefore rely on the concomitant variable
approach.

In a concomitant variable mixture model the component probabilities of the ﬁnite mixture

constraints are satisﬁed by any model for nominal variables, like the multinomial logit model:
exp(zi β k )
πk|zi = K
, k = 1, . . . , K, (4)
l=1 exp(zi β l )

with β 1 = 0 for model identiﬁability. Therefore, the prior probabilities of class membership
depend on the covariates zi through a non-linear function.

4
The concomitant variable model (3) involves two sets of parameters: θ1 , . . . , θK in the
binomial mass function (2) and β 2 , . . . , β K in the multinomial model (4). The model is

identiﬁed if the matrix of the covariates is of full rank, in addition to the condition on the

number of components, K ≤ 12 (t + 1) (Wang et al., 1996). For given K, the parameters

can be estimated with Maximum Likelihood using the EM algorithm (McLachlan and Peel,
2000).

Class membership can be predicted by assigning each unit to the class with the highest

probability, using either the prior probabilities πk|zi = P (ui = k | zi ) or the posterior prob-
abilities P (ui = k | yi , zi ), derived by means of Bayes rule. When the aim is to classify the

sample units, the prediction is usually based on posterior probabilities. In our application,

however, we are interested in predicting the number of gained credits for a hypothetical new
student on the basis of the available covariates. In other words, we aim at making out-

of-sample predictions. Thus, in order to predict the response y∗ for a hypothetical subject

with covariates z∗ , we rely on the expected value based on prior, rather than posterior, class

membership probabilities (marginal mean prediction):

K
K
E(y∗ | z∗ ) = πk|z∗ E(y∗ | u∗ = k) = t πk|z∗ θk , (5)
k=1 k=1

where the last equality follows from the assumption of binomial components. The predicted
value ŷ∗ is obtained by plugging the estimated parameters into equation (5).

To asses the predictive ability of the estimated model, we can compare the observed

responses yi with the corresponding predictions ŷi for the sample individuals (i = 1, . . . , n).
The prediction error can be summarized in many ways, e.g. by the mean absolute error

(MAE):
1
n
M AE = | ŷi − yi | . (6)
n i=1
Cross-validation techniques (Hastie et al., 2009) can be used to obtain a more reliable value
of MAE as a measure of the performance of out-of-sample prediction.

5
The choice of the number of mixture components (latent classes) is a critical issue: even if
models with diﬀerent values of K are nested, the Likelihood Ratio Test (LRT) does not have

the standard chi-square distribution since the regularity conditions are not met. In applied

research, the issue is usually solved by comparing models via information criteria, such as

BIC and AIC and their modiﬁcations, though the methodological literature suggests to use
statistical tests. Here we consider two tests: the LRT, whose asymptotic distribution may

be approximated by parametric bootstrap (McLachlan, 1987); and the EM-test recently

proposed by Li and Chen (2010). The EM-test compares a ﬁnite mixture model with K
components with a model having more than K components. The test statistic is a penalized

version of the LRT based on a few EM iterations. Li and Chen (2010) show that, under

weak conditions, the limiting distribution of the test statistic under the null hypothesis is a
mixture of a mass point in zero and several χ2 distributions.

Nylund, Asparouhov and Muthén (2007) perform a simulation study comparing various

methods for choosing the number of latent classes. The authors conclude that BIC is the

best information criterion, while bootstrap LRT is the best test (but the EM-test was not
considered).

3 Data description and model specification

We analyze data on 690 freshmen of the School of Economics in Florence in a.y. 2008/2009,

considering the students who took the compulsory pre-enrolment test in September 2008.

The aim is to evaluate their performance in terms of gained credits after one year, whose

distribution is shown in Figure 1. The number of credits ranges from 0 to 60 by 3, namely

{0, 3, 6, . . . , 60}. The sample distribution has a small percentage at the maximum (0.75% of

freshmen gained 60 credits), but it has a peak at the minimum (23% of freshmen did not

gain any credit). Therefore, the phenomenon is characterized by a relevant left censoring

6
that needs to be accounted by the model. Moreover, the distribution of positive credits is
quite irregular, showing peaks at 6, 15, 24, 36 and 45 credits. This pattern results from the

paths followed by students, who can take exams yielding 6, 9 or 12 credits. The distribution

of positive credits has a median of 30 and a mean of 29.8.

25
2015
percent
10
5
0

0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60
Gained credits after one year

Figure 1: Number of gained credits after one year. Freshmen of the School of Economics of
the University of Florence a.y. 2008/09.

The choice of a parametric distribution for gained credits is challenging due to the excess

zeroes and the multi-modality of the observed distribution. A natural approach is to use
a ﬁnite mixture of binomial distributions, yielding a multi-modal distribution with limited

support. Since each exam gives a number of credits multiple of 3, we deﬁne the response

variable as the number of gained credits divided by 3. In this way, the response variable
ranges from 0 to 20, thus it can be modelled with a binomial distribution with a number of

trials t = 20. The ﬁnite mixture approach automatically solves the issue of excess zeroes,

since they are captured by a component with a very low success probability. Note that zero-

inﬂated models (Hall, 2000) and hurdle models (Saﬀari et al., 2013) solve the issue of excess
zeroes, but they cannot easily account for the multi-modality of the positive counts. For

example, the hurdle approach could be generalized to allow for multi-modality by modelling

7
the positive counts through a mixture of shifted binomial distributions or a mixture of trun-
cated Poisson distributions (Böhning and Kuhnert, 2006), even if in the present application

the Poisson distributions should be truncated also on the right tail.

The prediction of gained credits may be improved by exploiting background information

and test results through the concomitant variable mixture model (3). The available covariates
are:

• Background variables: Gender, Far-away resident (indicator for residence in the provinces

of Massa-Carrara and Grosseto or in a province out of Tuscany), Type of high school

(HS type: Scientiﬁc, Humanities, Technical, Other), High school irregular career (HS
irreg. career: indicator for age at high school diploma > 19), High school grade (HS

grade: from 60 to 100, centered at 80);

• Pre-enrolment test scores: Total test score, Partial test scores (Logic, Reading, Math-

ematics).

A summary of the number of gained credits by background characteristics is reported in

Table 1. Note that the last column reports the average number of gained credits for students

gaining at least one credit. The most important predictors seem to be the irregular career

and the high school grade. The type of school plays a role in predicting students who did
not gain any credit.

The pre-enrolment test is a compulsory test for evaluating the abilities of the candidates

wishing to enrol in one of the degree programs of the School of Economics of the University

of Florence: Management, Economics, Tourism, Marketing and Statistics. The test is based
on 40 multiple-choice items covering 3 areas: Logic (12 items), Reading (10 items) and

Mathematics (18 items). For each item, one out of 5 alternatives is correct, with the following

scoring system: 1 if correct, 0 if blank, −0.25 if wrong. Thus the total score ranges from
−10 to 40. The threshold for passing the test is ﬁxed at 9: candidates with a lower total

8
Table 1: Freshmen’s performance after one year by background characteristics. School of
Economics of the University of Florence a.y. 2008/09.
Variable N % with credits Avg. Credits
=0 >0 (credits > 0)
Gender
Male 357 23.8 76.2 28.7
Female 333 22.2 77.8 30.9
Far-away resident
No 643 22.7 77.3 29.8
Yes 47 27.7 72.3 29.5
HS type
Scientific 254 19.3 80.7 30.7
Humanities 53 17.0 83.0 30.3
Technical 274 25.2 74.8 30.6
Other 109 29.4 70.6 24.9
HS irreg. career
No 605 19.8 80.2 30.6
Yes 85 45.9 54.1 20.6
HS grade
≤ 80 434 27.0 73.0 26.3
> 80 256 16.4 83.6 35.0
All 690 23.0 77.0 29.8

score are advised against enrollment, so that they could still enrol in a degree program of
the School of Economics, but they are allowed to take examinations only after passing the

test during one of the later editions.

The number of gained credits is strongly related to the test result, as shown by Table
2. Overall, the percentage of students gaining credits is 77.0% with a mean of 29.8 credits

out of 60. The performance is worse for students who did not pass the test (58.6% gained

credits, with a mean of 23.5) and it is better for students passing the test with a score above

the median (85.8% gained credits, with a mean of 33.7).

In the analysis we do not use the total score, but the three partial scores in Logic,

Reading and Math. In fact, we are interested in evaluating the role of each of the three

areas in predicting the student performance. For comparison purposes, we use standardized
partial scores.

9
Table 2: Freshmen’s performance after one year by test result. School of Economics of the
University of Florence a.y. 2008/09.
Test result N % with credits Avg. Credits
=0 >0 (credits > 0)
Not passed (< 9) 111 41.4 58.6 23.5
Passed below median (9 − 16.25) 297 24.6 75.4 27.4
Passed above median (> 16.25) 282 14.2 85.8 33.7
All 690 23.0 77.0 29.8

The relationships between gained credits and partial scores are summarized by the corre-
sponding simple regression coeﬃcients (Logic 3.49, Reading 4.35 and Math 5.67). However,

due to the correlations among the partial scores, the multiple regression coeﬃcients give a

somewhat diﬀerent picture (Logic 0.63, Reading 3.06 and Math 4.70): the score in Logic has

a little role in predicting gained credits once Math and Reading scores are known.
The use of test scores as concomitant variables allows us to assess the predictive power

of the test in terms of number of gained credits, thus establishing if the pre-enrolment test

is an eﬀective tool for student evaluation in addition to background characteristics of the

candidates already available from administrative records.

4 Results

Let us deﬁne the response variable for the i-th freshmen as yi = creditsi /3, where creditsi
is the number of gained credits at the end of the ﬁrst year. Conditionally on the latent

class ui = k, we assume that yi follows a binomial distribution with number of trials t = 20

and class-speciﬁc success probability θk . The marginal distribution of yi is given by the

concomitant variable ﬁnite mixture model (3) with the multinomial logit model (4) for the

component probabilities πk|zi . The model is ﬁtted with maximum likelihood using the Syntax

module of Latent Gold (Vermunt and Magidson, 2008).

In model (3) the covariates aﬀect the latent class probabilities, but they do not aﬀect the

10
class-specific distribution. Therefore, in order to choose the number K of latent classes, we
first fit the model without covariates. Once the number of latent classes has been chosen,

the covariates are selected using the standard Wald test.

The number of components K of the ﬁnite mixture binomial model without covariates is

selected according to the BIC, the bootstrap LRT, and the EM-test of Li and Chen (2010).
The BIC and the bootstrap LRT are obtained from Latent Gold, whereas the EM-test is

performed using the R code embinom.R developed by Pengfei Li 1 .

The results are reported in Table 3 for K = 1, . . . , 6: the three criteria agree in selecting
a model with K = 5. Regarding the EM-test, under the null hypothesis the test statistic

has a positive probability of being equal to 0, as for the case K = 5 in Table 3. This means

that there is no evidence to reject the null hypothesis, suggesting that the model with K = 5
components provides good ﬁtting to the data.

Table 3: Selection of the number of components K in the binomial mixture model without
concomitant variables.
Number Number Log-likelihood BIC LRT statistic EM-test
comp. param. (bootstrap p-value) statistic (p-value)
1 1 -4045.3 8097.1 3539.09 (0.0000) 3538.3 (0.0000)
2 3 -2275.8 4571.1 567.98 (0.0000) 700.3 (0.0000)
3 5 -1991.8 4016.2 135.50 (0.0000) 153.2 (0.0000)
4 7 -1924.0 3893.8 18.31 (0.0000) 17.5 (0.0001)
5 9 -1914.9 3888.5 0.03 (0.2640) 0.0 (1.0000)
6 11 -1914.8 3901.6

Table 4 reports the results for the binomial mixture model without concomitant variables
for K = 5 components. The ﬁrst component has a proportion π̂1 = 0.22 and a probability of

success θ̂1 near zero, thus yielding an almost degenerate distribution with mass concentrated

in zero. The other four components, whose distributions are depicted in Figure 2, correspond
to latent classes of students with increasing performance in terms of gained credits (the

expected number of credits are 9, 23, 39 and 51, respectively). Table 4 also reports two
1
Downloadable from http://www.math.uwaterloo.ca/ p4li/software/index.htmltest

11
relevant conditional probabilities: the probability of getting zero credits, P (credits = 0|u =
k), and the probability of getting all or almost all of the sixty planned credits, P (credits ≥

54|u = k).

Table 4: Fitted binomial mixture model without concomitant variables for K = 5 compo-
nents.
Component πk θk E(credits|u = k) P (credits = 0|u = k) P (credits ≥ 54|u = k)
1 0.22 0.00 0 1.000 0.000
2 0.15 0.14 9 0.045 0.000
3 0.25 0.39 23 0.000 0.000
4 0.28 0.65 39 0.000 0.012
5 0.10 0.85 51 0.000 0.381

The predicted marginal probabilities P (credits = c) are obtained by plugging parameter

estimates into equations (1) and (2), recalling that y = credits/3. It is worth noting that

the model yields an excellent ﬁt of the proportions at the extremes of the distribution. In

fact, Table 4 shows that the probability of gaining zero credits is not negligible only for
the ﬁrst two components, so that P (credits = 0) ≈ 0.22 × 1.000 + 0.15 × 0.045 = 0.230,

equal to the observed proportion. Therefore, the ﬁtted model adequately accounts for excess

zeroes. As for the right tail, the probability of gaining at least 54 credits (i.e. at most

one exam of 6 credits left out) is not negligible only for the last two components, so that
P (credits ≥ 54) ≈ 0.28 × 0.012 + 0.10 × 0.381 = 0.040, close to the observed proportion

0.046.

The five components of the fitted mixture model all have a relevant size and they are
well separated (see Table 4 and Figure 2, where the first component is not represented since

it has almost all the mass in zero). Thus, the multi-modal distribution of gained credits

depicted in Figure 1 is adequately approximated by the 5-component binomial mixture.

In order to predict the number of gained credits, we exploit the background variables

and the test scores to characterize the ﬁve components via the concomitant variable mixture

model of equations (2) to (4). To select the covariates, we ﬁrst ﬁt the full model, then we

12
0.25

0.20

P(credits|u=k)
0.15

k=2
k=3
k=4
0.10
k=5

0.05

0.00
0 6 12 18 24 30 36 42 48 54 60
Gained credits

Figure 2: Distribution of components 2 to 5 of the ﬁtted binomial mixture model of Table 4

remove not signiﬁcant covariates. Speciﬁcally, for each covariate s, we perform the multiple

Wald test for H0 : β2s = β3s = β4s = β5s = 0 and discard the covariate when the p-value is
higher than 5%.

For the ﬁnal model, Table 5 shows the parameter estimates and p-values of the multiple

Wald test for the covariates. The p-value of the standardized Logic test score is slightly higher
than 5%, but we retained it for comparability with the other test scores. As expected, the

estimates of the binomial probabilities are nearly identical to those of the model without

covariates (see Table 4).

The univariate Wald test for the k-th component (k = 2, 3, 4, 5) allows us to test if a
given covariate aﬀects the odds comparing the k-th component with the ﬁrst one (i.e. the

zero-credit latent class). Considering only signiﬁcant coeﬃcients, we see that students from

Technical or other high schools and students with irregular career have a lower performance
in terms of gained credits, while students with a good high school grade and students with

high scores in Reading and Math have a better performance.

13
Table 5: Binomial mixture model with concomitant variables: parameter estimates and
p-values of the multiple Wald test for the covariates.
Latent class p-value
1 2 3 4 5
Binomial probability θk 0.00 0.15 0.38 0.64 0.85 -
Multinomial logit model† for πk
Constant - -0.03 0.22 0.96 -0.57 0.000
HS Technical/other - -0.63 0.18 -0.40 -1.43 0.013
HS irregular career - -0.39 -0.79 -3.08 -0.57 0.012
HS grade - -0.01 0.01 0.06 0.12 0.000
Logic (std score) - -0.11 0.21 0.26 -0.34 0.052
Reading (std score) - 0.51 0.33 0.29 0.79 0.001
Math (std score) - -0.09 0.00 0.25 1.10 0.000
†
Estimates are in italic when the p-value of the univariate Wald test is < 0.05.

In general in a multinomial model, the eﬀect of covariates on the probabilities πk is

not linear and it can even be not monotone. In order to better understand the eﬀect of

covariates, it is useful to transform coeﬃcients into probabilities and credits. To this end,
Table 6 reports the predicted latent class probabilities and expected number of gained credits

for several student proﬁles. The expected number of credits for latent class k is obtained

from the corresponding binomial distribution as 60 × θk . These values are reported in the

last row of Table 6, whereas the expectations in the last column are the model predictions

obtained as weighted means, namely k πk (60 × θk ).

Table 6: Binomial mixture model with concomitant variables: predicted latent class proba-
bilities and expected number of gained credits.
Latent class Expected
1 2 3 4 5 n. of credits
Predicted probabilities
Baseline student† 0.16 0.15 0.20 0.41 0.09 25.8
HS Technical/other 0.20 0.11 0.31 0.36 0.03 22.9
HS irregular career 0.38 0.25 0.21 0.04 0.12 14.8
HS grade 60 (min) 0.24 0.30 0.26 0.19 0.01 16.5
HS grade 100 (max) 0.06 0.04 0.08 0.47 0.35 37.8
Weak student‡ 0.48 0.22 0.28 0.01 0.00 9.0
Expected number of credits
0.0 8.8 22.7 38.1 50.8
†
Baseline: HS Scientific/Humanities, regular career, HS grade=80, test scores=0.
‡
Weak: HS Technical/other, Irregular career, HS grade=60, test scores=0.

14
The baseline student is deﬁned by having all the covariates equal to zero, namely she
comes from a Scientiﬁc/Humanities high school, with regular career and a grade at the mid-

point (80), and she obtained average test scores. This student has a low probability to be in

the zero-credit latent class and a high probability to be in the fourth one, with an expected

number of credits equal to 25.8. In Table 6, the four rows below the baseline student refer
to proﬁles that diﬀer from the baseline by changing the covariates one at a time. Students

from Technical or other schools have a higher probability to be in the zero-credit class and

a lower probability to be in the top class, but overall the diﬀerence in terms of expected
number of credits is small (−2.9 credits). Students with irregular high school career have

a remarkably large probability to be in the zero-credit class and they have a substantially

lower expected number of credits (−11.0 credits). The high school grade is a good predictor
of student’s performance: the expected number of credits ranges from 16.5 for a grade at

the minimum to 37.8 for a grade at the maximum.

The weak student has the most unfavorable background characteristics, i.e. she comes

from a Technical or other high school, with irregular career and a grade at the minimum
(60). This kind of student has nearly ﬁfty percent probability to be in the zero-credit class

and an almost null probability to be in the two top classes; as a consequence, the expected

number of gained credits is only 9.

To interpret the eﬀect of the standardized test scores on gained credits, we compute the

the expected number of credits for the baseline student by varying the scores one at a time

in a grid between −3 and +3. Figure 3 reports the three curves, showing non-linear patterns
that would be difficult to figure out without transforming the estimated coefficients. The

Logic score has a negligible eﬀect, whereas the Reading score has a small positive eﬀect. The

Math score has the largest eﬀect, especially for high scores (note the asymmetry in Figure

3): given the background characteristics and the other test scores, a high Math score is
associated with a substantial increase of gained credits.

15
45

Expected number of credits

0
-3 -2 -1 0 1 2 3
Standardized test scores
Reading (std score) Math (std score) Logic (std score)

Figure 3: Expected number of gained credits by test scores (the value in zero refers to the
baseline student).

Another interesting aspect is the probability of belonging to the zero-credit latent class.
Such probability strongly depends on background characteristics, for example it is 0.16 for

the baseline student and 0.48 for the weak student (see Table 6). Figure 4 reports the

probability of belonging to the zero-credit latent class for the weak student by varying the
scores one at a time in a grid between −3 and +3. It is worth to note that the Math and

Logic scores do not have any appreciable eﬀect in reducing the size of the zero-credit class,

whereas the Reading score has a strong eﬀect: the probability that a weak student falls in
the zero-credit class ranges from 0.78 for a Reading score equal to −3 to 0.21 for a Reading

score equal to +3. Thus, given the background characteristics and the other test scores, a

high Reading score is associated with a substantial increase in the probability of a successful

start-up of the university career.

To evaluate the predictive performance of the concomitant variable mixture model we

compute the predicted values ŷi from equation (5), using 10-fold cross-validation (Hastie et

al., 2009). The prediction errors (ŷi − yi ) have a mean close to zero and a nearly symmetric
distribution, with quartiles equal to Q1 = −12.4, Q2 = −0.4 and Q3 = 11.4. The Mean

16
1.0

0.9

ProbabilityofbelongingtothezeroͲcreditlatentclass
0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.0
Ͳ3 Ͳ2 Ͳ1 0 1 2 3
Standardizedtestscores
Reading(stdscore) Math(stdscore) Logic(stdscore)

Figure 4: Probability of belonging to the zero-credit latent class by test scores (the value in
zero refers to the weak student).

Absolute Error (MAE) is 13.0, which is similar to the one computed on the whole sample
(12.7), therefore in this application the advantage of using cross-validation is negligible.

Moreover, to asses the predictive power of covariates, we compare the average predic-

tion errors for the following nested models: model without covariates (MAE=15.7), model
with only background characteristics (MAE=13.3, reduction of 15%), and full model with

background characteristics and test scores (MAE=12.7, further reduction of 4%). Therefore,

in terms of prediction ability, the background characteristics give a relevant contribution.

The pre-enrolment test yields a further improvement, even if the predictive ability remains

modest.

5 Concluding remarks

The paper has presented a detailed data analysis based on a binomial mixture model with

concomitant variables. In our application on the number of gained university credits, the

binomial mixture model proved to be a ﬂexible tool to model a complex response variable,
which is a bounded count characterized by a peak in zero and several modes.

17
The paper addressed the controversial issue of the choice of the number of components
using two standard methods, namely the BIC index and the bootstrap LRT, and a recently

proposed EM-test. In this application, the three methods led to the same conclusion, though

further research is needed to compare the EM-test with standard procedures.

The results of the concomitant variable mixture model have been eﬀectively presented
by means of tables and graphs, based on converting regression coeﬃcients into predicted

component probabilities and expected response for a set of student proﬁles deﬁned by the

covariates. In this way the information content of the model has been well exploited.
The predictive performance of the model has been evaluated by means of cross-validation,

showing that background characteristics and test scores help predicting gained credits. Fur-

ther work should be done on the development of suitable diagnostic tools based on generalized
residuals, starting from the work of Wang et al. (1996) on Poisson mixture models.

The analysis on gained university credits conﬁrmed the predictive role of background

characteristics such as the high school type and grade, and the regularity of the school

career. Moreover, the analysis showed that the pre-enrolment test designed by the School
of Economics of the University of Florence gives additional information. Thus, the test

results can be eﬀectively added to background characteristics to yield valuable indications

for student tutoring: in particular, a low Reading score is related to a diﬃcult start-up, while
a low Math score is related to a slow progression.

References
Böhning, D., Kuhnert, R. (2006). Equivalence of Truncated Count Mixture Distributions
and Mixtures of Truncated Count Distributions. Biometrics, 62:1207–1215.

Brooks, S. P., Morgan, B. J. T., Ridout, M. S., Pack, S. E. (1997). Finite Mixture Models
for Proportions. Biometrics, 53:1097-1115.

Dayton, C. M., Macready, G. B. (1988). Concomitant-Variable Latent-Class Models. Journal

of the American Statistical Association,83: 173-178.

18
Hall, D. B. (2000). Zero-inﬂated Poisson and binomial regression with random eﬀects: a case
study. Biometrics, 56:1030–1039.

Hastie, T., Tibshirani, R., Friedman, J. (2009). The Elements of Statistical Learning: Data
Mining, Inference, and Prediction. Second Edition. Springer.

Li, P., Chen, J. (2010). Testing the order of a ﬁnite mixture model. Journal of the American
Statistical Association, 105:1084-1092.

McLachlan, G. (1987). On Bootstrapping the Likelihood Ratio Test Statistic for the Number
of Components in a Normal MixtureTesting the order of a ﬁnite mixture model. Applied
Statistics, 36:318-324.

McLachlan, G., Peel, D. (2000). Finite Mixture Models. New York: Wiley.

Melkersson, M., Saarela, J. (2004). Welfare Participation and Welfare Dependence among
the Unemployed. Journal of Population Economics, 17:409-431.

Nylund, K. L., Asparouhov, T., Muthén, B. O. (2007). Deciding the number of classes in
latent class analysis and growth mixture modeling: a Monte Carlo Simulation Study,
Structural Equation Modeling, 14:535-569.

Saﬀari, S.E., Adnan, R., and Greene, W. (2012). Investigating the impact of excess zeros
on hurdle-generalized Poisson regression model with right censored count data, Statistica
Neerlandica, 67:67-80.

Schlattmann, P. (2009). Medical Applications of Finite Mixture Models. Berlin: Springer-

Verlag.

Titterington, D. M., Smith, A. F. M., Makov, U. E. (1985). Statistical Analysis of Finite

Mixture Distributions. New York: Wiley.

Vermunt, J. K., Magidson, J. (2008). LG-Syntax users guide: Manual for Latent GOLD 4.5
Syntax Module. Belmont, MA: Statistical Innovations Inc.

Wang, P., Puterman, M. L., Cockburn, I., Le, N. D. (1996). Mixed Poisson regression models
with covariate dependent rates. Biometrics, 52:381-400.

Wedel, M., DeSarbo, W. S. (1995). A Mixture Likelihood Approach for Generalized Linear
Models. Journal of Classification, 12:21-55.

View publication stats

(California Studies in 19th Century Music 11) Charles Fisk-Returning Cycles - Contexts For The Interpretation of Schubert's Impromptus and Last Sonatas - University of California Press (2001)
100% (1)
(California Studies in 19th Century Music 11) Charles Fisk-Returning Cycles - Contexts For The Interpretation of Schubert's Impromptus and Last Sonatas - University of California Press (2001)
324 pages
Finite Elements and Approximation
From Everand
Finite Elements and Approximation
O. C. Zienkiewicz
4.5/5 (4)
Stage 8 End of Unit 13 Worksheet
78% (9)
Stage 8 End of Unit 13 Worksheet
3 pages
Student Performance Analysis and Prediction in Classroom Learning: A Review of Educational Data Mining Studies
No ratings yet
Student Performance Analysis and Prediction in Classroom Learning: A Review of Educational Data Mining Studies
36 pages
American Musicological Society, University of California Press Journal of The American Musicological Society
100% (1)
American Musicological Society, University of California Press Journal of The American Musicological Society
44 pages
Statistics Skittles Project
No ratings yet
Statistics Skittles Project
7 pages
Business Statistics, 4e: by Ken Black
No ratings yet
Business Statistics, 4e: by Ken Black
53 pages
The Oxford Handbook Quantitative Methods: Todd D. Little
No ratings yet
The Oxford Handbook Quantitative Methods: Todd D. Little
63 pages
LCA Latent Class Analisys Introduction
No ratings yet
LCA Latent Class Analisys Introduction
37 pages
Bauer LCA Chapter Preprint Rev
No ratings yet
Bauer LCA Chapter Preprint Rev
37 pages
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Grilli Metelli Rampichini INLAlogistic Draft
No ratings yet
Grilli Metelli Rampichini INLAlogistic Draft
14 pages
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
From Everand
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
SUJAUL CHOWDHURY
No ratings yet
Predictive-Analytics-Predictive Analytic Models of
No ratings yet
Predictive-Analytics-Predictive Analytic Models of
21 pages
demographic-predictors
No ratings yet
demographic-predictors
32 pages
Predictive Analytic Models of Student Success in Higher Education
No ratings yet
Predictive Analytic Models of Student Success in Higher Education
16 pages
collegeadmissions
No ratings yet
collegeadmissions
11 pages
Student course grade prediction using the random forest algorithm_ Analysis of predictors' importance
No ratings yet
Student course grade prediction using the random forest algorithm_ Analysis of predictors' importance
7 pages
Understanding Analysis: Foundations and Applications
From Everand
Understanding Analysis: Foundations and Applications
Tanmay Shroff
No ratings yet
13
No ratings yet
13
19 pages
Data Mining: A Prediction For Performance Improvement Using Classification
No ratings yet
Data Mining: A Prediction For Performance Improvement Using Classification
5 pages
Applications of Differential Equations
From Everand
Applications of Differential Equations
Jayant Ramaswamy
No ratings yet
Latent Class Model
No ratings yet
Latent Class Model
3 pages
Student
No ratings yet
Student
12 pages
2015 Chan Dornyei Henry 2015 MM
No ratings yet
2015 Chan Dornyei Henry 2015 MM
22 pages
Analysis of Student Performance Based On Classification and Mapreduce Approach in Bigdata
No ratings yet
Analysis of Student Performance Based On Classification and Mapreduce Approach in Bigdata
8 pages
A Hybrid Machine Learning Model For Grade Prediction in Online Engineering Education
No ratings yet
A Hybrid Machine Learning Model For Grade Prediction in Online Engineering Education
22 pages
Research Paper, 2020
No ratings yet
Research Paper, 2020
5 pages
FlippedClassroomModel
No ratings yet
FlippedClassroomModel
11 pages
GrayEtAl2014PredictAP-IEEE
No ratings yet
GrayEtAl2014PredictAP-IEEE
6 pages
200-Article Text-673-1-10-20210724
No ratings yet
200-Article Text-673-1-10-20210724
13 pages
Survey of Learning Analytics Systems
No ratings yet
Survey of Learning Analytics Systems
8 pages
(Mathematical Modelling_ Theory and Applications 19) Ab Mooijaart, Kees Van Montfort (Auth.), Kees Van Montfort, Johan Oud, Albert Satorra (Eds.) - Recent Developments on Structural Equation Models_ T
No ratings yet
(Mathematical Modelling_ Theory and Applications 19) Ab Mooijaart, Kees Van Montfort (Auth.), Kees Van Montfort, Johan Oud, Albert Satorra (Eds.) - Recent Developments on Structural Equation Models_ T
364 pages
A Machine Learning Approach For Tracking and Predicting Student Performance in Degree Programs
No ratings yet
A Machine Learning Approach For Tracking and Predicting Student Performance in Degree Programs
34 pages
Comparison of Empirical Models
No ratings yet
Comparison of Empirical Models
16 pages
Likert Scale in Social Sciences Research: Problems and Difficulties
No ratings yet
Likert Scale in Social Sciences Research: Problems and Difficulties
14 pages
(Studies in Classification, Data Analysis, and Knowledge Organization) Donatella Vicari, Akinori Okada, Giancarlo Ragozini, Claus Weihs (eds.) - Analysis and Modeling of Complex Data in Behavioral and
No ratings yet
(Studies in Classification, Data Analysis, and Knowledge Organization) Donatella Vicari, Akinori Okada, Giancarlo Ragozini, Claus Weihs (eds.) - Analysis and Modeling of Complex Data in Behavioral and
297 pages
Ijossp 2016040104
No ratings yet
Ijossp 2016040104
21 pages
Tuto Equa Diff Archive
No ratings yet
Tuto Equa Diff Archive
67 pages
Accepted Qualitydifferencesofhighereducation
No ratings yet
Accepted Qualitydifferencesofhighereducation
21 pages
Chemistry Through Group Theory Applications
From Everand
Chemistry Through Group Theory Applications
Deepak Yadav
No ratings yet
Forecasting University Enrollments by Ratio Smoothing
No ratings yet
Forecasting University Enrollments by Ratio Smoothing
13 pages
1 IC
No ratings yet
1 IC
9 pages
Finite Mixture Models
No ratings yet
Finite Mixture Models
26 pages
AI
No ratings yet
AI
31 pages
Substantive Theory and Constructive Measures: A Collection of Chapters and Measurement Commentary on Causal Science
From Everand
Substantive Theory and Constructive Measures: A Collection of Chapters and Measurement Commentary on Causal Science
Mark Everett Stone
No ratings yet
2020.02 - 1 - 3 (Edited 18-8)
No ratings yet
2020.02 - 1 - 3 (Edited 18-8)
24 pages
Shsconf Glob2021 09001
No ratings yet
Shsconf Glob2021 09001
10 pages
Prediction of Academic Performance of Engineering Students by Artificial Neural Network
No ratings yet
Prediction of Academic Performance of Engineering Students by Artificial Neural Network
10 pages
Developing and Validating The Females in Mathemati PDF
No ratings yet
Developing and Validating The Females in Mathemati PDF
25 pages
ICETIC Full Paper Submission3 JBusch
No ratings yet
ICETIC Full Paper Submission3 JBusch
7 pages
Predicting the Students Performance
No ratings yet
Predicting the Students Performance
18 pages
Optimizing Adult Learner Success 1717338751
No ratings yet
Optimizing Adult Learner Success 1717338751
14 pages
Latent Variable Models and Factor Analysis A Unified Approach 3rd Edition David Bartholomew - Get the ebook instantly with just one click
100% (1)
Latent Variable Models and Factor Analysis A Unified Approach 3rd Edition David Bartholomew - Get the ebook instantly with just one click
29 pages
Comp Applic in Engineering - 2022 - Arashpour
No ratings yet
Comp Applic in Engineering - 2022 - Arashpour
17 pages
Predicting University Students’ Academic Success and Major Using Random Forests
No ratings yet
Predicting University Students’ Academic Success and Major Using Random Forests
17 pages
Graduate Admission Prediction Using Machine Learning: December 2020
No ratings yet
Graduate Admission Prediction Using Machine Learning: December 2020
6 pages
Fundamentals of Ordinary Differential Equations
From Everand
Fundamentals of Ordinary Differential Equations
Mohit Chatterjee
No ratings yet
Enhancing-Academic-Outcomes-through-an-Adaptive-Learning-Framework-Utilizing-a-Novel-Machine-Learning-Based-Performance-Prediction-Method
No ratings yet
Enhancing-Academic-Outcomes-through-an-Adaptive-Learning-Framework-Utilizing-a-Novel-Machine-Learning-Based-Performance-Prediction-Method
10 pages
Profile Analysis of Students' Academic Performance in Ghanaian Polytechnics: The Case of Bolgatanga Polytechnic
No ratings yet
Profile Analysis of Students' Academic Performance in Ghanaian Polytechnics: The Case of Bolgatanga Polytechnic
8 pages
Multiple Models Approach in Automation: Takagi-Sugeno Fuzzy Systems
From Everand
Multiple Models Approach in Automation: Takagi-Sugeno Fuzzy Systems
Mohammed Chadli
No ratings yet
Computational Physics: Basic Concepts
From Everand
Computational Physics: Basic Concepts
Devang Patil
No ratings yet
Latent Variable Models and Factor Analysis A Unified Approach 3rd Edition David Bartholomew pdf download
100% (1)
Latent Variable Models and Factor Analysis A Unified Approach 3rd Edition David Bartholomew pdf download
47 pages
Revue D'intelligence Artificielle: Received: 1 October 2021 Accepted: 21 October 2021
No ratings yet
Revue D'intelligence Artificielle: Received: 1 October 2021 Accepted: 21 October 2021
7 pages
Contoh Daftar Pustaka
No ratings yet
Contoh Daftar Pustaka
6 pages
Test PDF
No ratings yet
Test PDF
1 page
Appendix: A. Database Architecture
No ratings yet
Appendix: A. Database Architecture
7 pages
Lecture1 Firth
No ratings yet
Lecture1 Firth
18 pages
Hypothesis PDF
No ratings yet
Hypothesis PDF
17 pages
T P I M C: P V C M
No ratings yet
T P I M C: P V C M
25 pages
A LDA Based Model For Topic Evolution: Evidence From Information Science Journals
No ratings yet
A LDA Based Model For Topic Evolution: Evidence From Information Science Journals
6 pages
Revisiting Style, A Key Concept in Literary Studies (Abstract)
No ratings yet
Revisiting Style, A Key Concept in Literary Studies (Abstract)
6 pages
Gibbs Sampling
No ratings yet
Gibbs Sampling
10 pages
Epigraph PDF
No ratings yet
Epigraph PDF
12 pages
LDA Topic Model With Soft Assignment of Descriptors To Words
No ratings yet
LDA Topic Model With Soft Assignment of Descriptors To Words
9 pages
Johnson11MLSS Talk Extras
No ratings yet
Johnson11MLSS Talk Extras
73 pages
Information Content in Melodic and Non-Melodic Lines: Northwestern University
No ratings yet
Information Content in Melodic and Non-Melodic Lines: Northwestern University
4 pages
PHD Program in Digital Humanities - Eddh: Annual Progress Report
No ratings yet
PHD Program in Digital Humanities - Eddh: Annual Progress Report
7 pages
Big Data, Big Questions: A Closer Look at The Yale-Classical Archives Corpus (C. 2015)
No ratings yet
Big Data, Big Questions: A Closer Look at The Yale-Classical Archives Corpus (C. 2015)
9 pages
Modeling Harmony With Skip-Grams
No ratings yet
Modeling Harmony With Skip-Grams
7 pages
Emotional Responses To Hindustani Raga Music: The Role of Musical Structure
No ratings yet
Emotional Responses To Hindustani Raga Music: The Role of Musical Structure
22 pages
19th Century Music
No ratings yet
19th Century Music
15 pages
Comparing Pitch Spelling Algorithms: David Meredith Geraint A. Wiggins
100% (1)
Comparing Pitch Spelling Algorithms: David Meredith Geraint A. Wiggins
8 pages
Audit Sampling For Tests of Controls (TOC) and Substantive Tests of Transactions (STOT)
No ratings yet
Audit Sampling For Tests of Controls (TOC) and Substantive Tests of Transactions (STOT)
26 pages
3random Variable - Joint PDF Notes PDF
No ratings yet
3random Variable - Joint PDF Notes PDF
33 pages
Chapter 4
80% (5)
Chapter 4
21 pages
Hypothesis Testing in Research Methodolo PDF
No ratings yet
Hypothesis Testing in Research Methodolo PDF
3 pages
GCE A Level Hypothesis Tests Discrete Tests Binomial and Poisson
No ratings yet
GCE A Level Hypothesis Tests Discrete Tests Binomial and Poisson
4 pages
Statistics and Probabilityq4Week 3 Module 11
No ratings yet
Statistics and Probabilityq4Week 3 Module 11
22 pages
Chi Square Test
No ratings yet
Chi Square Test
13 pages
Assignment 2 DMED2103 - Statistics For Educational Research
No ratings yet
Assignment 2 DMED2103 - Statistics For Educational Research
6 pages
Chapter 1 - Introduction To Statistics PDF
No ratings yet
Chapter 1 - Introduction To Statistics PDF
35 pages
Module For Estimation of Parameters
No ratings yet
Module For Estimation of Parameters
4 pages
The Statistics Tutor's Quick Guide To Commonly Used Statistical Tests
No ratings yet
The Statistics Tutor's Quick Guide To Commonly Used Statistical Tests
53 pages
Skill Importance in Volleyball
No ratings yet
Skill Importance in Volleyball
15 pages
09 Power & Sample Size
No ratings yet
09 Power & Sample Size
16 pages
01 ProbTheory v2
No ratings yet
01 ProbTheory v2
29 pages
Probability Presentation Abdullah Al Rafi
No ratings yet
Probability Presentation Abdullah Al Rafi
127 pages
Course Outline - BUS 173 - Applied Statistics - NSU
No ratings yet
Course Outline - BUS 173 - Applied Statistics - NSU
5 pages
546846365-Maths-Project-XII-Probability-Final-converted (1)
No ratings yet
546846365-Maths-Project-XII-Probability-Final-converted (1)
20 pages
Spss 1. Uji Normalitas Data (Kolmogorov-Smirnov)
No ratings yet
Spss 1. Uji Normalitas Data (Kolmogorov-Smirnov)
3 pages
Biostatistics and Orthodontics
50% (2)
Biostatistics and Orthodontics
72 pages
3 Idiots
No ratings yet
3 Idiots
14 pages
Week 1: To Statistics: 1.1 An Overview of Statistics 1.2 Data Classification 1.3 Sampling Technique and Data Collection
No ratings yet
Week 1: To Statistics: 1.1 An Overview of Statistics 1.2 Data Classification 1.3 Sampling Technique and Data Collection
27 pages
Journal of Statistical Software: MICE: Multivariate Imputation by Chained Equations in R
No ratings yet
Journal of Statistical Software: MICE: Multivariate Imputation by Chained Equations in R
68 pages
Chi Square Notes
No ratings yet
Chi Square Notes
7 pages
Sample Size Calculation: Learning Objectives
No ratings yet
Sample Size Calculation: Learning Objectives
8 pages
Test of Hypothesis
67% (12)
Test of Hypothesis
85 pages
Basic Probability
No ratings yet
Basic Probability
32 pages
Probability: 1. Experiment
No ratings yet
Probability: 1. Experiment
5 pages

Tree Choro

Uploaded by

Tree Choro

Uploaded by

See discussions, stats, and author proﬁles for this publication at: https://www.researchgate.

Binomial Mixture Modeling of University Credits

Article in Communication in Statistics- Theory and Methods · January 2013

Leonardo Grilli Carla Rampichini

SEE PROFILE SEE PROFILE

The user has requested enhancement of the downloaded ﬁle.

Corresponding author Leonardo Grilli Department of Statistics, Computer Science, Appli-

Pre-print version of the article to appear in 2013 in

distribution. Moreover, we rely on a concomitant variable speciﬁcation to investigate the role

of student background characteristics and of a compulsory pre-enrolment test in predicting

such as a binomial ﬁnite mixture. This model should be considered as an approximation

to the observed distribution and it is not intended to be an accurate representation of the

use a concomitant variable approach (Dayton and Macready, 1988).

performance of the model will be evaluated via cross-validation techniques.

of the predictive ability. Section 5 concludes.

2 Finite mixture models for binomial counts

Let us consider a discrete random variable yi observed on a random sample of subjects

is deﬁned by a ﬁnite mixture of conditional distributions P (yi | ui ), where ui is a categorical

common number of trials t and component-speciﬁc probabilities of success θk :

A common interpretation of the latent variable ui is in terms of latent classes, namely

number of components, K ≤ 12 (t + 1) (Wang et al., 1996). For given K, the parameters

membership probabilities (marginal mean prediction):

be approximated by parametric bootstrap (McLachlan, 1987); and the EM-test recently

3 Data description and model specification

distribution is shown in Figure 1. The number of credits ranges from 0 to 60 by 3, namely

of positive credits has a median of 30 and a mean of 29.8.

the Poisson distributions should be truncated also on the right tail.

The prediction of gained credits may be improved by exploiting background information

of Massa-Carrara and Grosseto or in a province out of Tuscany), Type of high school

grade: from 60 to 100, centered at 80);

A summary of the number of gained credits by background characteristics is reported in

test during one of the later editions.

the median (85.8% gained credits, with a mean of 33.7).

is an eﬀective tool for student evaluation in addition to background characteristics of the

class ui = k, we assume that yi follows a binomial distribution with number of trials t = 20

and class-speciﬁc success probability θk . The marginal distribution of yi is given by the

module of Latent Gold (Vermunt and Magidson, 2008).

the covariates are selected using the standard Wald test.

performed using the R code embinom.R developed by Pengfei Li 1 .

The predicted marginal probabilities P (credits = c) are obtained by plugging parameter

depicted in Figure 1 is adequately approximated by the 5-component binomial mixture.

Figure 2: Distribution of components 2 to 5 of the ﬁtted binomial mixture model of Table 4

covariates (see Table 4).

high scores in Reading and Math have a better performance.

In general in a multinomial model, the eﬀect of covariates on the probabilities πk is

the minimum to 37.8 for a grade at the maximum.

number of gained credits is only 9.

Expected number of credits

start-up of the university career.

in terms of prediction ability, the background characteristics give a relevant contribution.

further research is needed to compare the EM-test with standard procedures.

results can be eﬀectively added to background characteristics to yield valuable indications

Dayton, C. M., Macready, G. B. (1988). Concomitant-Variable Latent-Class Models. Journal

Schlattmann, P. (2009). Medical Applications of Finite Mixture Models. Berlin: Springer-

Titterington, D. M., Smith, A. F. M., Makov, U. E. (1985). Statistical Analysis of Finite

View publication stats

You might also like