100% found this document useful (11 votes)
183 views16 pages

Extended Abstracts Fall 2015 Biomedical Big Data; Statistics for Low Dose Radiation Research scribd download

The document presents extended abstracts from the Fall 2015 workshop on Biomedical Big Data and Statistics for Low Dose Radiation Research, organized by the Centre de Recerca Matemàtica. It features contributions from various researchers on topics such as statistical methods for high-dimensional data, joint modeling approaches, and applications in cancer studies and HIV research. The volume aims to quickly communicate new research findings and methodological advancements in the field of biostatistics and bioinformatics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (11 votes)
183 views16 pages

Extended Abstracts Fall 2015 Biomedical Big Data; Statistics for Low Dose Radiation Research scribd download

The document presents extended abstracts from the Fall 2015 workshop on Biomedical Big Data and Statistics for Low Dose Radiation Research, organized by the Centre de Recerca Matemàtica. It features contributions from various researchers on topics such as statistical methods for high-dimensional data, joint modeling approaches, and applications in cancer studies and HIV research. The volume aims to quickly communicate new research findings and methodological advancements in the field of biostatistics and bioinformatics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Extended Abstracts Fall 2015 Biomedical Big Data; Statistics

for Low Dose Radiation Research

Visit the link below to download the full version of this book:

https://medipdf.com/product/extended-abstracts-fall-2015-biomedical-big-data-sta
tistics-for-low-dose-radiation-research/

Click Download Now


Trends in Mathematics

Research Perspectives CRM Barcelona

Volume 7

Series editors
Enric Ventura
Antoni Guillamon

Since 1984 the Centre de Recerca Matemàtica (CRM) has been organizing scien-
tific events such as conferences or workshops which span a wide range of
cutting-edge topics in mathematics and present outstanding new results. In the fall
of 2012, the CRM decided to publish extended conference abstracts originating
from scientific events hosted at the center. The aim of this initiative is to quickly
communicate new achievements, contribute to a fluent update of the state of the art,
and enhance the scientific benefit of the CRM meetings. The extended abstracts are
published in the subseries Research Perspectives CRM Barcelona within the Trends
in Mathematics series. Volumes in the subseries will include a collection of revised
written versions of the communications, grouped by events.

More information about this series at http://www.springer.com/series/13332


Elizabeth A. Ainsbury M.Luz Calle

Elisabeth Cardis Jochen Einbeck


Guadalupe Gómez Pere Puig


Editors

Extended Abstracts Fall 2015


Biomedical Big Data

Guadalupe Gómez
Pere Puig
M.Luz Calle
Editors

Statistics for Low Dose Radiation Research

Elizabeth A. Ainsbury
Elisabeth Cardis
Pere Puig
Jochen Einbeck
Editors
Editors
Elizabeth A. Ainsbury Jochen Einbeck
Chemical and Environmental Hazards Mathematical Sciences
Public Health England Durham University
Chilton Durham
UK UK

M.Luz Calle Guadalupe Gómez


Departament de Biociències Estadística i Investigació Operativa
Universitat Central de Catalunya Universitat Politècnica de Catalunya
Vic Barcelona
Spain Spain

Elisabeth Cardis Pere Puig


Campus Mar Departament de Matemàtiques
ISGlobal Universitat Autònoma de Barcelona
Barcelona Barcelona
Spain Spain

ISSN 2297-0215 ISSN 2297-024X (electronic)


Trends in Mathematics
ISSN 2509-7407 ISSN 2509-7415 (electronic)
Research Perspectives CRM Barcelona
ISBN 978-3-319-55638-3 ISBN 978-3-319-55639-0 (eBook)
DOI 10.1007/978-3-319-55639-0
Library of Congress Control Number: 2017936021

Mathematics Subject Classification (2010): 62M10, 62N01, 62P10, 92B15, 92C60, 92D30

© Springer International Publishing AG 2017


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made. The publisher remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.

Printed on acid-free paper

This book is published under the trade name Birkhäuser, www.birkhauser-science.com


The registered company is Springer International Publishing AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Contents

Part I Biomedical Big Data


Extreme Observations in Biomedical Data . . . . . . . . . . . . . . . . . . . . . . . . 3
Concepción Arenas, Itziar Irigoien, Francesc Mestres, Claudio Toma,
and Bru Cormand
An Ordinal Joint Model for Breast Cancer . . . . . . . . . . . . . . . . . . . . . . . 9
Carmen Armero, Carles Forné, Montserrat Rué, Anabel Forte,
Hector Perpiñán, Guadalupe Gómez, and Marisa Baré
Sample Size Impact on the Categorisation of Continuous
Variables in Clinical Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Irantzu Barrio, Inmaculada Arostegui,
and María-Xosé Rodríguez-Álvarez
Integrative Analysis of Transcriptomics and Proteomics Data
for the Characterization of Brain Tissue After Ischemic Stroke . . . . . . . 21
Ferran Briansó, Teresa García-Berrocoso, Joan Montaner,
and Alex Sánchez-Pla
Applying INAR-Hidden Markov Chains in the Analysis
of Under-Reported Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Amanda Fernández-Fontelo, Alejandra Cabaña, Pedro Puig,
and David Moriña
Joint Modelling for Flexible Multivariate Longitudinal and
Survival Data: Application in Orthotopic Liver Transplantation . . . . . . 35
Ipek Guler, Christel Faes, Carmen Cadarso-Suárez, and Francisco Gude
A Multi-state Model for the Progression to Osteopenia and
Osteoporosis Among HIV-Infected Patients . . . . . . . . . . . . . . . . . . . . . . . 41
Klaus Langohr, Nuria Pérez-Álvarez, Eugenia Negredo, Anna Bonjoch,
Montserrat Rué, Ronald Geskus, and Guadalupe Gómez

v
vi Contents

Statistical Challenges for Human Microbiome Analysis . . . . . . . . . . . . . . 47


Javier Rivera-Pinto, Carla Estany, Roger Paredes, M.Luz Calle,
Marc Noguera-Julián, and the MetaHIV-Pheno Study Group
Integrative Analysis to Select Genes Regulated by Methylation
in a Cancer Colon Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Alex Sánchez-Pla, M. Carme Ruíz de Villa, Francesc Carmona,
Sarah Bazzoco, and Diego Arango del Corro
Topological Pathway Enrichment Analysis of Gene Expression
in High Grade Serous Ovarian Cancer Reveals Tumor-Stoma
Cross-Talk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Oana A. Zeleznik, Gerhard G. Thallinger, John Platig,
and Aedín C. Culhane

Part II Statistics for Low Dose Radiation Research


Biological Dosimetry, Statistical Challenges: Biological
Dosimetry After High-Dose Exposures to Ionizing Radiation . . . . . . . . . 67
Joan Francesc Barquinero and Pere Puig
Heterogeneous Correlation of Multi-level Omics Data for the
Consideration of Inter-tumoural Heterogeneity . . . . . . . . . . . . . . . . . . . . 71
Herbert Braselmann
Overview of Topics Related to Model Selection for Regression . . . . . . . . 77
Riccardo De Bin
Understanding Plaque Overlap Is Essential for Modelling
Radiation Induced Atherosclerosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Fieke Dekkers, Arjan Maud-Briels, Teun van van-Dijk,
Astrid Dillen, and Kloosterman
On the Use of Random Effect Models for Radiation Biodosimetry . . . . . 89
Jochen Einbeck, Elizabeth Ainsbury, Stephen Barnard, Maria Oliveira,
Grainne Manning, Pere Puig, and Christophe Badie
Modelling of the Radiation Carcinogenesis: The Analytic
and Stochastic Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Krzysztof W. Fornalski, Ludwik Dobrzyński, and Joanna Reszczyńska
Bayesian Solutions to Biodosimetry Count Data Problems
and Supporting Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Manuel Higueras and Elizabeth A. Ainsbury
Empirical Assessment of Gene Expression Biomarkers
for Radiation Exposure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Adetayo Kasim, Nolen Joy Perualila, and Ziv Shkedy
Contents vii

Poisson-Weighted Estimation by Discrete Kernel with Application


to Radiation Biodosimetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Célestin C. Kokonendji, Nabil Zougab, and Tristan Senga-Kiessé
R Implementation of the Excess Relative Rate Model: Applications
to Radiation Epidemiology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
David Moriña and Elisabeth Cardis
Uncertainty Considerations Following a Mechanistic Analysis
of Lung Cancer Mortality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Ignacio Zaballa and Markus Eidemüller
Part I
Biomedical Big Data

Foreword
In the last quarter of 2015, from September 8 to November 27, over 100 biosta-
tisticians, statisticians and mathematicians from 45 different institutions visited the
Centre de Recerca Matemàtica (CRM) in Bellaterra to participate in the Intensive
Research Programme on Statistical Advances for Complex Data. The local orga-
nizers of this research semester were Alejandra Cabaña (Universitat Autònoma de
Barcelona), Malu Calle (Universitat de Vic), Pedro Delicado (Universitat Politèc-
nica de Catalunya), Anna Espinal (Universitat Autonòma de Barcelona), Guadalupe
Gómez (Universitat Politècnica de Catalunya), Rosa Lamarca (Almirall SA), Pere
Puig (Universitat Autonòma de Barcelona), Montserrat Rué (Universitat de Lleida),
and Àlex Sánchez (Universitat de Barcelona). The program brought together sci-
entists, from enthusiastic Ph.D. students to respected senior professors, working in
relevant areas such as Modeling and analysis of biological and biomedical data,
Biostatistical methods for clinical trials and for complex time-to-event data, and
Statistics and Big Data. The very dynamic and productive atmosphere we enjoyed
translated into equally active courses, seminars and a workshop on Biomedical (Big)
Data, held on November 26 and 27, closing the program.
The workshop was a meeting point for the researchers who are members of
BIOSTATNET, a Spanish pioneer network of biostatisticians. BIOSTATNET has
almost two hundred members organized around eight different nodes, led by statisti-
cians from different universities, with own research projects and teaching experience
in biostatistical matters, and working closely with biomedical researchers. The work-
shop included five invited talks, a roundtable, eleven contributed oral presentations
and ten posters.
In this volume of the subseries Research Perspectives CRM-Barcelona (published
by Birkhäuser inside the series Trends in Mathematics), we present ten extended
abstracts corresponding to selected talks given by participants in the workshop on
Biomedical (Big) Data. The variety of topics presented bears testimony to the rich
activity that made a success of the workshop, and also of the Intensive Research
Programme. The selected topics include methodological biostatistical and bioinfor-
2 Biomedical Big Data

matics advances as well as relevant medical applications. Five abstracts contribute


with new procedures and methods for high-dimensional data sets: from integrative
analysis of omics data to statistical models for microbiome data. In three papers
different approaches for the joint model between longitudinal markers and time to
events are the main contribution. Sample size considerations in clinical prediction and
models for under-reported time series count data are other topics addressed among
the authors. As is usual in our field, most of the methodological progress comes
from the hand of relevant scientific questions in the medical and biological field.
Among our abstracts, we find the problems that have arisen in HIV studies where
the evolution of bone mineral density measurements or the characterization of the
microbiome composition of HIV-infected persons is of interest; three papers discuss
applications in cancer studies focusing on the selection of genes in a cancer colon
study, analyzing gene expression in ovarian cancer, or assessing breast cancer risk
from the longitudinal mammographic breast densities. Other clinical studies analyz-
ing autism multiplex families, characterizing brain tissue after ischemic stroke, or
finding the association between post-operative glucose proles and insulin therapy
on patients survival after an orthotopic liver transplantation are part of the problems
addressed from statistical and computational perspectives. We hope that this volume
will give the authors the opportunity to quickly communicate their recent research:
most of the short articles here are brief and preliminary presentations of new results
not yet published in regular research journals.
We would like to express our gratitude to the CRM for hosting and supporting
our research program. Also our warm thanks to the CRM staff, its director, Joaquim
Bruna, and all the secretaries, for providing great facilities and a very pleasant work-
ing environment. Last but not least, thanks are due to all those who attended the talks,
for their interest, enthusiasm and their active participation. The program was also
possible thanks to the generous support of the following research projects from the
Ministerio de Economía y Competitividad (Spain): “Applied Stochastic Processes”,
conducted at Universitat Autònoma de Barcelona (MTM2012-31118), “Advanced
Methods in Survival Analysis: Clinical Trials, Longitudinal Data and Interval Cen-
soring”, conducted at Universitat Politècnica de Catalunya (MTM2012-38067-C02-
01), “Sampling Samples: Relevant Applications of Statistics in Digital Economy and
Society”, conducted at Universitat Politècnica de Catalunya (MTM2013-43992-R),
and the following one from the Departament d’Economia i Coneixement (Catalunya):
“Research Group in Biostatistics and Bioinformatics” at Universitat Politècnica de
Catalunya and Universitat de Barcelona (SGR 464), as well as the Simons Foun-
dation, and the Fundación Española para la Ciencia y la Tecnología (Ministerio de
Economía y Competitividad, Spain).

July 2016
Barcelona, Spain Guadalupe Gómez
Pere Puig
M.Luz Calle
Extreme Observations in Biomedical Data

Concepción Arenas, Itziar Irigoien, Francesc Mestres, Claudio Toma,


and Bru Cormand

Abstract We present a new procedure to detect extreme observations which can be


applied to low or high-dimensional data sets. Continuous features, a known under-
lying distribution or parameter estimations are not required. The procedure offers a
ranking by assigning a value to each observation that reflects its degree of outlying-
ness. A short computation time is needed.

1 Introduction

In current biomedical research, genetic studies are extensively used to identify the
causes of human diseases and they provide insights for the eventual development of
therapeutic strategies. Integration of different types of data sets, such as gene expres-
sion data, genotype data or clinical information is needed to capture information that
may otherwise be lost in separate analyses. Furthermore, it is crucial to be able to
detect extreme observations, since an extreme value may indicate an individual with a
wrong diagnosis or presenting particular clinical features or classified in the extreme
spectrum of the disease. Moreover, the usual scenario with current data is the lack of

C. Arenas (B)
Department of Statistics, University of Barcelona, Barcelona, Spain
e-mail: [email protected]
I. Irigoien
Department of Computer Sciences and Artificial Intelligence,
University of the Basque Country, Leioa, Spain
e-mail: [email protected]
F. Mestres · B. Cormand
Department of Genetics, University of Barcelona, Barcelona, Spain
e-mail: [email protected]
B. Cormand
e-mail: [email protected]
C. Toma
Neuroscience Research Australia, Sydney, NSW, Australia
e-mail: [email protected]

© Springer International Publishing AG 2017 3


E.A. Ainsbury et al. (eds.), Extended Abstracts Fall 2015,
Trends in Mathematics 7, DOI 10.1007/978-3-319-55639-0_1
4 C. Arenas et al.

information about the underlying distribution. Thus, no parametric extreme observa-


tion detection algorithms for any type of features and for any size/dimensional data
sets are desirable. We present a new procedure to detect extreme observations which
can be applied to low or high-dimensional data sets. Continuous features, a known
underlying distribution or parameter estimations are not required. The procedure
offers, using a short computation time, a ranking by assigning to each observation a
value that reflects its degree of outlyingness. The proposed method takes into account
all distances between observations, not only distances between neighbours, in such
a way that the relation of any observation with respect to all the other observations in
the data set and the dispersion of all data are considered. To illustrate our procedure,
we analyzed the data of rare genetic variants from 10 autism multiplex families and
twenty-six high-dimensional class-imbalanced cancer data sets. The results showed
the good performance of the procedure.

2 Methods

The starting point is an n × p data matrix ( p can be much larger than the size of the
sample n) where the rows correspond to observations (individuals, samples...) and the
columns correspond to any kind of features to be measured which can be continuous,
binary or multiattribute data (genes, clinical/pathological features,…). Let G be a
group that is represented by a p-random vector Y = (Y1 , . . . , Y p ), with values in a
metric space S ⊂ R p and a probability density f with respect to a suitable measure
λ. Let δ be a distance function between any pair of observations, δi j = δ(yi , y j ).

Definition 1 The geometric variability of G with respect to δ, a general measure of


dispersion of G, is defined by

1
V (G) = δ 2 (yi , y j ) f (yi ) f (y j )λ(dyi )λ(dy j );
2 S×S

see [1]. When δ is the Euclidean distance, V (G) = tr () with  = cov(Y). The
geometric variability is as a variant of Rao’s diversity coefficient; see [2].

Definition 2 The proximity function of an observation y to G is defined by



φ2 (y, G) = δ 2 (y, y j ) f (y j )λ(dy j ) − V (G);
S

see [1].

As in applied problems, the probability distribution for Y is usually unknown,


estimators are needed. Given a sample of size n, y1 , . . . , yn , natural estimators for
the geometric variability and the proximity function are
Extreme Observations in Biomedical Data 5

1  2
V̂ (G) = δ (yi , y j ),
2n 2 i, j

and
1 2
φ̂2 (y, G) = δ (y, yi ) − V̂ (G),
n i

respectively. See [3] for a review of these concepts, and for applications see [4, 5]
and references therein.
Definition 3 For each observation yi , the depth function I (yi , G) is defined by
 −1
φ2 (yi , G)
I (yi , G) = 1 + ; (1)
V (G)

see [6].
Proposition 4 Function I takes values in [0, 1] and, according to [7], it is a type
C depth function. Furthermore, it verifies the following desirable properties: (i)
maximality at center; (ii) monotonicity relative to the deepest observation; (iii) van-
ishing at infinity; and (iv) depending on the data and the selected distance, it is
affine-invariant.
As I is a depth function, it assigns to any observation a degree of centrality, thus a
O = 1/I , suggests
small value of I , or equivalently a large value of  a possible extreme
observation. Note that, by (1), Ô(yi , G) = n j δ 2 (yi , y j )/ j,k δ 2 (y j , yk ).
However, with only one observation taking a very large value, Ô already gives
aberrant values. For this reason, we propose the following version for Ô(yi , G)
where, due to robustness consideration, the mean is replaced by the median.
Definition 5 For each observation yi a new statistic O R (yi , G) is defined by

medδ,i
O R (yi , G) = , (2)
medδ

where medδ = median j,k (δ 2jk ) and medδ,i = median j (δi2j ).


Proposition 6 Let S = {y1 , . . . , yn } be a sample, and let y0 be an outlier. For a
fixed observation, say y1 , the sensitivity curve of O R (y1 , S) at point y0 , SC(y0 ) =
O R (y1 , S  ) − O R (y1 , S), where S  = {y1 , . . . , yn , y0 }, is bounded, which implies the
robustness of O R .
Proposition 7 Let δ be a distance function such that δ(yi , y j ) → ∞ when yi or y j
takes arbitrarily large values. The breakdown point of O R (the proportion of arbi-
trarily large observations
√ that O R can handle before giving arbitrarily large values)
is n − 1/2 − 2n 2 − 6n + 1/2, with n the sample size. Note that the breakdown
point of O R is always greater than 25%.
6 C. Arenas et al.

Note that the distribution of O R is not symmetric.

Definition 8 Following Kimber criterion (see [8]), an observation yi will be consider


as an extreme observation if

O R (yi ) > λ = Q 3 + 1.5(Q 3 − M), (3)

where Q 3 and M are the 3-th quartile and the median of all the O R values.

Our simulation studies show that the procedure is robust in front of masking effect
and it can properly identify most of the outliers when mixed data are analyzed.

3 Application to Data in Autism Multiplex Families

Now consider the following study [9] in which 10 autism multiplex families were
analyzed (nine with two affected sibs and one with three affected sibs). First, in
a clinical study, five features were measured in 21 affected individuals: two were
continuous (age and non-verbal intelligence quotient(NVIQ)), and three were cate-
gorical (gender, language delay and autism spectrum category). Using (3) and the
Gower distance [10], the threshold value was λ = 1.519, and four individuals could
be considered as extreme observations. Three of them were male (13, 17 and 20 years
old) with autism and language delay, and they presented NVIQ values indicative of
mental retardation. The most emblematic extreme value, corresponded to another
man (25 years old) also with an autism diagnosis and language delay, and presenting
the smallest NVIQ value. Thus, our method highlighted the four individuals from our
study that had the most severe clinical presentation of the disorder. In a second study,
a genetic analysis was performed in the 21 affected individuals and in their parents.
The full exome sequence (the fraction of the genome that encodes proteins, approx-
imately 3.4 × 107 nucleotide positions from 20,000 genes) of all family members
was determined. We selected those rare genetic variants (infrequent in the general
population) leading to an amino acid change in the encoded protein that were trans-
mitted from one parent to the two (or three) affected sibs. The identified mutations,
an average of 36.3 per family, were ranked according to their predicted damaging
effect using the SIFT and PolyPhen-2 tools. In this case, no extreme observations
were detected. This result is consistent with the fact that this type of mutation may
not have a major role in the aetiology of the disorder (as compared to mutations lead-
ing to truncated proteins, not considered here) in the sample of multiplex families
reported previously in [9].
Extreme Observations in Biomedical Data 7

4 Application to Gene Expression Data in Cancer

In biomedical studies, an important task is to select informative genes that present


altered expression levels in the diseases under study. Selecting adequate marker
genes should may be useful in classifying new samples. We considered 26 public
microarray cancer data sets (http://bioinformatics.rutgers.edu/Static/Supplements/
CompCancer/datasets.htm), which are high size/dimensional class-unbalanced data
sets. We compared the results of a linear discriminant analysis using the original
set of genes and the extreme genes identified under criterion (3). As an evaluation
criteria, we considered the rate of correct classification obtained by the leave-one-out
method. Using only the extreme genes detected by (3) the rate of correct classification
was, in general, maintained or even improved (see Table 1). It is important to note
that the reduction in the number of marker genes facilitates the interpretation of their
biological meaning with regard to the disease.

Table 1 Columns: Cancer data sets; classes (k); samples (n); original genes ( p); extreme genes
selected by our criterion (N G); total leave-one-out classification rate, in percentage, using all genes
(C Rall ) and using the reduced list of genes (C Rsel )
Data set k n p NG C Rall C Rsel Data set k n p NG C Rall C Rsel
Alizadeh-2000-v1 2 42 1095 118 90.48 92.86 Laiho-2007 2 37 2202 414 81.08 86.49
Alizadeh-2000-v2 3 62 2093 306 98.39 98.39 Lapointe- 3 69 1625 170 81.16 72.46
2004-v1
Armstrong-2002-v1 2 72 1081 193 91.67 98.61 Lapointe- 4 110 2496 249 80.91 70.00
2004-v2
Armstrong-2002-v2 3 72 2194 391 88.89 91.67 Liang-2005 3 37 1411 179 94.59 100.00
Bittner-2000-V1 2 38 2201 279 76.32 84.21 Nutt-2003-v1 4 50 1377 320 72.00 70.00
Bittner-2000-V2 3 38 2201 279 63.16 65.79 Nutt-2003-v2 2 28 1070 173 89.29 100.00
Bredel-2005 3 50 1739 238 84.00 84.00 Nutt-2003-v3 2 22 1152 246 100.00 90.91
Dyrskjot-2003 3 40 1203 217 75.00 82.50 Pomeroy- 2 34 857 126 76.47 79.41
2002-v1
Garber-2001 4 66 4553 391 81.82 83.33 Risinger-2003 4 42 1771 255 71.43 71.43
Golub-1999-v1 2 72 1877 321 88.89 90.28 Shipp-2002-v1 2 77 798 137 85.71 75.32
Golub-1999-v2 3 72 1877 321 84.72 88.89 Tomlins-2006- 4 92 1288 129 83.70 84.78
v2
Gordon-2002 2 181 1626 290 100.00 96.69 West-2001 2 49 1198 180 75.51 75.51
Khan-2001 4 83 1069 165 53.01 65.06 Yeoh-2002-v1 2 248 2526 315 87.90 95.97

Acknowledgements This research was partially supported by Grant 2014 SGR 464 (GRBIO)
from the Departament d’Economia i Coneixement de la Generalitat de Catalunya; by the Basque
Government Research Team Grant (IT313-10) SAIOTEK Project SA- 2013/00397; and by the
University of the Basque Country UPV/EHU (Grant UFI11/45 (BAILab)).
8 C. Arenas et al.

References

1. C. Arenas and C.M. Cuadras, “Some recent statistical methods based on distances”, Cont. Sci. 2
(2002), 183–191.
2. C.M. Cuadras and J. Fortiana, “A continuous metric scaling solution for a random variable”,
J. Multivariate Ana. 32 (1995), 1–14.
3. J.C. Gower, “A general coefficient of similarity and some of its properties”, Biometrics 27
(1971), 857–871.
4. I. Irigoien and C. Arenas, “INCA: new statistics for estimating the number of clusters and
identifying atypical units”, Stat. Med. 27, (2008), 2948–2973.
5. I. Irigoien, F. Mestres, and C. Arenas, “The depth problem: identifying the most representative
units in a data group”, IEEE ACM T Comput Bi 10 (2013), 161–172.
6. I. Irigoien, B. Sierra, and C. Arenas, “ICGE: an R package for detecting relevant clusters and
atypical units in gene expression”, BMC Bioinformatics 13 (2013), 30–41.
7. A.C. Kimber, “Exploratory data analysis for possibly censored data from skewed distributions”,
Appl. Stat. 39 (1990), 21–30.
8. C.R. Rao, “Diversity: its measurement decomposition apportionment and analysis”, Sankhya
Indian J. Stat. 44 (1982), 1–22.
9. C. Toma, B. Torrico, A. Hervs, R. Valdés-Mas, A. Tristán-Noguero, V. Padillo, M. Maristany,
M. Salgado, C. Arenas, X.S. Puente, M. Bayés, and B. Cormand, “Exome sequencing in
multiplex autism families suggests a major role for heterozygous truncating mutations”, Mol
Psychiatr 19 (2014), 784–790.
10. R. Serfling and S. Zuo, “General notions of statistical depth function”, Ann. Stat. 28 (2000),
461–482.
An Ordinal Joint Model for Breast Cancer

Carmen Armero, Carles Forné, Montserrat Rué, Anabel Forte,


Hector Perpiñán, Guadalupe Gómez and Marisa Baré

Abstract We propose a Bayesian joint model to analyze the association between


longitudinal measurements of an ordinal marker and time to a relevant event. The
longitudinal process is defined in terms of a proportional-odds cumulative logit model
and the time-to-event process through a left-truncated Cox proportional hazards
model with information of the longitudinal marker and baseline covariates. Both
longitudinal and survival processes are connected by a common vector of random
effects.

C. Armero (B) · A. Forte · H. Perpiñán


Department of Statistics and Operational Research, Universitat de València, València, Spain
e-mail: [email protected]
A. Forte
e-mail: [email protected]
H. Perpiñán
e-mail: [email protected]
C. Forné
Department of Basic Medical Sciences, Universitat de Lleida, IRB-Lleida and Oblikue
Consulting, Lleida, Spain
e-mail: [email protected]
M. Rué
Department of Basic Medical Sciences, Universitat de Lleida, IRB-Lleida, Lleida, Spain
e-mail: [email protected]
H. Perpiñán
Fundación para el Fomento de la Investigación Sanitaria y Biomédica (FISABIO),
Generalitat Valenciana, València, Spain
G. Gómez
Department of Statistics and Operations Research, Universitat Politècnica de Catalunya,
Barcelona, Spain
e-mail: [email protected]
M. Baré
Clinical Epidemiology and Cancer Screening, Corporació Sanitària Parc Taulí-UAB,
Sabadell, Spain
e-mail: [email protected]

© Springer International Publishing AG 2017 9


E.A. Ainsbury et al. (eds.), Extended Abstracts Fall 2015,
Trends in Mathematics 7, DOI 10.1007/978-3-319-55639-0_2

You might also like