0% found this document useful (0 votes)
27 views3 pages

Introduction_to_data_analysis

Uploaded by

.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views3 pages

Introduction_to_data_analysis

Uploaded by

.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/23235068

Introduction to data analysis

Article in Journal of Small Animal Practice · September 2008


DOI: 10.1111/j.1748-5827.2008.00647.x · Source: PubMed

CITATIONS READS

2 19,714

1 author:

Vicki Jean Adams

85 PUBLICATIONS 2,571 CITATIONS

SEE PROFILE

All content following this page was uploaded by Vicki Jean Adams on 21 June 2018.

The user has requested enhancement of the downloaded file.


EDITORIAL

Introduction to data analysis


DATA analysis encompasses the use of statistical methods to value of some other variable. Thus, a dependent variable is also
describe data, test hypotheses and estimate measures of effect such called an outcome or response variable. One of the dependent
as risk, relative risk and survival probabilities. As a post-graduate or response variables that was examined in the ischaemic myelop-
student in Canada, I had two excellent teachers who introduced athy study was the outcome of the cases. Outcome was defined as
me to the world of epidemiology and statistics. I still remember being either successful or unsuccessful (a dichotomous categorical
their words and have paraphrased some of them here to explain variable). The dependent or response variable that was examined in
how I developed my approach to clinical research. the echocardiography study included tissue Doppler imaging
measurements such as myocardial velocity gradient and mean
myocardial velocities (continuous variables).
An approach to your own research
Ask yourself: ‘‘What is my hypothesis?’’
Three steps from research question to statistical model
‘‘What am I really trying to show?’’
‘‘Can I simplify things?’’ 1. Consider the type (NOIR) of the dependent
(outcome or response) variable.
Don’t ask too many questions - keep it simple.
i. Categorical
From the hypothesis, you can create a table of what you expect
binary/dichotomous (nominal)
the results will look like and start to think about what numbers
e.g. alive or dead at the end of the study
will go into the table.
2 categories (ordinal)
e.g. disease absent (0) or present (1)
First the purpose of the study is stated - often as a research ques- >2 categories (nominal)
tion phrased in the form of a testable hypothesis. Then the study is e.g. blood group (A, B, AB)
designed to collect the appropriate data. Once a study has been ranked categories (ordinal)
completed, the data are entered into an electronic database, spread- e.g. cancer stage (I, II, III)
sheet or statistical package for analysis. Data are normally entered
using the columns for the variables and entering the cases in rows ii. Continuous
such that the data for each study subject is entered in one row. interval
A recent study (de Risio and others 2008) investigated the asso- e.g. Glasgow Coma Scale score (1-18)
ciation of clinical and magnetic resonance imaging (MRI) findings ratio
with outcome in dogs with presumed ischaemic myelopathy. This e.g. red blood cell count
study is classified as a retrospective case series (Cardwell 2008) 2. Consider the type (NOIR) of independent
since the exposures (clinical and MRI findings) were recorded (exposure or predictor) variable(s) as above.
at presentation at a referral hospital and outcome of interest (clin-
ical outcome) was evaluated at a later time. Another study (Koffas 3. Choose the statistical test(s) appropriate to the type of
and others 2008) used colour M-mode tissue Doppler imaging to dependent and independent variables that you plan to
detect differences in the myocardium of healthy cats and cats with include in your study (Table 2).
hypertrophic cardiomyopathy. This study is classified as a cross-
sectional study (Cardwell 2008) since the exposure (disease status)
and outcome of interest (measurements from tissue Doppler imag- An independent variable is defined as an explanatory variable
ing) were evaluated at the same time. that is measured and hypothesised to be associated with an out-
come of interest (dependent variable). Thus, an independent var-
Types of data and variables iable is also called an exposure or predictor variable. In the
Variables are defined in terms of the type of data they represent ischaemic myelopathy study, the independent or exposure vari-
and are classified as either categorical (discrete) or continuous. ables included: neuroanatomic location of the lesion, treatment
The NOIR system is commonly used to define the type of data prior to referral, upper vs lower motor neuron signs on presen-
as nominal, ordinal, interval or ratio (Table 1). A variable can tation and whether the lesion was symmetrical or not (nominal
also be considered to be dependent or independent. The value categorical variables with two or more unordered categories). In
of a dependent variable depends on (or can be predicted by) the the analysis of data from the echocardiography study, the main

Table 1. NOIR system of classification of types of data


Variable Data type Description Examples

Categorical Nominal Named categories with no implied order Blood groups, breed, gender, neuter status
Ordinal Ordered categories where the differences between categories Scoring systems, cancer staging, onset of
are not necessarily equal disease (peracute, acute, chronic)
Continuous Interval Equal distances between values but the zero point is arbitrary IQ, ordinal data with equal-appearing categories
Ratio Above as for interval and a meaningful zero; data usually Weight, age, temperature, blood pressure
obtained by measurement

Journal of Small Animal Practice  Vol 49  August 2008  Ó 2008 British Small Animal Veterinary Association 375
Editorial

Table 2. Basic statistical methods table


Y ¼ dependent or outcome variable
Categoricala Continuousb

X ¼ independent Categorical Contingency tables & chi-square or Fisher’s T-tests for 2 groups, Analysis of variance
or exposure variable(s) exact tests, logistic regression (ANOVA) for .2 groups
Continuous Logistic regression Correlation, Linear regression
a
Used to evaluate whether or not an event occurred
b
Used to evaluate how much of an outcome occurred

independent or exposure variable was the cardiac disease status of Vicki Adams graduated from the Western College of Veterinary
the cats (normal or affected with hypertrophic cardiomyopathy), Medicine in Saskatoon in 1990 and went on to complete a one-
a binary categorical variable. Additional independent variables year small animal internship at the University of Minnesota.
that were included in the analysis included the R-R interval, After seven years in general and emergency small animal
age and weight (all continuous variables). practice, she returned to the University of Saskatchewan to
do research. Having obtained an MSc in the epidemiology of
rabies in wildlife, Vicki completed a PhD in small animal
Statistical tests epidemiology investigating owner compliance with veterinary
Based on the classification of the independent and dependent var- recommendations and prescribed medications. Vicki started
iables, there are four basic types of data sets that can occur. For working at the Animal Health Trust in January 2003 and is
each of these there are different methods of statistical analysis currently Head of the Small Animal Epidemiology Unit.
available. The statistical models presented in Table 2 include
methods for evaluating how much of an outcome occurred or
whether or not an event occurred. Acknowledgements
Therefore, in the ischaemic myelopathy study a contingency With grateful thanks to Drs Carl Ribble and John Campbell for
table or cross-tabulation with chi-square or Fisher’s exact test is their very wise words.
the appropriate approach to analysis to examine the effect of each
of the independent variables mentioned above with the dichot-
omous categorical outcome variable (successful or unsuccessful References
KOFFAS, H., DUKES-MCEWAN, J., CORCORAN, B. M., MORAN, C. M., FRENCH, A., SBOROS, V.,
outcome). In the echocardiography study with a categorical main SIMPSON, K., ANDERSON, T., & MCDICKEN, W. N. (2008) Colour M-mode tissue
exposure variable, several continuous independent variables and Doppler imaging in healthy cats and cats with hypertrophic cardiomyopathy.
Journal of Small Animal Practice 49, 330-338
a continuous outcome variable, linear regression was an appro- CARDWELL, J. M. (2008) An overview of study design. Journal of Small Animal Prac-
priate approach to the analysis. This study found statistically sig- tice 49, 217-218
DE RISIO, L., ADAMS, V., DENNIS, R., MCCONNELL, F. & PLATT, S. (2008) Association of
nificant differences in several of the tissue Doppler imaging clinical and magnetic resonance imaging findings with outcome in dogs sus-
measurements. pected to have ischemic myelopathy: 50 cases (2000–2006). Journal of the
American Veterinary Medical Association 233, 129-135
This approach can be extended to all types of data, including
‘‘messy’’ data that might include the presence of repeated meas-
urements on individual animals or the occurrence of unbalanced Further reading
data sets due to missing data. The ability to extend this approach DOHOO, I. R., MARTIN, W. & STRYHN, H. (2003) Veterinary epidemiologic research.
Atlantic Veterinary College Inc. University of Prince Edward Island, Prince
allows us to consider some of the more advanced methods of sta- Edward Island, Canada, 706 pp
tistical analysis such as multiple regression (with $2 independent HULLEY, S. B., CUMMINGS, S. R., BROWNER, W. S., GRADY, D., HEARST, N. & NEWMAN, T. B.
(2001) Designing clinical research: An epidemiologic approach. 2nd edn. Lip-
variables that can be a mix of continuous and categorical), mixed pincott Williams & Wilkins, Philadelphia, PA, USA
or multi-level models (with $2 levels of measurements that need KATZ, M. H. (2006) Multivariable Analysis. A Practical Guide for Clinicians. 2nd
edn. Cambridge: Cambridge University Press, Cambridge, UK
to be taken into account, such as when looking at kittens within PETT, M. A. (1997) Nonparametric Statistics For Health Care Research. Sage
litters or clinicians within a practice) and survival analysis. Publications, London, UK
PFEIFFER, D. U. (2002) Veterinary Epidemiology - An Introduction. Available from:
http://www.vetschools.co.uk/EpiVetNet/epidivision/Pfeiffer/files/Epinotes.
pdf (accessed 9 July 2008). Royal Veterinary College, Herts, UK
SACKET, D. L., HAYNES, R. B., GUYATT, G. H. & TUGWELL, P. (1991) Clinical Epidemi-
Vicki Adams ology: A Basic Science for Clinical Medicine. 2nd edn. Little, Brown & Co.,
Animal Health Trust Boston, MA, USA

376 Journal of Small Animal Practice  Vol 49  August 2008  Ó 2008 British Small Animal Veterinary Association

View publication stats

You might also like