Secondary Data Analysis
Secondary Data Analysis
1051/ shsconf/20207504005
ICHTML 2020
Abstract. The article discusses the problem of using secondary data analysis (SDA) in educational
research. The definitions of the SDA are analyzed; the statistics of journals articles with secondary data
analysis in the field of sociology, social work and education is discussed; the dynamics of articles with
data in the Journal of Peace Research 1988 to 2018 is conducted; the papers of Ukrainian conference
“Implementation of European Standards in Ukrainian Educational Research” (2019) are analyzed. The
problems of PhD student training to use secondary data analysis in their dissertation are discussed: the
sources of secondary data analysis in the education field for Ukrainian PhD students are proposed, and the
model of training of Ukrainian PhD students in the field of secondary data analysis is offered. This model
consists of three components: theory component includes the theoretic basic of secondary data analysis;
practice component contains the examples and tasks of using SDA in educational research with statistics
software and Internet tools; the third component is PhD student support in the process of their thesis
writing.
*
Corresponding author: [email protected]
© The Authors, published by EDP Sciences. This is an open access article distributed under the terms of the Creative Commons Attribution License 4.0
(http://creativecommons.org/licenses/by/4.0/).
SHS Web of Conferences 75 , 04005 (2020) https://doi.org/10.1051/ shsconf/20207504005
ICHTML 2020
2
SHS Web of Conferences 75 , 04005 (2020) https://doi.org/10.1051/ shsconf/20207504005
ICHTML 2020
The third source is a population census in Ukraine. (http://nces.ed.gov/datalab/), Data Analysis System
We use data bases that contain Ukrainian census data (DAS)(http://nces.ed.gov/das/), AM Statistical Software
since 1959 [19]. For example, one of the tasks is related (http://am.air.org/). Also we can use general purpose
to building and comparing the gender-age pyramid of the software that can account for complex sampling. These
population of Ukraine at different years and includes tools are usually commercial and cost a lot. (except R).
searching for the relevant, data, building the pyramid They are generally syntax-based, more flexible.
using standard diagram building Excel tools, using SPSS Examples of such tools are: SAS (certain analyses
tools (Chart Builder, Histogram, Population Pyramid), require SUDAAN add-on), Stata, SPSS, Mplus and
and using pyramid package of R environment. The other.
second task is related to the calculation of child care and In R environment there is a special package called
grandparent care load coefficients, visualizing of their “survey” [21]. The package is oriented on analysis of
dynamics, and includes an introduction to the complex survey samples and provides the following
demographic passport of Ukraine [19]. features: summary statistics, two-sample tests, rank tests,
In Demographic and Social Statistics / Education generalized linear models, cumulative link models, Cox
page on the State Statistics Service of Ukraine models, log linear models, and general maximum pseudo
(http://www.ukrstat.gov.ua/) we can find some likelihood estimation for multistage stratified, cluster-
educational statistics about: sampled, unequally weighted survey samples. Also, we
• Preschool educational institutions (1990-2018) can use variances by Taylor series linearization or
• Secondary education schools (1990-2018) replicate weights, post-stratification, calibration, and
• Vocational schools (1990-2018) raking. There are two-phase subsampling designs,
• Institutions of higher education (1990-2019). graphics, PPS sampling without replacement; principal
Also the Women and Men / Demographic and Social components, factor analysis. So, the students need
Statistics / Education page presents gender data about: substantial training in order to be able to use this
• Pre-school education in 2017 package.
• Secondary education schools and vocational schools The next section discusses how the secondary data
in 2017 analysis application is displayed in the articles of
• Institutions of higher education in 2017 scientific journals, as well as the maintenance of the
• Indices of gender parity among students of article by data sets.
educational institutions of Ukraine
2.2. Presenting secondary data analysis and
What are the advantages of using secondary data?
quantitative methods in the journal article
We can save time and money; those datasets are ideal for
use in classroom examples, course projects, master’s The British Scientist E. Smith [4] explores the use of
theses, dissertations and supplemental studies; data may quantitative methods in educational research and the use
be of higher quality and more representative. of numeric secondary data analysis.
The disadvantages of using secondary data are: data She reviewed the published output of eight well-
may not facilitate particular research question; regarded journals in the fields of Education, Sociology
information regarding study design and data collection and Social Work over a seven-year period (Table 1).
procedures may be scarce; data may potentially lack Those journals were:
depth; may require knowledge of survey statistics and In the Education field
methods which is not generally provided by basic • British Educational Research Journal
graduate statistics courses. • Oxford Review of Education
Scientists list [20] the following important steps in • Research Papers in Education
the teaching SDA. In the Sociology field
1. Develop student’s research question • British Journal of Sociology
2. Identify a secondary data set • Sociology
3. Evaluate a secondary data set • Sociological Review
• What was the aim of the original study? In the Social Work field
• Who has collected the data? • British Journal of Social Work
• Which measures were employed? • International Social Work
• When was the data collected?
Table 1. The number of papers using secondary data analysis
• What methodology was used to collect the data?
and quantitative methods (E. Smith [4, p. 327])
• Making a final evaluation
4. Prepare and analyse secondary data. Secondary data Quantitative Total
Journal
It is useful to correlate these steps with use SDA in analysis methods papers
isolation, with the combination two or more data sets and Education
80 192 627
to combine secondary data analysis with primary data journals
analysis. Sociology
89 119 706
What software is used for SDA? We can use the journals
software specifically developed for analysing complex Social work
33 181 683
journals
survey data [12]. It is generally free, but may lack
All journals 202 492 2016
flexibility and be only useful for initial data analysis.
The examples of such tools are: PowerStats
3
SHS Web of Conferences 75 , 04005 (2020) https://doi.org/10.1051/ shsconf/20207504005
ICHTML 2020
About one quarter of all the papers (24 %) that were researcher. The data for calculations for two journals are
reviewed by E. Smith used some form of quantitative given in the Table 2.
method, of these around 42% presented secondary data
analysis. The use of quantitative methods changed from Table 2. Comparison of publications of two educational
journals using SDA (calculated with data from [4]).
31% of papers in the ‘Education’ journals, 27% in the
‘Social work’ journal, and 17% in ‘Sociology’ (Fig. 2). Secondary data Secondary data
Journals analysis, yes analysis, no Total
n % n %
British Educational
34 12,4 240 87,6 274
Research Journal
Oxford Review of
30 13,6 190 86,4 220
Education
Total 64 430 494
4
SHS Web of Conferences 75 , 04005 (2020) https://doi.org/10.1051/ shsconf/20207504005
ICHTML 2020
received an average of 52 articles in the last three years R 3.4.1 and Stata 13.1 software versions are required to
(2015-2017). reproduce this study; the following R packages need to
be installed: “MatchIt”, “dplyr”, “ggplot2”, “haven”,
Table 3. Statistics about articles with data in Journal of Peace “readr”, “xtable”, “tidyverse”, “RStata”. Among the files
Research
that accompany the article are .txt, .csv text files; scripts
Number of articles The average number of articles R, Stata; html files; Stata (.dta) and R (.rda) data files.
Year
with data with data in a single issue
1984 1 0,2
1998 10 1,7
1999 22 3,7
2008 28 4,7
2014 41 6,8
2015 52 8,7
2016 55 9,2
2017 49 8,2
2018 45 7,5
5
SHS Web of Conferences 75 , 04005 (2020) https://doi.org/10.1051/ shsconf/20207504005
ICHTML 2020
including archives of the results of sociological research 2. J. Sobal, Teaching with Secondary Data. Teaching
(quantitative and qualitative); data from statistical Sociology 8(2), 149–170 (1981).
agencies; global (international) indexes and their ratings doi:10.2307/1316942
of countries, cities, regions; data from national and 3. E. Smith, Pitfalls and promises: the use of secondary
international non-governmental research organizations, data analysis in educational research. British Journal
etc. The purpose is also to teach students to search data of Educational Studies 56(3), 323–339 (2008).
in electronic archives; to acquaint students with the doi:10.1111/j.1467-8527.2008.00405.x
peculiarities of preparation for analysis of data obtained
4. E. Smith, Using Secondary Data in Educational and
from archives, with the specifics and methods of
Social Research (Open University Press,
secondary data analysis; provide basic knowledge of
Maidenhead, Berkshire, 2008)
data management planning in empirical sociological
projects and preparation of own research data for 5. T.P. Vartanian, Secondary Data Analysis (Oxford
placement in electronic archives of social data. University Press, New York, 2011)
An analysis of the content of these courses showed 6. Practical Methods for Secondary Data Analysis
that not all topics related to the SDA (Fig. 1) were (2017),
reflected in their programmes. http://www.sph.umn.edu/site/docs/syllabi/Syllabi/20
In addition, we have not found such courses for 17/Fall/PubH-6617.pdf. Accessed 30 Dec 2019
masters and doctoral programs in the field of 7. T. Logan, A practical, iterative framework for
pedagogical sciences in Ukraine. secondary data analysis in educational research. The
3 Conclusion Australian Educational Researcher 47, 129–148
(2020). doi:10.1007/s13384-019-00329-z
Advanced informational technologies have made data
resources more accessible and easier to research. Modern 8. V. Sherif, Evaluating Preexisting Qualitative
initiatives about open access data provide wide Research Data for Secondary Analysis. Forum
opportunities for researchers. The most important Qualitative Sozialforschung / Forum: Qualitative
initiatives are: UK Data Service (UKDS), Office for Social Research, 19(2), Art. 7 May (2018).
National Statistics, Organization for Economic doi:10.17169/fqs-19.2.282
Cooperation and Development (OECD), World Bank. 9. M.P. Johnston, Secondary Data Analysis: A Method
So, it is important to prepare future researchers for a of which the Time Has Come. Qualitative and
secondary data analysis using new computer tools and Quantitative Methods in Libraries 3(3), 619–626
technologies. This is especially true for PhD students in (2014), http://www.qqml-
the field of education. They should search, analyze and journal.net/index.php/qqml/article/view/169.
interpret educational statistics in the framework of their Accessed 21 Mar 2020
dissertations. 10. J. Carter, S. Noble, A. Russell, E. Swanson,
This model of this training may consist of three Developing statistical literacy using real-world data:
components. Theory component includes the theoretic investigating socioeconomic secondary data
basic of secondary data analysis, strength and weakness resources used in research and teaching.
of this methodology. Practice component contains the International Journal of Research & Method in
examples and tasks of using SDA in educational research Education 34(3), 223–240 (2011).
with computer tools (specialised and general). These two doi:10.1080/1743727x.2011.609553
components are implemented in lectures, seminars and 11. SAGE Publications, MethodSpace - Connecting the
independent work in courses on research methods and Research Community (2020),
courses on quantitative methods. The third component is https://www.methodspace.com. Accessed 21 Mar
implemented as PhD student support in the process of 2020
writing a dissertation work and includes consultations,
seminars and peer reviews. 12. N. Koziol, A. Arthur, An Introduction to Secondary
In our opinion, the course of research methods need Data Analysis. CYFS (2011),
to contain a mandatory unit about SDA. The further http://r2ed.unl.edu/presentations/2011/RMS/120911
development of the study is integration of secondary data _Koziol/120911_Koziol.pdf. Accessed 30 Dec 2019
analyses in the courses of research methods for PhD 13. Archive of Educational Data,
students in the field of Education in Ukraine and http://www.icpsr.umich.edu/IAED/index.html
building the model of their support on the stage of thesis (2019). Accessed 30 Dec 2019
writing. This model can be structural and content [28] or 14. Open data ZNO 2019 (2019),
structural and functional. https://zno.testportal.com.ua/yearstat/uploads/Open
DataZNO2019.7z. Accessed 30 Dec 2019
References
15. L. Panchenko, Methodology of using structural
1. Secondary Data Analysis Initiative (SDAI) - open
equation modeling in educational research. CEUR
call, https://esrc.ukri.org/funding/funding-
Workshop Proceedings 2393, 895–904 (2019),
opportunities/secondary-data-analysis-initiative-
http://ceur-ws.org/Vol-2393/paper_411.pdf.
sdai-open-call (2020). Accessed 20 Feb 2020
Accessed 30 Dec 2019
6
SHS Web of Conferences 75 , 04005 (2020) https://doi.org/10.1051/ shsconf/20207504005
ICHTML 2020
16. Access the full survey by TALIS methodology raw Tools 74(6), 186–200 (2019).
data (in SPSS): TEACHERS_DATA, doi:10.33407/itlt.v74i6.2421
https://drive.google.com/open?id=1bzh6U7MnOaFS
t_1CV1BsQndCuLX_WpWt. Accessed 20 Feb 2019
17. Questionnaires (in ukr):
TEACHERS_Questionnaire,
https://drive.google.com/open?id=1L6SHvqpMAGP
zeLkp8Ksb9E-KXdnkd0sd. Accessed 20 Feb 2019
18. TALIS - The OECD Teaching and Learning
International Survey (2019),
http://www.oecd.org/education/talis/. Accessed 20
Feb 2019
19. L.F. Panchenko, Training Sociology Students in
Computer Analysis of Demographic Processes and
Structure. Information technologies and learning
tools 65(3), 166–183 (2018).
doi:10.33407/itlt.v65i3.2034
20. Bank danykh (Census) (2019),
http://database.ukrcensus.gov.ua/MULT/Database/C
ensus/databasetree_uk.asp. Accessed 30 Dec 2019
21. Package ‘survey’ (2020), https://cran.r-
project.org/web/packages/survey/survey.pdf.
Accessed 21 Mar 2020
22. Implementatsiia yevropeiskykh standartiv v
ukrainski osvitni doslidzhennia (Implementation of
European Standards in Ukrainian Educational
Research), ed. by S. Shchudlo, O. Zabolotna,
L. Zahoruiko. III International Scientific Conference
of Ukrainian Association of Educational Research
(21 Jun 2019). (TzOV “Trek-LTD”, Kyiv–
Drohobych, 2019)
23. Journal of Peace Research,
http://journals.sagepub.com/home/jpr. Accessed 30
Dec 2018
24. Data Access & Research Transparency,
https://www.dartstatement.org. Accessed 30 Dec
2018
25. F. Haass, Better peacekeepers, better protection?
Troop quality of United Nations peace operations
and violence against civilians. Journal of Peace
Research 6 (2018)
26. L.F. Panchenko, Prozorist doslidzhen z pytan myru
ta konfliktiv: Journal of Peace Research
(Transparency of Peace and Conflict Studies:
Journal of Peace Research), in The Great Wars - the
Great Transformations: Conflicts and Peace in the
20th and 21st Centuries. Materials of IX
International Scientific and Practice Conference
(Kyiv, 26-27 Nov 2018). (Kyiv, 2018), pp. 87–89
27. Program course ‘Use of electronic archives of social
data’ (Taras Shevchenko National University of
Kyiv), http://www.soc.univ.kiev.ua/uk/courses.
Accessed 30 Dec 2019
28. I. Lovyanova, K. Vlasenko, A. Krasnoschok,
D. Dmytriiev, R. Shponka, Modeling of ICT
Competence Formation of Would-be Mathematics
Teacher. Information Technologies and Learning