Big Data and Social Media Analytics
Big Data and Social Media Analytics
uk/research-matters/
© UCLES 2014
not enter early would have performed worse if they had taken two or more Gill, T. (2013). Early entry GCSE candidates: Do they perform to their potential?
GCSEs early. Further research could also estimate the average treatment Research Matters: A Cambridge Assessment Publication, 16, 23–40.
effect for the treated in the case of two treatment groups, to see if taking McCaffrey, D.F., Ridgeway, G., & Morral, A.R. (2004). Propensity score estimation
two or more GCSEs early is beneficial to these students or not. with boosted regression for evaluating causal effects in observational studies.
Psychological Methods, 9(4), 403–425.
Finally, it will be interesting to see the impact of GCSE reforms on
the amount of early entry. Students will still be able to sit GCSEs in Year Morgan, S.L., & Harding, D.J. (2006). Matching Estimators of Causal Effects:
Prospects and Pitfalls in Theory and Practice. Sociological Methods & Research,
10, but changes to accountability measures mean that only the result
35(1), 3–60.
from the first sitting of a GCSE will count in performance tables. This is
Ofsted (2013). Schools’ use of early entry to GCSE examinations. Its usage and
likely to lead to a fall in early entry because schools may want to wait
impact. Manchester: Ofsted.
until students are ready to achieve their best possible grade, rather than
Rosenbaum, P.R., & Rubin, D. B. (1983). The central role of the propensity score in
getting them to sit GCSEs early and then re-sit if they underperform.
observational studies for causal effects. Biometrika, 70(1), 41–55.
References
Caliendo, M., & Kopeinig, S. (2008). Some practical guidance for the
implementation of propensity score matching. Journal of Economic Surveys,
22(1), 31–72.
1. Joint Electron Device Engineering Council memory standards 12. Are we working with other departments within the organisation to
2. International Electrotechnical Commission units develop a comprehensive policy?
There are many examples of how big data is being used in various fields. Government and Nesta www.nesta.org.uk. This organisation brings
Whilst these are not directly associated with the field of education, they together data from a range of inter-related academic disciplines
give us a picture of the impact of data in our day-to-day lives (Raconteur (Behavioural Economics, Psychology, and Social Anthropology) to
media, 2013). Examples include: understand how individuals make decisions in practice and how they are
likely to respond to options so as to enable the Government to design its
l IBM’s Deep Thunder weather analytics package: helps farmers
policies or interventions accordingly.
know when to irrigate their crops;
l SAS: uses big data to identify fraud in the insurance sector;
Applications of big data in education
l British Airways’ Know Me Programme: uses the data collected to
A large amount of data is being generated in schools and higher
get a better insight into personal preferences and buying patterns of
education. Big data in education could be used to:
its frequent fliers;
l understand performance and behaviour patterns of students;
l Transport for Greater Manchester: uses real-time traffic
l keep track of student progress throughout their education, allowing
information to avoid congestion on roads;
timely intervention if any anomalies are noticed;
l Bank of America Merrill Lynch: creates practical and effective
l develop personalised content and instructional methodologies for
solutions for clients based on a more comprehensive and holistic
each student in order to provide remedial help without stigmatising
understanding of their requirements;
or isolating students or embarrassing them in front of their peers;
l East Kent Hospitals University NHS Foundation Trust: staff given
l estimate how students will perform on standardised tests
access to data to adapt to real-time changes such as re-allocation
(i.e. predictive assessment);
of doctors and nurses between sites based on changes in demand
across sites; l find out which instructional techniques work best for students and to
provide customised teaching (i.e. diagnostic assessment);
l Citi: estimates targeted predictive analytics according to customer
behaviour; l feedback in real-time to help improve student performance;
l Public Health England: creates highly targeted treatments according l conduct adaptive testing;
to how patients respond in real-time through recently announced l merge systems such as learning management and curriculum
national cancer database (the data contains 11 million historical management;
records and 350,000 new entries added every year); l integrate ICT devices used by students in classrooms and homes
l Ocado: delivers groceries purchased online. It keeps track of vehicle leading to a large amount of useful information about them under
location, driving styles and petrol consumption while delivering initiatives such as bring your own device (BYOD);
1.1 million items every week; l combine various data sources such as course records, student
l Royal Dutch Shell: spends £650 million a year compiling big data attendance, class rosters, programme participation, degree
across a number of sites so that they can more accurately predict attainment, discipline records and test scores which could enable
presence of hydrocarbon resources at a site – this may help save more efficient management of student recruitment, administration
them drilling costs (which for a single offshore drilling can cost up to and academic research; (Hoit, 2012; West, 2012).
£65 million);
In addition to the applications mentioned above, awarding bodies
l Accenture: collects social media analytics for the purposes of
could use data for more comprehensive research in areas such as test
sentiment analysis by using data and text mining, semantics,
development and marker monitoring. They could also make use of
linguistics and syntax processing;
large amounts of data which is likely to be generated by the use of
l Facebook: recently started to decode the content of photographs computerised assessment and through other IT-enabled initiatives such
(identifying faces and objects) and video; as computerised, interactive systems for producing questions.
l Apple: granted a patent to collect data on body temperature and
heart rate through audio buds; Educational courses in big data
l Google: tunes algorithms in language processing to be culturally McKinsey reports that by 2018 the United States alone will face a
relevant (for instance differentiating between American and British shortage of up to 190,000 people with analytical expertise and
idioms) and also improving its speech recognition capabilities; 1.5 million managers and analysts with the skills to understand and
l Temetra: collates information on how people use gas and water in make decisions based on the analysis of big data (Manyika et al., 2011).
their homes and businesses, giving them data after every 15 minutes A recent report prepared by e-skills UK3 for SAS suggests that over the
rather than an annual reading; next five-year period the average annual growth rate of demand for big
l Modak Analytics: mined about 18 terabytes of data of a 810 million data professionals in the UK is expected to be about 18% per annum
electorate during the general elections in India held in April to May (compared to 2.5% for IT staff). This would equate to the generation of
2014 on various demographics such as gender, age, and economic approximately 28,000 job opportunities per annum (a total of 132,000)
status for their client, a political party (Kurmanath, 2014). by 2017 (e-skills, 2013).
Various universities in the UK are offering MSc courses in big data/
An interesting application of the use of big data in developing
government policy is the Behavioural Insights Team 3. The Sector Skills Council for Business and Information Technology based in the UK.
Big data and social media to the development of new tools to access information and produce
metrics about visibility of websites. It is possible to gather metrics such
Businesses thrive on understanding their customers to the greatest as countries/cities where website visitors were based, the web browsers
extent possible. The monitoring of people’s online behaviour is they were using, the keywords they had used to search for a website and
therefore becoming important for their success. Organisations the webpages they had visited before and after accessing a particular
are investing in gathering such analytics using big data as a key website. Some such metrics are presented below.
component for monitoring social media activity, particularly on social
networking websites such as Facebook, Twitter and LinkedIn. Website rankings
Social media analytics are the synthesis of the behaviour of internet Websites can be ranked to get an estimate of a website’s popularity
users. The availability of data on consumers’ web browsing, online relative to all other websites over a specified period of time (for instance,
shopping behaviour, customers’ feedback and marketing research six months or one year). The ranks are provided by tools such as
on social networks allow organisations to gain timely and extensive www.ranking.com and www.alexa.com. The lower the rank, the higher
insights into consumers. Therefore, organisations can focus their the popularity of the website (for instance, the rank of Google.com is 1
market intelligence strategies based on different objectives such as followed by Facebook.com and YouTube.com). The ranks could be used by
advertising and product launches; publicity and brand management; organisations to estimate the popularity of their websites in general, as
promoting customer loyalty; providing personalised services to well as in comparison to their competitors. Figure 1 shows a comparison
customers; keeping a tab on market trends and competitors; of the ranks of two websites www.education.gov.uk and www.parliament.
minimising risk; saving cost and business expansion in general. uk from November 2013 to May 2014.
More
popular 40,000
45,000
Less 50,000
popular
55,000
Dec '13 Jan '14 Feb '14 Mar '14 Apr '14 May '14
education.gov.uk www.parliament.uk
th th
Figure 1: Historical traffic trends for the two
Figure 1: Historical traffic trends for the two websites from
websites from 12 November 2013 to 9 May 2014.
th 12th November 2013 to 9th May 2014. Source: www.alexa.com (retrieved 12th May, 2014).
Source: www.alexa.com (accessed 12 May, 2014).
form of tables and interactive graphs which could be customised by the users. Some tools also
Online traffic analytics Table 2: Web analytics tools
be used for targeting their products and services. The metrics also allow
WebSTAT Its distinctive trait is the measure http://www.webstat.com/
the identification of those website sections which are popular with the of visitors' behaviour once on the
visitors and those which are not, which in turn could help organisations website. This includes their drivers
and conversions; such as, the
improve their websites. degree to which different landing
pages are associated with online
Social media monitoring purchases.
Shah, S. (2012). SAS launches academy to tackle demand for "£52,000 a year"
Acknowledgements
big data specialists. Retrieved from http://www.computing.co.uk/ctg/
We would like to thank our colleagues Tom Benton, Nick Raikes, Sylvia Green and news/2230956/sas-launches-academy-to-tackle-demand-for-gbp52-000-a-
Frances Wilson for their advice. year-big-data-specialists
Swoyer, S. (2012). Big data – why the 3Vs just don't make sense. Retrieved from
References
http://tdwi.org/articles/2012/07/24/big-data-4th-v.aspx
BBC (2013). The age of big data: BBC Horizon. Retrieved from http://www. Raconteur Media (Ed.) (2013, September 4). Big data. The Times [supplemental
youtube.com/watch?v=CO2mGny6fFs material].
BIG (2014). Big data public private forum. Retrieved from http://big-project.eu Villanova University (2014). What is big data? Retrieved from www.villanovau.
Beyer, M. A., & Laney, D. (2012). The importance of 'Big Data': A definition com/university-online-programs/what-is-big-data
(Gartner Report G00235055). Retrieved from https://www.gartner.com/doc/2 West, D. M. (2012). Big data for education: Data mining, data analytics, and web
057415?ref=clientFriendlyURL dashboards (Brookings paper). Retrieved from http://www.brookings.edu/
Bradbury, D. (2013, June). Effective social media analytics. The Guardian. research/papers/2012/09/04-education-technology-west
Retrieved from http://www.theguardian.com/technology/2013/jun/10/ Wikipedia (2014a). Big data. Retrieved from http://en.wikipedia.org/wiki/Big_data
effective-social-media-analytics
Wikipedia (2014b). Bounce rate. Retrieved from http://en.wikipedia.org/wiki/
Einav, L. & Levin, J.D. (2013). The data revolution and economic analysis (NBER Bounce_rate
Working Paper no. 19035). Retrieved from http://www.nber.org/papers/
w19035 Wikipedia (2014c). Yotta. Retrieved from: http:// http://en.wikipedia.org/wiki/
Yotta-
e-skills UK (2013). Big data analytics. An assessment of demand for labour and
skills, 2012–2017 (E-skills UK report on behalf of SAS UK). Retrieved from Wikipedia (2014d). Yottabyte. Retrieved from http://en.wikipedia.org/wiki/
https://www.e-skills.com/Documents/Research/General/BigDataAnalytics_ Yottabyte
Report_Jan2013.pdf