ETI solved paper
ETI solved paper
Scheme - I
Sample Question Paper
Program Name : Diploma in Artificial Intelligence and Machine Learning
Program Code : AN
22684
Semester : Sixth
Course Title : Big Data Analytics
Marks : 70 Time: 3 Hrs.
Instructions:
Healthcare
The healthcare ecosystem consists of numerous entities including healthcare providers
(primary care physicians, specialists, or hospitals), payers (government, private health
insurance companies, employers), pharmaceutical, device and medical service companies, IT
solutions and services firms, and patients. The process of provisioning healthcare involves
massive healthcare data that exists in different forms (structured or unstructured), is stored in
disparate data sources (such as relational databases, or file servers) and in many different
lOMoARcPSD|39975562
formats. To promote more coordination of care across the multiple providers involved with
patients, their clinical information is increasingly aggregated from diverse sources into
Electronic Health Record (EHR) systems. EHRs capture and store information on patient
health and provider actions including individual-level laboratory results, diagnostic, treatment,
and demographic data. Though the primary use of EHRs is to maintain all medical data for an
individual patient and to provide efficient access to the stored data at the point of care, EHRs
can be the source for valuable aggregated information about overall patient populations [5, 6].
With the current explosion of clinical data the problems of how to collect data from distributed and
heterogeneous health IT systems and how to analyze the massive scale clinical data have become
critical. Big data systems can be used for data collection from different stakeholders (patients, doctors,
payers, physicians, specialists, etc) and disparate data sources (databases, structured and unstructured
formats, etc). Big data analytics systems allow
massive scale clinical data analytics and facilitate development of more efficient healthcare
applications, improve the accuracy of predictions and help in timely decision making.
Let us look at some healthcare applications that can benefit from big data systems:
• Epidemiological Surveillance: Epidemiological Surveillance systems study the distribution
and determinants of health-related states or events in specified populations and apply these
studies for diagnosis of diseases under surveillance at national level to control health
problems. EHR systems include individual-level laboratory results, diagnostic, treatment, and
demographic data. Big data frameworks can be used for integrating data from multiple EHR
systems and timely analysis of data for effectively and accurately predicting outbreaks,
population-level health surveillance efforts, disease detection and public health mapping.
• Patient Similarity-based Decision Intelligence Application: Big data frameworks can be
used for analyzing EHR data to extract a cluster of patient records most similar to a particular
target patient. Clustering patient records can also help in developing medical prognosis
applications that predicts the likely outcome of an illness for a patient based on the outcomes
for similar patients.
• Adverse Drug Events Prediction: Big data frameworks can be used for analyzing EHR
data and predict which patients are most at risk for having an adverse response to a certain
drug based on adverse drug reactions of other patients.
• Detecting Claim Anomalies: Heath insurance companies can leverage big data systems for
analyzing health insurance claims to detect fraud, abuse, waste, and errors.
• Evidence-based Medicine: Big data systems can combine and analyze data from a variety
of sources, including individual-level laboratory results, diagnostic, treatment and
demographic data, to match treatments with outcomes, predict patients at risk for a disease.
Systems for evidence-based medicine enable providers to make decisions not only based on
their own perceptions but also from the available evidence.
• Real-time health monitoring: Wearable electronic devices allow non-invasive and
continuous monitoring of physiological parameters. These wearable devices may be in various
forms such as belts and wrist-bands. Healthcare providers can analyze the collected healthcare
data to determine any health conditions or anomalies. Big data systems for real-time data
analysis can be used for analysis of large volumes of fast-moving data from wearable devices
and other in-hospital or in-home devices, for real-time patient health monitoring and adverse
event prediction.
lOMoARcPSD|39975562
Joining: Combining data from two data frames based on a common key.
lOMoARcPSD|39975562
lOMoARcPSD|39975562
e) Write and explain the Scala/Python code to create the Spark session.------(V)
lOMoARcPSD|39975562
c) Write syntax and example of Hive Query commands for following.---------- (IV)
(i) Create table
(ii) Alter Table
(iii) loading data into table from file
lOMoARcPSD|39975562
Scheme - I
Sample Test Paper - I
Program Name : Diploma in Artificial Intelligence and Machine
Learning
Program Code : AN
22684
Semester : Sixth
Course Title : Big Data Analytics
Marks : 20 Time: 1 Hour
Instructions:
12
Mark
Q.2) Attempt any THREE. s
a) Explain Data Science.
lOMoARcPSD|39975562
lOMoARcPSD|39975562
Scheme - I
Sample Test Paper - II
Program Name : Diploma in Artificial Intelligence and Machine
Learning
Program Code : AN 22684
Semester : Sixth
Course Title : Big Data Analytics
Marks : 20 Time: 1 Hour
Instructions:
c) Write syntax for loading data into table from file in HIVE
lOMoARcPSD|39975562