Bsa Assignment
Bsa Assignment
The main aim of healthcare is to prevent, diagnose and treat health related issues in human beings. The
main components of healthcare are the health professionals , health facilities (clinics hospitals and other
treatment technologies). At all these components various kinds of data is generated such as patient
medical history(diagnosis and prescriptions related data), medical and clinical data (like data from
imaging and laboratory examinations) and other medical data. Previously all these data are stored in a
traditional file system. But nowadays the digitization of all clinical exams and medical records in the
healthcare systems has become a standard which is widely adopted
The huge size and highly heterogeneous nature of the big data in healthcare can be less informative
using conventional technologies. For an efficient analysis , the data is required to be stored in a file
format that is easily accessible and is a distributed file system .To assist big data
Hadoop The best logical approach to analyze huge volumes of complex data is to distribute and process
it in parallel on multiple nodes .However, for such large size of data thousands of computing machines
are required to distribute and finish processing in reasonable amount of time. While working with
thousands of nodes, one has to handle issues like how to distribute the data, how to parallelize
computation and handle failures...The most popular open source distributed application for this purpose
is Hadoop . Hadoop Distributed File System (HDFS) is the file system component that provides a scalable,
efficient, and replica based storage of data at various nodes that form a part of a clusterThe MapReduce
engine and the HDFS have the capability to process thousand of terabytes of data. The Hadoop
technology helps to process the data quickly and analyze the data characteristically
Hadoop in Monitoring Patient Vitals Nowadays many big hospitals are using Hadoop to work efficiently
and better. Many such hospitals use some kind of sensors around the patient’s bed to capture and store
the patient activities and behavior like BP, Cholestrol ,etc. It also captures the kind of issues the patient
is facing and report them. Now these sensors are generating huge amount of data for which Big Data
Hadoop is needed. Later these data are being analyzed and used for enhancing the hospitals service and
treatment.
Real Time Alerting In hospitals , Clinical Decision Support(CDS) analyses the medical data on the spot
advise health professionals in making their prescriptive decisions. There are various smart devices (IoT
devices) which if the patient have with him , sends the health data to cloud. Now these data are
available to the health practitioners and they can now act on that in real time by analyzing the data with
the help of Big Data Hadoop.
Lifestyle Analytics Lifestyle of a patient is also one of the causes of many diseases. It has a great impact
on the causes of any disease. Lifestyle analytics helps to prevent the medical accidents, increase the
accuracy towards disease detection and provide the healthcare solutions based on the lifestyle of the
individuals
.Conclusion
Big data is a new feature in all sectors although it represents a major advancement in healthcare.
Nowadays various healthcare devices are generating a big amount of data. Data generated in healthcare
is of great importance as it may enable us to ensure personalized healthcare and minimize the
healthcare costs. Therefore it is very mandatory for us to know about the data by properly assessing it.
The analysis of such data can lead to further insights in terms of medical and other types of
improvements in healthcare. The collective big data analysis of EMRs and EHRs is continuously helping
to build a better framework. The companies providing service for healthcare analytics and clinical
transformation are infact contributing towards better and effective outcome. The primary goals of the
companies include developing effective Clinical Decision Support(CDS) system ,providing platform for
better treatment strategies ,reducing cost of analytics and identifying and preventing fraud associated
with big data. Though almost all the companies face challenges on issues like security, accuracy and
reliability (particularly the cloud security), the combined pool of data from healthcare organizations
have resulted in a better outlook and treatment of various diseases. This has also been very helpful in
constructing a better and healthier personalized framework. Modern healthcare fraternities has realized
the potential of big data and therefore have implemented Big Data analytics in healthcare and clinical
practices.
Storage
Storing large volume of healthcare data is a very big challenge. Many healthcare organizations are using
on site server network for the data storage but this can be expensive to scale and difficult to maintain.
The cloud based storage system can be a better reliable option in order to store the healthcare data. The
organizations must choose cloud partners that understand the healthcare specific compliance and
security issues.
Unified format
It is very difficult to maintain big data especially when the data don’t have any unique identifier. Most of
the patients doesn’t have any unique identifier which make the data generated difficult to handle. So,
there is a need to codify all the clinical relevant information for the purpose of clinical analytics and
billing purposes.
Accuracy
Sometimes the patient report data generated by the EMRs or EHRs is not entirely accurate because of
the poor EHR utility, complex workflows, etc. All these factors can contribute to the quality for big data
all along its lifecycle. Image pre-processing Various physical factors can lead to altered data quality and
misinterpretations from existing medical records. The medical can involve multiple types of noise and
artifacts. Improper handling of medical images can also cause tampering of images.
Security
Many healthcare organizations have a cloud based storage system in order to store the medical records,
the medical records contain very essential information . There can be some security breaches like
hackings , phising attacks and malware attacks, So security is one of the biggest concern in Big Data
analytics
[1] IBM,(2015, September 25) Big Data at the speed of business: Big data and analytics, Retrieved from
URL: http://www-01.ibm.com/software/data/bigdata/
[2] IBM (2013), Descriptive, Predictive, Prescriptive: Transforming asset and facilities management with
analytics, Retrieved from document, TIW14162USEN, IBM Corp.
[3] Big Data . Case Study: How Big Data is Solving Healthcare Problems Successfully, Retrieved from URL:
https://www.hdfstutorial.com/blog/big-dataapplication-inhealthcare/
[4] 5 Healthcare applications of Hadoop and Big Data, Retrieved from URL:
https://www.dezyre.com/article/5-healthcareapplications-of-hadoop-and-big-data/85
[5] IJSTR(2017, March 03) Improving Healthcare Using Big Data Analytics Author :Revanth Sonnati,
Retrieved From URL: https://www.ijstr.org/finalprint/mar2017/ImprovingHealthcare-Using-Big-Data-
Analytics/
PART B 1.1
STEPS: -
1. First, we create the JMP data table. It include the IQ scores for the 20 students.
2. Next in JMP software we select the analyze part.
3. In the analyze part select the distribution.
4. In distribution select IQ for the Y and columns variable and click OK.
5. Next in IQ select test mean and entered the values for mean 100 and SD as 15, and click
OK
Student
Name IQ
11
Kathy 0
13
Mike 2
Adam 98
Celia 97
11
Christina 5
14
Aaron 5
Elaine 77
13
Jesse 0
11
Sam 4
12
Nikki 8
Amanda 89
10
Steve 1
Jason 92
Tabitha 85
11
Mindy 2
Drew 79
13
Shailaja 9
10
Samir 2
10
Robert 3
Tiffany 89
Distributions
IQ
150
140
130
120
110
100
90
80
70
Quantiles
Summary Statistics
Mean 106.85
Std Dev 19.879307
Std Err Mean 4.4451482
Upper 95% Mean 116.1538
Lower 95% Mean 97.546198
N 20
Test Mean
z Test
Test Statistic 0.2534
Prob > |z| 0.7999
Prob > z 0.4000
Prob < z 0.6000
Report: -
1. The p-value for the two-sided test is reported next to Prob>Z value, so hypothesis is accept
in the problem.
2. The p-value for the one-sided tests are reported next to Prob > Z and Prob< Z.
3. The signs in the one-sided p-values correspond to the signs in the alternative hypothesis.
B1.2)
Steps: -
1. First, we create a JMP data table. It includes source, date and obliquity.
2. In JMP software we select the analyze part.
3. In analyze part we select the distribution.
4. In distribution we select the obliquity, in that click Y as an obliquity and columns variable
and click OK.
5. Click on the red triangle next to obliquity, and choose the Test Mean.
6. Next, we enter the PS for the Mean and SD blank.
Dat Obliquit
Source e y
Regiomontan 146
us 0 23.5
150 23.4733
Copernicus 0 3
150 23.4877
Waltherus 0 8
157 23.5077
Danti 0 8
157
Tycho 0 23.525
Distributions
Obliquity
23.53
23.52
23.51
23.5
23.49
23.48
23.47
Quantiles
Summary Statistics
Mean 23.498778
Std Dev 0.019613
Std Err Mean 0.0087712
Upper 95% Mean 23.523131
Lower 95% Mean 23.474425
N 5
Test Mean
t Test
Test Statistic 2.4829
Prob > |t| 0.0680
Prob > t 0.0340*
Prob < t 0.9660
Confidence Intervals
Report: -
1. The P-value for the two-sided test is reported next to prob>T.
2. The P-value for the one-sided tests are reported next Prob> T and Prob<T.
3. The signs in the one-sided p-values corresponded to the signs in the alternative
hypothesis.
Response resale
Whole Model
Actual by Predicted Plot
70
60
50
40
30
20
10
0
0 10 20 30 40 50 60 70
10
-5
-10
0 10 20 30 40 50 60 70
resale Predicted
Summary of Fit
RSquare 0.926289
RSquare Adj 0.923027
Root Mean Square Error 3.204219
Mean of Response 18.05937
Observations (or Sum Wgts) 119
Analysis of Variance
Parameter Estimates
60
50
40
30
20
10
0
-100 0 100 200 300 400 500 600
sales Leverage, P=0.8332
type
Leverage Plot
70
60
50
40
30
20
10
0
-0.5 0 0.5 1.0
type Leverage, P=0.2344
price
Leverage Plot
70
60
50
40
30
20
10
0
0 20 40 60 80
price Leverage, P<.0001
engine_s
Leverage Plot
70
60
50
40
30
20
10
0
1 2 3 4 5 6 7 8
engine_s Leverage, P=0.0004
horsepow
Leverage Plot
70
60
50
40
30
20
10
0
100 200 300 400
horsepow Leverage, P=0.3320
Steps: -