0% found this document useful (0 votes)
77 views

Bsa Assignment

Here are the steps to find the z-score for Tiffany's IQ score of 89: 1. We are given that the mean IQ score is 100 and the standard deviation is 15. 2. To calculate the z-score, we use the formula: z-score = (value - mean) / standard deviation 3. Plugging in the values for Tiffany's IQ score: z-score = (89 - 100) / 15 = -11/15 = -0.733 So the z-score for Tiffany's IQ score of 89 is -0.733.

Uploaded by

Detective PK
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views

Bsa Assignment

Here are the steps to find the z-score for Tiffany's IQ score of 89: 1. We are given that the mean IQ score is 100 and the standard deviation is 15. 2. To calculate the z-score, we use the formula: z-score = (value - mean) / standard deviation 3. Plugging in the values for Tiffany's IQ score: z-score = (89 - 100) / 15 = -11/15 = -0.733 So the z-score for Tiffany's IQ score of 89 is -0.733.

Uploaded by

Detective PK
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

A 1.

1 Discuss the problem addressed, technique applied and major conclusions


from the research (max. 200 words for each paper)
A 1.2 Discuss the major challenges faced by each (max. 100 words for each
paper)
A 1.3 References of the papers discussed above. Follow the Harvard style of
referencing
Big data not only talk about the volume of data but also the power of data and for making a data
powerful ,the data should be analyzed and utilized properly. Without proper analytics ,data will be just a
resource but not a utilized resource .The data sets are large and complex challenging the conventional
techniques to analyze and capture the outcomes. So in order to overcome these challenges Big Data
Analytics examine the data to uncover the hidden patterns and the useful information.

The main aim of healthcare is to prevent, diagnose and treat health related issues in human beings. The
main components of healthcare are the health professionals , health facilities (clinics hospitals and other
treatment technologies). At all these components various kinds of data is generated such as patient
medical history(diagnosis and prescriptions related data), medical and clinical data (like data from
imaging and laboratory examinations) and other medical data. Previously all these data are stored in a
traditional file system. But nowadays the digitization of all clinical exams and medical records in the
healthcare systems has become a standard which is widely adopted

The huge size and highly heterogeneous nature of the big data in healthcare can be less informative
using conventional technologies. For an efficient analysis , the data is required to be stored in a file
format that is easily accessible and is a distributed file system .To assist big data

Hadoop The best logical approach to analyze huge volumes of complex data is to distribute and process
it in parallel on multiple nodes .However, for such large size of data thousands of computing machines
are required to distribute and finish processing in reasonable amount of time. While working with
thousands of nodes, one has to handle issues like how to distribute the data, how to parallelize
computation and handle failures...The most popular open source distributed application for this purpose
is Hadoop . Hadoop Distributed File System (HDFS) is the file system component that provides a scalable,
efficient, and replica based storage of data at various nodes that form a part of a clusterThe MapReduce
engine and the HDFS have the capability to process thousand of terabytes of data. The Hadoop
technology helps to process the data quickly and analyze the data characteristically

Hadoop in Monitoring Patient Vitals Nowadays many big hospitals are using Hadoop to work efficiently
and better. Many such hospitals use some kind of sensors around the patient’s bed to capture and store
the patient activities and behavior like BP, Cholestrol ,etc. It also captures the kind of issues the patient
is facing and report them. Now these sensors are generating huge amount of data for which Big Data
Hadoop is needed. Later these data are being analyzed and used for enhancing the hospitals service and
treatment.

Real Time Alerting In hospitals , Clinical Decision Support(CDS) analyses the medical data on the spot
advise health professionals in making their prescriptive decisions. There are various smart devices (IoT
devices) which if the patient have with him , sends the health data to cloud. Now these data are
available to the health practitioners and they can now act on that in real time by analyzing the data with
the help of Big Data Hadoop.

Lifestyle Analytics Lifestyle of a patient is also one of the causes of many diseases. It has a great impact
on the causes of any disease. Lifestyle analytics helps to prevent the medical accidents, increase the
accuracy towards disease detection and provide the healthcare solutions based on the lifestyle of the
individuals

.Conclusion

Big data is a new feature in all sectors although it represents a major advancement in healthcare.
Nowadays various healthcare devices are generating a big amount of data. Data generated in healthcare
is of great importance as it may enable us to ensure personalized healthcare and minimize the
healthcare costs. Therefore it is very mandatory for us to know about the data by properly assessing it.
The analysis of such data can lead to further insights in terms of medical and other types of
improvements in healthcare. The collective big data analysis of EMRs and EHRs is continuously helping
to build a better framework. The companies providing service for healthcare analytics and clinical
transformation are infact contributing towards better and effective outcome. The primary goals of the
companies include developing effective Clinical Decision Support(CDS) system ,providing platform for
better treatment strategies ,reducing cost of analytics and identifying and preventing fraud associated
with big data. Though almost all the companies face challenges on issues like security, accuracy and
reliability (particularly the cloud security), the combined pool of data from healthcare organizations
have resulted in a better outlook and treatment of various diseases. This has also been very helpful in
constructing a better and healthier personalized framework. Modern healthcare fraternities has realized
the potential of big data and therefore have implemented Big Data analytics in healthcare and clinical
practices.

Challenges of Big Data Analytics in Healthcare

Storage

Storing large volume of healthcare data is a very big challenge. Many healthcare organizations are using
on site server network for the data storage but this can be expensive to scale and difficult to maintain.
The cloud based storage system can be a better reliable option in order to store the healthcare data. The
organizations must choose cloud partners that understand the healthcare specific compliance and
security issues.

Unified format

It is very difficult to maintain big data especially when the data don’t have any unique identifier. Most of
the patients doesn’t have any unique identifier which make the data generated difficult to handle. So,
there is a need to codify all the clinical relevant information for the purpose of clinical analytics and
billing purposes.

Accuracy

Sometimes the patient report data generated by the EMRs or EHRs is not entirely accurate because of
the poor EHR utility, complex workflows, etc. All these factors can contribute to the quality for big data
all along its lifecycle. Image pre-processing Various physical factors can lead to altered data quality and
misinterpretations from existing medical records. The medical can involve multiple types of noise and
artifacts. Improper handling of medical images can also cause tampering of images.

Security

Many healthcare organizations have a cloud based storage system in order to store the medical records,
the medical records contain very essential information . There can be some security breaches like
hackings , phising attacks and malware attacks, So security is one of the biggest concern in Big Data
analytics

[1] IBM,(2015, September 25) Big Data at the speed of business: Big data and analytics, Retrieved from
URL: http://www-01.ibm.com/software/data/bigdata/

[2] IBM (2013), Descriptive, Predictive, Prescriptive: Transforming asset and facilities management with
analytics, Retrieved from document, TIW14162USEN, IBM Corp.

[3] Big Data . Case Study: How Big Data is Solving Healthcare Problems Successfully, Retrieved from URL:
https://www.hdfstutorial.com/blog/big-dataapplication-inhealthcare/

[4] 5 Healthcare applications of Hadoop and Big Data, Retrieved from URL:
https://www.dezyre.com/article/5-healthcareapplications-of-hadoop-and-big-data/85

[5] IJSTR(2017, March 03) Improving Healthcare Using Big Data Analytics Author :Revanth Sonnati,
Retrieved From URL: https://www.ijstr.org/finalprint/mar2017/ImprovingHealthcare-Using-Big-Data-
Analytics/
PART B 1.1
STEPS: -
1. First, we create the JMP data table. It include the IQ scores for the 20 students.
2. Next in JMP software we select the analyze part.
3. In the analyze part select the distribution.
4. In distribution select IQ for the Y and columns variable and click OK.
5. Next in IQ select test mean and entered the values for mean 100 and SD as 15, and click
OK
Student
Name IQ
11
Kathy 0
13
Mike 2
Adam 98
Celia 97
11
Christina 5
14
Aaron 5
Elaine 77
13
Jesse 0
11
Sam 4
12
Nikki 8
Amanda 89
10
Steve 1
Jason 92
Tabitha 85
11
Mindy 2
Drew 79
13
Shailaja 9
10
Samir 2
10
Robert 3
Tiffany 89
Distributions
IQ
150

140

130

120

110

100

90

80

70

Quantiles

100.0% maximum 145


99.5% 145
97.5% 145
90.0% 138.3
75.0% quartile 124.75
50.0% median 102.5
25.0% quartile 89.75
10.0% 79.6
2.5% 77
0.5% 77
0.0% minimum 77

Summary Statistics

Mean 106.85
Std Dev 19.879307
Std Err Mean 4.4451482
Upper 95% Mean 116.1538
Lower 95% Mean 97.546198
N 20

Test Mean

Hypothesized Value 106


Actual Estimate 106.85
DF 19
Std Dev 19.8793
Sigma given 15

z Test
Test Statistic 0.2534
Prob > |z| 0.7999
Prob > z 0.4000
Prob < z 0.6000

Report: -
1. The p-value for the two-sided test is reported next to Prob>Z value, so hypothesis is accept
in the problem.
2. The p-value for the one-sided tests are reported next to Prob > Z and Prob< Z.
3. The signs in the one-sided p-values correspond to the signs in the alternative hypothesis.

B1.2)

Steps: -
1. First, we create a JMP data table. It includes source, date and obliquity.
2. In JMP software we select the analyze part.
3. In analyze part we select the distribution.
4. In distribution we select the obliquity, in that click Y as an obliquity and columns variable
and click OK.
5. Click on the red triangle next to obliquity, and choose the Test Mean.
6. Next, we enter the PS for the Mean and SD blank.

Dat Obliquit
Source e y
Regiomontan 146
us 0 23.5
150 23.4733
Copernicus 0 3
150 23.4877
Waltherus 0 8
157 23.5077
Danti 0 8
157
Tycho 0 23.525

Distributions
Obliquity
23.53

23.52

23.51

23.5

23.49

23.48

23.47

Quantiles

100.0% maximum 23.525


99.5% 23.525
97.5% 23.525
90.0% 23.525
75.0% quartile 23.51639
50.0% median 23.5
25.0% quartile 23.480555
10.0% 23.47333
2.5% 23.47333
0.5% 23.47333
0.0% minimum 23.47333

Summary Statistics

Mean 23.498778
Std Dev 0.019613
Std Err Mean 0.0087712
Upper 95% Mean 23.523131
Lower 95% Mean 23.474425
N 5

Test Mean

Hypothesized Value 23.477


Actual Estimate 23.4988
DF 4
Std Dev 0.01961

t Test
Test Statistic 2.4829
Prob > |t| 0.0680
Prob > t 0.0340*
Prob < t 0.9660

23.45 23.47 23.49 23.51

Confidence Intervals

Parameter Estimate Lower CI Upper CI 1-Alpha


Mean 23.49878 23.47443 23.52313 0.950
Std Dev 0.019613 0.011751 0.056359 0.950

Report: -
1. The P-value for the two-sided test is reported next to prob>T.
2. The P-value for the one-sided tests are reported next Prob> T and Prob<T.
3. The signs in the one-sided p-values corresponded to the signs in the alternative
hypothesis.

Response resale
Whole Model
Actual by Predicted Plot
70

60

50

40

30

20

10

0
0 10 20 30 40 50 60 70

resale Predicted RMSE=3.2042 RSq=0.93


PValue=<.0001
Effect Summary

Source LogWorth PValue


price 37.347 0.00000
engine_s 3.352 0.00044
type 0.630 0.23442
horsepow 0.479 0.33196
sales 0.079 0.83323

Residual by Predicted Plot

10

-5

-10
0 10 20 30 40 50 60 70
resale Predicted

Summary of Fit

RSquare 0.926289
RSquare Adj 0.923027
Root Mean Square Error 3.204219
Mean of Response 18.05937
Observations (or Sum Wgts) 119

Analysis of Variance

Source DF Sum of Mean Square F Ratio


Squares
Model 5 14579.230 2915.85 284.0012
Error 113 1160.173 10.27 Prob > F
C. Total 118 15739.404 <.0001*

Parameter Estimates

Term Estimate Std Error t Ratio Prob>|t|


Intercept 0.3553404 1.117332 0.32 0.7511
sales -0.000919 0.004353 -0.21 0.8332
type 0.9278507 0.776163 1.20 0.2344
price 0.8410128 0.042996 19.56 <.0001*
engine_s -2.382365 0.658331 -3.62 0.0004*
horsepow 0.0157612 0.016176 0.97 0.3320
sales
Leverage Plot
70

60

50

40

30

20

10

0
-100 0 100 200 300 400 500 600
sales Leverage, P=0.8332

type
Leverage Plot
70

60

50

40

30

20

10

0
-0.5 0 0.5 1.0
type Leverage, P=0.2344
price
Leverage Plot
70

60

50

40

30

20

10

0
0 20 40 60 80
price Leverage, P<.0001

engine_s
Leverage Plot
70

60

50

40

30

20

10

0
1 2 3 4 5 6 7 8
engine_s Leverage, P=0.0004
horsepow
Leverage Plot
70

60

50

40

30

20

10

0
100 200 300 400
horsepow Leverage, P=0.3320

Steps: -

1. First, we create the JMP table.


2. In JMP software we go for analyze part.
3. In analyze part, we select the fit to model.
4. In the fit model, resale as Y model and add the construct mode effect as a sale,
type, price, engine and horse power and Click on OK.
5. In response scale, we select a save column. In that select a prediction formula.

You might also like