0% found this document useful (0 votes)
6 views14 pages

Data Science Practical Manual

Uploaded by

Medha Bandi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views14 pages

Data Science Practical Manual

Uploaded by

Medha Bandi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

1. Write a program to find the mean absolute deviation for the given data set.

[26,46,56,45,19,22,24].

PROCEDURE:
Step 1: Calculate the mean.
Step 2: Calculate the distance of each data point from the mean. We need to find the
absolute value.
Step 3: Calculate the mean of the distances.
OUTPUT:
2. Write a program to find standard deviation for the following data set.

There are 39 plants in the garden. A few plants were selected randomly and their heights
in cm were recorded as follows: 1,2,3,5,8. Calculate the standard deviation of their
heights.

PROCEDURE:
Step 1: Calculate the mean by adding up all the data pieces and dividing it by the number
of pieces of the data.
Step 2: Subtract mean from every value.
Step 3: Square each of the differences.
Step 4: Find the average of squared numbers calculated in point number 3 to find the
variance.
Step 5: Lastly, find the square root of variance. That is the standard deviation.
OUTPUT:

There are 39 plants in the garden. A few plants were selected randomly and their heights in
cm were recorded as follows: 1,2,3,5,8. Calculate the standard deviation of their heights.

STEP3:
STEP1: STEP 2: CALCULATE STEP5:
CALCULATE CALCULATE SQUARE OF STEP 4: STANDARD
DATA SET MEAN DISTANCE DISTANCE VARIANCE DEVIATION
1 3.8 2.8 7.84 6.16 2.4819347
2 1.8 3.24
3 0.8 0.64
5 1.2 1.44
8 4.2 17.64
3. Write a program to collect data. Analyse it and interpret the result. Consider
the following data set for the statistical problem-solving process.

Consider that you have a food event in your residential society. Perform detailed
analysis and interpret what should be the top five cuisines that most people in the
society prefer for this event.

PROCEDURE:
Step 1: Formulate Statistical Investigative Questions
Step 2. Collect/Consider the Data
Step 3. Analyse the Data
Step 4. Interpret the Data
OUTPUT:
Data collected from each block of the apartment:

Consolidated data for analysis:

Data interpretation:
1. How many are interested in South Indian Cuisine?
26
2. How many people are interested in Chinese cuisine from block 3?
3
4. Write a program to find central limit theorem after observing the following data.
In a country in the middle east region, the recorded weights of the male population
follow a normal distribution. The mean and the standard deviations are 70 kg and 15
kg, respectively. If a person is eager to find the record of 50 males in the population,
then what would mean and the standard deviation of the chosen sample?
PROCEDURE:
Step 1: Draw groups of people at random from your area. We will call this a sample.
We will draw multiple samples in this case, each consisting of 30 people.
Step 2: Calculate the individual mean of each sample set.
Step 3: Calculate the mean of these sample means.
Step 4: To add up to this, a histogram of sample mean weights of people will
resemble a normal distribution.
The formula for the central limit theorem is:

μ = Population mean
σ = Population standard deviation
μx¯¯¯ = Sample mean
σx¯¯¯ = Sample standard deviation
n = Sample size
OUTPUT:
5. Write a program to find the quartile for the following odd dataset.
34 24 43 5 58 81 29 90 22 67 32 88 57 34 43 44 91 24 62
PROCEDURE:
Step 1: Sort in Ascending Order
Step 2: Find N
Step 3: Calculate Lower Quartile (Q1)
Lower Quartile (Q1) = (N+1)x1/4
Step 4: Calculate Middle Quartile (Q2)
Middle Quartile (Q2) = (N+1)x2/4
Step 5: Calculate Upper Quartile (Q3)
Upper Quartile (Q3)= (N+1)x3/4
OUTPUT:

N 19
SORTED
POSITION DATASET DATA
1 34 5 Q1 Q2 Q3
2 24 22 POSITION 5 10 15
3 43 24 DATA 58 67 43
4 5 24
5 58 29
6 81 32
7 29 34
INTER QUARTILE
90 34
8 RANGE: Q3-Q1 -15
9 22 43
10 67 43
11 32 44
12 88 57
13 57 58
14 34 62
15 43 67
16 44 81
17 91 88
18 24 90
19 62 91
6. Write a program to find the quartile for the following even dataset.
54 28 76 64 41 83 19 71 37 58
PROCEDURE:
Step 1: Sort in Ascending Order
Step 2: Find N
Step 3: Calculate Middle Quartile (Q2) or find the median of the dataset
Middle Quartile (Q2) =N/2 & (N+1)/2
Step 4: Split the Dataset into first half and second half
Step 5: Calculate Lower Quartile (Q1) for first half of the data set.
Lower Quartile (Q1) = (N+1)2
Step 6: Calculate Upper Quartile (Q3)
Upper Quartile (Q3) = (N+1)2
OUTPUT:

SORTED
POSITION DATASET DATA
1 54 19 N 10
2 28 28
3 76 37
4 64 41 Q2
5 41 54 POSTION 5.5 BETWEEN 5 AND 6
6 83 58 DATA 56 (54+58 )/2
7 19 64
8 71 71
9 37 76
10 58 83

SORTED First Last


POSITION DATASET DATA Half Half Q1 Q3
1 54 19 19 58 POSITION 3 3
2 28 28 28 64 DATA 37 71
3 76 37 37 71
4 64 41 41 76
5 41 54 54 83
56
6 83 58 INTERQUARTILE RANGE: Q3-Q1
7 19 64 34
8 71 71
9 37 76
10 58 83
7. Write a program to find the decile for the following data set.
4 9 10 10 12 13 88 90 91 96 99 100 16 49 49 52 55 58 60 60 63 64 65 65 65 73 75 81 83
84 86 17 26 27 33 38 42 43 46
PROCEDURE:
Step 1: Arrange the data set in ascending order.
Step 2: Give the position for each data points.
Step 3: Calculate the decile using the formula
Di = (N + 1) * i / 10
Step 4: Calculate the decile from D1 to D9
OUTPUT:
Position Data Set
1 4
2 9 n 39
3 10
4 10
5 12 Decile Data Position Data
6 13 D1 4 10
7 16 D2 8 17
8 17 D3 12 38
9 26 D4 16 49
10 27 D5 20 58
11 33 D6 24 64
12 38 D7 28 73
13 42 D8 32 84
14 43 D9 36 91
15 46
16 49
17 49
18 52
19 55
20 58
21 60
22 60
23 63
24 64
25 65
26 65
27 65
28 73
29 75
30 81
31 83
32 84
33 86
34 88
35 90
36 91
37 96
38 99
39 100

You might also like