0% found this document useful (0 votes)

10 views

IT-3006(DA)-CS_END_MAY_2023

Uploaded by

girivinayak0

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

IT-3006(DA)-CS_END_MAY_2023

Uploaded by

girivinayak0

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

SPRING END SEMESTER EXAMINATION-2023

6th Semester, B.Tech

DATA ANALYTICS (IT-3006)
Evaluation Scheme and Solution

1. Answer the following questions.

(a) Explain the similarity and difference between JSON and BSON with
suitable examples.
[Evaluation Scheme] Full mark for the correct answer. 0.5 mark for
similarity and 0.5 for difference. No step-wise mark to be awarded.
[Solution]
Similarity: Both represent semi-structured format.
Difference: BSON is not in a readable format wherein JSON is readable
format.
(b) What is the difference between univariate, bivariate, and multivariate
analysis?
[Evaluation Scheme] Full mark for the correct answer. Step-wise mark
should be awarded based on the partial correctness of the solution.
[Solution]
Univariate represents the type of data that consists of only one variable
and its analysis involves central tendency measures (mean, median and
mode), dispersion or spread of data (range, minimum, maximum, quartiles,
variance and standard deviation) and by using frequency distribution
tables, histograms, pie charts, frequency polygon and bar charts.
Bivariate represents the type of data that consists of two variables and its
analysis involves comparisons, relationships, causes and explanations.
Multivariate represents the type of data that consists of more than two
variables and its analysis involves regression analysis, path analysis, factor
analysis and multivariate analysis of variance (MANOVA).
(c) Consider the below dataset that contains the number of hours of studies
and the actual score received for 3 students in data analytics, and the
predicted score was calculated with linear regression. Calculate R2.
# Number of hrs Actual score Predicted score
1 2 74 72
2 3 80 83
3 4 76 79
[Evaluation Scheme] Full mark for the correct answer. No step-wise mark
to be awarded.
[Solution]
Mean of actual score = 76.66 which is rounded to 77
SSR = Sum of squares regression = (72 - 77)2 + (83 – 77)2 + (79 – 77)2 =
25 + 36 + 4 = 65
SSE = Sum of squares error = (72 - 74)2 + (83 – 80)2 + (79 – 76)2 = 4 + 9 +
9 = 22

1
SST = Sum of squares total = SSR + SSE = 65 + 22 = 87
R2 = SSR / SST = 65/87 = 0.747 and such value indicate moderately fit
model.
(d) A time series model is mathematically represented as Y t = f(Tt, St, Ct, It)
where Yt is the time series value at time t. Tt, St, Ct, and It are the trend,
seasonal, cyclic and irregular component value at time t respectively.
Represents the model
(1) When the amplitude of seasonal and irregular variations does not
change as the level of trend rises or falls.
(2) When the amplitude of both the seasonal and irregular variations
increase as the level of trend rises.
[Evaluation Scheme] Full mark for the correct answer. 0.5 mark for 1 st
part and 0.5 for other part. No step-wise mark to be awarded.
[Solution]
(1) When the amplitude of seasonal and irregular variations does not
change as the level of trend rises or falls, time series follows additive
model and it is represented by Yt = Tt + St + Ct + It
(2) When the amplitude of both the seasonal and irregular variations
increase as the level of trend rises, time series follows multiplicative
model and it is represented by Yt = Tt * St * Ct * It
(e) Suppose a hierarchical clustering to be applied in segmenting the students
and following sample has been collected. Create the proximity matrix for
the below sample. The mark is out of 20 in the mid semester.
Roll No Sex Section Mark
1 Male CSE -1 10
2 Female IT – 1 17
3 Male CSSE – 1 18
4 Female CSCE - 1 20
[Evaluation Scheme] Full mark for the correct answer. No step-wise mark
to be awarded.
[Solution]
Since the students dataset have 4 observations, so a 4 X 4 proximity matrix
is to be created wherein the diagonal elements is 0 as the distance of a
point with itself is always 0. Applying Euclidean distance formula, the
matrix looks as follows:
Roll No 1 2 3 4
1 0 √(10-17)2= 7 √(10-18)2= 8 √(10-20)2= 10

2 √(17-10)2= 7 0 √(17-18)2= 1 √(17-20)2= 3

3 √(18-10)2= 8 √(18-17)2= 1 0 √(18-20)2= 2

4 √(20-10)2= 10 √(20-17)2= 3 √(20-18)2= 2 0

(f) Consider the following dataset, wherein TID represents transaction ID and
G to O represents individual products. In the dataset, 1 represents a

2
transaction that includes the specific products. For instance, TID 1
includes all products and TID 3 includes M and O product. Calculate
Confidence({G, A} => {M}).
TID G A M
1 1 1 1
2 1 0 1
3 0 0 1
4 0 1 0
5 1 1 1
6 1 1 0
[Evaluation Scheme] Full mark for the correct answer. No step-wise mark
to be awarded.
[Solution]
Confidence({G, A} => {M}) = Support(G, A, M)/Support(G, A) = [2/6] /
[3/6] = 0.667
(g) Consider the decagon, which has 10 sides. Three sides are marked 1, two
sides are marked 2, one side is marked 3, two sides are marked 4, and two
sides are marked 5. Draw a graph representing occurrence of each mark
verses its probability.
[Evaluation Scheme] Full mark for the correct answer. No step-wise mark
to be awarded.
[Solution]
Probability(side marked as 1) = 3 / 10 = 0.3
Probability(side marked as 2) = 2/10 = 0.2
Probability(side marked as 3) = 1 / 10 = 0.1
Probability(side marked as 4) = 2/10 = 0.2
Probability(side marked as 5) = 2/10 = 0.2
The graph looks as follows:

(h) Consider the following dataset. Consider support count is represented with
SC. Calculate (SC({E}) + SC({A, B}) + SC({C, D})) / (SC({A, B, C, E})
+ SC({A, B, C, D, E}))
Transaction Itemset
T1 A, B
T2 B, D
T3 B, C
T4 A, B, D
T5 A, C
T6 B, C
T7 A, B, C, E

3
[Evaluation Scheme] Full mark for the correct answer. No step-wise mark
to be awarded.
[Solution]
SC({E}) = 1, SC({A, B}) = 3, SC({C, D}) = 0, SC({A, B, C, E}) = 1, and
SC({A, B, C, D, E}) = 0
Numerator = (SC({E}) + SC({A, B}) + SC({C, D})) = 1 + 3 + 0 = 4
Denominator = SC({A, B, C, E}) + SC({A, B, C, D, E}) = 1 + 0 = 1
Therefore, (SC({E}) + SC({A, B}) + SC({C, D})) / (SC({A, B, C, E}) +
SC({A, B, C, D, E})) = 4/1 = 4
(i) A bloom filter with a size of 1000 slots is used to store the information of
100 data stream items using 4 hash functions. Calculate the false positive
probability.
[Evaluation Scheme] Full mark for the correct answer. No step-wise mark
to be awarded.
[Solution]
n = size of bloom filter = 1000
m = number of expected elements to be inserted = 100
k = number of hash functions = 4

False positive probability =

(1/e)km/n = (1/2.718)4*100/1000 =0.670. So, (1-0.670)4 = 0.3294 = 0.011
(j) What is the probability that a slot is hashed in a bloom filter where n is the
size and k is the number of hash functions?
[Evaluation Scheme] Full mark for the correct answer. No step-wise mark
to be awarded.
[Solution]
Probability that a slot is hashed with one hash function = 1/n, so with k
hash functions it is 1/nk

2. (a) Consider the following dataset. Draw the MapReduce process to find the number
of customers from each city followed by each state, both in the chronological
order.
ID Name City State
1 Sujay Lila Ambikapur Chhattisgarh
2 Geetha Choudhary Bhilai Chhattisgarh
3 Anandi D'Cruz Bilaspur Chhattisgarh
4 Surendra Nagarkar Cuttack Odisha
5 Balwinder Nagarkar Bangalore Karnataka
6 Nitin Nibhanupudi Mangalore Karnataka
7 Dinesh Sharma Cuttack Odisha
8 Raj Chaudhri Bilaspur Chhattisgarh

4
9 Govind Kumar Mysore Karnataka
10 Jayanta Begam Ambikapur Chhattisgarh
[Evaluation Scheme] Full mark for the correct answer. 2 marks for city
and 2 marks for state. Step-wise mark can be awarded based on the partial
correctness of the solution.
[Solution]
MapReduce process to find the number of customers from each city:

MapReduce process to find the number of customers from each state:

(b) A retail company wants to enhance their customer experience by analysing

the customer reviews for different products, so that they can inform the
corresponding vendors and manufacturers about the product defects and
shortcomings. You have been tasked to analyse the complaints filed under
each product & the total number of complaints filed based on the
geography, type of product, etc. You also have to figure out the complaints
which have no timely response. Discuss and then model your views
concerning descriptive, diagnostic and predictive analytics.
[Evaluation Scheme] Full mark for the correct answer. Step-wise mark
can be awarded based on the partial correctness of the solution.
[Solution]
Descriptive analytics model – the model should use historical and current
data to seek answer for the questions “what has been happened” using data
analytics technique such as box plot. Few examples may be as follows:
(1) Find the number of complaints by geography and type of product
(2) Which geography contributed maximum number of negative
review comments?
(3) Which product type has maximum number of positive review

5
comments?
Diagnostic analytics model – the model should use historical and current
data to seek answer for the questions “why it has been happened” using
data analytics technique such as drill-through, root cause analysis using
fish bone, etc. Few examples may be as follows:
(1) Why is the number of complaints by Asian geography and food
and beverages product
(2) Why male of Asian geography provided maximum number of
negative review comments?
(3) Why the suitable features of beauty care product type are has
collected maximum number of positive review comments?
Predictive analytics model – the model should use historical and current
data to seek answer for the questions “what will happen in the” using data
analytics technique such as regression, clustering, classifications etc. Few
examples may be as follows:
(1) What would be the total number of complaints by Asian geography
and food and beverages product by the end of this quarter?
(2) What is expected number of negative review comments from Asian
geography by end of this month?
(3) What would be the sale by end of this year?

3. (a) In the population, the average IQ is 100 with a standard deviation of 15. A
team of scientists want to test a new medication to see if it has either a
positive or negative effect on intelligence or not effect at all. A sample of
30 participants who have taken the medication has a mean of 140. Using
hypothesis testing, find the answer to the question i.e., did the medication
affect intelligence? The z value (i.e., critical value) from statistical table is
found to be 1.96. The solution must mention the null (H0) and alternative
hypotheses (Ha).
[Evaluation Scheme] Full mark for the correct answer. Step-wise mark
can be awarded based on the partial correctness of the solution.
[Solution]
Step 1: Set up the null and alternate hypothesis
H0: medication affects intelligence
Ha: medication does not affect intelligence.
Step 2: Determine the type of test to use
Since the sample size is 30, the z-test is used.
Step 3: Calculate the tested statistic z using the formula

6
Where x̄n is the mean of the population, µ0 is the null hypothesis (i.e., the
mean) to be tested, σ is the standard deviation, and n is the sample size.
Using the data given in the equation we would have the following:
μ0 = 100, σ = 15, n = 30, x̄n = 140
Plugging the values into the formula: ((140 – 100) / 15) * √30 = 14.606
Step 4: In the question, z value is provided i.e., 1.96 and hence no need to
look into z table.
Step 5: drawing conclusion
The tested statistic value of z calculated is more than the critical value
obtained from statistical tables (i.e., 14.606 > 1.96). Therefore the null
hypothesis is rejected. This means that the medication administered does
not affect intelligence.
(b) Find the relationships of salary between millennials (between the ages of
18 and 34), gen X (between the ages of 35 and 50) and baby boomers
(aged 51 and above) of below sample by plotting multiple boxplots in one
graph.
Gender Age Salary
Male 20 81600
Female 55 61600
Male 38 64300
Female 25 71900
Male 58 76300
Male 45 68200
Female 30 60900
Female 49 78600
Male 60 81700
[Evaluation Scheme] Full mark for the correct answer. Step-wise mark
can be awarded based on the partial correctness of the solution.
[Solution]
The data points (i.e., salary) for millennials = {81600, 71900, 60900}
The data points (i.e., salary) for gen X = {64300, 68200, 78600}
The data points (i.e., salary) for baby boomers = {61600, 76300, 81700}

Plotting multiple boxplots in one graph infers visualizing millennials, gen

X and baby boomers boxplots side-by-side in the same graphic.

7
4. (a) A consumer electronics company has adopted an aggressive policy to
increase sales of a newly launched product. The company has invested in
advertisements as well as employed salesmen for increasing sales rapidly.
Below dataset presents the sales, the number of employed salesmen, and
advertisement expenditure for 4 randomly selected months. Develop a
regression model to predict the impact of advertisement and the number of
salesmen on sales.
Month No 1 2 3 4
Sales 5000 5200 5700 6300
Salesmen 25 35 15 27
Advertisement 180 250 150 240
[Evaluation Scheme] Full mark for the correct answer. Step-wise mark
can be awarded based on the partial correctness of the solution.
[Solution]

8
(b) Explain non-linear regression with a suitable example. Subsequently,
establish narrate second degree (quadratic), third degree (cubic) and n
degree polynomial mathematical model. In general, what techniques
applied to determine the right degree of the model?
[Evaluation Scheme] Full mark for the correct answer. Step-wise mark
can be awarded based on the partial correctness of the solution.
[Solution]
In the case of linear and multiple linear regression, the dependent variable
is linearly dependent on the independent variable(s). But, in several
situations, the situation is no simple where the two variables might be
related in a non-linear way. This may be the case where the results from
the correlation analysis show no linear relationship but these variables
might still be closely related. If the result of the data analysis shows that
there is a non-linear (also known as curvilinear) association between the
two variables, then the need is to develop a non-linear regression model.
Imagine a dataset whose scatter plot looks as follows:

The non-linear data can be handled in 2 ways:

 Use of polynomial rather than linear regression model
 Transform the data and then use linear regression model.
The polynomial mathematical model are represented below:
Second degree: y = β0+ β1x1 + β2x2 + e
Third degree: y = β0+ β1x1 + β2x2 + β3x3 + e
n degree: y = β0+ β1x1 + β2x2 + β3x3 + … … + βnxn + e
To determine the right degree of the model, 2 approaches are followed:
Forward Selection: This method increases the degree until it is significant

9
enough to define the best possible model.
Backward Elimination: This method decreases the degree until it is
significant enough to define the best possible model.

5. (a) Consider the following dataset consisting of 6 observations that depicts

automobile battery sales. Using Simple Exponential Smoothing, calculate
the forecasted value of month 7 by calculating smooth observation (S t) for
each month and mean of the squared errors. The smoothing constant is 0.5
and S1 value is 20.
Month No Actual
1 20
2 22
3 21
4 18
5 17
6 23
[Evaluation Scheme] Full mark for the correct answer. Step-wise mark
can be awarded based on the partial correctness of the solution.
[Solution]
St=α * Yt-1 + (1-α) * St-1 where α = 0.5 and S1=20

Mont Actua Forec Err Sq- S2 = 0.5 * 20 + 0.5 *20

h l(Yt) ast Err = 10 + 10 = 20
(St)
1 20 20 0 0 S3 = 0.5 * 22 + 0.5 * 20
2 22 20 2 4 = 11 +10 = 21
3 21 21 0 0
4 18 21 -3 9 S4 = 0.5 * 21 + 0.5 * 21
5 17 19.5 -2.5 6.25 = 10.5 + 10.5 = 21
6 23 17.26 5.74 32.94
7 19.14 S5 = 0.5 * 18 + 0.5 * 21
= 9 + 10.5 = 19.5
Sum of Square errors = 52.19
S6 = 0.5 * 17 + 0.5 * 19.5
Mean Square error = 52.19/6 =8.698 = 8.5 + 9.75
= 17.26
S7 = 0.5 * 23 + 0.5 * 17.26
= 11.5 + 8.63 = 19.14

(b) Consider the following dataset capturing monthly sales of actual vs.
predicted of an Indian B2C (business to customer) firm. The sales figures
are in lakh and presented in INR.

10
Month No 1 2 3 4
Actual 112 113 122 120
Predicted 113 115 121 119
As a data consultant, the B2C firm hires you for the following and you
need to justify your response.
(1) Determine the hybrid error and a hybrid error is determined by 0.3 *
MSE + 0.25 * RMSE.
(2) Determine MAPE.
[Evaluation Scheme] Full mark for the correct answer. 2 marks for hybrid
error and rest 2 marks for MAPE calculation. Step-wise mark can be
awarded based on the partial correctness of the solution.
[Solution]
The hybrid error calculation is as follows:
Month No 1 2 3 4
Actual 112 113 122 120
Predicted 113 115 121 119
Error -1 -2 1 1
Squared Error 1 4 1 1
Sum of Square Error = 1 + 4 +1 + 1 = 7
Mean Square Error (MSE) = 7/ 4 = 1.75
Root Mean Square Error (RMSE) = √MSE = √1.75 = 1.322
So, hybrid error = 0.3 * 1.75 + 0.25 * 1.322 = 0.855

The MAPE calculation is as follows:

Month No 1 2 3 4
Actual 112 113 122 120
Predicted 113 115 121 119
| Predicted – Actual | 1 2 1 1
| Predicted – Actual | / Actual 1/112 2/113 = 1/122 = 1/120 =
= 0.008 0.017 0.008 0.008

SUM(| Predicted – Actual | / Actual) = 0.008 + 0.017 + 0.008 + 0.008 = 0.041

MAPE = (100/4) * 0.041 = 1.025

6. (a) Consider the following transactional data in which minimum support is 2

and minimum confidence is 50%. Find frequent itemsets and generate
association rules for them by illustrating it with step-by-step process.
Transactions List of items
T1 I1, I2, I3
T2 I2, I3, I4
T3 I4, I5
T4 I1, I2, I4
T5 I1, I2, I3, I5
T6 I1, I2, I3, I4
[Evaluation Scheme] Full mark for the correct answer. Step-wise mark

11
can be awarded based on the partial correctness of the solution.
[Solution]

12
13
14
(b) Consider the following dataset.
Basket Product 1 Product 2 Product 3
1 Milk Cheese
2 Milk Apple Cheese
3 Apple Banana
4 Milk Cheese
5 Apple Banana
6 Milk Cheese Banana
Calculate Support, Confidence and Lift for the followings:
(1) Apple, Milk
(2) (Apple, Milk) => Cheese
(3) Milk => Cheese
(4) (Apple, Cheese) => Milk
[Evaluation Scheme] Full mark for the correct answer. Step-wise mark
can be awarded based on the partial correctness of the solution.
[Solution]

15
The support formula written out would look something like:

The confidence formula written out would like something like:

The lift formula written out would look something like:

16
7. (a) Consider the following hypothetical dataset concerning student
characteristics whether or not each student should be hired. Use Naive
Bayes Classifier to determine whether or not someone with poor GPA and
lots of effort should be hired.
Name GPA Effort Hirable?
Sarah Poor Lots Yes
Dana Average Some No
Alex Average Some No
Annie Average Some Yes
Emily Excellent Lots Yes
Pete Excellent Lots No
John Excellent Lots No
Kathy Poor Some No
[Evaluation Scheme] Full mark for the correct answer. Step-wise mark
can be awarded based on the partial correctness of the solution.
[Solution]

17
(b) Demonstrate a step-by-step process of Agglomerative hierarchical
clustering with the following dataset. In addition, illustrate the merge with
Dendogram (keep the threshold as 5). Use Manhattan distance for the
construction of matrix.
Roll Mark
1 80
2 90
3 65
4 75
5 95
6 55
[Evaluation Scheme] Full mark for the correct answer. Step-wise mark

18
can be awarded based on the partial correctness of the solution.
[Solution]

19
20
8. (a) Design an optimised algorithm for the updation of an element in a Bloom
filter.
[Evaluation Scheme] Full mark for the correct answer. Step-wise mark
can be awarded based on the partial correctness of the solution.
[Solution]
The steps of optimised algorithm is as follows
1. Clear the bloom filter
2. Insert all the elements into the bloom filter except the element to
be updated.
3. Insert the updated value into the bloom filter.
The insert function code is as follows.
insert(e)
begin
/* Loop all hash functions k */
for j : 1 . . . k do
m ← hj(e) //apply the hash function on e
Bm ← bf[m] //retrieve val at mth pos from Bloom filter bf
if Bm == 0 then
/* Bloom filter had zero bit at index m */
Bm ← 1;
end if
end for
end

The clear function code is as follows.

clear()
begin
for i : 1 . . . n // n is the size of the bloom filter
bf[i] = 0
end for
end
(b) Consider a Bloom Filter of size 11, with integers as stream elements and
two hash functions as follows:
 H1(x) = take odd number of bits from right in the binary
representation of X. Subsequently, treat it as an integer i, and result is
i modulo 11.
 H2(x) = same, but take even numbered bits.

(1) Find the filter after the insertion of elements 25, 15 and 35.
(2) Check whether the element y=18 exists in the bloom filter or not. Is it

21
the case of False Positive or False Negative? Explain.
[Evaluation Scheme] Full mark for the correct answer. Step-wise mark
should be awarded based on the partial correctness of the solution.
[Solution]
Step 1: Initialization of bloom filter
0 0 0 0 0 0 0 0 0 0 0
Step 2:
- Insertion of 25:
(25)10 = (11001)2
Considering odd number of bits from right in (11001) 2 = 101, So
H1(101) = 101 mod 11 = 2
Considering even number of bits from right in (11001) 2 = 1, So
H2(1) = 1 mod 11 = 1
The revised bloom filter is as follows:
0 1 1 0 0 0 0 0 0 0 0
- Insertion of 15
(15)10 = (1111)2
Considering odd number of bits from right in (1111) 2 = 11, So
H1(11) = 11 mod 11 = 0
Considering even number of bits from right in (1111) 2 = 11, So
H2(11) = 11 mod 11 = 0
The revised bloom filter is as follows:
1 1 1 0 0 0 0 0 0 0 0
- Insertion of 35
(35)10 = (100011)2
Considering odd number of bits from right in (100011) 2 = 100, So
H1(100) = 100 mod 11 = 1
Considering even number of bits from right in (100011) 2 = 101, So
H2(101) = 101 mod 11 = 2
The revised bloom filter is as follows:
1 1 1 0 0 0 0 0 0 0 0
Step 3:
Membership test of 18
(18)10 = (10010)2
Considering odd number of bits from right in (10010) 2 = 1, So H1(1) = 1
mod 11 = 1

22
Considering even number of bits from right in (100011) 2 = 10, So H2(10)
= 10 mod 11 = 10
Since 10th slot of bloom filter is 0, it is concluded that 18 is definitely does
not exist in bloom filter.

Pokemon 5e - Pokemon Races
75% (4)
Pokemon 5e - Pokemon Races
7 pages
Specifications For Excavation and Earthwork
100% (2)
Specifications For Excavation and Earthwork
78 pages
CSBS - AD3491 - FDSA - IA 1 - Answer Key
100% (11)
CSBS - AD3491 - FDSA - IA 1 - Answer Key
14 pages
Mock - Test - 2 - Solution Statistics
100% (5)
Mock - Test - 2 - Solution Statistics
20 pages
Social Web Analytics - Solution Answers
0% (2)
Social Web Analytics - Solution Answers
22 pages
Statistics and Probability Solved Assignments - Semester Spring 2008
67% (3)
Statistics and Probability Solved Assignments - Semester Spring 2008
31 pages
Comprehensive Assignment 1 Question One: A A A A
No ratings yet
Comprehensive Assignment 1 Question One: A A A A
14 pages
Previous Exam Paper 2 Solutions
No ratings yet
Previous Exam Paper 2 Solutions
6 pages
II PUC-StatisticsPracticeQPEng23-24 I - IV
No ratings yet
II PUC-StatisticsPracticeQPEng23-24 I - IV
16 pages
2022 Stats 1 Ms (Paper 1 (A'level Statistics) ) MR Share
No ratings yet
2022 Stats 1 Ms (Paper 1 (A'level Statistics) ) MR Share
18 pages
Top 15 Statistics PQP January 2024
No ratings yet
Top 15 Statistics PQP January 2024
56 pages
DA Exam Paper
No ratings yet
DA Exam Paper
6 pages
PG Dast 2019
No ratings yet
PG Dast 2019
29 pages
848SP 1 (2)
No ratings yet
848SP 1 (2)
6 pages
Mock Paper -4 Applied Maths 12_8d231860-c2a5-44c4-b0fd-ce4a3b996b27
No ratings yet
Mock Paper -4 Applied Maths 12_8d231860-c2a5-44c4-b0fd-ce4a3b996b27
9 pages
Statistics and Probability - Solved Assignments - Semester Spring 2010
No ratings yet
Statistics and Probability - Solved Assignments - Semester Spring 2010
33 pages
Methodist KL 2013 M2 (Q)
0% (1)
Methodist KL 2013 M2 (Q)
4 pages
Final Assigment - Decision - Science - Dec22
No ratings yet
Final Assigment - Decision - Science - Dec22
4 pages
20 Diff Districts II PU Stats Prep QPs 2024.
No ratings yet
20 Diff Districts II PU Stats Prep QPs 2024.
73 pages
Week 12 Tutorial 11 Review Questions and Solutions
No ratings yet
Week 12 Tutorial 11 Review Questions and Solutions
17 pages
Fyybsc - CS Sem 1 FMS Journal
No ratings yet
Fyybsc - CS Sem 1 FMS Journal
43 pages
Pu 1
No ratings yet
Pu 1
7 pages
5-SET-3 QP SLOW
No ratings yet
5-SET-3 QP SLOW
7 pages
LOG708 Applied Statistics 24 November 2021 - Sensorveiledning - 2
No ratings yet
LOG708 Applied Statistics 24 November 2021 - Sensorveiledning - 2
16 pages
Assignment Booklet -Jan.-Dec. 2024 (PGDAST) (1)
No ratings yet
Assignment Booklet -Jan.-Dec. 2024 (PGDAST) (1)
26 pages
math set 2
No ratings yet
math set 2
5 pages
West Cluster Sahodaya Kpcvs 13de
No ratings yet
West Cluster Sahodaya Kpcvs 13de
7 pages
sample_question
No ratings yet
sample_question
19 pages
BSA - PUT - SEM I - 21-22 Solution
No ratings yet
BSA - PUT - SEM I - 21-22 Solution
16 pages
MODEL 1
No ratings yet
MODEL 1
22 pages
Chennai Mathematical Institute
No ratings yet
Chennai Mathematical Institute
14 pages
MTH 302 Long Questions Solved by Pisces Girl "My Lord! Increase Me in Knowledge."
No ratings yet
MTH 302 Long Questions Solved by Pisces Girl "My Lord! Increase Me in Knowledge."
20 pages
Grade 11 Annual Examination 2023 Feb PDF
No ratings yet
Grade 11 Annual Examination 2023 Feb PDF
10 pages
CSIR NET Statistics PYQs
No ratings yet
CSIR NET Statistics PYQs
94 pages
mscds2018 Solutions
No ratings yet
mscds2018 Solutions
11 pages
PB Paper Done
No ratings yet
PB Paper Done
8 pages
FYBCA__Applied_Mathematics_and_Statistics_Question bank
No ratings yet
FYBCA__Applied_Mathematics_and_Statistics_Question bank
7 pages
CS1Paper A - Examiner Report
No ratings yet
CS1Paper A - Examiner Report
11 pages
GATE 2022 Paper Solution (CS) IESMaster
No ratings yet
GATE 2022 Paper Solution (CS) IESMaster
26 pages
Assignment Booklet PGDAST 2016
No ratings yet
Assignment Booklet PGDAST 2016
29 pages
2024 Gosford HS Mathematics Advanced Task 4 (Trial HSC)
No ratings yet
2024 Gosford HS Mathematics Advanced Task 4 (Trial HSC)
36 pages
Assignment 2 CS Sec#4
No ratings yet
Assignment 2 CS Sec#4
5 pages
Class 12 Applied Mathematics Sample Paper Set 4
No ratings yet
Class 12 Applied Mathematics Sample Paper Set 4
10 pages
Selfstudys Com File (3)
No ratings yet
Selfstudys Com File (3)
19 pages
JNTUH USED 07-11-2020 AM: Ax For X FX Elsewhere
No ratings yet
JNTUH USED 07-11-2020 AM: Ax For X FX Elsewhere
2 pages
DM Makeup Key
No ratings yet
DM Makeup Key
6 pages
Sample Paper
No ratings yet
Sample Paper
101 pages
Final Assignment MAT1004 Code 6
No ratings yet
Final Assignment MAT1004 Code 6
2 pages
Assignment 1,2,3
No ratings yet
Assignment 1,2,3
3 pages
Matlab-Median and Mode
100% (1)
Matlab-Median and Mode
12 pages
M.Sc_DS_Model_QP
No ratings yet
M.Sc_DS_Model_QP
7 pages
Section A (45 Marks) : Answer All Questions in This Section
No ratings yet
Section A (45 Marks) : Answer All Questions in This Section
4 pages
Mscdatascience2018 Sample
No ratings yet
Mscdatascience2018 Sample
12 pages
David R. Anderson, Dennis J. Sweeney, Thomas A. Williams, Jeffrey D. Camm, James James J. Cochran Quantitative Methods For Business
No ratings yet
David R. Anderson, Dennis J. Sweeney, Thomas A. Williams, Jeffrey D. Camm, James James J. Cochran Quantitative Methods For Business
19 pages
PDF Final Set No. 2 Applied Math 12
No ratings yet
PDF Final Set No. 2 Applied Math 12
7 pages
II PU Statistics 10 Set PrepQPs-merge
No ratings yet
II PU Statistics 10 Set PrepQPs-merge
43 pages
Ignou PGDAST Assignment Booklet Jan-Dec 2020
No ratings yet
Ignou PGDAST Assignment Booklet Jan-Dec 2020
30 pages
2022 CS244 End Sem Soln
No ratings yet
2022 CS244 End Sem Soln
6 pages
PDF Joiner
No ratings yet
PDF Joiner
12 pages
Practise Mathematics Grade 7 Book 8
From Everand
Practise Mathematics Grade 7 Book 8
Esther Chen
5/5 (1)
Geometry and Locus (Geometry) Mathematics Question Bank
From Everand
Geometry and Locus (Geometry) Mathematics Question Bank
Mohmmad Khaja Shareef
No ratings yet
Master Fundamental Concepts of Math Olympiad: Maths, #1
From Everand
Master Fundamental Concepts of Math Olympiad: Maths, #1
Subbalakshmi Devaki
No ratings yet
Introduction Nursing Ethics
No ratings yet
Introduction Nursing Ethics
6 pages
Faster Fashion How To Shorten The Apparel Calendar PDF
No ratings yet
Faster Fashion How To Shorten The Apparel Calendar PDF
6 pages
Selected Color Plates - Edition: 41: The Babcock & Wilcox Company
No ratings yet
Selected Color Plates - Edition: 41: The Babcock & Wilcox Company
9 pages
CC-KML051-Unit V
No ratings yet
CC-KML051-Unit V
17 pages
Proof of Softmax
No ratings yet
Proof of Softmax
3 pages
Cambridge IGCSE™: English As A Second Language (Speaking Endorsement) 0510/41 May/June 2020
No ratings yet
Cambridge IGCSE™: English As A Second Language (Speaking Endorsement) 0510/41 May/June 2020
4 pages
Digitizing Public Services in Europe Putting Ambition Into Action
No ratings yet
Digitizing Public Services in Europe Putting Ambition Into Action
272 pages
BNWAS_AMI_KW810
No ratings yet
BNWAS_AMI_KW810
2 pages
Project 1 A V3
No ratings yet
Project 1 A V3
20 pages
DLL Week 6 Tle
No ratings yet
DLL Week 6 Tle
4 pages
Assessment Diagnosis Background Knowledge Planning Intervention Rationale Evaluation Subjective
No ratings yet
Assessment Diagnosis Background Knowledge Planning Intervention Rationale Evaluation Subjective
3 pages
Letter for Souvenir AYS 2025 HM(SYA)
No ratings yet
Letter for Souvenir AYS 2025 HM(SYA)
1 page
Lesson Plan Ipg
No ratings yet
Lesson Plan Ipg
7 pages
ECON010 DSC 9 - Vishruti Gupta
No ratings yet
ECON010 DSC 9 - Vishruti Gupta
4 pages
IMG - 0535 EE PreBoard Exam 13
No ratings yet
IMG - 0535 EE PreBoard Exam 13
1 page
Form 3 Book
No ratings yet
Form 3 Book
18 pages
Spiritual Task
No ratings yet
Spiritual Task
12 pages
Moral and Ethical Issues in Teacher Education
100% (1)
Moral and Ethical Issues in Teacher Education
4 pages
Application Center Fork-Lift Trucks
100% (1)
Application Center Fork-Lift Trucks
10 pages
Design and Analysis of Ic Engine Piston Using Catia and Ansis Software
No ratings yet
Design and Analysis of Ic Engine Piston Using Catia and Ansis Software
40 pages
Sagi Sev 603bxf-220
No ratings yet
Sagi Sev 603bxf-220
1 page
4 Test 1
No ratings yet
4 Test 1
15 pages
Kamala Das
No ratings yet
Kamala Das
2 pages
Kuis Week 8
No ratings yet
Kuis Week 8
5 pages
Organizational Behaviour - Ii: Nptel
No ratings yet
Organizational Behaviour - Ii: Nptel
27 pages
Study Guide Barista NC2 UNIT 1 OCT2020
No ratings yet
Study Guide Barista NC2 UNIT 1 OCT2020
8 pages
Arjya B. Majumdar 2013
No ratings yet
Arjya B. Majumdar 2013
24 pages
11 36236 Copyedit Layout+V3 - ED
No ratings yet
11 36236 Copyedit Layout+V3 - ED
12 pages

IT-3006(DA)-CS_END_MAY_2023

Uploaded by

IT-3006(DA)-CS_END_MAY_2023

Uploaded by

SPRING END SEMESTER EXAMINATION-2023

6th Semester, B.Tech

1. Answer the following questions.

2 √(17-10)2= 7 0 √(17-18)2= 1 √(17-20)2= 3

3 √(18-10)2= 8 √(18-17)2= 1 0 √(18-20)2= 2

4 √(20-10)2= 10 √(20-17)2= 3 √(20-18)2= 2 0

False positive probability =

MapReduce process to find the number of customers from each state:

(b) A retail company wants to enhance their customer experience by analysing

Plotting multiple boxplots in one graph infers visualizing millennials, gen

The non-linear data can be handled in 2 ways:

5. (a) Consider the following dataset consisting of 6 observations that depicts

Mont Actua Forec Err Sq- S2 = 0.5 * 20 + 0.5 *20

The MAPE calculation is as follows:

SUM(| Predicted – Actual | / Actual) = 0.008 + 0.017 + 0.008 + 0.008 = 0.041

6. (a) Consider the following transactional data in which minimum support is 2

The confidence formula written out would like something like:

The lift formula written out would look something like:

The clear function code is as follows.

You might also like