Lecture 03 - Student Version
Lecture 03 - Student Version
Analysis
Going back to the
marketing meeting..
• Remember working at the Marketing and Analytics division of
Nykaa?
• You propose running more advertisements on the website, this
time featuring a well-noted celebrity to improve Nykaa’s sales.
• & Congratulations, you successfully complete it!
• One fine day, suddenly the management calls you in for a
quarterly meeting, and says:
• What do you think is the kind of data you need to handle such questions?
• At this point, you may only have a vague idea of the kind of data you would need to collect.
• By the end of this course, you should know how to use statistical methods to formally evaluate impact of
advertisement campaigns.
Hands-On Data Exercise: Part 1
• Download the dataset showing Ad data and Nykaa
Sales
• In the form of y = mx + c
Dependent Independent
Error term
variable variable
Regression
Coefficient
Scatter Plot
• Value of correlation Coefficient = 0.9069, which is
100
90
high
80
Ice Tea Orders
70
• Variables are highly positively correlated
60
50
40
20 22 24 26 28 30 32 34 36
Temp (Celcius)
• Draw a straight line following the pattern in the data
Scatter Plot
100 • Select the graph, click on + sign, select “trendline”
90
option
80
Ice Tea Orders
40
20
22 2224 24 26 26 28 28 30 30 32 32 3434 36 the actual values.
Temp (Celcius)
• This distance is the residual.
80
which we use to find the regression equation.
Ice Tea Orders
70
60 • Yi = α + βXi + εi
50
40
22 24 26 28 30 32 34 36 • For a value Xi of the independent variable, the
Temp (Celcius)
estimated value of the dependent variable Y i is given
by Ŷi = α + βXi
This is known as linear least
squares regression. • The difference between actual value Yi and the
• ei = Yi - Ŷi
• We square the residuals to find the sum of squares, which we use to find the regression equation.
• Using calculus, unknowns ‘a’ and ‘b’ are obtained by equating first derivative of the sum of square of errors to zero.
Raw materials
of step 2
• Using calculus, unknowns ‘a’ and ‘b’ are obtained by equating first derivative of the sum of square of errors to zero.