Marketing Analytics Unit 3
Marketing Analytics Unit 3
SHREY SHUKLA
Introduction to Regression Models
1. Feature Selection
1. Use correlation analysis to identify variables strongly related to sales.
2. Apply domain knowledge to ensure variables are logical and meaningful (e.g., promotions, market size).
3. Example: Exclude variables with weak or no relationship to sales performance.
2. Model Estimation
1. Use statistical tools like Excel, programming languages like Python (scikit-learn), or software like R to
perform regression analysis.
2. Steps include:
1. Inputting data.
2. Running the regression algorithm to estimate coefficients.
CONTD…
3. Model Evaluation
1. Adjusted R-Squared: Measures how well the independent variables explain the
variation in sales.
2. Residual Analysis: Checks the difference between actual and predicted values to
identify model errors.
3. Example: A high adjusted R-squared and low residuals indicate a good fit.
Real-Life Example
• Scenario: A retail chain forecasts monthly sales using the following variables:
• Price: Discounts or promotions on products.
• Advertising Spend: Budget allocated to marketing campaigns.
• Economic Indicators: Consumer confidence index or inflation rates.
MODEL OUTCOME
Challenges
1. Multicollinearity
1. Occurs when independent variables are highly correlated, making it difficult to determine their individual
impact on the dependent variable.
2. Example: Advertising spend and promotional discounts might both increase sales, but their effects can
overlap.
2. Overfitting
1. When a model is too complex and fits the training data perfectly but performs poorly on new data.
2. Example: Including too many variables may capture noise instead of true patterns.
3. Data Quality Issues
1. Missing, inaccurate, or inconsistent data can distort the model's output.
2. Example: Sales data recorded differently across branches or with gaps can lead to unreliable forecasts.
Best Practices
❑Simplify Models
❑ Include only relevant and meaningful variables to avoid overfitting and ensure
interpretability.
❑ Example: Instead of using 20 variables, focus on the top 5 that strongly influence sales.
❑Test Assumptions
❑ Validate the linearity, normality of residuals, and homoscedasticity (constant variance of
errors).
❑ Example: Plot residuals to ensure they are randomly distributed.
CONTD…
❑ Validate with New Data
❑ Use a separate test dataset or cross-validation to evaluate the model's performance on unseen data.
❑ Example: Train the model on 2022 sales data and test it using 2023 data.
❑ Address Multicollinearity
❑ Use Variance Inflation Factor (VIF) to identify and eliminate highly correlated variables.
❑ Example: If "ad spend" and "online ad spend" are strongly correlated, retain only one.
❑ Regular Updates
❑ Re-train the model periodically with the latest data to account for market changes.
❑ Example: Update the regression model annually to reflect evolving customer behavior.
While plotting
• Independent variable- x axis
• Dependent variable – y axis
• Done using scatter chart in MS EXCEL
Introduction to Trend and Seasonality
• Trend:
Trend refers to the long-term direction in data, either upward or downward, that
occurs over an extended period.
Example: A company’s sales steadily growing year-over-year due to market
expansion.
• Seasonality:
Seasonality refers to predictable and recurring patterns or fluctuations in data that
happen at regular intervals, often tied to specific periods or events.
Example: Retail sales peaking during the holiday season or ice cream sales increasing
in summer.
Importance of Modeling Trend and
Seasonality:
• Examples:
• A steady increase in e-commerce sales over the past decade due to the growth of digital adoption.
• Declining newspaper subscriptions as online news consumption rises.
• Methods to Identify Trends:
• Line Plots: Plotting raw data over time to observe long-term patterns.
• Moving Averages: Smoothing fluctuations in data to reveal the underlying trend.
Example: A 12-month moving average for yearly sales data.
• Linear Regression: Fitting a line to time-series data to quantify the direction and rate of change in
trends.
Seasonality
• Examples:
• Increased travel bookings during summer vacations and public holidays.
• Peaks in retail sales during Black Friday, Cyber Monday, or Christmas seasons.
• Visualizing Seasonality:
• Time-Series Plots: Overlaying monthly or weekly sales to identify repeated patterns.
Example: Plotting monthly ice cream sales to observe summer spikes.
• Decomposition Methods: Splitting time-series data into trend, seasonal, and residual
components using techniques like additive or multiplicative decomposition.
Example: Analyzing sales data to separate seasonal holiday spikes from overall growth.
Linear Trend
Nonlinear Trends
• Concept:
When trends are not constant (e.g., exponential growth or decay), more
complex models are needed.
• Polynomial Models: Fit curved trends to capture increasing or decreasing rates of change.
• Exponential Models: Capture rapid growth or decline, often seen in tech adoption or viral
trends.
Techniques for Modeling Seasonality
• Seasonal Decomposition
• Additive Model:
• Equation: y=T+S+R
• Components:
• T: Trend (long-term pattern).
• S: Seasonality (cyclical variations).
• R: Residual (random noise).
• Use: When variations in seasonal patterns remain constant over time.
Multiplicative Model:
• Equation: y=T×S×R
• Components: Same as the additive model but assumes seasonal effects grow
proportionally with the trend.
• Use: When seasonal fluctuations increase or decrease in magnitude over time.
Fourier Transform or Dummy Variables
• Fourier Analysis:
• Identifies periodic signals in data and captures complex seasonal patterns.
• Suitable for advanced modeling in large datasets, like temperature variations affecting
sales.
• Dummy Variables:
• Introduce binary indicators (0 or 1) for different seasons or months in a regression
model.
• Helps quantify the seasonal effect of specific periods (e.g., holidays vs. non-holidays).
Combining Trend and Seasonality for
Forecasting
• Step 1: Divide the actual sales data by the corresponding moving average to
determine the seasonal ratio for each period.
Example: If actual sales for Quarter 1 are 120 and the moving average is 100,
the seasonal ratio is 120 ÷ 100 = 1.2.
• Step 2: Average seasonal ratios across the same periods (e.g., all Q1s, Q2s,
etc.) to calculate consistent seasonal factors.
Example: If the seasonal ratios for Q1 across three years are 1.2, 1.1, and
1.15, the seasonal factor for Q1 is (1.2 + 1.1 + 1.15) ÷ 3 = 1.15.
Adjusting Data for Seasonality:
1. Historical Data:
1. Gather early sales data and customer adoption rates for the product.
2. Use similar product launches or industry benchmarks if data is limited.
2. Market Size Estimates:
1. Define the potential maximum market size (𝑲), which represents the saturation point.
2. Consider market research reports, surveys, or competitor performance.
3. Adoption Rates:
1. Identify factors like promotional campaigns, consumer demographics, and competitor responses that
influence growth rate (𝒃).
Mathematical Representation