0% found this document useful (0 votes)
1 views40 pages

BA4206 Business Analytics Batch 17 Study Material

The document provides a comprehensive overview of Business Analytics (BA), including its definition, methods, evolution, scope, and importance in decision-making. It details various analytics types such as descriptive, predictive, and prescriptive, along with the tools and challenges associated with BA. Additionally, it outlines the roles and skills required for business analysts and the significance of data visualization in interpreting data effectively.

Uploaded by

jeromerho2226
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views40 pages

BA4206 Business Analytics Batch 17 Study Material

The document provides a comprehensive overview of Business Analytics (BA), including its definition, methods, evolution, scope, and importance in decision-making. It details various analytics types such as descriptive, predictive, and prescriptive, along with the tools and challenges associated with BA. Additionally, it outlines the roles and skills required for business analysts and the significance of data visualization in interpreting data effectively.

Uploaded by

jeromerho2226
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

BA4206 Business Analytics

Study Material
Prepared by Prof. S Balaji

1
Unit – 1

Introduction to Business Analytics

What is BA?

According to Harvard Business School, Business analytics is the process of using


quantitative methods to derive meaning from data to make informed business decisions.
According to IBM, Business analytics refers to the statistical methods and computing
technologies for processing, mining and visualizing data to uncover patterns, relationships
and insights that enable better business decision-making.

Methods of Business Analysis:

There are 4 primary methods of business analysis,

1. Descriptive: The interpretation of historical data to identify trends and patterns

2. Diagnostic: The interpretation of historical data to determine why something has happened

3. Predictive: The use of statistics to forecast future outcomes

4. Prescriptive: The application of testing and other techniques to determine which outcome
will yield the best result in a given scenario

Evolution of BA:

• Pre-Computer Era (Before 1950s) – Traditional Decision-Making


Decisions relied on intuition, experience, and manual calculations with limited data
processing. Basic statistical methods were used for forecasting.

• The Rise of Computers (1950s-1970s) – Early Data Processing


Mainframes enabled large-scale data processing. Decision Support Systems (DSS)
emerged, and relational databases improved data storage and retrieval.

• Business Intelligence (BI) Era (1980s-1990s) – Descriptive Analytics


BI tools like OLAP and data warehouses enabled historical data analysis. Dashboards,
Excel, and SQL became essential for business insights.

2
• The Internet Boom (2000s) – Advanced Analytics & Big Data
Big Data grew with the internet and social media. Predictive analytics gained traction,
supported by cloud computing and data mining.

• AI & Machine Learning Era (2010s-Present) – Predictive & Prescriptive


Analytics
AI, ML, and real-time analytics revolutionized decision-making. Prescriptive
analytics and self-service tools empowered businesses with AI-driven insights.

Scope of BA:

1. Helping organisations know their customers

2. Reputation management

3. Improve Operational efficiency

4. Financial management

5. Manage waste, fraud & abuse

6. Proactive Risk Management

Need for BA:

• Data-Driven Decision Making

• Improved Operational Efficiency

• Enhanced Customer Insights

• Competitive Advantage

• Predictive Modeling and Forecasting

• Risk Mitigation

• Performance Measurement

3
Components of BA:

Data Collection and Aggregation

Data Mining

Association

Text Mining

Forecasting

Optimization

Data Visualization

Types and Techniques of Business Analytics:

• Descriptive analytics: The techniques used are,

1. Data Query

2. Data dashboard

• Predictive analytics: The techniques used are,

1. Data mining

2. Simulation

• Prescriptive analytics: The techniques used are,

1. Simulation Optimisation

2. Decision analysis

4
Title Descriptive Predictive Prescriptive

Descriptive Statistics
Data Visualization
Linear Regression
Time Series Analysis and Forecasting
Data Mining
Spreadsheet Models
Linear Optimization Models
Integer Linear Optimization Models
Non-linear Optimization Models
Monte Carlo Simulation
Decision Analysis

Business Analytics Process:

Step 1: Define the Business Need

Step 2: Explore the Data

Step 3: Analyze the Data

Step 4: Predict What’s Likely to Happen

Step 5: Optimize – Find the Best Solution

Step 6: Make a Decision & Measure the Outcome

Step 7: Update the System with the Results of the Decision

*Note: It is an iterative process.

Importance of BA:
• Data-Driven Decision Making
• Improved Operational Efficiency
• Competitive Advantage
• Enhanced Customer Experience
• Cost Reduction & Revenue Growth
• Risk Management & Fraud Detection
• Supports AI & Automation

5
Tools of BA:
Data Visualization Tools
• Tableau
• Power BI
• Google Data Studio
Statistical & Data Analysis Tools
• R
• Python (Pandas, NumPy, SciPy)
• SAS
Business Intelligence (BI) Tools
• SAP BusinessObjects
• IBM Cognos Analytics
• Oracle BI
Big Data & Database Management Tools
• SQL
• Apache Hadoop
• Snowflake
Predictive Analytics & Machine Learning Tools
• IBM SPSS
• KNIME
ETL (Extract, Transform, Load) Tools
• Talend
• Apache Nifi
Customer Analytics & CRM Tools
• Google Analytics
• Salesforce Analytics
• HubSpot
Spreadsheet & Reporting Tools
• Microsoft Excel
• Google Sheets

6
Challenges of BA:

1) Lack of technical skills in employees

2) Fuss over acceptance of BA by staff

3) Data Security and Maintenance

4) Integrity of Data

5) Delivering relevant information in the given time

6) Inability to address complex issues

7) Costs involved in implementing BA

8) Investment of staff time in implementation of BA

9) Lack of a proper strategy to implement BA

Application Areas of BA:

Marketing & Customer Analytics

• Customer segmentation

• Market trend analysis

• Personalization & recommendation systems

Finance & Banking

• Fraud detection

• Risk assessment & credit scoring

• Investment portfolio optimization

Retail & E-commerce

• Demand forecasting

• Inventory optimization

• Customer sentiment analysis

7
Supply Chain & Logistics

• Route optimization

• Warehouse management

• Demand-supply balancing

Human Resources & Workforce Analytics

• Employee performance analysis

• Attrition prediction

• Talent acquisition & workforce planning

Manufacturing & Operations

• Predictive maintenance

• Quality control & defect detection

• Process optimization

Comparison of Business Analytics and Organisation Decision making processes:

8
Ways Business Analytics can help achieve a Competitive Advantage:

Price Leadership – Business analytics helps optimize pricing strategies by analyzing market
trends, competitor pricing, and consumer behavior to offer competitive yet profitable prices.

Operational Efficiency – It streamlines processes, reduces waste, and enhances productivity


by leveraging data-driven insights for better resource allocation and decision-making.

Service Effectiveness – Analytics enables personalized customer experiences and faster issue
resolution by predicting customer needs and improving service delivery.

Innovation – By identifying emerging trends and customer preferences, analytics fosters data-
driven innovation in products, services, and business models.

Product Differentiation – It helps businesses tailor products to specific customer segments


through deep insights into consumer preferences and market gaps.

9
Unit – 2

Managing Resources for BA

Business analyst:

• A business analyst is a professional who analyses a business or organization to


identify vulnerabilities, assess its business model, and devise solutions.
• Business analytics personnels use data to identify patterns, create models, and
improve business processes. They are involved in many aspects of a business,
including strategy, architecture, and systems. Business analysts analyse data to
identify trends and problems, then collaborate with stakeholders to develop solutions.
These solutions may include improving processes, changing policies, or introducing
new technology.

Skills required for a Business Analyst:

• Verbal and Written communication

• Analytical and Systems thinking

• Technology and Business Knowledge

• Relationship Management & Negotiation

• Evaluation and Decision Analysis

• Planning and Management

• Elicitation and Facilitation

• Modelling (Process, Data, System)

Roles of a Business Analyst:

• As a Contributor: To contribute to the development of Business

• As a Facilitator: To facilitate or to make complex things simpler

• As an Analyst: To identify problems, forecast situations and devise solutions accordingly.

10
Types of Business Analysts:

• Pure Business Analysts

• IT Business Analysts

• Data Business Analysts

• Functional Business Analysts

• Business System Analysts

• Business Requirements Analysts

• Reporting Business Analysts

• Business Intelligence Analysts

Managing Business Analytics Personnel:

Following are the ways of managing business analytics personnel:

1) Understand What a Business Analyst Does and Their Value

2) Give them tools

3) Make sure they have time to think

4) Keep Business Analysts to Over-analyse

5) Spending time with Business Analysts

6) Get comfortable with Business Analysis Methodologies

7) Brush up Stakeholder management skills

8) Go deep into the tech side of the job

11
Organizational structures that align well with the Business Analyst (BA) role are:

1. Functional Organizational Structure (Department-Based BA Role)

o In this structure, Business Analysts are embedded within specific functional


departments (e.g., Finance, HR, Marketing, IT).

o BAs report directly to a functional manager and focus on department-specific


projects and improvements.

o Ensures deep domain expertise, as BAs work closely with department


stakeholders to optimize processes and implement solutions.

• Example:

o A BA in the Finance department analyses financial data, improves reporting


tools, and helps implement an ERP system to enhance financial decision-
making.

2. Centralized Business Analysis Department (CoE-Based BA Role)

o A Centralized Business Analysis Department or Center of Excellence (CoE)


manages all BAs within an organization.

o BAs work on projects across multiple departments, ensuring standardized BA


methodologies and best practices.

o Enables efficient resource allocation, knowledge sharing, and consistency in


business analysis practices across teams.

• Example:

o A BA from a CoE team is assigned to lead the business analysis phase of a


company-wide CRM implementation, working with multiple departments like
Sales, Customer Support, and IT.

12
Primary data:

Primary data refers to information collected firsthand by researchers or individuals directly


from the source. This data is original and has not been previously collected or analyzed.

It is gathered through sources such as,

• Surveys

• Interviews

• Questionnaires & Schedule

• Experiments

• Auditing

• Simulation

• Observation

Secondary data

Secondary data refers to information that has been collected by someone else or for another

purpose but is utilized by researchers for their own investigations.

This data is not collected firsthand but rather obtained from sources such as,

Internal Sources:

Sales Analysis

Invoice Analysis

Financial Data

Transportation data

External sources:

Libraries

Literature

References & Bibliography

Government and Private Organization sources

13
Difference between Primary and Secondary data:

Secondary
Primary Data
Data

Data collected firsthand for a Data that has already been collected by
Definition
specific research purpose someone else for a different purpose

Surveys, interviews, Books, articles, reports, government


Source
experiments, observations records, online databases

Collection
Direct and customized Indirect and pre-existing
Method

Less expensive and quickly


Cost & Time Expensive and time-consuming
available

High (tailored to the research


Accuracy May vary depending on the source
objective)

Conducting a survey to assess Using a government labour report to


Example
employee job satisfaction analyse employment trends

Management Issues in implementing BA:

Managing Information Policy

Outsourcing

Data Quality

Measuring BA contribution

Managing change

14
Unit – 3

Descriptive Analytics

What is Descriptive Analytics?

Descriptive analytics is a statistical interpretation used to analyze historical data to identify


patterns and relationships. Descriptive analytics seeks to describe an event, phenomenon, or
outcome.

Functions of Descriptive Analytics:

Company’s Current Performance

Business’s Historical Trends

Company’s Strong and Weak Points

Types of Descriptive Analytics:

Measures of Frequency – Describe how often a particular value appears in the dataset.

• Example: The number of employees in each department of a company.

• Common Measures: Frequency distribution tables, histograms, bar charts.

Measures of Central Tendency – Identify the central or most representative value in a dataset.

• Example: The average (mean) salary of employees in an organization.

• Common Measures: Mean, Median, Mode.

Measures of Dispersion (Variability) – Show how much the data varies from the central
value.

• Example: The variation in sales performance across different regions.

• Common Measures: Range, Variance, Standard Deviation, Interquartile Range (IQR).

Measures of Position – Indicate where a particular value stands in relation to others in the
dataset.

• Example: The percentile rank of a student in an entrance exam.

• Common Measures: Percentiles, Quartiles, Z-scores.

15
Steps involved in Descriptive Analysis:

1. State the Business Metrics


2. Identify the Data Required
3. Extract and Prepare the Data
4. Analyse the Data
5. Present the Data

Data Visualization:

Data Visualization is the graphical representation of data and information using visual elements
like charts, graphs, maps, and dashboards. It helps to identify trends, patterns, and insights in
data, making complex information easier to understand and interpret.

Common types of data visualization include:

• Bar Charts – Compare categories

• Line Graphs – Show trends over time

• Pie Charts – Display proportions

• Heatmaps – Represent data density

• Scatter Plots – Show relationships between variables

Functions of Data Visualization

1. Simplifies Complex Data – Converts raw data into visual formats, making it easier to
interpret and analyze.

2. Identifies Trends & Patterns – Helps in recognizing trends, correlations, and outliers
in large datasets.

3. Enhances Decision-Making – Provides actionable insights, aiding businesses and


organizations in making informed decisions.

4. Improves Data Storytelling – Presents data in an engaging and compelling way to


communicate findings effectively.

5. Facilitates Quick Analysis – Enables users to analyze and understand large amounts
of data at a glance.

16
Charts & Graphs in Data Visualization:

Charts and graphs are essential tools in data visualization that help represent numerical data in
a visual format. They make it easier to identify trends, patterns, and relationships between
variables. Choosing the right type of chart depends on the nature of the data and the insights
you want to extract.

Steps to Create a Chart

1. Select Data Set

o Highlight the data you want to visualize, including headers.

2. Go to Insert Tab

o Click on the Insert tab in the Excel ribbon.

3. Choose a Chart Option

o Select a chart type (e.g., Bar, Line, Pie) from the Charts group.

4. Choose a Chart Style

o Click on the Chart Styles option in the Chart Design tab.

5. Select Chart Style & Click OK

o Choose a preferred style and format.

o Click OK or press Enter to apply.

17
Types of Charts:

Bar chart

In a bar chart, values are indicated by the length of bars, each of which corresponds with a
measured group. Bar charts can be oriented vertically or horizontally; vertical bar charts are
sometimes called column charts. Horizontal bar charts are a good option when you have a lot
of bars to plot, or the labels on them require additional space to be legible.

Line chart

18
Line charts show changes in value across continuous measurements, such as those made over
time. Movement of the line up or down helps bring out positive and negative changes,
respectively. It can also expose overall trends, to help the reader make predictions or projections
for future outcomes. Multiple line charts can also give rise to other related charts like the
sparkline or ridgeline plot.

Scatter plot

A scatter plot displays values on two numeric variables using points positioned on two axes:
one for each variable. Scatter plots are a versatile demonstration of the relationship between
the plotted variables—whether that correlation is strong or weak, positive or negative, linear
or non-linear.

Heatmap

19
The heatmap presents a grid of values based on two variables of interest. The axis variables
can be numeric or categorical; the grid is created by dividing each variable into ranges or levels
like a histogram or bar chart. Grid cells are colored based on value, often with darker colors
corresponding with higher values. A heatmap can be an interesting alternative to a scatter plot
when there are a lot of data points to plot, but the point density makes it difficult to see the true
relationship between variables.

Pie chart

A pie chart, sometimes called a circle chart, is a way of summarizing a set of nominal data or
displaying the different values of a given variable (e.g. percentage distribution). This type of
chart is a circle divided into a series of segments. Each segment represents a particular category.

Importance of Data Visualization

1. Enhances Data Interpretation

2. Aids in Better Decision-Making

3. Identifies Trends & Patterns

4. Communicates Complex Data Clearly

5. Increases Efficiency & Productivity

6. Helps in Error Detection & Data Validation

20
Probability Distribution

A probability distribution is a mathematical function that describes the likelihood of different


possible outcomes in an experiment. It provides insights into how probabilities are distributed
over values.

Importance of Probability Distribution in Descriptive Analytics

1. Helps in Data Summarization

2. Identifies Patterns & Trends

3. Supports Decision-Making

4. Basis for Statistical Analysis & Machine Learning

5. Assists in Risk Assessment & Uncertainty Measurement

Sampling:

Sampling is the process of selecting a subset of individuals from a larger population to represent
the whole group.

It is used in research to draw conclusions without surveying the entire population.

Types of Sampling:

Probability Sampling:

• Every member of the population has a known, non-zero chance of being selected.
• It ensures objectivity and is suitable for generalizing results to the population.
• Examples: Simple Random, Systematic, Cluster, and Stratified Sampling

Non-Probability Sampling:

• Not all members have a known or equal chance of selection.


• Used when random sampling isn't feasible or necessary.
• Examples: Convenience, Purposive, Panel, and Snowball Sampling.

21
Probability Sampling Methods

Simple Random Sampling


Every individual has an equal chance of being selected.
Selection is completely random, often using a random number generator.
Ensures unbiased representation of the population.

Systematic Sampling
Selects every kth individual from a list after a random start.
Useful when the population is orderly and large.
Easy to implement but may introduce bias if patterns exist.

Cluster Sampling
Divides population into clusters, randomly selects entire clusters.
Used when population is large and spread out.
Cost-effective but may reduce diversity within samples.

Stratified Sampling
Divides population into subgroups (strata) based on a characteristic.
Samples are taken from each stratum proportionally or equally.
Ensures representation across key subgroups.

Non - Probability Sampling Methods

Convenience Sampling
Selects samples that are easiest to access.
Quick and inexpensive but prone to bias.
Often used in exploratory research or pilot studies.

Purposive Sampling
Samples chosen based on researcher’s judgment and purpose.
Targets specific characteristics relevant to the study.
Useful in qualitative research with defined criteria.

22
Panel Sampling
Involves studying the same group (panel) over time.
Allows for longitudinal analysis and tracking changes.
Panel members are selected non-randomly and retained.

Snowball Sampling
Existing subjects recruit future subjects from their networks.
Used for hard-to-reach or hidden populations.
Effective but risks selection bias due to homogenous networks.

Sampling error

• Sampling error is the difference between the results obtained from a sample and the
actual values of the population.
• It occurs because only a subset of the population is studied, not the entire group.
• This error is natural and expected, but it can be minimized through proper sampling
techniques.

Types of Sampling error

Random Sampling Error:

Occurs due to chance variations when a random sample does not perfectly represent the
population.

It can be reduced by increasing the sample size or using more precise sampling methods.

Systematic Sampling Error (or Bias):

Happens when the sampling method consistently overrepresents or underrepresents certain


groups.

It is usually caused by flaws in the sampling process or design, not by chance.

23
Estimation

Estimation is the process of inferring or approximating a population parameter based on


sample data.

It helps researchers make predictions or decisions without studying the entire population.

Estimation is central to statistical analysis and decision-making.

Types of Estimation

Point Estimation:
Gives a single value (point) as an estimate of a population parameter (e.g., mean, proportion).
Example: Using the sample mean to estimate the population mean.
It's simple but doesn't show how accurate the estimate is.

Interval Estimation:
Provides a range (interval) of values within which the population parameter is likely to lie.
Usually includes a confidence level (like 95%) to indicate reliability.
Example: The population mean is estimated to be between 45 and 55 with 95% confidence.

Probability

Probability is the measure of how likely an event is to occur.

It ranges from 0 (impossible event) to 1 (certain event).

Example: The probability of flipping a coin and getting heads is 0.5.

Probability Distribution

• A probability distribution shows how probabilities are distributed over the values of a
random variable.
• It describes all possible outcomes and their associated probabilities.
• Two main types are Discrete (e.g., Binomial, Poisson) and Continuous (e.g., Normal
distribution).

24
Binomial Distribution

It models the number of successes in a fixed number of independent yes/no trials.

Each trial has two outcomes (success or failure) and a constant probability of success.

Example: Tossing a coin 10 times and counting the number of heads.

Key Conditions:

Fixed number of trials (n)

Only two outcomes (success/failure)

Constant probability (p)

Independent trials

P(x)=nC x ⋅p x ⋅q n−x

Example Question: A coin is tossed 5 times. What is the probability of getting exactly 3
heads?(Here, getting a head is considered a success)

Given:

• n=5

• x=3

• p=0.5 (since probability of heads = 0.5)

• q=1−p=0.5

P(3)=5C 3 ⋅(0.5) 3 ⋅(0.5) 5−3 =0.3125

So, there is a 31.25% chance of getting exactly 3 heads in 5 tosses.

25
Poisson Distribution

It models the number of times an event occurs in a fixed interval of time or space.

It’s used when events happen independently and rarely over a large number of opportunities.

Example: Number of customer calls received at a call center per hour.

Key Conditions:

Events occur independently

Average rate (λ) is constant

Two events can't occur at the exact same moment

P(x)= (e −λ) λ x / x!

Example Question: A call center receives 4 calls per hour on average. What is the
probability that it will receive exactly 2 calls in an hour?

Given:

𝜆=4

𝑥=2

P(2)= e-4 −42 / 2! = 0.1464

So, there is approximately a 14.64% chance that the call center will receive exactly 2 calls in
one hour.

26
Unit – 4

Predictive Analytics

Predictive Analytics

Predictive analytics is the use of statistics and modeling techniques to forecast future outcomes.

Current and historical data patterns are examined and plotted to determine the likelihood that
those patterns will repeat.

Types of Predictive Analytics

• Regression Models: These models estimate the strength of relationships between


variables, allowing for predictions based on input data. Examples include linear
regression.
• Time Series Models: These models analyze data points over time to identify patterns
and trends, enabling predictions about future values. Examples include ARIMA models
and other time-series forecasting techniques.
• Decision Trees: These models use a tree-like diagram to represent decision-making
processes, allowing for the prediction of outcomes based on different inputs.

Steps in Predictive Analytics

27
Steps involved in Predictive Analysis

1. Define your project’s objectives. What is the desired outcome? What problem are you trying
to solve? The first step is to define your project’s objectives, deliverables, scope, and data
required.

2. Collect your data. Gather all the data you need in one place. Include different types of current
and historical data from a variety of sources – from transactional systems and sensors to call
center logs – for more in-depth results.

3. Clean and prepare your data. Clean, prepare, and integrate your data to get it ready for
analysis. Remove outliers and identifying missing information to improve the quality of your
predictive data set.

4. Build and test your model. Build your predictive model, train it on your data set, and test it
to ensure its accuracy. It may take multiple iterations to generate an error-free model.

5. Deploy your model. Deploy your predictive model and put it to work on new data. Get results
and reports – and automate decision-making based on the output.

6. Monitor and refine your model. Regularly monitor your model to review its performance
and ensure it’s providing the expected results. Refine and optimize your model as needed.

Logic-Driven Predictive Models

Definition:

These models use expert knowledge and predefined rules to make predictions. The logic is
manually encoded, often using "if-then" rules or decision trees.

How it works:

Based on domain expertise or business logic. Doesn’t rely heavily on historical data. Uses
deterministic rules.

28
Examples:

"If a customer hasn’t logged in for 30 days, flag them as 'at-risk’.

Pros:

Transparent and explainable.

No need for large datasets.

Quick to implement for well-understood domains.

Cons:

Doesn’t adapt to new data or patterns.

Limited scalability and flexibility.

Rule conflicts can arise in complex systems.

Data-Driven Predictive Models

Definition:

These models rely on historical data and use statistical algorithms or machine learning to learn
patterns and make predictions.

How it works:

Trains on past data to find trends or relationships.

Examples include regression, decision trees, random forests, neural networks.

Examples:

Predicting customer churn based on purchase history and behavior.

Forecasting sales using time-series data.

29
Pros:

High accuracy.

Can uncover complex relationships.

Cons:

Requires large, clean datasets.

Can be a black box.

Difference:

Aspect Logic-Driven Data-Driven

Based on Rules & domain expertise Historical data

Flexibility Low High

Adaptability Static Learns and adapts

Data Requirements Minimal Extensive

Python ML libraries, R,
Example Tools Expert Systems, Rule Engines
AutoML platforms

30
For Difference,

Task Logic-Driven Data-Driven

Learns from years of weather


"If it's cloudy and humid, it
Predicting weather data to predict rain
might rain."
accurately.

"If a card is used in two Learns from millions of


Detecting fraud countries in 5 minutes, flag transactions to detect unusual
it." patterns.

Data Mining for Predictive Analytics

Data Mining is the process of discovering patterns, trends, and knowledge from large sets of
data. When used for Predictive Analytics, it helps in forecasting future outcomes based on
historical data.

Data Mining Process

Problem Definition

Understand the business or research goal.

Define what needs to be predicted (e.g., customer churn, sales forecast).

Data Collection

Gather relevant data from different sources like databases, cloud storage, etc.

Data Preprocessing

Clean the data (remove noise, handle missing values).

Transform data (normalization, encoding, etc.) and Select relevant features (feature selection
or extraction).

31
Data Mining / Model Building

Apply appropriate algorithms (e.g., regression, classification, clustering).

Train the model using historical data.

Evaluation

Test the model on new or unseen data.

Use metrics like accuracy, precision, recall, etc.

Deployment

Integrate the model into the business process.

Monitor and update as needed.

Data Mining Techniques for Predictive Analytics

1. Classification

Goal: Predict categorical outcomes.

Example:
A bank wants to predict whether a loan applicant will default or not (Yes/No).

Technique: Decision Tree

Input Data: Age, Income, Credit Score, Loan Amount

Output: "Yes" (will default) or "No" (will not default)

32
2. Regression

Goal: Predict numeric values.

Example:
A real estate company wants to predict the price of a house.

Technique: Linear Regression

Input Data: Number of bedrooms, Size (sq. ft.), Location, Age of house

Output: ₹45,00,000 (predicted price)

3. Time Series Analysis

Goal: Predict values based on time-dependent data.

Example:
A retail store wants to forecast monthly sales for the next 6 months.

Technique: ARIMA or Forecasting

Input Data: Monthly sales data for the past 3 years

Output: Predicted sales for coming months

4. Association Rule Learning

Goal: Find relationships or patterns among items.

Example:
A supermarket wants to find which items are frequently bought together.

Technique: Apriori Algorithm

Input Data: List of products bought in each transaction

Output Rule: "If a customer buys bread and butter, they are likely to buy jam."

33
5. Clustering (supporting technique for segmentation)

Goal: Group data into similar clusters

Example:
An e-commerce site wants to group customers into segments based on purchase behavior.

Technique: K-means Clustering

Input Data: Purchase frequency, Order value, Product categories

Output:

• Cluster 1: High spenders

• Cluster 2: Occasional buyers

• Cluster 3: Discount seekers

6. Neural Networks (Deep Learning)

Goal: Model complex, non-linear relationships.

Example:
A hospital uses patient data to predict the risk of heart disease.

Technique: Neural Networks

Input Data: Age, Blood Pressure, Cholesterol, ECG, etc.

Output: 0.78 (Probability of having heart disease)

Applications of Data Mining in Predictive Analytics

Business & Marketing

Customer churn prediction

Customer segmentation

Sales forecasting

34
Healthcare

Disease prediction

Patient readmission prediction

Drug response prediction

Finance

Credit scoring

Fraud detection

Stock market prediction

Challenges in Data Mining for Predictive Analytics

• Data Quality

• Data Privacy and Security

• Dynamic Data

• Interpretability

• Scalability

35
Unit – 5

Prescriptive Analytics

Prescriptive Analytics

• Prescriptive analytics uses data to recommend actions that can lead to the best possible
outcomes.

• It not only predicts what might happen but also suggests what should be done next.

• By using techniques like optimization and machine learning, it helps businesses make
smarter decisions.

• Prescriptive analytics is often used in areas like marketing, healthcare, and supply
chain management.

Steps involved in Predictive Analysis

Define the Objective: Clearly understand and state what decision or goal you want to achieve.

Collect and Prepare Data: Gather relevant internal and external data and organize it for
analysis.

Analyze Data and Build Models: Use optimization models, machine learning, or simulation
techniques to study the data.

Generate Recommendations: Develop actionable suggestions based on the model's results.

Evaluate and Validate Results: Test the recommendations to check if they meet the objectives
and make adjustments if needed.

Implement and Monitor: Put the chosen actions into practice and track outcomes to improve
future decisions.

36
Types of Prescriptive Modeling

Definition: Linear programming is used to find the best outcome in a mathematical model with
linear relationships. It is often used for optimization problems.

Example: A factory wants to maximize profits by deciding how many units of each product to
produce, given constraints like labor and materials. Linear programming helps find the optimal
production mix.

Definition: Integer programming is a type of optimization where decision variables are


constrained to integer values. It is useful when the solution must be a whole number.

Example: A delivery company needs to assign trucks to different routes. Since the number of
trucks is fixed, integer programming helps in assigning them to minimize travel distance.

Definition: Network optimization involves finding the most efficient flow of resources through
a network. It aims to minimize costs or maximize efficiency in transportation, logistics, or data
flow.

Example: A shipping company needs to minimize transportation costs by determining the most
efficient routes for its trucks to take between distribution centers.

Definition: Simulation optimization uses simulation models to evaluate different scenarios and
find the best solution. It is particularly useful for complex, uncertain systems.

Example: A hospital wants to optimize the scheduling of surgeries. By running simulations of


different schedules, the best option to minimize waiting times is found.

Definition: Decision trees are used to make decisions by breaking down a problem into a tree
structure of possible outcomes. They help in decision-making under uncertainty.

Example: A bank uses a decision tree to determine whether to approve a loan based on the
applicant's credit score and income.

37
Definition: Reinforcement learning is a machine learning technique where an agent learns by
interacting with an environment to maximize a cumulative reward.

Example: A self-driving car uses reinforcement learning to determine the best route based on
real-time traffic conditions, continuously improving its driving decisions.

Definition: Non-linear optimization deals with optimization problems where the objective
function or constraints are non-linear, involving complex relationships between variables.

Example: A company in the energy sector uses non-linear optimization to minimize fuel
consumption while meeting fluctuating energy demands, considering non-linear relationships
in energy production costs.

Definition: Heuristics are rule-of-thumb techniques to quickly find good-enough solutions.


They are used when exact methods are too slow or complex.

Example: A traveler wants to pack a suitcase quickly for a trip. Instead of trying all packing
combinations, they pack by priority. Heuristics help choose what’s most important. It’s not
perfect, but it's fast and practical.

Pros of Prescriptive Modeling

• Decision Support

• Optimal Resource Use

• Improves Strategic Planning

• Scenario Analysis

• Competitive Advantage

38
Cons of Prescriptive Modeling

• Complexity

• Data Dependency

• Assumptions May Be Unrealistic

• Limited Flexibility

• Cost & Time

Applications of Prescriptive Modeling

• Supply Chain Management

Optimizing inventory, transportation, and distribution strategies.

• Healthcare Management

Scheduling surgeries, optimizing patient flow, and managing hospital resources.

• Financial Portfolio Management

Recommending investment mixes to maximize returns and minimize risks.

• Manufacturing and Production Planning

Scheduling production lines and minimizing production costs.

• Marketing Campaign Optimization

Allocating budgets across channels for maximum impact.

• Human Resource Planning

Workforce scheduling, recruitment planning, and talent management.

• Transportation and Logistics

Route optimization for shipping and delivery services.

• Retail Pricing and Merchandising

Deciding optimal product prices and promotions.

39
All the Best !

40

You might also like