0% found this document useful (0 votes)
29 views

BA Unit IV

Predictive analytics uses data analysis, machine learning, and statistical models to find patterns in historical and current data to forecast and predict future outcomes. There are three main techniques used: regression analysis to determine relationships between variables, decision trees to classify data into categories, and neural networks to model complex nonlinear relationships. Predictive analytics can be applied across many industries for uses like fraud detection, customer segmentation, risk reduction, inventory management, and maintenance forecasting. Models can be logic-driven based on inferences from existing conditions or data-driven by finding links between variables without clear physical relationships.

Uploaded by

It's me Rahul
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

BA Unit IV

Predictive analytics uses data analysis, machine learning, and statistical models to find patterns in historical and current data to forecast and predict future outcomes. There are three main techniques used: regression analysis to determine relationships between variables, decision trees to classify data into categories, and neural networks to model complex nonlinear relationships. Predictive analytics can be applied across many industries for uses like fraud detection, customer segmentation, risk reduction, inventory management, and maintenance forecasting. Models can be logic-driven based on inferences from existing conditions or data-driven by finding links between variables without clear physical relationships.

Uploaded by

It's me Rahul
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

BUSINESS ANALYTICS

UNIT IV
PREDICTIVE ANALYTICS
Predictive analytics
• Predictive analytics is the process of using data to forecast future
outcomes. The process uses data analysis, machine learning, artificial
intelligence, and statistical models to find patterns that might predict
future behavior. Organizations can use historic and current data to
forecast trends and behaviors seconds, days, or years into the future with
a great deal of precision.
PREDICTIVE ANALYTICS FRAMEWORKS
• Define the problem: A prediction starts with a good thesis and set of requirements. For
instance, can a predictive analytics model detect fraud? Determine optimal inventory levels for
the holiday shopping season? Identify potential flood levels from severe weather? A distinct
problem to solve will help determine what method of predictive analytics should be used.
• Acquire and organize data: an organization may have decades of data to draw upon, or a
continual flood of data from customer interactions. Before predictive analytics models can be
developed, data flows must be identified, and then datasets can be organized in a repository
such as a data warehouse like big query.
• Pre-process data: raw data is only nominally useful by itself. To prepare the data for the
predictive analytics models, it should be cleaned to remove anomalies, missing data points, or
extreme outliers, any of which might be the result of input or measurement errors.
• Develop predictive models: data scientists have a variety of tools and techniques to develop
predictive models depending on the problem to be solved and nature of the dataset. Machine
learning, regression models, and decision trees are some of the most common types of
predictive models.
• Validate and deploy results: check on the accuracy of the model and adjust accordingly. Once
acceptable results have been achieved, make them available to stakeholders via an app,
website, or data dashboard.
PREDICTIVE ANALYTICS TECHNIQUES:
Predictive analytics tends to be performed with three main types of techniques:

Regression analysis

Regression is a statistical analysis technique that estimates relationships between variables. Regression is useful to
determine patterns in large datasets to determine the correlation between inputs. It is best employed on continuous
data that follows a known distribution. Regression is often used to determine how one or more independent
variables affects another, such as how a price increase will affect the sale of a product.

Decision trees

Decision trees are classification models that place data into different categories based on distinct variables. The
method is best used when trying to understand an individual's decisions. The model looks like a tree, with each
branch representing a potential choice, with the leaf of the branch representing the result of the decision. Decision
trees are typically easy to understand and work well when a dataset has several missing variables.

Neural networks

Neural networks are machine learning methods that are useful in predictive analytics when modeling very complex
relationships. Essentially, they are powerhouse pattern recognition engines. Neural networks are best used to
determine nonlinear relationships in datasets, especially when no known mathematical formula exists to analyze
the data. Neural networks can be used to validate the results of decision trees and regression models.
USES AND EXAMPLES OF PREDICTIVE ANALYTICS
Predictive analytics can be used to streamline operations, boost revenue, and mitigate risk for almost any business or industry, including
banking, retail, utilities, public sector, healthcare, and manufacturing. Sometimes augmented analytics are used, which uses big data
machine learning. Here are some more use case examples, including data lake analytics.

• Predictive analytics examines all actions on a company’s network in real time to pinpoint abnormalities that
indicate fraud and other vulnerabilities.
• Conversion and purchase prediction & Customer segmentation
• Companies can take actions, like retargeting online ads to visitors, with data that predicts a greater likelihood of
conversion and purchase intent.
• Risk reduction & Operational improvement
• Credit scores, insurance claims, and debt collections all use predictive analytics to assess and determine the
likelihood of future defaults.
• Companies use predictive analytics models to forecast inventory, manage resources, and operate more
efficiently.
• By dividing a customer base into specific groups, marketers can use predictive analytics to make forward-looking
decisions to tailor content to unique audiences.
• Maintenance forecasting
• Organizations use data to predict when routine equipment maintenance will be required and can then schedule
it before a problem or malfunction arises.
LOGIC AND DATA DRIVEN MODELS
• Predictive modelling is the method of making, testing and authenticating
a model to best predict the likelihood of a conclusion. Several modelling
procedures from artificial intelligence, machine learning and statistics are
present in predictive analytics software solutions. Models can utilize
single or more classifiers to decide the probability of a set of data
related to another set.
• The different models available for predictive analytics software enables
the system to develop new data information and predictive models. Each
model has its own strengths and weakness and is best suited for various
types of problems.
PREDICTIVE MODELLING
• Predictive modelling is at the heart of business decision making
• Building decision models more than science is an art
• Creating an ideal decision model demands:

• Good understanding of functional business areas


• Knowledge of conventional and in-trend business practices and research
• Logical skillset

• It is always recommended to start simple and keep on adding to the


models as required.
LOGIC-DRIVEN MODELS
• Logic driven models are created on the basis of inferences and
postulations which the sample space and existing conditions provide.
Creating logical models require solid understanding of business
functional areas, logical skills to evaluate the propositions better and
knowledge of business practices and research.
• To understand better, let us take an example of a customer who
visits a restaurant around six times in a year and spends around
₹5000 per visit. The restaurant gets around 40% margin on per
visit billing amount. The annual gross profit on that customer
turns out to be 5000 × 6 × 0.40 = ₹12000. 30% of the customers
do not return each year, while 70% do return to provide more
business to the restaurant.
• Assuming the average lifetime of a customer (time for which a consumer remains a
customer) W 1/.3 = 3.33 years. So, the average gross profit for a typical customer turns
out to be 12000 × 3.33 = ₹39,960.

• Armed with all the above details, we can logically arrive at a conclusion and can derive
the following model for the above problem statement:

• Economic value of each customer (V) = (R × F × M)/D

• Where,R = revenue generated per customer

• F = frequency of visits per year

• M = profit margin

• D = defection rate (non-returning customers each year)


Data-driven models
• The main aim of data-driven model concept is to find links between the
state system variables (input and output) without clear knowledge of the
physical attributes and behaviour of the system. The data driven
predictive modelling derives the modelling method based on the set of
existing data and entails a predictive methodology to forecast the future
outcomes.
• It is data-driven only when there is no clear knowledge of the
relationships among variables/system, though there is lot of data. Here,
you are simply predicting the outcomes based on the data. The model is
not based on hand-picked variables, but may contain unobserved,
hidden combination of variables.
Data driven modeling (DDM)
• Data driven modeling (DDM) is a technique using which the configurator model components are
dynamically injected into the model based on the data derived from external systems such as catalog
system, customer relationship management (CRM), Watson, and so on.

• The Omni-configurator engine constructs the model components including option class and option item
during runtime based on the service request parameters, and populates associated properties before
executing business logic contained inside the configurator model.

• Using the DDM technique, a modeler can define a configurator model by using the sterling configurator
visual modeler tool with DDM properties that defines the data source and selection criteria for injecting
the catalog items into the model. The data is retrieved from the system or data source by using the data
source adapters implemented for each system or data source.

• Based on the data source defined in the model, the corresponding data source adapters are invoked to
fetch the data. Model components are dynamically created in the configurator model based on the
data returned by the data source adapter.

• The DDM technique provides the following benefits over the static modeling technique: it reduces the
total cost of ownership (TCO) by eliminating manual construction of model components by the modeler
that represents products within configurator models.

• It reduces the time to market since the model is dynamically updated with changes in the catalog
system.


1. Differences between Static and dynamic modeling techniques

Static Dynamic

There is no need to manually


update the model. Modeler creates
Each time a new item gets added the DDM-based model with
to the catalog, the modeler had to placeholders for dynamic options.
manually update the model for Therefore, whenever a new item
making the new option items gets added to the catalog, the
available in the UI. application dynamically updates
the model and displays the new
option items in the UI.

Multiple data sources are


Multiple data sources are not
supported. Data from multiple data
supported. Model can receive data
sources can be injected into a
from a single data source only.
DDM-based model.
TYPES OF PREDICTIVE MODELS

• There are many ways of classifying predictive models and in practice


multiple types of models may be combined for best results. The most
salient distinction is between unsupervised versus supervised models.
• Unsupervised models use traditional statistics to classify the data directly,
using techniques like logistic regression, time series analysis and decision
trees.
• Supervised models use newer machine learning techniques such as neural
networks to identify patterns buried in data that has already been
labeled.
TYPES OF PREDICTIVE MODELS
• Some of the most popular methods include the following:
• Decision trees. Decision tree algorithms take data (mined, open source, internal) and
graph it out in branches to display the possible outcomes of various decisions. Decision
trees classify response variables and predict response variables based on past decisions,
can be used with incomplete data sets and are easily explainable and accessible for
novice data scientists.
• Time series analysis. This is a technique for the prediction of events through a sequence of
time. You can predict future events by analyzing past trends and extrapolating from there.
• Logistic regression. This method is a statistical analysis method that aids in data
preparation. As more data is brought in, the algorithm's ability to sort and classify it
improves and therefore predictions can be made.
• Neural networks. This technique reviews large volumes of labeled data in search of
correlations between variables in the data. Neural networks form the basis of many of
today's examples of artificial intelligence (AI), including image recognition, smart assistants
and natural language generation.
ALGORITHMS FOR PREDICTIVE MODELING
• Random forest. This algorithm combines unrelated decision trees and uses
classification and regression to organize and label vast amounts of data.
• Gradient boosted model. Similar to random forest, this algorithm uses
several decision trees, but in this method, each tree corrects the flaws of the
previous one and builds a more accurate picture.
• K-means. This algorithm groups data points in a similar fashion as clustering
models and is popular in devising personalized retail offers. It create
personalized offers by seeking out similarities among large groups of
customers.
• Prophet. A forecasting procedure, this algorithm is especially effective when
dealing with capacity planning. This algorithm deals with time series data
and is relatively flexible.
How to build a predictive model
Building a predictive model starts with identifying historical data that's representative of the outcome you
are trying to predict."
The model can infer outcomes from historical data but cannot predict what it has never seen before,"
Carroll said. Therefore, the volume and breadth of information used to train the model is critical to
securing an accurate prediction for the future.
The next step is to identify ways to clean, transform and combine the raw data that leads to better
predictions.
Skill is required in not only finding the appropriate set of raw data but also transforming it into data
features that are most appropriate for a given model. For example, calculations of time-boxed weekly
averages may be more useful and lead to better algorithms than real-time levels.
It is also important to weed out data that is coincidental or not relevant to a model. At best, the additional
data will slow the model down, and at worst, it will lead to less accurate models.
This is both an art and a science. The art lies in cultivating a gut feeling for the meaning of things and
intuiting the underlying causes. The science lies in methodically applying algorithms to consistently
achieve reliable results, and then evaluating these algorithms over time. Just because a spam filter works
on day one does not mean marketers will not tune their messages, making the filter less effective.
Analyzing representative portions of the available information -- sampling -- can help speed development
time on models and enable them to be deployed more quickly.
Benefits of predictive modeling
Phil Cooper, group VP of products at Clari, a RevOps software startup, said some of the
top benefits of predictive modeling in business include the following:
•Prioritizing resources. Predictive modeling is used to identify sales lead conversion and
send the best leads to inside sales teams; predict whether a customer service case will be
escalated and triage and route it appropriately; and predict whether a customer will pay their
invoice on time and optimize accounts receivable workflows.
•Improving profit margins. Predictive modeling is used to forecast inventory, create pricing
strategies, predict the number of customers and configure store layouts to maximize sales.
•Optimizing marketing campaigns. Predictive modeling is used to unearth new customer
insights and predict behaviors based on inputs, allowing organizations to tailor marketing
strategies, retain valuable customers and take advantage of cross-sell opportunities.
•Reducing risk. Predictive analytics can detect activities that are out of the ordinary such as
fraudulent transactions, corporate spying or cyber attacks to reduce reaction time and
negative consequences.
The techniques used in predictive modeling are probabilistic as opposed to deterministic. This
means models generate probabilities of an outcome and include some uncertainty.
"This is a fundamental and inherent difference between data modeling of historical facts
versus predicting future events [based on historical data] and has implications for how this
information is communicated to users," Cooper said. Understanding this difference is a critical
necessity for transparency and explain ability in how a prediction or recommendation was
generated.
Predictive modeling versus predictive
analytics
Predictive modeling is but one aspect in the larger predictive analytics process
cycle. This includes collecting, transforming, cleaning and modeling data using
independent variables, and then reiterating if the model does not quite fit the
problem to be addressed.
"Once data has been gathered, transformed and cleansed, then predictive
modeling is performed on the data," said Terri Sage, chief technology officer at
1010data, an analytics consultancy.
Collecting data, transforming and cleaning are processes used for other types
of analytic development.
"The difference with predictive analytics is the inclusion and discarding of
variables during the iterative modeling process," Sage explained.
This will differ across various industries and use cases, as there will be diverse
data used and different variables discovered during the modeling iterations.
For example, in healthcare, predictive models may ingest a tremendous
amount of data pertaining to a patient and forecast a patient's response to
certain treatments and prognosis. Data may include the patient's specific
medical history, environment, social risk factors, genetics -- all which vary from
person to person. The use of predictive modeling in healthcare marks a shift
from treating patients based on averages to treating patients as individuals.
Similarly, with marketing analytics, predictive models might use data sets
based on a consumer's salary, spending habits and demographics. Different
data and modeling will be used for banking and insurance to help determine
credit ratings and identify fraudulent activities.
Predictive modeling tools
Before deploying a predictive model tool, it is crucial for your organization
to ask questions and sort out the following: Clarify who will be running the
software, what the use case will be for these tools, what other tools will
your predictive analytics be interacting with, as well as the budget.
Different tools have different data literacy requirements, are effective in
different use cases, are best used with similar software and can be
expensive. Once your organization has clarity on these issues,
comparing tools becomes easier.
•Sisense. A business intelligence software aimed at a variety of
companies that offers a range of business analytics features. This
requires minimal IT background.
•Oracle Crystal Ball. A spreadsheet-based application focused on
engineers, strategic planners and scientists across industries that can be
used for predictive modeling, forecasting as well as simulation and
optimization.
•IBM SPSS Predictive Analytics Enterprise. A business intelligence
platform that supports open source integration and features descriptive
and predictive analysis as well as data preparation.
•SAS Advanced Analytics. A program that offers algorithms that identify
the likelihood of future outcomes and can be used for data mining,
forecasting and econometrics.
The future of predictive modeling

There are three key trends that will drive the future of data
modeling.
1.First, data modeling capabilities are being baked into more
business applications and citizen data science tools. These
capabilities can provide the appropriate guardrails and templates
for business users to work with predictive modeling.
2.Second, the tools and frameworks for low-code predictive
modeling are making it easier for data science experts to quickly
cleanse data, create models and vet the results.
3.Third, better tools are coming to automate many of the data
engineering tasks required to push predictive models into
production. Carroll predicts this will allow more organizations to
shift from simply building models to deploying them in ways that
deliver on their potential value.
Challenges of predictive modeling
Data preparation. One of the most frequently overlooked challenges of predictive
modeling is acquiring the correct amount of data and sorting out the right data to use
when developing algorithms. By some estimates, data scientists spend about 80% of their
time on this step. Data collection is important but limited in usefulness if this data is not
properly managed and cleaned.
Once the data has been sorted, organizations must be careful to avoid overfitting. Over-
testing on training data can result in a model that appears very accurate but has
memorized the key points in the data set rather than learned how to generalize.
Technical and cultural barriers. While predictive modeling is often considered to be
primarily a mathematical problem, users must plan for the technical and organizational
barriers that might prevent them from getting the data they need. Often, systems that
store useful data are not connected directly to centralized data warehouses. Also, some
lines of business may feel that the data they manage is their asset, and they may not
share it freely with data science teams.
Choosing the right business case. Another potential obstacle for predictive modeling
initiatives is making sure projects address significant business challenges. Sometimes,
data scientists discover correlations that seem interesting at the time and build algorithms
to investigate the correlation further. However, just because they find something that is
statistically significant does not mean it presents an insight the business can use.
Predictive modeling initiatives need to have a solid foundation of business relevance.
Bias. "One of the more pressing problems everyone is talking about, but few have
addressed effectively, is the challenge of bias," Carroll said. Bias is naturally introduced
into the system through historical data since past outcomes reflect existing bias.
Nate Nichols, distinguished principal at Narrative Science, a natural language generation
tools provider, is excited about the role that new explainable machine learning methods
such as LIME or SHAP could play in addressing concerns about bias and promoting trust.
PREDICTIVE ANALYSIS PROCEDURE
DATA MINING FOR PREDICTIVE ANALYTICS
• What is data mining?

• Data mining refers to a process of analyzing data from different contexts and
summarizing it into useful information. The information gathered from data mining could
include customer patterns, purchase patterns, transaction times, customer demand, and the
relationship between the sold items. It is a powerful technology with great potential to
assist companies in targeting the most significant information in the data set they have
gathered about the customer behaviours and potential of the customers.

• These are the given steps involved in the process of data mining

• Business understandings

• Data selection

• Data preparation

• Modelling

• Evaluation

• Deployment
APPLICATION OF DATA MINING

• Financial analysis
• Biological data analysis
• Market analysis
• Retail industry
• Manufacturing engineering
• Criminal investigation
Predictive Analytics Data Mining

Predictive analytics refers to the use of


data mining refers to the computational
both new and historical data, statistical
technique of discovering patterns in
algorithms, and machine learning
huge data sets involving methods at the
techniques to forecast future activity,
intersection of AI.
patterns, and trends

It helps to make predictions based on It helps to understand the gathered


future events. information better.

Business analysts and other SMEs


Statisticians and engineers perform it.
perform it.

It applies algorithms such as


It applies business knowledge to find
classification and regression on
patterns to get valid business
gathered information to find hidden
predictions.
patterns.
How does data mining work?
• The entire process of data mining consists of three basic stages:

• Exploration– the first and foremost stage usually starts with data preparation; i.E., From cleaning data
to data transformations, selecting subsets of records and so forth. The primary stage can take place
anywhere between a simple choice of straightforward predictors for a regression model, to elaborate
exploratory analyses using a wide variety of graphical and statistical methods. Keeping the nature of
the analytic problem in mind, businesses can quickly identify the most relevant variables and at the
same time, determine the complexity and/or the general nature of models.

• Model building or pattern identification– the second stage is all about learning about several models
and choosing the right one for your need. Depending on the predictive performance, you need to
conduct such simple yet elaborative process. Although, several techniques can be taken into account such
as bagging (voting, averaging), boosting, stacking (stacked generalizations), and meta-learning. It is
interesting to know that many of these are based on so-called “competitive evaluation of models.” This
means applying different models to the same data set and then comparing their performance to choose
the best.

• Deployment- the last and final stage involves the use of the selected model and applying the same to
generate predictions or estimates of the expected outcome. Data mining as a business information
management tool seems to becoming popular day in day out. However, the only difference between
data mining and the traditional exploratory data analysis (EDA) is that data mining is more oriented
towards applications than the fundamental nature of the underlying phenomena. Which means it is less
concerned with identifying the specific relations between the involved variables.
ANALYSIS OF PREDICTIVE ANALYTICS

• In predictive analysis we use both excel and IBM SPSS for to computer
statistics in this step of BA process.
• Multiple regression can be used to evaluate independent variables are
the best included or exclude in linear model called step wise multiple
regression.
• Validation statistics – the multiple correlation coefficient and the F-test
from ANOVA

You might also like