DA5.6 Marketing Analytics q&a
DA5.6 Marketing Analytics q&a
[b] Brief the differences between data analytics and data mining?
Data mining is catering the data collection and deriving crude but essential insights.
Data analytics then uses the data and crude hypothesis to build upon that and create
a model based on the data. Data mining is a step in the process of data analytics
[2] What is Data Mining Tool? Mention different types of Data Mining Tools and its
applications
Data Mining tools are software programs that help in framing and executing data mining
techniques to create data models and test them as well. It is usually a framework like R
studio or Tableau with a suite of programs to help build and test a data model.
Data mining has several types, Rapid Miner, Oracle Data Mining, IBM SPSS Modeler, Knime,
Python, Orange, Kaggle, Rattle, Weka,Teradata, H2O, Apache Spark, Sisense, Xplenty
including pictorial data mining, text mining, social media mining, web mining, and audio and
video mining amongst others.
Few applications of DATA MINING are mentioned and explained below:
1. Healthcare
Data mining methods has the potential to transform the healthcare system completely. It can be
used to identify best practices based on data and analytics, which can help healthcare facilities to
reduce costs and improve patient outcomes. Data mining, along with machine learning, statistics,
data visualization, and other techniques can be used to make a difference. It can come in handy
when forecasting patients of different categories. This will help patients to receive intensive care
when and where they want it. Data mining can also help healthcare insurers to identify
fraudulent activities.
2. Education
Use of data mining methods in education aims to develop techniques that can use data coming
out of education environments for knowledge exploration. The purposes that these techniques
are expected to serve include studying how educational support impacts students, supporting the
future-leaning needs of students, and promoting the science of learning amongst others.
Educational institutions can use these techniques to not only predict how students are going to
do in examinations but also make accurate decisions. With this knowledge, these institutions can
focus more on their teaching pedagogy.
3. Market basket analysis
This is a modelling technique that uses hypothesis as a basis. Retailers can use this technique to
understand the buying habits of their customers. Retailers can use this information to make
changes in the layout of their store and to make shopping a lot easier and less time consuming
for customers.
4. Customer relationship management (CRM)
CRM involves acquiring and keeping customers, improving loyalty, and employing customer-
centric strategies. Every business needs customer data to analyze it and use the findings in a way
that they can build a long-lasting relationship with their customers.
(3) Mention the different types of Methods used in DATA MINING and explain them?
Some Methods are:
Association
Classification
Clustering Analysis
Prediction
Sequential Patterns or Pattern Tracking
Decision Trees
Outlier Analysis or Anomaly Analysis
Neural Network
1. Association
It is used to find a correlation between two or more items by identifying the hidden pattern in the
data set and hence also called relation analysis. This method is used in market basket analysis to
predict the behavior of the customer.
Suppose, the marketing manager of a supermarket wants to determine which products are
frequently purchased together.
As an example,
Buys (x,”beer”) -> buys(x, “chips”) [support = 1%, confidence = 50%]
Here x represents a customer buying beer and chips together.
Confidence shows certainty that if a customer buys a beer, there is a 50% chance that he/she will
also accept the chips.
Support means that 1% of all the transactions under analysis showed that beer and chips were
bought together.
Many similar examples like bread and butter or computer and software can be considered.
There are two types of Association Rules:
Single dimensional association rule: These rules contain a single attribute that is repeated.
Multidimensional association rule: These rules contain multiple attributes that are repeated.
2. Classification
This data mining method is used to distinguish the items in the data sets into classes or groups. It
helps to predict the behaviour of entities within the group accurately. It is a two-step process:
Learning step (training phase): In this, a classification algorithm builds the classifier by analyzing a
training set.
Classification step: Test data are used to estimate the accuracy or precision of the classification
rules.
For example, a banking company uses to identify loan applicants at low, medium or high credit risks.
Similarly, a medical researcher analyzes cancer data to predict which medicine to prescribe to the
patient.
3. Clustering Analysis
Clustering is almost similar to classification, but in this cluster are made depending on the similarities
of data items. Different groups have dissimilar or unrelated objects. It is also called data
segmentation as it partitions huge data sets into groups according to the similarities.
Various clustering methods are used:
Hierarchical Agglomerative methods
Grid-Based Methods
Partitioning Methods
Model-Based Methods
Density-Based Methods
A similar example of loan applicants can be considered here also. Some differences are depicted in
the figure below.
4. Prediction
This method is used to predict the future based on the past and present trends or data set.
Prediction is mostly used to combine other mining methods such as classification, pattern matching,
trend analysis, and relation.
For example, if the sales manager would like to predict the amount of revenue that each item would
generate based on past sales data. It models a continuous-valued function that indicates missing
numeric data values.
Regression Analysis is the best choice to perform prediction. It can be used to set a relationship
between independent variables and dependent variables.
5. Sequential patterns or Pattern tracking
This method is used to identify patterns that frequently occur over a certain period of time.
For example, a clothing company’s sales manager sees that sales of jackets seem to increase just
before the winter season, or sales in bakery increase during Christmas or New Year’s eve.
Let’s look at an example with a graph.
6. Decision Trees
A decision tree is a tree structure (as its name suggests), where
Each internal node represents a test on the attribute.
Branch denotes the result of the test.
Terminal nodes hold the class label.
The topmost node is the root node which has a simple question that has two or more answers.
Accordingly, the tree grows, and a flow chart like structure is generated.
In this decision, tree government classifies citizens below age 18 or above age 18. This would help
them to decide whether a license must be issued to a particular city or not.
7. Outlier Analysis or Anomaly Analysis:
This method identifies the data items that do not comply with the expected pattern or expected
behaviour. These unexpected data items are considered as outliers or noise. They are helpful in
many domains like credit card fraud detection, intrusion detection, fault detection etc. This is also
called Outlier Mining.
For example, let’s assume the graph below is plotted using some data sets in our database.
So the best fit line is drawn. The points lying nearby the line show expected behaviour while the end
far from the line is an Outlier.
This would help to detect the anomalies and take possible actions accordingly.
8. Neural Network
This method or model is based on biological neural networks. It is a collection of neurons like
processing units with weighted connections between them. They are used to model the relationship
between inputs and outputs. It is used for classification, regression analysis, data processing etc. This
technique works on three pillars-
Model Learning Algorithm (supervised or unsupervised) Activation function