BDA Notes Unit-5
BDA Notes Unit-5
Machine learning enables a machine to automatically learn from data, improve performance
from experiences, and predict things without being explicitly programmed.
With the help of sample historical data, which is known as training data, machine learning
algorithms build a mathematical model that helps in making predictions or decisions without
being explicitly programmed. Machine learning brings computer science and statistics together
for creating predictive models. Machine learning constructs or uses the algorithms that learn
from historical data. The more we will provide the information, the higher will be the
performance.
A machine has the ability to learn if it can improve its performance by gaining more data.
Suppose we have a complex problem, where we need to perform some predictions, so instead
of writing a code for it, we just need to feed the data to generic algorithms, and with the help
of these algorithms, machine builds the logic as per the data and predict the output. Machine
learning has changed our way of thinking about the problem. The below block diagram explains
the working of Machine Learning algorithm:
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
We can train machine learning algorithms by providing them the huge amount of data and let
them explore the data, construct the models, and predict the required output automatically. The
performance of the machine learning algorithm depends on the amount of data, and it can be
determined by the cost function. With the help of machine learning, we can save both time and
money.
The importance of machine learning can be easily understood by its uses cases, Currently,
machine learning is used in self-driving cars, cyber fraud detection, face recognition,
and friend suggestion by Facebook, etc. Various top companies such as Netflix and Amazon
have build machine learning models that are using a vast amount of data to analyze the user
interest and recommend product accordingly.
Following are some key points which show the importance of Machine Learning:
1. Supervised learning
2. Unsupervised learning
3. Reinforcement learning
What is R Analytics?
R has become increasingly popular over many years and remains a top analytics
language for many universities and colleges. It is well established today within
academia as well as among corporations around the world for delivering robust,
reliable, and accurate analytics. While R programming was originally seen as
difficult for non-statisticians to learn, the user interface has become more user-
friendly in recent years. It also now allows for extensions and other plugins like
R Studio and R Excel, making the learning process easier and faster for new
business analysts and other users. It has become the industry standard for
statistical analysis and data mining projects and is due to grow in use as more
graduates enter the workforce as R-trained analysts.
Leveraging Big Data: R can help with querying big data and is used by many
industry leaders to leverage big data across the business. With R analytics,
organizations can surface new insights in their large data sets and make sense of
their data. R can handle these big datasets and is arguably as easy if not easier for
most analysts to use as any of the other analytics tools available today.
Statistical testing
Prescriptive analytics
Predictive analytics
Time-series analysis
What-if analysis
Regression models
Data exploration
Forecasting
Text mining
Data mining
Visual analytics
Web analytics
Social media analytics
Sentiment analysis
It provides good explanatory code. For example, if you are at the early stage
of working with a machine learning project and you need to explain the
work you do, it becomes easy to work with R language comparison to
python language as it provides the proper statistical method to work with
data with fewer lines of code.
R language is perfect for data visualization. R language provides the best
prototype to work with machine learning models.
R language has the best tools and library packages to work with machine
learning projects. Developers can use these packages to create the best pre-
model, model, and post-model of the machine learning projects. Also, the
packages for R are more advanced and extensive than python language
which makes it the first choice to work with machine learning projects.
lattice: The lattice package supports the creation of the graphs displaying
the variable or relation between multiple variables with conditions.
DataExplorer: This R package focus to automate the data visualization and
data handling so that the user can pay attention to data insights of the
project.
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
There are many top companies like Google, Facebook, Uber, etc using the R
language for application of Machine Learning. The application are:
Social Network Analytics
To analyze trends and patterns
Getting insights for behaviour of users
To find the relationships between the users
Developing analytical solutions
Accessing charting components
Embedding interactive visual graphics
Web search like Siri, Alexa, Google, Cortona: Recognize the user’s voice
and fulfill the request made
Social Media Service: Help people to connect all over the world and also
show the recommendations of the people we may know
Online Customer Support: Provide high convenience of customer and
efficiency of support agent
Intelligent Gaming: Use high level responsive and adaptive non player
characters similar to human like intelligence
Product Recommendation: A software tool used to recommend the
product that you might like to purchase or engage with
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
Virtual Personal Assistance: It is the software which can perform the task
according to the instructions provided
Traffic Alerts: Help to switch the traffic alerts according to the situation
provided
Online Fraud Detection: Check the unusual functions performed by the
user and detect the frauds
Healthcare: Machine Learning can manage a large amount of data beyond
the imagination of normal human being and help to identify the illness of
the patient according to symptoms
Real world example: When you search for some kind of cooking recipe on
youTube, you will see the recommendations below with the title “You May
Also Like This”. This is a common use of Machine Learning.
Packages
We will be using, directly or indirectly, the following packages through the chapters:
caret
ggplot2
mlbench
class
caTools
randomForest
impute
ranger
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
kernlab
class
glmnet
naivebayes
rpart
rpart.plot
Supervised Learning:
Supervised learning is the types of machine learning in which machines are trained using well
"labelled" training data, and on basis of that data, machines predict the output. The labelled
data means some input data is already tagged with the correct output.
In supervised learning, the training data provided to the machines work as the
supervisor that teaches the machines to predict the output correctly. It applies the
same concept as a student learns in the supervision of the teacher.
In the real-world, supervised learning can be used for Risk Assessment, Image
classification, Fraud Detection, spam filtering, etc.
The working of Supervised learning can be easily understood by the below example and
diagram:
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
Suppose we have a dataset of different types of shapes which includes square, rectangle,
triangle, and Polygon. Now the first step is that we need to train the model for each shape.
o If the given shape has four sides, and all the sides are equal, then it will be labelled as
a Square.
o If the given shape has three sides, then it will be labelled as a triangle.
o If the given shape has six equal sides then it will be labelled as hexagon.
Now, after training, we test our model using the test set, and the task of the model is to
identify the shape.
The machine is already trained on all types of shapes, and when it finds a new shape, it
classifies the shape on the bases of a number of sides, and predicts the output.
o Evaluate the accuracy of the model by providing the test set. If the model predicts the
correct output, which means our model is accurate.
Supervised learning deals with or learns with “labeled” data. This implies that some data is
already tagged with the correct answer.
Types:-
• Regression
• Logistic Regression
• Classification
• Naive Bayes Classifiers
• K-NN (k nearest neighbors)
• Decision Trees
• Support Vector Machine
Regression:
• Dependent Variable: This is the variable that we are trying to understand or forecast.
• Independent Variable: These are factors that influence the analysis or target variable
and provide us with information regarding the relationship of the variables with the
target variable.
Regression analysis is used for prediction and forecasting. This statistical method is
used across different industries such as,
• Financial Industry- Understand the trend in the stock prices, forecast the prices, and
evaluate risks in the insurance domain
• Marketing- Understand the effectiveness of market campaigns, and forecast pricing
and sales of the product.
• Manufacturing- Evaluate the relationship of variables that determine to define a better
engine to provide better performance
• Medicine- Forecast the different combinations of medicines to prepare generic
medicines for diseases.
Logistic Regression
Classification
In Regression algorithms, we have predicted the output for
continuous values, but to predict the categorical values, we
need Classification algorithms.
What is the Classification Algorithm?
The Classification algorithm is a Supervised Learning
technique that is used to identify the category of new
observations on the basis of training data. In
Classification, a program learns from the given dataset or
observations and then classifies new observation into a
number of classes or groups. Such as, Yes or No, 0 or 1,
Spam or Not Spam, cat or dog, etc. Classes can be called
as targets/labels or categories.
Unlike regression, the output variable of Classification is
a category, not a value, such as "Green or Blue", "fruit or
animal", etc. Since the Classification algorithm is a
Supervised learning technique, hence it takes labeled input
data, which means it contains input with the
corresponding output.
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
The goal of the SVM algorithm is to create the best line or decision boundary that can
segregate n-dimensional space into classes so that we can easily put the new data point
in the correct category in the future. This best decision boundary is called a hyperplane.
SVM chooses the extreme points/vectors that help in creating the hyperplane. These
extreme cases are called as support vectors, and hence algorithm is termed as Support
Vector Machine. Consider the below diagram in which there are two different categories
that are classified using a decision boundary or hyperplane:
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
Example: SVM can be understood with the example that we have used in the KNN
classifier. Suppose we see a strange cat that also has some features of dogs, so if we want
a model that can accurately identify whether it is a cat or dog, so such a model can be
created by using the SVM algorithm. We will first train our model with lots of images of
cats and dogs so that it can learn about different features of cats and dogs, and then we
test it with this strange creature. So as support vector creates a decision boundary
between these two data (cat and dog) and choose extreme cases (support vectors), it will
see the extreme case of cat and dog. On the basis of the support vectors, it will classify it
as a cat. Consider the below diagram:
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
SVM algorithm can be used for Face detection, image classification, text
categorization, etc.
Types of SVM
SVM can be of two types:
o Linear SVM: Linear SVM is used for linearly separable data, which means if a
dataset can be classified into two classes by using a single straight line, then such
data is termed as linearly separable data, and classifier is used called as Linear
SVM classifier.
o Non-linear SVM: Non-Linear SVM is used for non-linearly separated data, which
means if a dataset cannot be classified by using a straight line, then such data is
termed as non-linear data and classifier used is called as Non-linear SVM classifier.
The dimensions of the hyperplane depend on the features present in the dataset, which
means if there are 2 features (as shown in image), then hyperplane will be a straight line.
And if there are 3 features, then hyperplane will be a 2-dimension plane.
We always create a hyperplane that has a maximum margin, which means the maximum
distance between the data points.
Support Vectors:
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
The data points or vectors that are the closest to the hyperplane and which affect the
position of the hyperplane are termed as Support Vector. Since these vectors support the
hyperplane, hence called a Support vector.
Unsupervised learning
It uses machine learning algorithms to analyze and cluster
unlabeled datasets.
These algorithms discover hidden patterns or data groupings
without the need for human intervention.
• Unsupervised machine learning finds all kind of unknown
patterns in data.
• Unsupervised methods help you to find features which can
be useful for categorization.
• It is taken place in real time, so all the input data to be
analyzed and labeled in the presence of learners.
• It is easier to get unlabeled data from a computer than
labeled data, which needs manual intervention.
Working of Unsupervised Learning
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
Example: K-means
Agglomerative
In this clustering technique, every data is a cluster. The iterative
unions between the two nearest clusters reduce the number of
clusters.
Example: Hierarchical clustering
Overlapping
In this technique, fuzzy sets is used to cluster data. Each point
may belong to two or more clusters with separate degrees of
membership.
Here, data will be associated with an appropriate membership
value. Example: Fuzzy C-Means
Probabilistic
This technique uses probability distribution to create the
clusters.
Clustering Types
Following are the clustering types of Machine Learning:
• Hierarchical clustering
• K-means clustering
• K-NN (k nearest neighbors)
• Principal Component Analysis
• Singular Value Decomposition
• Independent Component Analysis
Hierarchical Clustering
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
In this algorithm, we develop the hierarchy of clusters in the form of a tree, and this tree-
shaped structure is known as the dendrogram.
Sometimes the results of K-means clustering and hierarchical clustering may look similar,
but they both differ depending on how they work. As there is no requirement to
predetermine the number of clusters as we did in the K-Means algorithm.
o Step-1: Create each data point as a single cluster. Let's say there are N data points,
so the number of clusters will also be N.
o Step-2: Take two closest data points or clusters and merge them to form one
cluster. So, there will now be N-1 clusters.
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
o Step-3: Again, take the two closest clusters and merge them together to form one
cluster. There will be N-2 clusters.
o Step-4: Repeat Step 3 until only one cluster left. So, we will get the following
clusters. Consider the below images:
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
o Step-5: Once all the clusters are combined into onebig cluster, develop the
dendrogram to divide the clusters as per the problem.
1. Single Linkage: It is the Shortest Distance between the closest points of the clusters.
Consider the below image:
2. Complete Linkage: It is the farthest distance between the two points of two different
clusters. It is one of the popular linkage methods as it forms tighter clusters than single-
linkage.
3. Average Linkage: It is the linkage method in which the distance between each pair of
datasets is added up and then divided by the total number of datasets to calculate the
average distance between two clusters. It is also one of the most popular linkage methods.
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
4. Centroid Linkage: It is the linkage method in which the distance between the centroid of
the clusters is calculated. Consider the below image:
From the above-given approaches, we can apply any of them according to the type of
problem or business requirement.
K-Means Clustering is an Unsupervised Learning algorithm, which groups the unlabeled dataset
into different clusters. Here K defines the number of pre-defined clusters that need to be created
in the process, as if K=2, there will be two clusters, and for K=3, there will be three clusters, and
so on.
It allows us to cluster the data into different groups and a convenient way to discover the
categories of groups in the unlabeled dataset on its own without the need for any training.
o Determines the best value for K center points or centroids by an iterative process.
o Assigns each data point to its closest k-center. Those data points which are near to the
particular k-center, create a cluster.
Hence each cluster has datapoints with some commonalities, and it is away from other
clusters.
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
The below diagram explains the working of the K-means Clustering Algorithm:
Step-2: Select random K points or centroids. (It can be other from the input dataset).
Step-3: Assign each data point to their closest centroid, which will form the predefined K
clusters.
Step-4: Calculate the variance and place a new centroid of each cluster.
Step-5: Repeat the third steps, which means reassign each datapoint to the new closest
centroid of each cluster.
Suppose we have two variables M1 and M2. The x-y axis scatter plot of these two variables
is given below:
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
o Let's take number k of clusters, i.e., K=2, to identify the dataset and to put them
into different clusters. It means here we will try to group these datasets into two
different clusters.
o We need to choose some random k points or centroid to form the cluster. These
points can be either the points from the dataset or any other point. So, here we are
selecting the below two points as k points, which are not the part of our dataset.
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
o Now we will assign each data point of the scatter plot to its closest K-point or
centroid. We will compute it by applying some mathematics that we have studied
to calculate the distance between two points. So, we will draw a median between
both the centroids. Consider the below image:
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
From the above image, it is clear that points left side of the line is near to the K1 or blue
centroid, and points to the right of the line are close to the yellow centroid. Let's color
them as blue and yellow for clear visualization.
o As we need to find the closest cluster, so we will repeat the process by choosing a
new centroid. To choose the new centroids, we will compute the center of gravity
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
o Next, we will reassign each datapoint to the new centroid. For this, we will repeat
the same process of finding a median line. The median will be like below image:
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
From the above image, we can see, one yellow point is on the left side of the line, and two
blue points are right to the line. So, these three points will be assigned to new centroids.
As reassignment has taken place, so we will again go to the step-4, which is finding new
centroids or K-points.
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
o We will repeat the process by finding the center of gravity of centroids, so the new
centroids will be as shown in the below image:
o As we got the new centroids so again will draw the median line and reassign the
data points. So, the image will be:
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
o We can see in the above image; there are no dissimilar data points on either side
of the line, which means our model is formed. Consider the below image:
As our model is ready, so we can now remove the assumed centroids, and the two final
clusters will be as shown in the below image:
Agglomerative clustering
It is also known as the bottom-up approach or hierarchical agglomerative clustering
(HAC). A structure that is more informative than the unstructured set of clusters
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
returned by flat clustering. This clustering algorithm does not require us to prespecify
the number of clusters. Bottom-up algorithms treat each data as a singleton cluster at
the outset and then successively agglomerate pairs of clusters until all clusters have
been merged into a single cluster that contains all data.
Steps:
Consider each alphabet as a single cluster and calculate the distance of one
cluster from all the other clusters.
In the second step, comparable clusters are merged together to form a single
cluster. Let’s say cluster (B) and cluster (C) are very similar to each other
therefore we merge them in the second step similarly to cluster (D) and (E)
and at last, we get the clusters [(A), (BC), (DE), (F)]
We recalculate the proximity according to the algorithm and merge the two
nearest clusters([(DE), (F)]) together to form new clusters as [(A), (BC),
(DEF)]
Repeating the same process; The clusters DEF and BC are comparable and
merged together to form a new cluster. We’re now left with clusters [(A),
(BCDEF)].
At last, the two remaining clusters are merged together to form a single
cluster [(ABCDEF)].
Dendrogram
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
K- Nearest neighbors
K-Nearest Neighbours is one of the most basic yet essential classification algorithms
in Machine Learning. It belongs to the supervised learning domain and finds intense
application in pattern recognition, data mining, and intrusion detection.
It is widely disposable in real-life scenarios since it is non-parametric, meaning, it
does not make any underlying assumptions about the distribution of data (as opposed
to other algorithms such as GMM, which assume a Gaussian distribution of the given
data). We are given some prior data (also called training data), which classifies
coordinates into groups identified by an attribute.
As an example, consider the following table of data points containing two features:
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
Euclidean Distance
This is nothing but the cartesian distance between the two points which are in the
plane/hyperplane. Euclidean distance can also be visualized as the length of the
straight line that joins the two points which are into consideration. This metric helps
us calculate the net displacement done between the two states of an object.
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
Manhattan Distance
This distance metric is generally used when we are interested in the total distance
traveled by the object instead of the displacement. This metric is calculated by
summing the absolute difference between the coordinates of the points in n-
dimensions.
Minkowski Distance
We can say that the Euclidean, as well as the Manhattan distance, are special cases
of the Minkowski distance.
From the formula above we can say that when p = 2 then it is the same as the
formula for the Euclidean distance and when p = 1 then we obtain the formula for
the Manhattan distance.
The above-discussed metrics are most common while dealing with a Machine
Learning problem but there are other distance metrics as well like Hamming
Distance which come in handy while dealing with problems that require overlapping
comparisons between two vectors whose contents can be boolean as well as string
values.
How to choose the value of k for KNN Algorithm?
The value of k is very crucial in the KNN algorithm to define the number of neighbors
in the algorithm. The value of k in the k-nearest neighbors (k-NN) algorithm should
be chosen based on the input data. If the input data has more outliers or noise, a higher
value of k would be better. It is recommended to choose an odd value for k to avoid
ties in classification. Cross-validation methods can help in selecting the best k value
for the given dataset.
Applications of the KNN Algorithm
Data Preprocessing – While dealing with any Machine Learning problem
we first perform the EDA part in which if we find that the data contains
missing values then there are multiple imputation methods are available as
well. One of such method is KNN Imputer which is quite effective ad
generally used for sophisticated imputation methodologies.
Pattern Recognition – KNN algorithms work very well if you have trained
a KNN algorithm using the MNIST dataset and then performed the
evaluation process then you must have come across the fact that the
accuracy is too high.
Recommendation Engines – The main task which is performed by a KNN
algorithm is to assign a new query point to a pre-existed group that has been
created using a huge corpus of datasets. This is exactly what is required in
the recommender systems to assign each user to a particular group and then
provide them recommendations based on that group’s preferences.
Advantages of the KNN Algorithm
Easy to implement as the complexity of the algorithm is not that high.
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
Adapts Easily – As per the working of the KNN algorithm it stores all the
data in memory storage and hence whenever a new example or data point is
added then the algorithm adjusts itself as per that new example and has its
contribution to the future predictions as well.
Few Hyperparameters – The only parameters which are required in the
training of a KNN algorithm are the value of k and the choice of the distance
metric which we would like to choose from our evaluation metric.
Disadvantages of the KNN Algorithm
Does not scale – As we have heard about this that the KNN algorithm is
also considered a Lazy Algorithm. The main significance of this term is that
this takes lots of computing power as well as data storage. This makes this
algorithm both time-consuming and resource exhausting.
Curse of Dimensionality – There is a term known as the peaking
phenomenon according to this the KNN algorithm is affected by the curse
of dimensionality which implies the algorithm faces a hard time classifying
the data points properly when the dimensionality is too high.
Prone to Overfitting – As the algorithm is affected due to the curse of
dimensionality it is prone to the problem of overfitting as well. Hence
generally feature selection as well as dimensionality reduction techniques
are applied to deal with this problem.
PCA generally tries to find the lower-dimensional surface to project the high-dimensional
data.
PCA works by considering the variance of each attribute because the high attribute shows
the good split between the classes, and hence it reduces the dimensionality. Some real-
world applications of PCA are image processing, movie recommendation system,
optimizing the power allocation in various communication channels. It is a feature
extraction technique, so it contains the important variables and drops the least important
variable.
Association
Association rules allow you to establish associations amongst
data objects inside large databases. This unsupervised
technique is about discovering interesting relationships
between variables in large databases.
For example, people that buy a new home most likely to buy
new furniture.
Other Examples:
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
Unsupervised
Computational Supervised learning is learning is
Complexity a simpler method. computationally
complex
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
Collaborative filtering
Collaborative filtering is a technique that can filter out items
that a user might like on the basis of reactions by similar users.
It works by searching a large group of people and finding a
smaller set of users with tastes similar to a particular user.
What is a Recommendation system?
There are a lot of applications where websites collect data from
their users and use that data to predict the likes and dislikes of
their users. This allows them to recommend the content that
they like. Recommender systems are a way of suggesting
similar items and ideas to a user’s specific way of thinking.
There are basically two types of recommender Systems:
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
• Model-based
Model-based CF uses machine learning algorithms to
predict users’ rating of unrated items.
There are many model-based CF algorithms, the most
commonly used are matrix factorization models such as to
applying a SVD to reconstruct the rating matrix, latent
Dirichlet allocation or Markov decision process based
models.
• Hybrid
These aim to combine the memory-based and the model-based
approaches. One of the main drawbacks of the above methods,
is that you’ll find yourself having to choose between historical
user rating data and user or item attributes.
Hybrid methods enable us to leverage both, and hence tend to
perform better in most cases. The most widely used methods
nowadays are factorization machines.
Memory-based CF
There are 2 main types of memory-based collaborative filtering
algorithms: User-Based and Item-Based. While their difference
is subtle, in practice they lead to very different approaches, so
it is crucial to know which is the most convenient for each case.
Let’s go through a quick overview of these methods:
• Item-Based
The idea is similar, but instead, starting from a given movie (or
set of movies) we find similar movies based on other users’
preferences.
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
Mobile Analytics
Mobile analytics involves measuring and analysing
data generated by mobile platforms and properties,
MREM DEPARTMENT Of CSE-DATA SCIENCE (CSD) III YEAR -II SEM
The actual installation of mobile analytics involves adding tracking code to the sites and
SDKs to the mobile applications teams want to track. Most mobile analytics platforms will
be set up to automatically track website visits.
Platforms with codeless mobile features will be able to automatically track certain basic
features of apps such as crashes, errors, and clicks, but you’ll want to expand that by
manually tagging additional actions for tracking. With mobile analytics in place, you’ll
have deeper insights into your mobile web and app users which you can use to create
competitive, world-class products and experiences.
Collecting the data necessary for successful mobile analytics is often the
greatest challenge organizations face when attempting to understand
consumer behavior on mobile devices. Many devices do not allow for
cookies to track actions or do not use Javascript which can also help with
website data tracking.