Introduction To Machine Learning
Introduction To Machine Learning
Learning
Machine
Learning
Neural
Networks
Deep
Learning
Data Science Data
Mining
Big Data
Analytics
• Analytics is the process of discovering, interpreting, and communicating significant
patterns in data. Quite simply, analytics helps us see insights and meaningful data that we
might not otherwise detect. Business analytics focuses on using insights derived from data
to make more informed decisions that will help organizations increase sales, reduce costs,
and make other business improvements.
https://www.oracle.com/ph/business-analytics/what-is-analytics/
• 4 Types
• Descriptive Analytics - what happened in the past
• Diagnostic Analytics - why something happened in the past
• Predictive Analytics - which predicts what’s most likely to happen in the
future
• Prescriptive Analytics - which recommends actions you can take to affect
those likely outcomes
Descriptive Analytics
• Any activity or method that helps us to describe or summarize raw data into
something interpretable by humans
• Allow us to learn from past behaviors and understand how they might influence future
outcomes
• Examples:
• Company’s business intelligence reports that cover different aspect of the organization to provide
historical insights regarding the company’s production, operations, sales, revenue, financials, inventory,
customers, and market share
• The sales team can learn which customer segments generated the highest peso amount in sales last year.
• The marketing team can uncover which social media platforms delivered the best return on advertising
investment last quarter.
• The finance team can track month-over-month and year-over-year revenue growth or decline.
• Operations can track demand for SKUs (Stock Keeping Units) across geographic locations throughout the
past year.
Diagnostic Analytics
• Examines data or information to answer the question “Why did it happen?”
• Techniques: Drill-down, data discovery, data mining, correlations, causations
• Provides a very good understanding of a limited piece of the problem you want to solve
• Labor intensive – human intervention is required to perform drill-down or data mining
to go deeper into the data to understand why something happened or its root cause. It
focuses on determining the factors and events that contributed to the outcome.
• Examples:
• Decline in sales of a product line on some stores, product manager may want to look backward to
review past trends and patterns for the product line sales across different stores base on its
placement (floor, corner, aisle) within the store. The manager may look at external factors such as
demographic, season, and other factors
Predictive Analytics
• Ability to make predictions or estimations of likelihood about unknown future
events based on the past or historic patterns.
• Give insights into “What might happen?”
• Uses techniques from data mining, statistics, modeling, machine learning, and
AI to analyze current data to make predictions about the future.
• The foundation of predictive analytics is based on probabilities, and the
quality of predictions by statistical algorithms depends a lot on the quality of
input data. 100% Accuracy
• Examples: Weather Forecasting, e-mail spam identification, fraud detection,
probability of customer purchasing a product or renewal of insurance policy,
predicting the chances of a person with known illness, etc.
Prescriptive Analytics
• Area of data or business analytics dedicated to finding the best course of action for
a given situation.
• Endeavors to measure the future decision’s effect to enable the decision makers to
foresee the possible outcomes before the actual decisions are made.
• Combination of business rules, machine learning algorithms, tools that can be
applied against historic and real-time data feed.
• Key objective: not just to predict what will happen but also why it will happen by
predicting multiple futures based on different scenarios to allow companies to
assess possible outcomes base on their actions.
• Examples: simulations in design situations to help users identify system behaviors
under different configurations and ensuring that all key performance metrics are
met such as wait times, queue length, etc.
What is Machine Learning?
• Arthur Samuel (1959)
• Machine Learning is a field of study that gives computers the ability to learn without being explicitly
programmed
• Tom Mitchel(1997)
• A computer program is said to learn from experience E with respect to some class of tasks T and
performance measure P, if its performance at tasks in T, as measured by P, improves with experience
E.
• Machine learning is a field of computer science that involves using statistical methods to
create programs that either improve performance over time, or detect patterns in
massive amounts of data that humans would be unlikely to find.
• Machine learning explores the study and construction of algorithms that can learn from
and make predictions on data. Such algorithms operate by building a model from
example inputs in order to make data driven predictions or decisions rather than
following strictly static program instructions
A computer program is said to learn from experience E with respect to
some class of tasks T and performance measure P, if its performance
at tasks in T, as measured by P, improves with experience E.
• Preparation of Data.
• Data pre-processing • EDA (Exploratory Data Analysis).
will be best for the type of data • Choosing the best model.
Analytics
Solution Unified
Method for
Data Mining/
Predictive
Analytics
(ASUM-DM)
Python’s Data Analysis Packages
• Numpy - Core library for scientific computing. Its built-in mathematical
functions enable lightning-speed computation and can support
multidimensional data and large matrices. It is also used in linear algebra.
NumPy Array is often used preferentially over lists as it uses less memory
and is more convenient and efficient.
• Scikit-Learn - one of the most used machine learning libraries in Python.
Built on NumPy, SciPy, and Matplotlib
• Matplotlib - an extensive library for creating fixed, interactive, and
animated Python visualizations.
• Pandas -. It is primarily used for data analysis, data manipulation, and
data cleaning.
Commonly used Algorithms
CLASSIFICATION REGRESSION
• K-Nearest Neighbor • Linear Regression
• Naive Bayes • Support Vector Regression
• Decision Trees/Random Forest • Decision Tress/Random Forest
• Support Vector Machine • Gaussian Progresses
Regression
• Logistic Regression
• Ensemble Methods