SlideShare a Scribd company logo
General Introduction to
AI/ML/DL/DS
Roopesh Kohad
Artificial Intelligence
● Ability to perform tasks normally requiring human intelligence, such as visual
perception, speech recognition, decision-making, and translation between
languages.
● Ability of a computer program or a machine to think and learn
● Ability to correctly interpret external data, to learn from such data, and to use
those learnings to achieve specific goals and tasks through flexible adaptation
● Ability to mimic human cognition
● A program that can sense, reason, act and adapt
Source: Various sources on the internet including wikipedia
Evolution of Industry
Source: https://blogs.worldbank.org/digital-development/what-korea-s-strategy-manage-implications-artificial-intelligence
Impact of Artificial Intelligence
3 Stages of Artificial Intelligence
Source: AI & Data Preparation – Avoiding ‘Garbage-In, Garbage-Out’
Strong vs Weak AI
Source: Gödel, Consciousness and the Weak vs. Strong AI Debate
Artificial Intelligent Systems that we encounter?
Artificial Intelligent Systems that we encounter?
● Weak or Narrow AI i.e. in narrow field of application
● Examples
○ Recommendation Systems
○ Chatbots
○ Virtual Assistants
○ Robots
● Weak or Narrow AI is what is leading to most of the Automation !!
What is an intelligent System?
● Intelligence is an experience that one gets by interacting with system
● Intelligence is intangible like other attributes - fast, secure, usable, intuitive
● Intelligent system is non-deterministic
● Intelligence is adding value to businesses
● A System appears to be intelligent
● Are Personal Computers or typical programs (Browser etc.) intelligent?
Artificial Intelligence Scope
Source: What's required for a machine to be intelligent
AI vs ML vs DL vs DS
Source: Link
Data Science
● Data Science
● Extract Knowledge or Insights from Data
● Understand and analyze actual phenomena" with data
● Whether the data contains enough information to make predictions
The Scientific Method
Source: Getting Insights Using Data Science Skills and the Scientific Method
Source: Data Science is Multidisciplinary
Data Science Venn Diagram
Source: The Data Science Venn Diagram
Activities in Data Science
● Data Exploration & Preparation
○ Collection & loading
● Data Representation & Transformation
○ Tabular, DataFrame etc.
● Computing with Data
○ Programming
● Data Modelling
○ Predictive Modeling
● Data Visualization and Presentation
○ Charting, graphs etc
● Science about Data Science
○ What works, What doesn’t works
What are different types of Data?
● Structured Data
○ Relational Databases
● Semi-structured Data
○ NoSQL Databases
○ XML, JSON
● Unstructured Data
○ Image, audio, text
What is Dataset?
● A collection of related sets of information that is composed of separate
elements but can be manipulated as a unit by a computer.
● Popular datasets
○ Iris Flower Data Set
○ MNIST handwritten digits database
○ Kaggle Datasets
○ Data.World
● Other datasets
○ Open Government Data (OGD) platform of India
● How do we obtain Data if not made available as Datasets?
○ Access to Database
○ APIs
○ Web Scraping
Web Scraping
● Crawling the web to extract information
● In the absence of Database, API access
● Python Frameworks
○ Apache Nutch
○ Scrapy
○ BeautifulSoup
○ Selenium!!!
● Scriptless
○ import.io
Jupyter Notebook
● Jupyter
● Open-source web application
● REPL programming environment
● Create and share documents that contain
○ Live Code
○ Equations
○ Visualizations
○ Narrative text
● Uses include
○ Data cleaning and transformation
○ Numerical simulation
○ Statistical modeling
○ Data visualization
○ Machine learning
Python ML Ecosystem
NumPy
Pandas
DataFrame in Spreadsheet
General introduction to AI ML DL DS
Matplotlib
NumFOCUS & PyData
● NumFOCUS is a nonprofit supporting open source scientific computing.
● PyData is our flagship educational program
● Projects include Jupyter, pandas, NumPy, Matplotlib
● PyData
○ A community for developers and users of open source data tools
○ They have a Meetup in PUNE
SciPy Lectures
● One document to learn numerics, science, and data with Python
● SciPy lectures which gives end-to-end introduction to all SciPy libs.
● SciPy Lectures
Analytics
Where does Data come into picture?
Source: Machine Learning for Dummies: Part 1
Machine Learning
● Ability to learn without being explicitly programmed
● A computer program is said to learn from experience 'E', with respect to some
class of tasks 'T' and performance measure 'P' if its performance at tasks in
'T' as measured by 'P' improves with experience ‘E’
● Machine Learning an approach to achieve Artificial Intelligence
● Machine Learning is an algorithm that can learn from data without relying on
conventional programming
● Machine Learning is a field of computer science that gives computers the
ability to learn without being explicitly programmed
● Machine learning is more like Data Mining and statistics
Machine Learning Types
Source: What is machine learning?
Source: Machine Learning Types #2
Source: Regression or Classification? Linear or Logistics?
Workflow of Machine Learning Project
Source: A Tool To Build Future For Non Experienced Candidates: Machine Learning
Steps to build Machine Learning System
Source: Building a Machine Learning Model from A-Z
Data Preparation
Steps:
● Query Data
● Clean Data
○ Deal with missing values
○ Remove outliers
● Format Data
More like an ETL step!!
Feature Engineering
“Process of transforming raw data into features that better represent the
underlying problem to the predictive models, resulting in improved model accuracy
on unseen data.”
Steps:
● Brainstorm features
● Create features
● Check how the features work with the model
● Start again from first until the features work perfectly
Data Modeling
Performance Measure - Metrics
Mathematical / Statistical way of measuring performance of ML Model
● Classification Accuracy
● Logarithmic Loss
● Confusion Matrix
● Area under Curve
● F1 Score
● Mean Absolute Error
● Mean Squared Error
Performance Measure - Other Approaches
● Testing by End User or Crowd testing
○ Test with real users
● Equivalence classes or ranges of output or tolerance
○ Assert (somewhat expected ~ actual)
● Ranking of output
○ Instead of Pass/Fail, rank outputs
● Comparison Test
○ Compare with a competing system
Machine Learning Algorithms
Housing Price prediction
● Predict Sale Price of a House based on attributes
● Test Data
Linear Regression
Linear Regression in one variable
Linear Regression in one variable
Linear Regression in one variable
Polynomial Regression
Logistics Regression
Scikit-Learn
● https://scikit-learn.org
● Free machine learning library for the Python programming language
● Features various classification, regression and clustering algorithms
● Examples:
○ Linear Regression
○ Support Vector Machines (SVM)
○ Random forests
○ Gradient boosting
○ K-means
● Interoperate with the Python numerical and scientific libraries NumPy and
SciPy.
Sample Code
Logistics Regression
K-Nearest Neighbours
Model vs Algorithm
● Model is what you get when you run the Algorithm over your training data
and what you use to make predictions on new data.
● A Model is a Function which takes inputs and gives an output (prediction)
● You can generate a new Model with the same Algorithm but with different
data, OR
● You can get a new Model from the same data but with a different Algorithm
or different hyperparameter of same Algorithm
● Model is unique to your project and deployed to make predictions.
Model Deployment
● A model or “predictor” or “classifier” is a piece of code/function which runs and
gives output. It could be a,
○ Python module
○ Containerized Docker image
○ A Serverless Function
● How do deploy a simple ML model on your own?
○ As a RESTful API
○ Using Pickle library and then hosting on a Flask webserver.
Choosing right Machine Learning Model
General introduction to AI ML DL DS
What kind of problems ML can solve?
● Problems which could be solved in <1 sec
○ Eg. identify picture
● Problems which require experience
○ A doctor is able to see X-Ray and tell diagnose
○ Hiring shortlisting
● Problems which ML cannot solve?
○ Solving mathematical equations
○ Writing prose
Machine Learning Data Science
Collect Data → Train Model → Deploy to
start getting predictions or classifications
Collect Data → Analyze →
Hypotheses/Actions/Suggestions
Output is a Software Output is a slide deck of
recommendations
Could be OUTSOURCED
(s/w development)
Better INHOUSE
(tied to business)
Engineering Discipline Multidisciplinary
Make a model which makes good prediction
because we have labeled train/test sets.
Ask questions
Design Experiments
Why
What
What can we do to change the
outcome?
Data Science vs Machine Learning
General introduction to AI ML DL DS
Universal Approximation Theorem
A feedforward network with a single layer is sufficient to represent any function,
but the layer may be unfeasibly large and may fail to learn and generalize
correctly.
— Ian Goodfellow
Deep Learning Neural Network
Types of Neural Networks
● CNN (Convolution Neural Network)
● RNN (Recurrent Neural Network)
● LSTM (Long Short Term Memory)
● GAN (Generative Adversarial Network)
Convolution Neural Network - ConvNet
● ConvNet takes an image and differentiates one from another
● Analogous to connectivity pattern of Neurons in the Human Brain
● Inspired by the organization of the Visual Cortex
● Captures Spatial and Temporal dependencies
● Convolution Layer to extract high level features
○ A kernel filter NxN matrix scans entire MxM image
● Pooling layer to reduce dimension of convoluted features
● Convolution and Pooling phase are “Feature Extraction” phase
● Flatten the final output and feed it to a regular Neural Network for
classification purposes.
ConvNet - Convolution Phase
ConvNet - Pooling Phase
Digit Recognizer ConvNet
ML vs DL
Source: ML vs DL
Deep Learning Frameworks
What is Tensorflow?
● Tensorflow is an open source library to help you develop and train ML models
● Tensorflow playground
● Tensorflow Tutorial
What is Keras?
● Keras in a high-level API to develop Neural Networks
● Capable of running on top of TensorFlow, CNTK, or Theano.
● Keras Getting Started
Cloud Platforms
● AWS
● GCP
● AZURE
AWS AI Services
AWS ML/DL Services
● Sage Maker
○ Build → Train → Deploy Machine Learning Models
● Deep Learning AMIs
AWS AI Learning & Certification
● Training: ML Training
● Certification: Machine Learning Speciality
Microsoft Azure AI Platform
Microsoft Azure - Learning
● Microsoft Professional Program - AI
○ Now being retired
● Microsoft Learn
○ Search via Role/Product
● AI School
○ Dedicated AI academy
● Microsoft has tied up with edX
Google AI Platform
Google AI Training & Certification
● Google has tied up with Coursera for their training
● Training - Data & Machine Learning path
● Machine Learning with TensorFlow on Google Cloud Platform Specialization
● Certification - Data & Machine Learning
Kaggle
● Online community of data scientists and machine learners, owned by Google
● Datasets
● Notebooks
● Competitions
Resources
1. CGP Grey: How Machines Learn
2. 3Blue1Brown: Neural Networks
3. nVIDIA: What’s the Difference Between Artificial Intelligence, Machine
Learning, and Deep Learning?
4. State of AI
5. DataMeet is a community of Data Science and Open Data enthusiasts from
India.
6. A visual introduction to Probability & Statistics
Roles
● Data Scientist
○ Examine Data and provide Insights
○ Make presentation to Team / Executive
○ Storytelling
● Machine Learning Engineer
○ Build, Train, Test & Improve ML/DL models
● Data Engineer
○ Organize Data
○ Make sure data is stored in easily accessible, secure and cost-effective way
Where to start?
● Try hands with Jupyter Notebook, try hands using SciPy stack
● Take part in some Kaggle contests
● Look into your projects and see if they are candidates for ML/DL?
General introduction to AI ML DL DS

More Related Content

What's hot (20)

PDF
Machine learning
Dr Geetha Mohan
 
PPT
Machine Learning
Vivek Garg
 
PPT
Machine Learning
Rahul Kumar
 
PDF
Data science presentation
MSDEVMTL
 
PDF
Lecture 1: What is Machine Learning?
Marina Santini
 
PPT
Machine learning
Rajib Kumar De
 
PPTX
Machine Learning and Real-World Applications
MachinePulse
 
PPTX
Machine Learning
Kumar P
 
PPTX
Machine Can Think
Rahul Jaiman
 
PPTX
Machine learning
Saurabh Agrawal
 
PDF
Fundamentals of Artificial Intelligence — QU AIO Leadership in AI
Junaid Qadir
 
PPTX
Machine learning
eonx_32
 
PPTX
A Comprehensive Review of Large Language Models for.pptx
SaiPragnaKancheti
 
PPTX
Machine learning ppt
Poojamanic
 
PDF
History of AI, Current Trends, Prospective Trajectories
Giovanni Sileno
 
PDF
Data science and Artificial Intelligence
Suman Srinivasan
 
PDF
Intro to LLMs
Loic Merckel
 
PDF
Introduction to Machine Learning with SciKit-Learn
Benjamin Bengfort
 
PDF
Introduction to LLMs
Loic Merckel
 
PDF
The Data Science Process
Vishal Patel
 
Machine learning
Dr Geetha Mohan
 
Machine Learning
Vivek Garg
 
Machine Learning
Rahul Kumar
 
Data science presentation
MSDEVMTL
 
Lecture 1: What is Machine Learning?
Marina Santini
 
Machine learning
Rajib Kumar De
 
Machine Learning and Real-World Applications
MachinePulse
 
Machine Learning
Kumar P
 
Machine Can Think
Rahul Jaiman
 
Machine learning
Saurabh Agrawal
 
Fundamentals of Artificial Intelligence — QU AIO Leadership in AI
Junaid Qadir
 
Machine learning
eonx_32
 
A Comprehensive Review of Large Language Models for.pptx
SaiPragnaKancheti
 
Machine learning ppt
Poojamanic
 
History of AI, Current Trends, Prospective Trajectories
Giovanni Sileno
 
Data science and Artificial Intelligence
Suman Srinivasan
 
Intro to LLMs
Loic Merckel
 
Introduction to Machine Learning with SciKit-Learn
Benjamin Bengfort
 
Introduction to LLMs
Loic Merckel
 
The Data Science Process
Vishal Patel
 

Similar to General introduction to AI ML DL DS (20)

PPTX
L15.pptx
ImonBennett
 
PDF
Introduction to machine learning and applications (1)
Manjunath Sindagi
 
PDF
Python Machine Learning - Getting Started
Rafey Iqbal Rahman
 
PPTX
Data analytics with python introductory
Abhimanyu Dwivedi
 
PDF
machine learning basic unit1 for third year cse studnets
sachinjadhav990783
 
PPTX
Introduction to Machine Learning - An overview and first step for candidate d...
Lucas Jellema
 
PPTX
NCCU: The Story of Data Science and Machine Learning Workshop - A Tutorial in...
The Statistical and Applied Mathematical Sciences Institute
 
PDF
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
Lucas Jellema
 
PPTX
Introduction overviewmachinelearning sig Door Lucas Jellema
Getting value from IoT, Integration and Data Analytics
 
PDF
Machine learing
Abu Saleh Muhammad Shaon
 
PPTX
Machine Learning AND Deep Learning for OpenPOWER
Ganesan Narayanasamy
 
PPTX
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...
Lucas Jellema
 
PPTX
supervised.pptx
MohamedSaied316569
 
PPTX
Ml - A shallow dive
Gopi Krishna Nuti
 
PPTX
Intro/Overview on Machine Learning Presentation
Ankit Gupta
 
PDF
ML.pdf
SamuelAwuah1
 
PDF
Choosing a Machine Learning technique to solve your need
GibDevs
 
PPTX
Machine Learning Essentials Demystified part1 | Big Data Demystified
Omid Vahdaty
 
PDF
Intro to machine learning
Tamir Taha
 
PPTX
MLIntro_ADA.pptx
ADA Consulting
 
L15.pptx
ImonBennett
 
Introduction to machine learning and applications (1)
Manjunath Sindagi
 
Python Machine Learning - Getting Started
Rafey Iqbal Rahman
 
Data analytics with python introductory
Abhimanyu Dwivedi
 
machine learning basic unit1 for third year cse studnets
sachinjadhav990783
 
Introduction to Machine Learning - An overview and first step for candidate d...
Lucas Jellema
 
NCCU: The Story of Data Science and Machine Learning Workshop - A Tutorial in...
The Statistical and Applied Mathematical Sciences Institute
 
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
Lucas Jellema
 
Introduction overviewmachinelearning sig Door Lucas Jellema
Getting value from IoT, Integration and Data Analytics
 
Machine learing
Abu Saleh Muhammad Shaon
 
Machine Learning AND Deep Learning for OpenPOWER
Ganesan Narayanasamy
 
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...
Lucas Jellema
 
supervised.pptx
MohamedSaied316569
 
Ml - A shallow dive
Gopi Krishna Nuti
 
Intro/Overview on Machine Learning Presentation
Ankit Gupta
 
ML.pdf
SamuelAwuah1
 
Choosing a Machine Learning technique to solve your need
GibDevs
 
Machine Learning Essentials Demystified part1 | Big Data Demystified
Omid Vahdaty
 
Intro to machine learning
Tamir Taha
 
MLIntro_ADA.pptx
ADA Consulting
 
Ad

Recently uploaded (20)

PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
PDF
July Patch Tuesday
Ivanti
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
Biography of Daniel Podor.pdf
Daniel Podor
 
July Patch Tuesday
Ivanti
 
Ad

General introduction to AI ML DL DS

  • 2. Artificial Intelligence ● Ability to perform tasks normally requiring human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages. ● Ability of a computer program or a machine to think and learn ● Ability to correctly interpret external data, to learn from such data, and to use those learnings to achieve specific goals and tasks through flexible adaptation ● Ability to mimic human cognition ● A program that can sense, reason, act and adapt Source: Various sources on the internet including wikipedia
  • 3. Evolution of Industry Source: https://blogs.worldbank.org/digital-development/what-korea-s-strategy-manage-implications-artificial-intelligence
  • 4. Impact of Artificial Intelligence
  • 5. 3 Stages of Artificial Intelligence Source: AI & Data Preparation – Avoiding ‘Garbage-In, Garbage-Out’
  • 6. Strong vs Weak AI Source: Gödel, Consciousness and the Weak vs. Strong AI Debate
  • 7. Artificial Intelligent Systems that we encounter?
  • 8. Artificial Intelligent Systems that we encounter? ● Weak or Narrow AI i.e. in narrow field of application ● Examples ○ Recommendation Systems ○ Chatbots ○ Virtual Assistants ○ Robots ● Weak or Narrow AI is what is leading to most of the Automation !!
  • 9. What is an intelligent System? ● Intelligence is an experience that one gets by interacting with system ● Intelligence is intangible like other attributes - fast, secure, usable, intuitive ● Intelligent system is non-deterministic ● Intelligence is adding value to businesses ● A System appears to be intelligent ● Are Personal Computers or typical programs (Browser etc.) intelligent?
  • 10. Artificial Intelligence Scope Source: What's required for a machine to be intelligent
  • 11. AI vs ML vs DL vs DS Source: Link
  • 12. Data Science ● Data Science ● Extract Knowledge or Insights from Data ● Understand and analyze actual phenomena" with data ● Whether the data contains enough information to make predictions
  • 13. The Scientific Method Source: Getting Insights Using Data Science Skills and the Scientific Method
  • 14. Source: Data Science is Multidisciplinary
  • 15. Data Science Venn Diagram Source: The Data Science Venn Diagram
  • 16. Activities in Data Science ● Data Exploration & Preparation ○ Collection & loading ● Data Representation & Transformation ○ Tabular, DataFrame etc. ● Computing with Data ○ Programming ● Data Modelling ○ Predictive Modeling ● Data Visualization and Presentation ○ Charting, graphs etc ● Science about Data Science ○ What works, What doesn’t works
  • 17. What are different types of Data? ● Structured Data ○ Relational Databases ● Semi-structured Data ○ NoSQL Databases ○ XML, JSON ● Unstructured Data ○ Image, audio, text
  • 18. What is Dataset? ● A collection of related sets of information that is composed of separate elements but can be manipulated as a unit by a computer. ● Popular datasets ○ Iris Flower Data Set ○ MNIST handwritten digits database ○ Kaggle Datasets ○ Data.World ● Other datasets ○ Open Government Data (OGD) platform of India ● How do we obtain Data if not made available as Datasets? ○ Access to Database ○ APIs ○ Web Scraping
  • 19. Web Scraping ● Crawling the web to extract information ● In the absence of Database, API access ● Python Frameworks ○ Apache Nutch ○ Scrapy ○ BeautifulSoup ○ Selenium!!! ● Scriptless ○ import.io
  • 20. Jupyter Notebook ● Jupyter ● Open-source web application ● REPL programming environment ● Create and share documents that contain ○ Live Code ○ Equations ○ Visualizations ○ Narrative text ● Uses include ○ Data cleaning and transformation ○ Numerical simulation ○ Statistical modeling ○ Data visualization ○ Machine learning
  • 22. NumPy
  • 27. NumFOCUS & PyData ● NumFOCUS is a nonprofit supporting open source scientific computing. ● PyData is our flagship educational program ● Projects include Jupyter, pandas, NumPy, Matplotlib ● PyData ○ A community for developers and users of open source data tools ○ They have a Meetup in PUNE
  • 28. SciPy Lectures ● One document to learn numerics, science, and data with Python ● SciPy lectures which gives end-to-end introduction to all SciPy libs. ● SciPy Lectures
  • 30. Where does Data come into picture?
  • 31. Source: Machine Learning for Dummies: Part 1
  • 32. Machine Learning ● Ability to learn without being explicitly programmed ● A computer program is said to learn from experience 'E', with respect to some class of tasks 'T' and performance measure 'P' if its performance at tasks in 'T' as measured by 'P' improves with experience ‘E’ ● Machine Learning an approach to achieve Artificial Intelligence ● Machine Learning is an algorithm that can learn from data without relying on conventional programming ● Machine Learning is a field of computer science that gives computers the ability to learn without being explicitly programmed ● Machine learning is more like Data Mining and statistics
  • 33. Machine Learning Types Source: What is machine learning?
  • 35. Source: Regression or Classification? Linear or Logistics?
  • 36. Workflow of Machine Learning Project Source: A Tool To Build Future For Non Experienced Candidates: Machine Learning
  • 37. Steps to build Machine Learning System Source: Building a Machine Learning Model from A-Z
  • 38. Data Preparation Steps: ● Query Data ● Clean Data ○ Deal with missing values ○ Remove outliers ● Format Data More like an ETL step!!
  • 39. Feature Engineering “Process of transforming raw data into features that better represent the underlying problem to the predictive models, resulting in improved model accuracy on unseen data.” Steps: ● Brainstorm features ● Create features ● Check how the features work with the model ● Start again from first until the features work perfectly
  • 41. Performance Measure - Metrics Mathematical / Statistical way of measuring performance of ML Model ● Classification Accuracy ● Logarithmic Loss ● Confusion Matrix ● Area under Curve ● F1 Score ● Mean Absolute Error ● Mean Squared Error
  • 42. Performance Measure - Other Approaches ● Testing by End User or Crowd testing ○ Test with real users ● Equivalence classes or ranges of output or tolerance ○ Assert (somewhat expected ~ actual) ● Ranking of output ○ Instead of Pass/Fail, rank outputs ● Comparison Test ○ Compare with a competing system
  • 44. Housing Price prediction ● Predict Sale Price of a House based on attributes ● Test Data
  • 46. Linear Regression in one variable
  • 47. Linear Regression in one variable
  • 48. Linear Regression in one variable
  • 51. Scikit-Learn ● https://scikit-learn.org ● Free machine learning library for the Python programming language ● Features various classification, regression and clustering algorithms ● Examples: ○ Linear Regression ○ Support Vector Machines (SVM) ○ Random forests ○ Gradient boosting ○ K-means ● Interoperate with the Python numerical and scientific libraries NumPy and SciPy.
  • 53. Model vs Algorithm ● Model is what you get when you run the Algorithm over your training data and what you use to make predictions on new data. ● A Model is a Function which takes inputs and gives an output (prediction) ● You can generate a new Model with the same Algorithm but with different data, OR ● You can get a new Model from the same data but with a different Algorithm or different hyperparameter of same Algorithm ● Model is unique to your project and deployed to make predictions.
  • 54. Model Deployment ● A model or “predictor” or “classifier” is a piece of code/function which runs and gives output. It could be a, ○ Python module ○ Containerized Docker image ○ A Serverless Function ● How do deploy a simple ML model on your own? ○ As a RESTful API ○ Using Pickle library and then hosting on a Flask webserver.
  • 55. Choosing right Machine Learning Model
  • 57. What kind of problems ML can solve? ● Problems which could be solved in <1 sec ○ Eg. identify picture ● Problems which require experience ○ A doctor is able to see X-Ray and tell diagnose ○ Hiring shortlisting ● Problems which ML cannot solve? ○ Solving mathematical equations ○ Writing prose
  • 58. Machine Learning Data Science Collect Data → Train Model → Deploy to start getting predictions or classifications Collect Data → Analyze → Hypotheses/Actions/Suggestions Output is a Software Output is a slide deck of recommendations Could be OUTSOURCED (s/w development) Better INHOUSE (tied to business) Engineering Discipline Multidisciplinary Make a model which makes good prediction because we have labeled train/test sets. Ask questions Design Experiments Why What What can we do to change the outcome? Data Science vs Machine Learning
  • 60. Universal Approximation Theorem A feedforward network with a single layer is sufficient to represent any function, but the layer may be unfeasibly large and may fail to learn and generalize correctly. — Ian Goodfellow
  • 62. Types of Neural Networks ● CNN (Convolution Neural Network) ● RNN (Recurrent Neural Network) ● LSTM (Long Short Term Memory) ● GAN (Generative Adversarial Network)
  • 63. Convolution Neural Network - ConvNet ● ConvNet takes an image and differentiates one from another ● Analogous to connectivity pattern of Neurons in the Human Brain ● Inspired by the organization of the Visual Cortex ● Captures Spatial and Temporal dependencies ● Convolution Layer to extract high level features ○ A kernel filter NxN matrix scans entire MxM image ● Pooling layer to reduce dimension of convoluted features ● Convolution and Pooling phase are “Feature Extraction” phase ● Flatten the final output and feed it to a regular Neural Network for classification purposes.
  • 67. ML vs DL Source: ML vs DL
  • 69. What is Tensorflow? ● Tensorflow is an open source library to help you develop and train ML models ● Tensorflow playground ● Tensorflow Tutorial
  • 70. What is Keras? ● Keras in a high-level API to develop Neural Networks ● Capable of running on top of TensorFlow, CNTK, or Theano. ● Keras Getting Started
  • 73. AWS ML/DL Services ● Sage Maker ○ Build → Train → Deploy Machine Learning Models ● Deep Learning AMIs
  • 74. AWS AI Learning & Certification ● Training: ML Training ● Certification: Machine Learning Speciality
  • 75. Microsoft Azure AI Platform
  • 76. Microsoft Azure - Learning ● Microsoft Professional Program - AI ○ Now being retired ● Microsoft Learn ○ Search via Role/Product ● AI School ○ Dedicated AI academy ● Microsoft has tied up with edX
  • 78. Google AI Training & Certification ● Google has tied up with Coursera for their training ● Training - Data & Machine Learning path ● Machine Learning with TensorFlow on Google Cloud Platform Specialization ● Certification - Data & Machine Learning
  • 79. Kaggle ● Online community of data scientists and machine learners, owned by Google ● Datasets ● Notebooks ● Competitions
  • 80. Resources 1. CGP Grey: How Machines Learn 2. 3Blue1Brown: Neural Networks 3. nVIDIA: What’s the Difference Between Artificial Intelligence, Machine Learning, and Deep Learning? 4. State of AI 5. DataMeet is a community of Data Science and Open Data enthusiasts from India. 6. A visual introduction to Probability & Statistics
  • 81. Roles ● Data Scientist ○ Examine Data and provide Insights ○ Make presentation to Team / Executive ○ Storytelling ● Machine Learning Engineer ○ Build, Train, Test & Improve ML/DL models ● Data Engineer ○ Organize Data ○ Make sure data is stored in easily accessible, secure and cost-effective way
  • 82. Where to start? ● Try hands with Jupyter Notebook, try hands using SciPy stack ● Take part in some Kaggle contests ● Look into your projects and see if they are candidates for ML/DL?