Data Science Chapitre 0
Data Science Chapitre 0
Mahdi Louati
3 GLID
September, 19th 2022
Content
s
0. Welcome to Machine
Learning
1. Data
Preprocessing
2. Regression Models
0.1 Why Machine Learning is the Future 1.1. Importing the Librairies 2.1. Simple Linear Regression (SLR)
0.2. What is machine Learning 1.2. Importing the Dataset 2.2. Multiple Linear Regression (MLR)
0.3. Installing Python and Anaconda 1.3. Missing Data 2.3. Polynomial Regression
1.4. Categorical Data 2.4. Support Vector Regression (SVR)
1.5. Training Set and Test Set 2.5. Decision Tree Regression
1.6. Feature Scaling 2.6. Random Forest Regression
2.7. Evaluation Regression Models
These techniques are used in several applications such as control system, natural language processing, facial
recognition, voice recognition, business analytics, pattern matchnig and data mining.
Psycol Philos
ogy ophy
AI Lingui
Logic Comp stics
uter
Scienc
e
Artificial Intelligence, Machine Learning, Deep Learning and Data Science are popular terms in this era and knowing what it
is and the difference between them is crucial. Although these terms might be closely related there are differences between
them see the image below to visualize it.
Machine Learning (ML) is that field of computer science with the help
of which computer systems can provide sense to data in much the same
way as human beings do.
Machine Learning is a subset of Artificial Intelligence that uses statistical learning algorithms to build systems that have the
ability to automatically learn and improve from experiences without being explicitly programmed or human intervention. .
Deep learning is a machine learning technique that is inspired by the way a human brain filters information, it is basically
learning from examples. It helps a computer model to filter the input data through layers to predict and classify information.
Conférence Claude Shannon (Father of Information Theory) 1916-2001
Dartmouth 1956 John McCarthy (Creator of the « Lisp » Programming Language)
1927-2011.
WATSON (DeepQA : prog. Inf. in IANLP of IBM) defeats Brad RUTTER and
2011
Kenn JENNINGS in Jeopardy! and wins 1 million $
AlphaGo (prog. Inf. in IA of Google DeepMind) defeats Lee SEDOL (best professional player
2016
in Go game)
The branch of computer science concerned with making computers behave like humans. (J. McCarthy 1956)
The science of making machines do things that would require intelligence if done by men. (M. Minsky)
0.1 Why Machine Learning is the Future 1.1. Importing the Librairies 2.1. Simple Linear Regression (SLR)
0.2. What is machine Learning 1.2. Importing the Dataset 2.2. Multiple Linear Regression (MLR)
0.3. Installing Python and Anaconda 1.3. Missing Data 2.3. Polynomial Regression
1.4. Categorical Data 2.4. Support Vector Regression (SVR)
1.5. Training Set and Test Set 2.5. Decision Tree Regression
1.6. Feature Scaling 2.6. Random Forest Regression
2.7. Evaluation Regression Models
EXABYTES?
A
500*10⁶ Hectars of trees
It is grown so quickly over the past decade that now you are almost expected to know some level
of Machine Learning to call yourself a Data Scientist.
The Machine learning is so pervasive today that you probably use it dozens of times a day
without knowing it. It focuses on the development of computer programs that can access data
and use it learn for themselves. It has several domains of applications
Topics
include:
Regression problems are distinguished from Classification problems. Thus, it is considered that
the problems of predicting a quantitative variable are Regression problems whereas the problems
of predicting a qualitative variable are Classification problems.
Although both types of learning are Artificial Intelligence, in the Supervised Learning a
researcher is there to "guide" the algorithm on the path of learning by providing him with
examples that he considers convincing after having previously labeled expected results. Artificial
Intelligence then learns from each example, with the aim of being able to generalize its learning
to new cases.
In the case of Unsupervised Learning, Machine Learning is completely autonomous. Data is
then communicated to the machine without providing the examples of output results expected .
You will learn about the most effective machine learning techniques.
You gain practice implementing them and getting them to work for yourself.
Youwill learn about not only the theoretical underpinnings of learning but also gain the practical
know-how needed to quickly and powerfully apply these techniques to new problems.
0. Welcome to Machine
Learning 0.2. Installing Python and
Anaconda
Python is an interpreted programming language, multi-paradigm and multiplatform. It promotes
structured, functional and object-oriented imperative programming. It was developed by the
Dutch Guido van Rossum in 1989.
Spider is a development environment. It is a kind of studio with very practical tools and it is
integrated in the best platform Anaconda.
Anaconda is the best open source Data Science platform since it contains most of the Machine
Learning Libraries, the most practical to use for building models is the Scikit-learn Library.