0% found this document useful (0 votes)
22 views

Introduction To Data Science

The document provides an introduction to data science. It defines data science as dealing with science and algorithms related to data from various sources to extract useful patterns and insights through visualization. A data scientist's role involves mathematics, statistics, computer science, data analysis, problem solving, machine learning, and data visualization. Data science has applications in recommender systems, voice/image recognition, spam/fraud detection, and more.

Uploaded by

Harsh Ojha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Introduction To Data Science

The document provides an introduction to data science. It defines data science as dealing with science and algorithms related to data from various sources to extract useful patterns and insights through visualization. A data scientist's role involves mathematics, statistics, computer science, data analysis, problem solving, machine learning, and data visualization. Data science has applications in recommender systems, voice/image recognition, spam/fraud detection, and more.

Uploaded by

Harsh Ojha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Introduction to Data

Science
AGENDA
• Defining Data Science
• What Does a Data Science Professional Do?
• Data Science in Business
• Use Cases for Data Science
• Installation of R and R studio
DEFINING DATA SCIENCE
• Data Science deals with the science and algorithms related to
data.
• Data generated from various sort of sources.
• Report says, “Every day, approximately 2 quintillion bytes of
data is generated. If it grows at this pace, then by the next 3
years, it is expected that 2MB of data will be created every
second for every individual on this planet.”
• Last 2 years witnessing the creation of 90% of data over the
globe.
• Data has two sources:
• Structured
• Unstructured
• Structured sources include information that iscompatible
with the
relational database.
• E.g. ATM transactions, Flight Tickets which enable SQL
to
make changes in them.
• Unstructured data is generated from tweets and comments
on social media, audio and video files which the SQL cannot
process.
DEFINITION
“ Data Science is a broad field which is an assembly of scientific
techniques, methods, processes used to clean the data and then
extract some useful patterns and insights in form of
visualizations.”

• Visualizations are crucial to make important business decisions


and come up with strategies that are instrumental for
organization’s well-being.
ROLE OF DATA SCIENCE
ON STATISTICS
• Statistics
• Mathematics
• Computer Science
• Data Analysis
• CriticalThinking
• Problem Solving
• Machine Learning
• DataVisualization
• For anyone who is willing to carry this “tag” along should
be well-versed with a lot of concepts.

Some of them are


•Mathematics
•Statistics
•Problem-solving
•Data wrangling or data munging
•Coding prowess in both R and Python
•SQL
•Hadoop
•Machine learning and AI
•Data visualization
•Communication skills
APPLICATIONS
• Data Science has tons of applications in real-world
implementation.
• Recommender Systems
– Content based – keeps track of users watching habits.
– Collaborative based – recognizes users with similar tastes.
• Voice and Image Recognition
• Spam and Fraud Detection

• Many more…….
DATA SCIENTISTS ANDTHEIR ROLE
• Data Scientist is a Rockstar!!!
• A Data Scientist is an individual who has the power and freedom to
experiment with tons of different kinds of data.
• Based on knowledge of:
– Mathematics
– Problem solving
– Critical thinking
– Careful analysis
WHAT DOES DATA SCIENCE PROFESSIONAL
DO?
DATA ANALYST V/S DATA SCIENTIST
• Data Analyst has a lot to do with converting the data into a
structured format in order to process it further.
• Focus more on Data Mining and Data Auditing
• Data mining involves retrieving information from large databases with
the help of SQL to extract new data/information.
• Data auditing involves checking the essence of data and trying to figure out
if the data is
capable enough for gaining useful insights or not.
DATA ANALYST V/S DATA SCIENTIST
• Data Scientist take the clean data and trying to gain
some meaningful insights.

• An algorithm either from classification or regression is


implemented in order to create a model and make it
sustainable enough to gain some business insights with
the help of visualization tools.

You might also like