0% found this document useful (0 votes)
8 views

DS231 Module 2

The document is a course module on Data Science Programming from the Saudi Electronic University, focusing on the fundamentals of data science, its applications, and career opportunities. It covers key components of data science, including data collection, analysis, and communication of insights, while distinguishing between data science and data engineering. The module also explores various career paths in data science, such as data implementers, leaders, and entrepreneurs.

Uploaded by

azooom64
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

DS231 Module 2

The document is a course module on Data Science Programming from the Saudi Electronic University, focusing on the fundamentals of data science, its applications, and career opportunities. It covers key components of data science, including data collection, analysis, and communication of insights, while distinguishing between data science and data engineering. The module also explores various career paths in data science, such as data implementers, leaders, and entrepreneurs.

Uploaded by

azooom64
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 33

‫الجامعة السعودية االلكترونية‬

‫الجامعة السعودية االلكترونية‬

‫‪26/12/2021‬‬
College of Computing and
Informatics
Introduction to Data Science
Programming

2
Introduction to Data Science Programming
Module 2:
Wrapping Your Head around Data
Science
Contents

1. Seeing Who Can Make Use of Data Science


2. Inspecting the Pieces of the Data Science Puzzle
3. Exploring Career Alternatives That Involve Data
Science

4
Weekly Learning
Outcomes
1. Deploying data science methods across various
industries

2. Piecing together the core data science


components

3. Identifying viable data science solutions to


business challenges

4. Exploring data science career alternatives


5
Required Reading
1. Chapter 1. Wrapping Your Head around Data
Science (Lillian Pierson, Data Science, 3rd
Edition, 2021)

Videos
• Data Science In 5 Minutes | Data Science For Beginners | What Is
Data Science? https://youtu.be/X3paOmcrTjQ

• Data Analyst vs Data Scientist vs Data Engineer Picking The Role


That'll Make You HAPPY & PROSPEROUS
https://youtu.be/DVOvRM7r3gM
6
Introduction
Introduction

• Data is coming from every computer, every mobile device,


every camera, and every imaginable sensor.
• Data is generated in every social media interaction we
humans make, every file we save, every picture we take, and
every query we submit.
• Data is even generated when we do something as simple as
ask a favorite search engine for directions to the closest ice
cream shop.
• Data is streaming from almost every activity that takes place
in both the digital and physical worlds.
Introduction

• Data is structured, semistructured, and unstructured.


• The question is : “What’s the point of all this data? Why use
valuable resources to generate and collect it?”
• Specialists known as data engineers are constantly finding
innovative and powerful new ways to capture, collate, and
condense unimaginably massive volumes of data.
• And other specialists, known as data scientists, are leading
change by deriving valuable and actionable insights from that
data.
Introduction

• In its truest form, Data Science represents the optimization of


processes and resources. Data science produces data insights.
• Using data science insights is like being able to see in the
dark. For any goal you can imagine, you can find data science
methods to help you predict the most direct route from
where you are to where you want to be.

Welcome to the magic of Data Science


1. Seeing Who Can Make Use of Data
Science
Seeing Who Can Make Use of Data
Science
• The terms data science and data engineering are often
misused and confused.
• Data science is the computational science of extracting
meaningful insights from raw data and then effectively
communicating those insights to generate value.
• Data engineering is an engineering domain that’s dedicated
to building and maintaining systems that overcome data
processing bottlenecks and data handling problems for
applications that consume, process, and store large volumes,
varieties, and velocities of data.
Seeing Who Can Make Use of Data
Science
Three data varieties:
• Structured: Data that is stored, processed, and manipulated
in a traditional relational database management system
(RDBMS). Example: a MySQL database
• Unstructured: Data that is commonly generated from human
activities and doesn’t fit into a structured database format.
Example: email documents
• Semistructured: Data that doesn’t fit into a structured
database system but is nonetheless organizable by tags that
are useful for creating a form of order and hierarchy in the
data. Example: XML and JSON files.
Seeing Who Can Make Use of Data
Science
• It used to be that only large tech companies implement data
science methodologies to optimize and improve their
business, but that’s not been the case for quite a while now.
The proliferation of data has created a demand for insights.
• Data and the need for data-informed insights are found
everywhere.
• Because organizations of all sizes are beginning to recognize
that they’re immersed in a sink-or-swim, data-driven,
competitive environment, data know-how has emerged as a
core and requisite function in almost every line of business.
Seeing Who Can Make Use of Data
Science
• The fact is, in order to stay relevant, you need to take the
time and effort to acquire the skills that keep you current.
• Who can use data science? You can. Your organization can.
Your employer can. Anyone who has a bit of understanding
and training can begin using data insights to improve their
lives, their careers, and the well-being of their businesses.
• With data insights, however, people now have access to the
predictive vision that they need to truly drive change and
achieve the results they want.
2. Inspecting the Pieces of the Data
Science Puzzle
Inspecting the Pieces of the Data
Science Puzzle
• To practice data science, in the true meaning of the term, you
need the analytical know-how of math and statistics, the
coding skills necessary to work with data, and an area of
subject matter expertise.
• Nowadays, it’s almost impossible to differentiate between a
proper data scientist and a subject matter expert (SME)
whose success depends heavily on their ability to use data
science to generate insights.
Inspecting the Pieces of the Data
Science Puzzle
• The key components that are part of any data science role:

Collecting, querying, and consuming data


Applying mathematical modeling to data science tasks
Deriving insights from statistical methods
Coding, coding, coding — it’s just part of the game
Applying data science to a subject area
Communicating data insights
Inspecting the Pieces of the Data
Science Puzzle
Collecting, querying, and consuming data
• Data engineers have the job of capturing and collating large
volumes of structured, unstructured, and semi structured big
data
• Again, data engineering tasks are separate from the work
that’s performed in data science, which focuses more on
analysis, prediction, and visualization. Despite this distinction,
whenever data scientists collect, query, and consume data
during the analysis process, they perform work similar to that
of the data engineer.
Inspecting the Pieces of the Data
Science Puzzle
Collecting, querying, and consuming data
• A data scientist can work from several datasets that are
stored in a single database, or even in several different data
storage environments.
• No matter how the data is combined or where it’s stored, if
you’re a data scientist, you almost always have to query data.
• Whether you’re using a third party application or doing
custom analyses by using a programming language such as R
or Python, you can choose from the following universally
accepted file formats:
Inspecting the Pieces of the Data
Science Puzzle
Collecting, querying, and consuming data
Comma-separated values (CSV): Almost every brand of desktop
and web-based analysis application accepts this file type.
Script: These script files end with the extension .ply or .ipynb
(Python) or .r (R).
Application: Excel is useful for quick and easy analyses on
small to medium size datasets.
Web programming: If you’re building custom, web-based data
visualizations, you may be working in D3.js — or data-driven
documents, a JavaScript library for data visualization.
Inspecting the Pieces of the Data
Science Puzzle
Applying mathematical modeling to data science tasks
• Data science relies heavily on a practitioner’s math skills.
Precisely because these are the skills needed to understand
your data and its significance.
• These skills can be used to carry out predictive forecasting,
decision modeling, and hypotheses testing.
• Mathematics uses deterministic methods to form a
quantitative (or numerical) description of the world.
• Data scientists use mathematical methods to build decision
models, generate approximations, and make predictions
Inspecting the Pieces of the Data
Science Puzzle
Deriving insights from statistical methods
• In data science, statistical methods are useful for better
understanding your data’s significance, for validating
hypotheses, for simulating scenarios, and for making
predictive forecasts of future events.
• If you want to go places in data science, though, take some
time to get up to speed in a few basic statistical methods, like
linear and logistic regression, naïve Bayes classification, and
time series analysis.
Inspecting the Pieces of the Data
Science Puzzle
Coding, coding, coding - it’s just part of the game
• Coding is unavoidable when you’re working in data science
• Your code will instruct the computer in how to manipulate,
analyze, and visualize your data.
• Programming languages such as Python and R are important for
writing scripts for data manipulation, analysis, and visualization.
• SQL, on the other hand, is useful for data querying.
• Finally, the JavaScript library D3.js is often required for making
cool, custom, and interactive web-based data visualizations.
Inspecting the Pieces of the Data
Science Puzzle
Applying data science to a subject area
• Many statisticians have cried out, “Data science is nothing new -
it’s just another name for what we’ve been doing all along!”
• In fact, data scientists often use computer languages not used in
traditional statistics and take approaches derived from the field of
mathematics.
• The main point of distinction between statistics and data science
is the need for subject matter expertise.
Inspecting the Pieces of the Data
Science Puzzle
Applying data science to a subject area
• Data scientists, should have a strong subject matter expertise in
the area in which they’re working.
• Data scientists generate deep insights and then use their domain-
specific expertise to understand exactly what those insights mean
with respect to the area in which they’re working.
• An example of coupling data science skills with area of expertise:
Clinical informatics scientist, that combines healthcare expertise
with data science skills to produce personalized healthcare
treatment plans. (more examples in the book)
Inspecting the Pieces of the Data
Science Puzzle
Communicating data insights
• A data scientist must have excellent verbal communication skills.
• If a data scientist can’t communicate, all the knowledge and
insight in the world does nothing for the organization.
• Data scientists need to be able to explain data insights in a way
that staff members can understand.
• Data scientists need to be able to produce clear and meaningful
data visualizations and written narratives.
• Data scientists must be creative and pragmatic in their means and
methods of communication.
3. Exploring Career Alternatives That Involve
Data Science
Exploring Career Alternatives That Involve
Data Science
The data implementer
• The main task is to build data and artificial intelligence (AI)
solutions.
• Attention should be given to detail that naturally helps you in
coding up innovative solutions that deliver reliable and accurate
results.
• A project starts with a simple request and some messy data, but
through perseverance and brainpower, they’re turned into clear
and accurate predictive data insights.
• If you’re a data implementer, math and coding are your power.
Exploring Career Alternatives That Involve
Data Science
The data leader
• Data leader leads teams and project stakeholders through the
process of building successful data solutions.
• The difference between data implementers and data leaders is
that leaders generally love data science for the incredible
outcomes that it makes possible.
• They have a deep passion for using their data science expertise
and leadership skills to create tangible results.
• Data leaders love to collaborate with smart people across the
company to get the job done right.
Exploring Career Alternatives That Involve
Data Science
The data entrepreneur
• If you’re a data entrepreneur, your secret superpower is building
up businesses by delivering exceptional data science services and
products.
• A data entrepreneur has many overlapping traits and a greater
affinity for either the data implementer or the data leader, but
with one important difference: “Data entrepreneurs crave the
creative freedom that comes with being a founder.”
• The goal is to create a vision for a business and then use data
science expertise to guide the business to turn that vision into
reality.
Thank
You

You might also like