0% found this document useful (0 votes)
21 views

[ISP610] Lesson 1 - Introduction to data analytics _Mdm Ezza2024

Uploaded by

diniimusthafa
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

[ISP610] Lesson 1 - Introduction to data analytics _Mdm Ezza2024

Uploaded by

diniimusthafa
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

ISP610

Chapter 1:
Introduction to Data Analytics

DR. EZZATUL AKMAL KAMARU-


ZAMAN
OC2024
Outline
INTRODUCTION TO DATA ANALYTICS

1. Definition of data analytics


2. Importance of data analytics
3. Type of data analytics
4. Example of applications
5. Data Science
6. Data Analytics Process
7. The people involved in business data analytics.
8. Your role in business data analytics.
9. Necessary skill set to be in business data analytics.
1.1 Definition of data analytics

Data All Around


• Lots of data is being collected
and warehoused
– Web data, e-commerce
– Financial transactions, bank/credit
transactions
– Online trading and purchasing
– Social Network
How Economist Define New
Terms As Commodity :

Data, AI, IoT

IOT is the new


Data is the new oil nervous system

AI is the new
electricity

Data is the
new Oil “
Clive Humby
1.1 Definition of data analytics

How Much Data That We Have?

• Google processes 20 PB a day


• Facebook has 60 TB of daily logs
• eBay has 6.5 PB of user data +
50 TB/day
• 1000 genomes project: 200 TB
1.1 Definition of data analytics
1.1 Definition of data analytics

How Much Data That We Have?


How Much Data That We
Have?

October 24 11
How Big is Big Data


From Megabyte (106) of data to Brontobyte (1027) and
Geopbyte (1030), these measurements will be used to
describe the tremendous amount of digital pool
formed by the IoT platform.
Cisco-IBSG predicts about more than 50 billion


devices connected to the internet by 2020, 75 billion
IoT Devices by 2025.
By 2025, it’s estimated that 463 exabytes of data will
be created each day globally – that’s the equivalent of
212,765,957 DVDs per day.
-Almiani,2020
1.1 Definition of data analytics

• Data Analytics WordCloud


1.1 Definition of data analytics
• What is Data?
Data is a set of values of subjects with respect to qualitative or quantitative variables. Data and
information or knowledge are often used interchangeably; however data becomes information
when it is viewed in context or in post-analysis.
1.1 Definition of data analytics

• What is Big Data?


Big Data is any data that is expensive to manage and hard to extract value from.
5 Main Characteristics
(5V’s) of Big Data

Derivation and Story


Telling Data Types

Value Variety

Truthfulness, Veracity Velocity Data Production &


Correctness of the Processing Speed
Data

Volume

Data Size (Mohammed,2020)


• Volume
– The size of the data – define the word “big”
• Velocity
– how fast the data can be processed and accessed (social
media posts/ YouTube videos etc that are uploaded in
thousands every second should be accessible as early as
possible.
• Variety and Complexity
– The diversity of sources, formats, quality, structures
• Variability
– data which keeps on changing constantly - focus on understanding and
interpreting the correct meanings of raw data
• Veracity
– about making sure the data gathered is accurate and keeping the bad
data away from the systems
• Visualization
– how to present data to the management for decision-making purposes
• Value
– user needs to understand that the organization needs some value after
efforts are made and resources are spent on the other V’s (if it is done
and processed correctly)
The V’s Evolution of Big Data

9V’s (Owais,2016)
10V’s (Data Science Central)
17V’s (Panimalar,2017)

Volume Volume Volume


Variety
Variety Variety Velocity
Velocity Velocity Veracity
Value
Veracity Veracity
Variability
Value Value Validity
Visualization Variability Venue
Vocabulary
Variability Validity
Vagueness
Validity Venue Volatility
Volatility Vocabulary Visualization
Viscosity
Vagueness Virality
Verbosity
Voluntariness
Versality
1.1 Definition of data analytics
• Types of Data
– STRUCTURED:
• Relational Data
(Tables/Transaction/Legacy
Data)
– UNSTRUCTURED
• Text Data (Web)
• Semi-structured Data (XML)
• Graph Data
• Streaming Data
1.1 Definition of data analytics
• Types of Data
– Relational Data (Tables/Transaction/Legacy Data)
COLUMN

ROW
1.1 Definition of data analytics
• Types of Data
– Text Data (Web)
1.1 Definition of data analytics
• Types of Data
– Semi-structured Data – Graph Data
(XML)

Political Polarization During


the 2008 US Presidential
Campaign Graph data
• Types of Data

October 24 24
1.1 Definition of data analytics
• Types of Data
– Streaming Data

https://www.zoomdata.com/solutions/modern-bi/streaming-analytics/
Data Sources
• Data come from multiple sources, including:
– Medical Information, such as genomic sequencing and MRIs
– Increased use of broadband on the Web – including the 2 billion photos each
month that Facebook users currently upload as well as the innumerable videos
uploaded to YouTube and other multimedia sites.
– Video surveillance.
– Increased global use of mobile devices – the torrent of texting is not likely to
cease.
– Smart devices – sensor-based collection of information from smart electric
grids, smart buildings and many other public and industry infrastructure.
– Non-traditional IT devices – including the use of RFID readers, GPS navigation
systems, and seismic processing.
1.1 Definition of data analytics

• What is Data Analytics


➢ “is a process of inspecting, cleansing, transforming, and modeling data
with the goal of discovering useful information, suggesting conclusions,
and supporting decision-making”. - Wikipedia
➢ "leverage data in a particular functional process (or application) to
enable context-specific insight that is actionable.“ – Gartner
➢ the science of analyzing raw data to make conclusions about that
information - Investopedia
➢ examines large amounts of data to uncover hidden patterns,
correlations and other insights - SAS
1.2 Importance of data
analytics
• Data analytics is important because it helps a business of
a particular sector to optimize its performance.
Implementing it into the business model means
companies can help reduce costs by identifying more
efficient ways of doing business and by storing large
amounts of data. A company/ sector can also use data
analytics to make better business decisions and help
analyze customer trends and satisfaction per say, which
can lead to new (and better) products and services or at
least provide input or guideline.
1.2 Importance of data
analytics
• Value Chain: The analytics will tell how the existing
information is going to aid the business in finding out the
gold mine that is the way to success for a company.
• Knowledge: The insights able to comprehend a guide to
show how you can go about your business in the near
future and what is that the economy already has its
hands on. That’s how you are going to avail the benefit
before anyone else.
• Opportunities: Data Analytics gives us analyzed data that
helps us in seeing opportunities before the time that’s
another way of unlocking more options.
1.3 Type of data analytics
There are four types of data analytics:
• Descriptive analytics
– describes what has happened over a given period. Have the number of views gone up? Are
sales stronger this month than last?
• Diagnostic analytics
– focuses more on why something happened. This involves more diverse data inputs and a bit of
hypothesizing. Did the weather affect drink sales? Did that latest marketing campaign impact
sales?
• Predictive analytics
– moves to what is likely going to happen in the near term. What happened to sales last time we
had a hot summer? How many weather models predict a hot summer this year?
• Prescriptive analytics
– moves into the territory of suggesting a course of action. If the likelihood of a hot summer as
measured as an average of these five weather models is above 58%, then we should rent an
additional drink tank to increase output in the factory.
1.3 Type of data analytics
1.4 Example of applications
• Nowadays, data analytics has become important needs in solving
business problem in various field including:
– Case 1: Customer Analytics

• Analytics are often used to model customer behavior. For


example, modeling the events that lead to a customer
becoming brand loyal.

– Case 2: Credit Risk Analytics

• Analytics conducted on credit data that help risk managers to


stay competitive in today’s marketplace. The manager can
use analytics to access real credit data, inference evaluation
and decision, conduct low default portfolio risk modelling,
stress-testing as well as building and validate credit risk
management model. Predictive analytics is often used to
model business risk such as the credit risk associated with a
particular customer.
1.4 Example of applications

– Case 3: Retail Analytics


• Analytics for retail forecasts and operations. For example, a retailer may attempt to
predict demand for a trendy new style of shoe by color and sales region.

– Case 4: Marketing Analytics


• Analytics to look at the results of product, pricing, promotion, advertising and
distribution strategies. For example, analytics might show that female customers in
their 20s are 70% more likely to purchase a particular item at price A as compared
to price B

– Case 5: Business Analytics


• A company would like to identify which of their customers are likely to stop using
their services (to churn). Thus, this company can use data analytics to explore and
understand the customer’s behaviour based on the company’s business data.
Based on the results obtained the company can focus on the retention strategy.
The Value of Data
• Everyone and everything is leaving a digital footprint. The
graph shows the different forms of data being generated
by new applications and the scale and growth rate of the
data. By analysing these immense data, organisations can
reap value.
• Industry case studies:
– Health care – Reducing cost of care
– Public services – Preventing pandemics
– Life sciences – Genomic mapping
Health care

Data

VALUE!
October 24 35
Public Services

Data
Life Sciences

Data
ChatGPT / Large Language Model

• TASK :
Find Value of Data in the Context of Large Language
Model
– Situation
– Use of Data
– Key Outcomes
Example of applications. More…
• Netflix
• https://www.edureka.co/blog/data-
science-applications/
Competitive Advantage

• To a profit-making organisation, value of data comes


in the form of an advantage over their competitors.
• According to Bain Research, top performing
organisations tend to make decisions based on what
their data tells them. By having a good basis to work
on, these organisations tend to make decisions faster.

October 24 40
Competitive Advantage: Airlines
• Call centres, for instance, can be made more effective and efficient by
capitalizing on what the company can know about the caller ahead of
time. And airlines have for years been able to route premium-status fliers
to higher-level customer service representatives by recognizing their caller
IDs. Now they can do even more: By making a quick correlation between
your ID, your booked flights and the status of those flights, they may be
able to determine why you’re calling, even before the second ring. If your
next flight has just been delayed, the representative could answer the
phone with a pretty good idea of why you’re calling. More in-depth
analysis could correlate your ID with your social media presence. If you’ve
just tweeted an irate message about being booted from a flight, the rep
answering your call may have already read it.

October 24 41
What’s Driving Analytics
in Organisations?

October 24 42
Analytics
• More than just an OLTP MIS reporting.
• Rather than doing standard reporting on these areas, organizations can
apply advanced analytical techniques to optimize processes and derive
more value from these typical tasks.
• Analytics examine large amounts of data to uncover hidden patterns,
correlations and other insights.
• Analytics help organisations to make more accurate decisions when faced
with problems.
• Analytics helps organizations harness their data and use it to identify new
opportunities. That, in turn, leads to smarter business moves, more
efficient operations, higher profits and happier customers

October 24 43
October 24 44
WHO ARE THE PEOPLE INVOLVED IN
BUSINESS DATA ANALYTICS
AND
WHAT IS YOUR ROLE?

October 24 45
October 24 46
YOU

October 24 47
October 24 48
SKILL SET

YOU

October 24 49
• Quantitative skills, such as mathematics or statistics.
• Technical aptitude, such as software engineering, machine
learning and programming skills.
• Sceptical…. This may be a counterintuitive trait, although it is
important that data scientists can examine their work critically
rather than in a one-sided way.
• Curious & Creative. Must be passionate about data and finding
creative ways to solve problems and portray information.
• Communicative & Collaborative: It is not enough to have strong
quantitative skills or engineering skills. To make a project
resonate, you must be to articulate the business value in a clear
way, and work collaboratively with project sponsors and key
October stakeholders.
24 50
1.5 Data Science
What is Data Science?

Data science is a multi-disciplinary field that


uses scientific methods, processes, algorithms
and systems to extract knowledge and insights
from structured and unstructured data.

Diagram 1: Data science process flow


1.6 Process of Data
Analytics
Data Science Life Cycle :
CRISP-DM

CRISP-DM or CRoss Industry Standard Process


for Data Mining is a process model with six
phases that naturally describes the data
science life cycle. It’s like a set of guardrails to
help you plan, organize, and implement your
data analytics /data science project.
Data Science Life Cycle :
CRISP-DM
• Business Understanding: CRISP-DM starts with understanding the business objectives and requirements of the project. This aligns with the goal-
oriented nature of both data analytics and data science, which aim to address specific business challenges or opportunities u sing data-driven
approaches.

• Data Understanding: In CRISP-DM, data understanding involves exploring and familiarizing oneself with the available data sources. Similarly, data
analytics and data science projects require an understanding of the data landscape, including data quality, structure, and re lationships, to inform the
analysis and modeling processes.

• Data Preparation: Data preparation in CRISP-DM involves cleaning, transforming, and integrating data to make it suitable for analysis. This aligns with
the data preprocessing phase in data analytics and data science, where raw data is refined and prepared for analysis using te chniques such as data
cleaning, normalization, and feature engineering.

• Modeling: The modeling phase in CRISP-DM involves building and evaluating predictive models to address the project objectives. This corresponds to
the modeling phase in data science, where various algorithms and techniques are applied to develop models that can uncover pa tterns, make
predictions, or generate insights from the data.

• Evaluation: CRISP-DM emphasizes the importance of evaluating model performance and validating results to ensure they meet the project
objectives. Similarly, data analytics and data science projects involve assessing the accuracy, reliability, and relevance of the analysis results to ensure
they provide actionable insights and value to the business.

• Deployment: The deployment phase in CRISP-DM focuses on implementing the data analytics/data science solution and integrating it into business
processes. In data analytics and data science, this translates to deploying analytical models, dashboards, or reports to stak eholders and decision-
makers, enabling them to use the insights generated to inform their actions and decisions.

You might also like