0% found this document useful (0 votes)

45 views

DA Notes

Data analytics notes

Uploaded by

Aisha Emad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views

DA Notes

Data analytics notes

Uploaded by

Aisha Emad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

Data Analytics Notes

 Descriptive Analytics: Focuses on summarizing past data to answer "what

happened?". It uses statistical measures like mean, median, and mode and
creates visual reports, dashboards, and simple data queries. This type of
analytics helps businesses understand the current status and performance
indicators.
 Exploratory/Diagnostic Analytics: Aims to answer "why did it happen?" by
investigating data relationships, root causes, and patterns. It often involves
correlations, business dashboards, and analysis models to identify causal
factors behind trends or outcomes.
 Predictive Analytics: Looks at historical data patterns to forecast future
events, answering "what is likely to happen?" It applies techniques like
regression analysis, time series forecasting, machine learning, and deep
learning, often using tools like R and Python. Predictive analytics helps
anticipate risks, identify opportunities, and optimize processes.
 Prescriptive Analytics: Provides recommendations on the best actions to
take based on predictive models, addressing "what should we do?". It uses
optimization models, simulation techniques, and sensitivity analysis to
guide decision-making and improve performance. This type of analytics is
beneficial for tasks like pricing strategies, investment planning, and process
optimization.

 Data Analytics 1.0 (Business Intelligence Era): Initial stage focused on

structured data reporting through business intelligence (BI) tools like
PowerBI. This era primarily involved data collection and reporting on
company metrics.
 Data Analytics 2.0 (Big Data Era): With the rise of big data and NoSQL
databases, this phase introduced tools like Hadoop, AI, and machine
learning for handling vast and varied data types. The emphasis was on
processing large data volumes to extract insights.
 Data Analytics 3.0 (Data Science and Decision-Making Era): Leveraging
data science and tools like ML, DL, Python, and R, this era focuses on using
data analysis for innovation and sustainable decision-making. It helps
companies stay competitive by developing data-driven products, optimizing
services, and supporting R&D.

 Data Collection and Cleaning: Involves gathering data from different

sources (IoT devices, applications, social media, etc.) and storing it in a
centralized system like a cloud database. Cleaning ensures the data is
reliable and free of errors.
 Data Mining: Sorting, processing, and labeling data with metadata to
enable data scientists to identify trends and focus on insights rather than
manual tasks. Machine learning algorithms help automate this step.
 Descriptive and Exploratory Analysis: Summarizes "what is happening?"
and explores "why is it happening?". This stage uses descriptive statistics,
visualizations, and dashboard tools to gain initial insights.
 Predictive and Prescriptive Analysis: Uses trends and historical data to
predict and advise on future actions, supporting decision-makers with
actionable insights.
 Visualization and Reporting: Visualization tools like Power BI, Tableau, and
dashboards simplify complex datasets, enabling managers to understand
alternatives quickly and make informed decisions.

 Marketing: Marketing analytics assess the success of campaigns, customer

engagement, and ROI. AI and machine learning help optimize campaigns,
segment customers, and guide strategic marketing efforts.
 Human Resources (HR): HR analytics reveal insights on talent acquisition,
employee behavior, and retention. Tools allow HR leaders to optimize
recruitment processes, analyze employee decisions, and predict outcomes.
 Sales: Sales analytics identify key factors influencing customer purchases,
like price, seasonality, or availability, allowing teams to improve sales cycles
and forecasts.
 Finance: Financial analytics enhance budget planning, optimize cost
management, and increase profit margins by analyzing spending patterns,
using predictive modeling, and generating machine learning insights.
Industries:
 Transportation: Helps analyze traffic patterns and network congestion.
 Logistics and Delivery: Optimizes shipping routes and delivery times and
tracks shipments.
 Web Services: Enhances search engine algorithms, providing more relevant
search results.
 Manufacturing: Improves operational efficiency through predictive
maintenance, budgeting, and trend analysis.
 Security: Proactively addresses cybersecurity by analyzing and identifying
potential threats.
 Education: Supports student learning and engagement by analyzing
educational data and outcomes.
 Healthcare: Utilizes analytics to provide faster diagnosis, treatment options,
and personalized care by examining patient data in real-time.
Benefits of Data Analytics
 Competitive Advantage: Data analytics helps companies understand
industry trends, strategize against competitors, and grow in a changing
environment.
 Efficient Use of Data: Companies collect extensive data; analytics helps
them discern what data is valuable and how best to use it.
 Customer Relationship Building: By analyzing customer behavior,
companies can create personalized experiences, improving loyalty and
satisfaction.
 Strategic Planning and Forecasting: Analytics allows companies to adapt to
trends, optimize processes, and make data-driven decisions.
Future of Data Analytics
 Growing Demand: The data analytics field is expected to expand
significantly, with the market projected to grow by 30% annually, reaching a
value of $77.6 billion.
 Job Opportunities: Data analysts and data scientists are in high demand
across various industries. Roles will likely continue to grow as businesses
across all sectors increasingly rely on data-driven insights.
 Importance Across Sectors: Beyond IT, data analytics is becoming essential
in finance, healthcare, media, entertainment, and mobility, leading to
employment growth.
Data Analytics Tools
 Common Tools: Includes SQL, Excel, R, Python, Tableau, and Power BI.
These tools handle data collection, analysis, visualization, and reporting.
 Specialized Tools:
o Spreadsheets (Excel): Often used for data entry, pivot tables, and
data visualization.
o OLAP (Online Analytical Processing): Used for multidimensional
analysis in databases.
o Statistical and Quantitative Tools: Enable complex data analysis and
decision-making, such as decision trees, TOPSIS, and Bayesian
networks.
o Business Rule Engines (BRE): Help automate business rules and data
handling based on specific criteria.
o Simulation Tools: Model various scenarios to predict outcomes using
mathematical functions.
9. Data Cleaning
 Importance: Cleaning is crucial to ensure reliable results. A common saying,
"Garbage in, garbage out," highlights that clean data leads to meaningful
outcomes, whereas unclean data leads to unreliable results.
 Methods:
o Removing Duplicates: Filters out repeat records.
o Deleting Irrelevant Columns: Omits unnecessary data fields to
streamline analysis.
o Handling Missing Data: Missing values can be either imputed based
on other values or excluded from analysis.
o Outlier Management: Identifies and addresses data points that are
extreme or inconsistent with other values.
10. Data Analyst Role
 Responsibilities:
o Collect and clean data, identify trends, and create reports, charts,
and dashboards.
o Communicate insights through data storytelling and visualization.
o Collaborate with stakeholders to turn data insights into actionable
business strategies.
 Required Skills:
o Mathematical proficiency, statistical and programming skills (in
languages like SQL, R, Python), problem-solving, analytical thinking,
and effective communication.
 Tools and Software: Data analysts commonly use tools such as Excel, SQL,
Tableau, Power BI, Python, and R to gather, analyze, and present data.
Data Analysis:
 Process: Exploring, cleaning, transforming, and reporting data.
 Goal: Extract useful insights, suggest conclusions, and support decision-
making.
Tools for Data Analysis:
 Open Refine, Tableau, KNIME, Google Fusion Tables, Node XL.
Data Analytics:
 Focus: Using data visualization and statistical models for insights and better
decision-making.
 Defined as: Transforming data into actionable insights within an
organizational context.
Tools for Data Analytics:
 SAS, R, Python (with libraries), Tableau, Apache Spark, MS Excel.
 Focus:
o BI: Explains past performance using consistent metrics to guide planning.
o Data Analytics: Provides insights, predictions, and prescriptions based on
statistical and data transformation methods.
 Purpose:
o BI: Helps in decision-making by analyzing business operations and identifying
areas for improvement.
o Data Analytics: Transforms raw data into usable formats, supports decision-
making, and applies predictive analytics.
 Techniques:
o BI: Uses descriptive analytics to review past data.
o Data Analytics: Involves advanced modeling, cleaning, and predictive
techniques.
 Visualization Tools:
o BI: Dashboards for summarizing and presenting data insights.
o Data Analytics: Focus on creating new insights and visualizing results
dynamically.
 Typical Progression:
o Companies often implement BI first to understand their business, then advance to
Data Analytics for deeper insights and actionable recommendations.
 Definition:
o Data Analytics: Analyzes data to extract meaningful insights aligned with
business objectives, solving specific questions or problems.
o Data Science: Explores raw data to uncover insights, often answering open-ended
questions using advanced algorithms, statistical models, and programming.
 Focus:
o Data Analytics: Focuses on visualization and decision-making for defined
problems.
o Data Science: Focuses on exploring and modeling raw data to create predictive or
prescriptive solutions.
 Scope:
o Data Analytics: A subset of data science, addressing specific data-related
questions.
o Data Science: Broader, including data analytics, machine learning, data mining,
and other disciplines.
 Methods:
o Data Analytics: Relies on statistical methods to interpret structured data.
o Data Science: Involves coding, machine learning, and advanced statistical
techniques.
 Objective:
o Data Analytics: Solves well-defined business problems.
o Data Science: Creates new methodologies and models to derive insights from raw
data.
 Commonality:
o Both extract insights to support business decisions, but data science is a broader,
more technical field encompassing analytics.
 Definition:
o Business Analytics: Uses strategies and technologies to analyze industry-specific
data and guide decision-making for business growth.
o Data Analytics: Transforms raw or unstructured data into meaningful formats for
insights, conclusions, and predictions.
 Focus:
o Business Analytics: Prescribes solutions and plans specific to a business based on
metrics.
o Data Analytics: Explains data patterns and visualizes results using statistical
methods.
 Scope:
o Business Analytics: Measures past performance and aligns metrics with business
planning.
o Data Analytics: Explores and models data to discover new insights.
 Relationship:
o Integration: Companies often combine both; data analytics results are tailored for
use in business analytics decisions.
 Purpose:
o Data Analyst: Answers existing business questions through data analysis and
visualization.
o Data Scientist: Creates questions, builds models, and predicts future trends using
advanced techniques.
 Focus:
o Data Analyst: Works on data preparation, exploratory analysis, and descriptive
analytics.
o Data Scientist: Develops statistical models, machine learning algorithms, and
prescriptive analytics.
 Skills:
o Data Analyst: Proficient in data visualization tools, statistical analysis, and
database management.
o Data Scientist: Skilled in Python, R, Hadoop, machine learning, and software
development.
 Responsibilities:
o Data Analyst: Prepares reports and visualizations for decision-making.
o Data Scientist: Designs systems and models to automate and optimize operations.
 Complexity:
o Data Analyst: Focuses on simpler, analysis-level insights.
o Data Scientist: Tackles complex problems and builds data models.
 Tools:
o Data Analyst: Uses tools like Excel, Tableau, and SQL for visualization and
analysis.
o Data Scientist: Employs programming languages (Python, R) and platforms for
advanced analytics.
 Outcome:
o Data Analyst: Delivers actionable insights for stakeholders.
o Data Scientist: Creates predictive and prescriptive solutions for long-term
strategies.

Dirty Data: Any data that requires cleaning or preparation before analysis. It
includes:

1. Missing Data:
o Example: Missing values in variables essential for analysis, like customer ages
when analyzing purchasing behavior.
2. Duplicate Data:
o Example: Multiple identical records due to merging data from different sources.
3. Inconsistent or Incorrect Data:
o Example: Structural errors, typos, or inconsistent naming, such as mixed labels
like "Pass/Fail" and "G/B" in the same dataset.

"Garbage In, Garbage Out" (GIGO): Incorrect data leads to incorrect results.

Foundation for Analysis: Clean data ensures meaningful, reliable, and long-lasting
analysis, similar to a strong foundation for a house.

Cost of Dirty Data: Poor data practices can lead to significant long-term expenses.

Dirty data such as duplicate data, missing data , and de

Goal: Properly cleaned data is essential for extracting accurate and actionable
insights.

Simplified Data Cleaning Methods:

1. Remove Duplicates:
o Filter and eliminate repeated data, often introduced during collection.
2. Delete Irrelevant Columns:
o Remove non-essential data (e.g., IDs or birthdates) that don’t contribute to the
analysis.
3. Handle Missing Data:
o Options:
 Delete rows with missing values.
 Impute missing values based on other data.
 Mark as "0" or "missing."
o Choose the method carefully, as it impacts analysis.
4. Remove Outliers:
o Identify and decide whether to keep or remove values that deviate significantly
(e.g., test scores far below/above the norm).
5. Correct Inconsistencies:
o Resolve issues like typos or irregular naming conventions using manual methods
(e.g., "Find and Replace" in Excel) or filters.

Simplified Steps to Cleanse a Dataset in Excel:

1. Remove Duplicates:
o Select all data.
o Create a table (Insert → Table).
o Go to Data → Remove Duplicates to delete duplicate rows.
2. Handle Missing Data:
o Remove Blank Rows:
 Select all data and sort columns (A → Z or Z → A).
 Locate and delete blank rows.
o Find and Remove Blank Cells:
 Select a specific column (e.g., column F).
 Apply a filter (Data → Filter).
 In the filter dropdown, uncheck "Select All" and check only "Blanks."
 Delete rows with blank cells, then clear the filter.

Frequency Distribution: Shows how often a particular value occurs in a dataset.

Measures of Central Tendency: Include the mean, median, and mode, which estimate the
middle or average values.

Measures of Variability: Include range, standard deviation, and variance, which describe the
spread or variability in the dataset.

A pivot table summarizes large amounts of data by grouping it in meaningful ways (e.g., by sum
or average).

BBA 202 Business Analytics
No ratings yet
BBA 202 Business Analytics
52 pages
Stock Audit Report
No ratings yet
Stock Audit Report
8 pages
Here is an even more detailed and expanded version of Chapter 1 - Copy
No ratings yet
Here is an even more detailed and expanded version of Chapter 1 - Copy
5 pages
What is Data Analytics
No ratings yet
What is Data Analytics
4 pages
Chapter 1 Introduction To Data Analytics
No ratings yet
Chapter 1 Introduction To Data Analytics
4 pages
Unit 1
No ratings yet
Unit 1
8 pages
Da
No ratings yet
Da
6 pages
Enhanced Structured Notes- Introduction to Data Analytics
No ratings yet
Enhanced Structured Notes- Introduction to Data Analytics
5 pages
Unit1 Introduction To Data Analytics and Data Analytics Lifecycle Notes
No ratings yet
Unit1 Introduction To Data Analytics and Data Analytics Lifecycle Notes
13 pages
Internship Report
No ratings yet
Internship Report
9 pages
DATA ANALYSIS AND DATA SCIENCE: Unlock Insights and Drive Innovation with Advanced Analytical Techniques (2024 Guide)
From Everand
DATA ANALYSIS AND DATA SCIENCE: Unlock Insights and Drive Innovation with Advanced Analytical Techniques (2024 Guide)
WINTON CLEM
No ratings yet
UNIT-1 Data Analytics
No ratings yet
UNIT-1 Data Analytics
37 pages
Introduction to Data Analytics
No ratings yet
Introduction to Data Analytics
19 pages
Introduction to Data Science and Data Analytics
No ratings yet
Introduction to Data Science and Data Analytics
72 pages
Data_Analytics
No ratings yet
Data_Analytics
3 pages
Analytics Overview
No ratings yet
Analytics Overview
34 pages
Data Analytics
No ratings yet
Data Analytics
30 pages
Data Analytics Syllabus PDF
No ratings yet
Data Analytics Syllabus PDF
5 pages
What Is Business Analytics
No ratings yet
What Is Business Analytics
109 pages
Business Analytics Summary (Units 1.2 - 1.8)
No ratings yet
Business Analytics Summary (Units 1.2 - 1.8)
8 pages
ISPFL9 Module1
100% (1)
ISPFL9 Module1
22 pages
DataAnalytics-Chap-1
No ratings yet
DataAnalytics-Chap-1
36 pages
analytics and data science
No ratings yet
analytics and data science
12 pages
Unit - II (Bca01)
No ratings yet
Unit - II (Bca01)
17 pages
UNIT 2 Data Analysis
No ratings yet
UNIT 2 Data Analysis
19 pages
The Power and Promise of Data Analytics
No ratings yet
The Power and Promise of Data Analytics
3 pages
Unit1
No ratings yet
Unit1
21 pages
Data Analytics Complete Notes
No ratings yet
Data Analytics Complete Notes
33 pages
An97bMUq7TosJOh6ocX Vyw Sd0ReXf3IJh5o7G0xGuTpllE0dC8VhcDKbp oNHo8WZyFvw8FTaC31gQ3eFw3xmKLZAq eDcCNkmVGon3D7p48VP7EYatPxn-Hjd8D0
No ratings yet
An97bMUq7TosJOh6ocX Vyw Sd0ReXf3IJh5o7G0xGuTpllE0dC8VhcDKbp oNHo8WZyFvw8FTaC31gQ3eFw3xmKLZAq eDcCNkmVGon3D7p48VP7EYatPxn-Hjd8D0
17 pages
Week 1
No ratings yet
Week 1
50 pages
Introduction
No ratings yet
Introduction
14 pages
BISMA ITC
No ratings yet
BISMA ITC
7 pages
Business Analytics
No ratings yet
Business Analytics
10 pages
Business Analytics COMPLETE
No ratings yet
Business Analytics COMPLETE
8 pages
Unit-1 for Students
No ratings yet
Unit-1 for Students
57 pages
Data Analytics Tools A Comprehensive Overview
No ratings yet
Data Analytics Tools A Comprehensive Overview
6 pages
Analytics Techniques and Tool1
No ratings yet
Analytics Techniques and Tool1
6 pages
1.data Analytics Overview and Variables Disruptive System
No ratings yet
1.data Analytics Overview and Variables Disruptive System
7 pages
Ccw331-Business Analytics Printed Notes
100% (1)
Ccw331-Business Analytics Printed Notes
59 pages
Unit No 1 Intro To Business Analytics
No ratings yet
Unit No 1 Intro To Business Analytics
36 pages
IAT-1 - Bᵤgz..?-6
No ratings yet
IAT-1 - Bᵤgz..?-6
20 pages
DATA ANALYTICS
No ratings yet
DATA ANALYTICS
6 pages
Q
No ratings yet
Q
28 pages
data anlytics
No ratings yet
data anlytics
2 pages
Module 1
No ratings yet
Module 1
49 pages
Intro To Data Analytics
No ratings yet
Intro To Data Analytics
42 pages
Bat 401 Fba Reviewer
No ratings yet
Bat 401 Fba Reviewer
5 pages
Business Analytics Using Excel
No ratings yet
Business Analytics Using Excel
56 pages
UNIT 1 - INTRODUCTION ( DATA ANALYTICS AND BIG DATA )_60515294_2025_05_15_17_42
No ratings yet
UNIT 1 - INTRODUCTION ( DATA ANALYTICS AND BIG DATA )_60515294_2025_05_15_17_42
25 pages
889e5783-4a7a-4838-89a8-daf772cf3b8d_UNIT_2
No ratings yet
889e5783-4a7a-4838-89a8-daf772cf3b8d_UNIT_2
11 pages
BUSINESS ANALYTICS
No ratings yet
BUSINESS ANALYTICS
16 pages
Lecture 1
No ratings yet
Lecture 1
27 pages
Unit_1.pptx
No ratings yet
Unit_1.pptx
57 pages
BDA CH 1 V1
No ratings yet
BDA CH 1 V1
48 pages
Unit-1 Business Analytics
No ratings yet
Unit-1 Business Analytics
160 pages
Big - Data Unit-2
100% (2)
Big - Data Unit-2
64 pages
Introduction To Data Analytics
No ratings yet
Introduction To Data Analytics
16 pages
BA Test Material
No ratings yet
BA Test Material
13 pages
Unit 1 Introduction to Data Analytics
No ratings yet
Unit 1 Introduction to Data Analytics
20 pages
Unit-II (Data Analytics)
100% (1)
Unit-II (Data Analytics)
17 pages
Get Hired as a Data Analyst FAST in 2024
From Everand
Get Hired as a Data Analyst FAST in 2024
Silas Meadowlark
No ratings yet
Book of Abstracts ECE2018
No ratings yet
Book of Abstracts ECE2018
340 pages
2.3 DEWA Training - Design of A Solar PV System - Part 2
100% (1)
2.3 DEWA Training - Design of A Solar PV System - Part 2
42 pages
CP500 Tools
No ratings yet
CP500 Tools
26 pages
AI and Human Values - Ensuring Technology Aligns With Societal Needs
No ratings yet
AI and Human Values - Ensuring Technology Aligns With Societal Needs
11 pages
Fire Detection Robot
No ratings yet
Fire Detection Robot
7 pages
Spouses Afulugencia v. Metropolitan Bank and Trust Co., G.R. No. 185145, February 5, 2014
No ratings yet
Spouses Afulugencia v. Metropolitan Bank and Trust Co., G.R. No. 185145, February 5, 2014
7 pages
Time Management Practices in Large Construction Projects
No ratings yet
Time Management Practices in Large Construction Projects
5 pages
ARM Architecture Family - Wikipedia
No ratings yet
ARM Architecture Family - Wikipedia
35 pages
try c1
No ratings yet
try c1
13 pages
Ba7207 Business Research Methods Question Bank Edited
No ratings yet
Ba7207 Business Research Methods Question Bank Edited
9 pages
Tan Vs Sec
No ratings yet
Tan Vs Sec
2 pages
Kangkong Bites Kangkong Chips: Direct Ingredients Cost of Ingredients
No ratings yet
Kangkong Bites Kangkong Chips: Direct Ingredients Cost of Ingredients
9 pages
FA Work Book
No ratings yet
FA Work Book
59 pages
SSRN Id3450470
No ratings yet
SSRN Id3450470
14 pages
NAN BPKIHS Unit Annual Report 2080
No ratings yet
NAN BPKIHS Unit Annual Report 2080
16 pages
Fake Reviews
No ratings yet
Fake Reviews
5 pages
Sistem Pertanian Terpadu Sapi Dan Padi: September 2016
No ratings yet
Sistem Pertanian Terpadu Sapi Dan Padi: September 2016
12 pages
Gondola Load Calculations: Yy ZZ
No ratings yet
Gondola Load Calculations: Yy ZZ
5 pages
Indian Constitution Most Important Question and Answer
No ratings yet
Indian Constitution Most Important Question and Answer
18 pages
(Ebook) Post-Keynesian Economics: New Foundations (Second Edition) by Marc Lavoie ISBN 9781839109614, 1839109610 instant download
100% (6)
(Ebook) Post-Keynesian Economics: New Foundations (Second Edition) by Marc Lavoie ISBN 9781839109614, 1839109610 instant download
52 pages
Cryptography and Network Security Assignment 1) A Java Program For Shift Cipher
No ratings yet
Cryptography and Network Security Assignment 1) A Java Program For Shift Cipher
5 pages
IN4740
No ratings yet
IN4740
4 pages
Compiler Construction Assignment
No ratings yet
Compiler Construction Assignment
5 pages
Unit 1 OSCM
No ratings yet
Unit 1 OSCM
51 pages
Minecraft Keywords
No ratings yet
Minecraft Keywords
4 pages
Panel 101
No ratings yet
Panel 101
48 pages
Chapter 10 Sol Students
100% (1)
Chapter 10 Sol Students
13 pages
Hofstede's Dimensions3b
No ratings yet
Hofstede's Dimensions3b
16 pages
Polyaluminum Chloride
No ratings yet
Polyaluminum Chloride
1 page

DA Notes

Uploaded by

DA Notes

Uploaded by

Data Analytics Notes

 Descriptive Analytics: Focuses on summarizing past data to answer "what

 Data Analytics 1.0 (Business Intelligence Era): Initial stage focused on

 Data Collection and Cleaning: Involves gathering data from different

 Marketing: Marketing analytics assess the success of campaigns, customer

Dirty data such as duplicate data, missing data , and de

Simplified Data Cleaning Methods:

Simplified Steps to Cleanse a Dataset in Excel:

Frequency Distribution: Shows how often a particular value occurs in a dataset.

You might also like