Document from Srishti
Document from Srishti
1|KOMA069310
2|KOMA069310
3|KOMA069310
4|KOMA069310
Unit 1
Introduction
Topics to be covered
Data and Data Science;
Data analytics and data analysis, Classification of
Analytics, Application of analytics in business, Types of
data: nominal, ordinal, scale;
Big Data and its characteristics, Applications of Big
data;
Challenges in data analytics;
5|KOMA069310
What is Data?
Data refers to raw facts and figures collected from various sources. It can be
quantitative (numbers, statistics) or qualitative (descriptions, observations).
In business, data might include sales numbers, customer feedback, website
traffic, or social media interactions.
1. Data Collection
o Gathering data from different sources, including databases,
sensors, websites, and surveys.
o Example: A retail store collects sales data from its POS (Point of
Sale) system.
2. Data Cleaning
o Removing errors, duplicates, and missing values to ensure high-
quality data.
o Example: If customer records contain multiple spellings of the
same name, cleaning ensures consistency.
3. Data Processing
o Organizing and transforming raw data into a structured format for
analysis.
o Example: Converting transaction records into a readable table
format.
4. Data Analysis
o Applying statistical and analytical techniques to understand
patterns and relationships in data.
o Example: Analyzing customer demographics to determine target
markets.
5. Data Visualization
o Representing data through graphs, charts, and dashboards to
communicate insights effectively.
o Example: A sales performance dashboard showing trends over
time.
6. Machine Learning and AI
o Using algorithms to allow computers to learn from data and make
predictions.
o Example: Netflix using machine learning to recommend shows
based on viewing history.
7. Decision Making
o Using insights from data science to guide business strategies.
o Example: A marketing team using data to decide which
advertisements perform best.
7|KOMA069310
2. Finance and Banking
Data Analysis
Data Analysis is the process of inspecting, cleaning, transforming, and
modelling data to discover useful information, patterns, trends, and
relationships. It helps in making data-driven decisions.
Data Analytics
Data Analytics is the broader field that involves using technology, statistics,
and machine learning to analyze data and gain actionable business insights.
Data Analytics is generally categorized into the same four types as Data
Analysis, but it also includes real-time and automated analytics.
10 | K O M A 0 6 9 3 1 0
summarizing data. predictive models.
Techniques Statistical methods, AI, machine learning, real-time
Used descriptive & diagnostic processing, predictive &
analysis. prescriptive analytics.
Tools Excel, SQL, Tableau, Hadoop, Spark, AI, Cloud
Python (basic stats). Platforms, Advanced BI Tools.
Example Analyzing past sales trends Predicting future customer
to understand customer behavior and automating
buying patterns. personalized marketing.
Classification of Analytics
Analytics is classified into different types based on its purpose and the type
of insights it provides. The four main types of analytics are:
Key Features:
Examples in Business:
Tools Used:
Use Case:
Definition:
Key Features:
12 | K O M A 0 6 9 3 1 0
Examples in Business:
Tools Used:
Use Case:
Definition:
Key Features:
Examples in Business:
Tools Used:
13 | K O M A 0 6 9 3 1 0
Use Case:
Definition:
Key Features:
Examples in Business:
Tools Used:
Use Case:
Types of Data
14 | K O M A 0 6 9 3 1 0
1. Nominal Data (Categorical Data)
Nominal data refers to data that consists of categories or labels that do not
have any intrinsic order or ranking. This is the simplest form of data.
Characteristics:
o No order or ranking: The categories do not have a logical order.
o Qualitative: It‘s used to classify data into distinct groups or
categories.
Examples:
o Gender (Male, Female, Other)
o Types of products (Electronics, Clothing, Furniture)
o Colors of cars (Red, Blue, Black)
o Customer ID numbers
o Blood type (A, B, O, AB)
Analysis: Nominal data is typically analyzed using frequency counts
(how many data points fall into each category). Measures such as mode
(the most frequent category) are commonly used.
Ordinal data refers to categories that have a meaningful order or ranking, but
the intervals between the categories are not uniform or precisely
measurable.
Characteristics:
o Order or ranking: The categories have a specific order, but the
distance between the categories is not equal.
o Qualitative: Still considered qualitative data, but with a defined
sequence.
Examples:
o Customer satisfaction ratings (Very Unsatisfied, Unsatisfied,
Neutral, Satisfied, Very Satisfied)
o Educational level (High School, Undergraduate, Graduate,
Postgraduate)
o Military ranks (Private, Sergeant, Captain, General)
o Levels of service (Basic, Standard, Premium)
Analysis: Ordinal data can be analyzed by comparing rankings. The
median or mode is typically used, but mean values are not appropriate
due to the uneven intervals between categories. Non-parametric tests
(such as the Kruskal-Wallis test) are often used for analysis.
15 | K O M A 0 6 9 3 1 0
3. Scale Data (Interval and Ratio Data)
Scale data is quantitative and includes both interval data and ratio data,
which are more advanced levels of measurement. Scale data allows for
mathematical operations like addition, subtraction, multiplication, and
division, unlike nominal or ordinal data.
Interval Data
Interval data has ordered values with meaningful differences between them,
but it lacks a true zero point (i.e., zero doesn‘t mean the absence of the
quantity).
Characteristics:
o Ordered and measurable: The values have a specific order, and
the differences between values are meaningful.
o No true zero: The zero point is arbitrary (e.g., a temperature of 0°C
does not mean there is no temperature).
Examples:
o Temperature in Celsius or Fahrenheit (e.g., 20°C, 30°C, 40°C)
o Calendar dates (e.g., 2020, 2021, 2022)
o IQ scores
Analysis: You can calculate mean, median, and standard deviation.
However, because there is no true zero, you cannot compute ratios like
"twice as much."
Ratio Data
Ratio data is similar to interval data but has a true zero point, meaning zero
represents the complete absence of the quantity.
Characteristics:
o Ordered, measurable, and has a true zero: The presence of a true
zero allows for meaningful ratios and all mathematical operations.
o Absolute zero: Zero means the complete absence of the quantity,
making it a true zero point.
Examples:
o Sales revenue ($0, $500, $1000, etc.)
o Weight (0 kg means no weight)
o Height (0 cm means no height)
o Age (0 years means no age)
o Distance (0 meters means no distance)
Analysis: All statistical measures can be used, including mean, median,
standard deviation, and ratios (e.g., twice as much, three times as
large). It also supports operations like multiplication and division.
16 | K O M A 0 6 9 3 1 0
Big Data
Big Data refers to extremely large and complex datasets that cannot be
processed using traditional data processing methods due to their volume,
variety, and velocity. It is often used in business analytics to uncover hidden
patterns, correlations, and trends to inform decision-making.
Big Data is typically defined by the following three core characteristics, often
referred to as the 3Vs:
1. Volume
2. Velocity
17 | K O M A 0 6 9 3 1 0
3. Variety
4. Veracity
5. Value
Description: Value refers to the usefulness of big data. Not all data is
valuable, and the goal of big data analytics is to extract meaningful
insights that can be leveraged to drive business decisions and
strategies.
Examples:
o Identifying new business opportunities by analyzing consumer
behavior patterns.
o Improving customer satisfaction through predictive analytics.
18 | K O M A 0 6 9 3 1 0
Significance: The ultimate goal of big data is to create value by deriving
actionable insights. Businesses need to focus on extracting valuable
knowledge from large datasets to achieve growth and improve
operational efficiency.
Big data has numerous applications across various industries. Here are
some of the key areas where big data is making a significant impact:
1. Healthcare
3. Financial Services
4. Manufacturing
20 | K O M A 0 6 9 3 1 0
oUrban Planning: Analyzing traffic, energy usage, and population
data to improve city infrastructure.
o Public Safety: Using data to predict and prevent crime or respond
quickly to emergencies.
o Environmental Monitoring: Tracking climate data to inform policy
decisions related to environmental protection.
Impact: Big data enables governments to provide better services,
increase transparency, and make more informed decisions.
21 | K O M A 0 6 9 3 1 0
Impact: Without effective data integration, businesses may struggle to
gain a comprehensive view of operations, hindering effective analysis
and decision-making.
22 | K O M A 0 6 9 3 1 0
5. Data Interpretation and Analysis
4. What are the different types of data? Explain Nominal, Ordinal, and Scale
data with examples.
5. What is Big Data? Discuss its characteristics and explain how it differs
from traditional data.
7. What are the main challenges faced in Data Analytics? How can
businesses overcome these challenges?
10. What are the key technologies used in Data Science and Big Data
Analytics? Discuss their applications in modern business practices.
25 | K O M A 0 6 9 3 1 0