UNIT-I Data Science
UNIT-I Data Science
UDITC04-DATA SCIENCE
Unit-I INTRODUCTION TO DATA SCIENCE
12/27/2024
Unit-I Asst.Professor,Dept of EEE
Content
2
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Where is Data Science Needed?
4
Data Science can be applied in nearly every part of a business where data
is available.
Examples are
Consumer goods
Stock markets
Industry
Politics
Logistic companies
E-commerce
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
How Does a Data Scientist Work?
6
Statistics
Programming (Python or R)
Mathematics
Databases
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Cont…
7
A Data Scientist must find patterns within the data. Before he/she can find the patterns, he/she must organize the
data in a standard format.
Here is how a Data Scientist works:
Ask the right questions - To understand the business problem.
Explore and collect data - From database, web logs, customer feedback, etc.
Extract the data - Transform the data to a standardized format.
Clean the data - Remove erroneous values from the data.
Find and replace missing values - Check for missing values and replace them with a suitable value (e.g. an
average value).
Normalize data - Scale the values in a practical range (e.g. 140 cm is smaller than 1,8 m. However, the
number 140 is larger than 1,8. - so scaling is important).
Analyze data, find patterns and make future predictions.
Represent the result - Present the result with useful insights in a way the "company" can understand.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
What is Data?
8
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Structured Data
10
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
How to Structure Data?
11
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Database Table
12
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Database Table Structure
13
A row is a horizontal
representation of
data.
A column is a
vertical
representation of
data.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Need for Data Science:
14
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Cont…
15
But in today's world, data is becoming so vast, i.e., approximately 2.5 quintals bytes of data is generating on every
day, which led to data explosion. It is estimated as per researches, that by 2020, 1.7 MB of data will be created at
every single second, by a single person on earth. Every Company requires data to work, grow, and improve their
businesses.
Now, handling of such huge amount of data is a challenging task for every organization. So to handle, process, and
analysis of this, we required some complex, powerful, and efficient algorithms and technology, and that technology
came into existence as data Science. Following are some main reasons for using data science technology:
With the help of data science technology, we can convert the massive amount of raw and unstructured data into
meaningful insights.
Data science technology is opting by various companies, whether it is a big brand or a startup. Google, Amazon,
Netflix, etc, which handle the huge amount of data, are using data science algorithms for better customer
experience.
Data science is working for automating transportation such as creating a self-driving car, which is the future of
transportation.
Data science can help in different predictions such as various survey, elections, flight ticket confirmation, etc.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Subsets of Data Science
16
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Characteristics of Data Science
17
1. Business Understanding
It is the most important characteristic unless
you understand the business; you cannot make
a good model even if you have good
knowledge of machine learning algorithms or
statistical skills. A data scientist needs to
understand the business requirement and
develop analytics according to them. So,
domain knowledge of the business also
becomes important or helpful.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Cont…
18
2. Intuition
Although the math involved is proven and foundational, a data scientist needs to
pick the right model with the right accuracy as all models will not give up the same
results. So a data scientist needs to feel when a model is ready for production
deployment. They also need the intuition to know at what point the production
model is stale and needs refactoring to respond to changing business environment.
3. Curiosity
Data Science is not a new field. It has been there before also, but the progress being
made in this field is very fast. New methods to solve familiar problems are being
developed constantly, so, as a data scientist, curiosity to learn emerging technologies
becomes very important.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Challenges of Data Science Technology
19
Data Scientists are data enthusiasts who gather and analyze large
sets of structured and unstructured data. A data scientist's role
combines computer science, statistics, and mathematics. They
analyze, process, and model data and later interpret the results to
create actionable plans for companies and organizations.
Data Scientists are analytical experts who utilize their skills both in
technology and social science to find trends and manage data. They
use their industry knowledge and context-specific understanding to
find solutions to business challenges.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Business Intelligence Analyst
22
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Data Engineer
23
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Data Architect
24
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Senior Data Scientist
25
12/27/2024 Unit-1
Concept of Data science
27
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Cont…
28
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
31
1. Statistics: Statistics is one of the most important components of data science. Statistics is a way to collect and analyze
the numerical data in a large amount and finding meaningful insights from it.
2. Domain Expertise: In data science, domain expertise binds data science together. Domain expertise means specialized
knowledge or skills of a particular area. In data science, there are various areas for which we need domain experts.
3. Data engineering: Data engineering is a part of data science, which involves acquiring, storing, retrieving, and
transforming the data. Data engineering also includes metadata (data about data) to the data.
4. Visualization: Data visualization is meant by representing data in a visual context so that people can easily understand
the significance of data. Data visualization makes it easy to access the huge amount of data in visuals.
5. Advanced computing: Heavy lifting of data science is advanced computing. Advanced computing involves designing,
writing, debugging, and maintaining the source code of computer programs.
6. Mathematics: Mathematics is the critical part of data science. Mathematics involves the study of quantity, structure,
space, and changes. For a data scientist, knowledge of good mathematics is essential.
7. Machine learning: Machine learning is backbone of data science. Machine learning is all about to provide training to a
machine so that it can act as a human brain. In data science, we use various machine learning algorithms to solve the
problems.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Data Science Lifecycle
32
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Discovery
33
In this phase, you require analytical sandbox in which you can perform analytics for the
entire duration of the project. You need to explore, preprocess and condition data prior to
modeling. Further, you will perform ETLT (extract, transform, load and transform) to get
data into the sandbox. Let’s have a look at the Statistical Analysis flow below.
You can use R for data cleaning, transformation, and visualization. This will help you to
spot the outliers and establish a relationship between the variables. Once you have cleaned
and prepared the data, it’s time to do exploratory analytics on it. Let’s see how you can
achieve that
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Model planning:
35
Here, you will determine the methods and techniques to draw the
relationships between variables. These relationships will set the base
for the algorithms which you will implement in the next phase. You
will apply Exploratory Data Analytics (EDA) using various statistical
formulas and visualization tools.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
36
R has a complete set of modeling capabilities and provides a good environment for
building interpretive models.
SQL Analysis services can perform in-database analytics using common data
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Model building:
37
In this phase, you will develop datasets for training and testing purposes. Here
you need to consider whether your existing tools will suffice for running the
models or it will need a more robust environment (like fast and parallel
processing). You will analyze various learning techniques like classification,
association and clustering to build the model.
You can achieve model building through the following tools.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Operationalize:
38
1. Discovery: The first phase is discovery, which involves asking the right questions. When you start any data science project, you
need to determine what are the basic requirements, priorities, and project budget. In this phase, we need to determine all the
requirements of the project such as the number of people, technology, time, data, an end goal, and then we can frame the business
problem on first hypothesis level.
2. Data preparation: Data preparation is also known as Data Munging. In this phase, we need to perform the following tasks:
1. Data cleaning
2. Data Reduction
3. Data integration
4. Data transformation,
After performing all the above tasks, we can easily use this data for our further processes.
3. Model Planning: In this phase, we need to determine the various methods and techniques to establish the relation between input
variables. We will apply Exploratory data analytics(EDA) by using various statistical formula and visualization tools to understand the
relations between variable and to see what data can inform us. Common tools used for model planning are:
1. SQL Analysis Services
2. R
3. SAS
4. Python
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Cont…
41
4. Model-building: In this phase, the process of model building starts. We will create datasets
for training and testing purpose. We will apply different techniques such as association,
classification, and clustering, to build the model.
Following are some common Model building tools:
1. SAS Enterprise Miner
2. WEKA
3. SPCS Modeler
4. MATLAB
5. Operationalize: In this phase, we will deliver the final reports of the project, along with briefings, code,
and technical documents. This phase provides you a clear overview of complete project performance and
other components on a small scale before the full deployment.
6. Communicate results: In this phase, we will check if we reach the goal, which we have set on the initial
phase. We will communicate the findings and final result with the business team.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
42
HISTORY
12/27/2024 Unit-1
History of Data Science
43
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
44
1962 – Inception
a. Future of Data Analysis – In 1962, John W Tukey wrote the “Future of Data Analysis”
where he first mentioned the importance of data analysis with respect to science rather
than mathematics.
1974
a. Concise Survey of Computer Methods – In 1974, Peter Naur published the “Concise
Survey of Computer methods that surveys the contemporary methods of data processing in
various applications.
1974 – 1980
a. International Association For Statistical Computing – In 1997, The committee was
formed whose sole purpose is to link traditional statistical methodology with modern
computer technology to extract useful information and knowledge from the data.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Cont…
45
1980-1990
a. Knowledge Discovery in Databases – In 1989, Gregory Piatetsky-Shapiro
chaired the Knowledge Discovery in Databases that later went on to become the
annual conference on knowledge discovery and data mining.
1990-2000
a. Database Marketing – In 1994, BusinessWeek published a cover story that
explains how big organizations are using the customer data to predict the
likelihood of a customer buying a specific product or not. Kind of like how
targeted ads work in the modern era for social media campaigns.
b. International Federation of Classification Society – For the first time in
1996, the term “Data Science” was used in a conference held in Japan.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
46
2000-2010
a. Data Science – An Action Plan for Expanding the Technical Areas of the Field of Statistics – In 2001, William S
Cleveland published the action plan, that majorly focused on major areas of the technical work in the field of
statistics and coined the term Data Science.
b. Statistical Modeling – The Two Cultures – In 2001, Leo Breiman wrote “There are two cultures in the use of
statistical modeling to reach conclusions from data. One assumes that the data are generated by a given stochastic
data model. The other uses algorithmic models and treats the data mechanism as unknown”.
c. Data Science Journal – April 2002 saw the launch of a journal that focused on management of data and
databases in science and technology.
2010-Present
a. Data Everywhere – In February 2010, Kenneth Cukier wrote a special report for The Economist that said a new
professional has arrived – a data scientist. Who combines the skills of software programmer, statistician and
storyteller/artist to extract the nuggets of gold hidden under mountains of data.
b. What is Data Science? – In June 2010, Mike Loukides described data science as combining entrepreneurship
with patience, the willingness to build data products incrementally, the ability to explore, and the ability to iterate
over a solution.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Tools for Data Science
47
AWS Redshift
Data Visualization tools: R, Jupyter, Tableau, Cognos.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
48
APPLICATIONS OF DATA
SCIENCE
12/27/2024
Applications of Data Science:
49
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
50
Internet search:
When we want to search for something on the internet, then we use different types of
search engines such as Google, Yahoo, Bing, Ask, etc. All these search engines use the
data science technology to make the search experience better, and you can get a search
result with a fraction of seconds.
Transport:
Transport industries also using data science technology to create self-driving cars. With
self-driving cars, it will be easy to reduce the number of road accidents.
Healthcare:
In the healthcare sector, data science is providing lots of benefits. Data science is being
used for tumor detection, drug discovery, medical image analysis, virtual medical bots,
etc.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
51
Recommendation systems:
Most of the companies, such as Amazon, Netflix, Google Play, etc., are using
data science technology for making a better user experience with personalized
recommendations. Such as, when you search for something on Amazon, and
you started getting suggestions for similar products, so this is because of data
science technology.
Risk detection:
Finance industries always had an issue of fraud and risk of losses, but with the
help of data science, this can be rescued.
Most of the finance companies are looking for the data scientist to avoid risk
and any type of losses with an increase in customer satisfaction.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
52
12/27/2024
Traits of Big data
53
Big data is a collection of data from many different sources and is often describe by five
characteristics: volume, value, variety, velocity, and veracity.
Big Data contains a large amount of data that is not being processed by traditional data storage
or the processing unit. It is used by many multinational companies to process the data and
business of many organizations. The data flow would exceed 150 exabytes per day before
replication.
5 V's of Big Data
Volume
Veracity
Variety
Value
Velocity
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Volume
54
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Variety
55
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
The data is categorized as below:
56
Structured data: In Structured schema, along with all the required columns. It is in a tabular
form. Structured Data is stored in the relational database management system.
Semi-structured: In Semi-structured, the schema is not appropriately defined, e.g., JSON,
XML, CSV, TSV, and email. OLTP (Online Transaction Processing) systems are built to
work with semi-structured data. It is stored in relations, i.e., tables.
Unstructured Data: All the unstructured files, log files, audio files, and image files are
included in the unstructured data. Some organizations have much data available, but they did
not know how to derive the value of data since the data is raw.
Quasi-structured Data:The data format contains textual data with inconsistent data formats
that are formatted with effort and time with some tools.
Example: Web server logs, i.e., the log file is created and maintained by some server that
contains a list of activities.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Veracity
57
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Value
58
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Velocity
59
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
60
WEB SCRAPING
12/27/2024
What is web scraping?
61
The dictionary meaning of word ‘Scrapping’ implies getting something from the web.
Here two questions arise: What we can get from the web and How to get that.
The answer to the first question is ‘data’. Data is indispensable for any programmer
and the basic requirement of every programming project is the large amount of useful
data.
The answer to the second question is a bit tricky, because there are lots of ways to get
data. In general, we may get data from a database or data file and other sources. But
what if we need large amount of data that is available online? One way to get such
kind of data is to manually search (clicking away in a web browser) and save (copy-
pasting into a spreadsheet or file) the required data. This method is quite tedious and
time consuming. Another way to get such data is using web scraping.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Cont…
62
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Uses of Web Scraping
66
The uses and reasons for using web scraping are as endless as the uses of the World Wide Web. Web scrapers can
do anything like ordering online food, scanning online shopping website for you and buying ticket of a match the
moment they are available etc. just like a human can do. Some of the important uses of web scraping are
discussed here:
E-commerce Websites: Web scrapers can collect the data specially related to the price of a specific product from
various e-commerce websites for their comparison.
Content Aggregators: Web scraping is used widely by content aggregators like news aggregators and job
aggregators for providing updated data to their users. Marketing and Sales Campaigns: Web scrapers can be
used to get the data like emails, phone number etc. for sales and marketing campaigns.
Search Engine Optimization (SEO): Web scraping is widely used by SEO tools like SEMRush, Majestic etc. to
tell business how they rank for search keywords that matter to them.
Data for Machine Learning Projects: Retrieval of data for machine learning projects depends upon web scraping.
Data for Research: Researchers can collect useful data for the purpose of their research work by saving their
time by this automated process.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Components of a Web Scraper
67
Web Crawler Module :A very necessary component of web scraper, web crawler module, is used to
navigate the target website by making HTTP or HTTPS request to the URLs. The crawler downloads
the unstructured data (HTML contents) and passes it to extractor, the next module.
Extractor: The extractor processes the fetched HTML content and extracts the data into
semistructured format. This is also called as a parser module and uses different parsing techniques
like Regular expression, HTML Parsing, DOM parsing or Artificial Intelligence for its functioning.
Data Transformation and Cleaning Module: The data extracted above is not suitable for ready use. It
must pass through some cleaning module so that we can use it. The methods like String
manipulation or regular expression can be used for this purpose. Note that extraction and
transformation can be performed in a single step also.
Storage Module :After extracting the data, we need to store it as per our requirement. The storage
module will output the data in a standard format that can be stored in a database or JSON or CSV
format.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Working of a Web Scraper
68
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Why Python for Web Scraping?
69
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
70
DATA ANALYSIS
12/27/2024
What is Data Analysis?
72
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Data Analysis Tools
73
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Types of Data Analysis: Techniques and
Methods
74
Text Analysis
Text Analysis is also referred to as Data Mining. It is one of the methods of data analysis to discover a
pattern in large data sets using databases or data mining tools. It used to transform raw data into business
information. Business Intelligence tools are present in the market which is used to take strategic business
decisions. Overall it offers a way to extract and examine data and deriving patterns and finally
interpretation of the data.
Statistical Analysis
Statistical Analysis shows “What happen?” by using past data in the form of dashboards. Statistical
Analysis includes collection, Analysis, interpretation, presentation, and modeling of data. It analyses a set
of data or a sample of data. There are two categories of this type of Analysis – Descriptive Analysis and
Inferential Analysis.
Descriptive Analysis
analyses complete data or a sample of summarized numerical data. It shows mean and deviation for
continuous data whereas percentage and frequency for categorical data.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
76
Inferential Analysis
analyses sample from complete data. In this type of Analysis, you can find
Diagnostic Analysis
Diagnostic Analysis shows “Why did it happen?” by finding the cause from
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
77
Predictive Analysis
Predictive Analysis shows “what is likely to happen” by using previous data. The simplest data analysis
example is like if last year I bought two dresses based on my savings and if this year my salary is
increasing double then I can buy four dresses. But of course it’s not easy like this because you have to
think about other circumstances like chances of prices of clothes is increased this year or maybe instead
of dresses you want to buy a new bike, or you need to buy a house!
So here, this Analysis makes predictions about future outcomes based on current or past data. Forecasting
is just an estimate. Its accuracy is based on how much detailed information you have and how much you
dig in it.
Prescriptive Analysis
Prescriptive Analysis combines the insight from all previous Analysis to determine which action to take
in a current problem or decision. Most data-driven companies are utilizing Prescriptive Analysis because
predictive and descriptive Analysis are not enough to improve data performance. Based on current
situations and problems, they analyze the data and make decisions.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Data Analysis Process
78
The Data Analysis Process is nothing but gathering information by using a proper
application or tool which allows you to explore the data and find a pattern in it. Based
on that information and data, you can make decisions, or you can get ultimate
conclusions.
Data Analysis consists of the following phases:
Data Requirement Gathering
Data Collection
Data Cleaning
Data Analysis
Data Interpretation
Data Visualization
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
79
DATA REPORTING
12/27/2024
What is Data reporting?
80
Data reporting helps you track what’s happening to your business and evaluate its
performance. It’s the process of collecting, merging, and visualizing raw data from all
available sources. Most often, it's presented in the form of tables, graphs, or charts.
Also, you shouldn’t forget that:
In most cases, data-based records show only historical data, so you see an assessment
context (at least consider the industry and niche in which your company works).
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
How to write a data report
81
By analyzing data, you can make informed decisions and test working
hypotheses. You can specify where your company’s resources go, what
progress has been made, and what your company should pay attention
to most.
Want to know how to write a data analysis report? Let’s look at the steps
you need to take to create what perfectly fits your company:
Step 1. Determine the report’s purpose and which specific questions it
should answer. Different specialists need different reports, and, of course,
they need answers to different questions. And too many questions in one
dashboard can overload it.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Cont…
82
Step 2. Define metrics and data sources. Once you’ve decided on the
questions you want to answer and the specialists that will use the
dashboard, you should highlight critical metrics. Also, you should
determine what information is needed to build reports and what
sources should be connected to those reports.
Step 3. Make sure data collection works correctly. You must be sure
of the quality of your data to make informed decisions. Make sure
that information is collected accurately and without errors. Also, note
that your attribution model should be tailored to your business needs.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Cont…
83
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Data report examples and templates
84
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Five Key Differences Between Reporting
and Analysis
85
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Cont…
86
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Cont…
87
3. The Final Output: In the case of reporting, outputs such as canned reports,
dashboards, and alerts push information to users. Through analysis, analysts try to
extract answers using business queries and present them in the form of ad hoc
responses, insights, recommended actions, or a forecast. Understanding this key
difference can help businesses leverage analytics better.
4. People: Reporting requires repetitive tasks that can be automated. It is often used by
functional business heads who monitor specific business metrics. Analytics requires
customization and therefore depends on data analysts and scientists. Also, it is used by
business leaders to make data-driven decisions.
5. Value Proposition: This is like comparing apples to oranges. Both reporting and
analytics serve a different purpose. By understanding the purpose and using them
correctly, businesses can derive immense value from both.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Cont…
88
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Orbit for both Reporting and Analytics
89
Orbit Reporting and Analytics is a single tool that can be used for both generating
different reports and running analytics to meet business objectives. It can work in
multi-cloud environments, extracting data from the cloud and on-prem systems
and presenting them in many ways as required by the user. It enables self-service,
allowing business users to generate their own reports without depending on the IT
team, in real-time. It complies with security and privacy requirements by allowing
access only to authorized users. It also allows users to generate reports in real-time
in Excel.
It also facilitates analytics, enabling businesses to draw insights and convert them
into actions to predict future trends, identify areas of improvement across
functions, and meet the organizational goal of growth.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Cont…
90
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Cont…
91
Data Analysis
Once the data is collected, cleaned, and processed, it is ready for Analysis. As you manipulate data, you may
find you have the exact information you need, or you might need to collect more data. During this phase,
you can use data analysis tools and software which will help you to understand, interpret, and derive
conclusions based on the requirements.
Data Interpretation
After analyzing your data, it’s finally time to interpret your results. You can choose the way to express or
communicate your data analysis either you can use simply in words or maybe a table or chart. Then use the
results of your data analysis process to decide your best course of action.
Data Visualization
Data visualization is very common in your day to day life; they often appear in the form of charts and
graphs. In other words, data shown graphically so that it will be easier for the human brain to understand and
process it. Data visualization often used to discover unknown facts and trends. By observing relationships
and comparing datasets, you can find a way to find out meaningful information.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Why Data Analysis?
92
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Use Cases of Data Science
94
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
Advantages and Disadvantages of Data Science
95
Advantages:
It helps us to get insights from historical data with its powerful tools.
It helps to optimize the business, hire the right persons and generate more revenue, as using data science
helps you make better future decisions for the business.
Companies can develop and market their products better as they can better select their target customers.
Introduction to Data Science also helps consumers search for better goods, especially in e-commerce sites
based on the data-driven recommendation system.
Disadvantages:
The disadvantages are generally when data science is used for customer profiling and infringement of
customer privacy.
Their information, such as transactions, purchases, and subscriptions, is visible to their parent companies.
The information obtained using data science can be used against a certain group, individual, country, or
community.
Department of EEE, Academy of Maritime Education and Training, Deemed to be University, Chennai 12/27/2024
96
THANK YOU
12/27/2024 Unit-1