0% found this document useful (0 votes)
41 views

Unit 1

BIG DATA ANALYTICS
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views

Unit 1

BIG DATA ANALYTICS
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 20

Unit - I: INTRODUCTION TO BIG DATA

Introduction to Big Data Platform - Challenges of Conventional Systems - Intelligent data analysis -
Nature of Data - Analytic Processes and Tools - Analysis vs Reporting.
---------------------------------------------------------------------------------------------------------------

INTRODUCTION TO BIG DATA PLATFORM

INTRODUCTION TO DATA:

Data and Information:

 Data are plain facts.


 The word "data" is plural for "datum."
 Data is nothing but facts and statistics stored or free flowing over a network, generally it's raw and
unprocessed.
 When data are processed, organized, structured or presented in a given context so as to make them
useful, they are called Information.
 It is not enough to have data (such as statistics on the economy).
 Data themselves are fairly useless, but when these data are interpreted and processed to determine
its true meaning, they becomes useful and can be called Information.
 For example: When you visit any website, they might store your IP address, that is data,in return
they might add a cookie in your browser, marking you that you visited the website, that is data, your
name, it's data, your age, it's data.

What is Data?
 The quantities, characters, or symbols on which operations are performed by a computer
 which may be stored and transmitted in the form of electrical signals and
 recorded on magnetic, optical, or mechanical recording media.
Three Actions on Data
 Capture
 Transform
 Store
BigData
 Big Data may well be the Next Big Thing in the IT world.
 Big data burst upon the scene in the first decade of the 21st century.

What is Big Data?

1
 Big Data is also data but with a huge size.
 Big Data is a term used to describe a collection of data that is huge in size and yet growing
exponentially with time.
 In short such data is so large and complex that none of the traditional data management tools are
able to store it or process it efficiently.

Big Data is the term for a collection of data sets so large and complex that it becomes difficult to process
using on-hand database management tools or traditional data processing applications.

Examples of Bigdata
 The New York Stock Exchange generates about one terabyte of new trade data per day.
 Other examples of Big Data generation includes stock exchanges, social media sites, jet engines, etc.

Characteristics of Big Data (6 Vs of Big Data)


1. Volume:
Volume refers to the sheer size of the ever-exploding data of the computing world. It raises the
question about the quantity of data collected from different sources over the Internet
2. Velocity:
In Big Data, Velocity refers how fast the data is generated. Here the data flows from sources like
machines, networks, social media, mobile phones etc. There is a massive and continuous flow of
data. This determines the potential of data that how fast the data is generated and processed to
meet the demands.
3. Variety:
Variety refers to the types of data. In Big Data the raw data always collected in variety. The raw
data can be structured, unstructured, and semi structured. This is because the data is collected
from various sources.It also refers to heterogeneous sources containing a combination of simple text
files, images, videos etc.

4. Veracity:
Veracity is all about the trust score of the data. If the data is collected from trusted or reliable
sources then the data neglect this rule of big data. It refers to inconsistencies and uncertainty in
data, that is data which is available can sometimes get messy, low quality and less accurate. Data is

2
also variable because of the multitude of data dimensions resulting from multiple disparate data
types and sources.
Example: Data in bulk could create confusion whereas less amount of data could convey half or
Incomplete Information.

5. Value:
Value refers to purpose, scenario or business outcome that the analytical solution has to address.
Does the data have value, if not, is it worth being stored or collected? The analysis needs to be
performed to meet the ethical considerations.

6. Variability:
It defines the need to get meaningful data considering all possible circumstances.
 How fast or available data that extent is the structure of your data is changing?
 How often does the meaning or shape of your data change?

BIG DATA PLATFORM


A big data platform is a tool that has been developed by data management vendors with an aim of
increasing the scalability, availability, performance, and security of organizations that are driven using big
data. The platform is designed to handle voluminous data that is multi-structured in real time.
Big data platform - consists of big data storage, servers, database, big data management, business
intelligence and other big data management utilities. It supports custom development, querying and
integration with other systems.
Primary benefit is reducing the complexity of multiple vendors/ solutions into a one cohesive solution.

Big data platform are also delivered through cloud where the provider provides big data solutions and
services.

New analytic applications drive the requirements for a big data platform
1. Integrate and manage the full variety, velocity and volume of data
2. Apply advanced analytics to information in its native form
3. Visualize all available data for ad-hoc analysis
4. Development environment for building new analytic applications
5. Workload optimization and scheduling

3
6. Security and Governance

Figure: Big Data Platform


1. Workload Optimization:
Adaptive MapReduce
 Algorithm to optimize execution time of multiple small and large jobs
 Performance gains of 30% reduce overhead of task startup Hadoop System Scheduler
 Identifies small and large jobs from prior experience
 Sequences work to reduce overhead

Figure: MapReduce

4
2. Big Data Platform - Stream Computing:
 Built to analyze data in motion
 Multiple concurrent input streams
 Massive scalability
 Process and analyze a variety of data:
-- Structured, unstructured content, video, audio
-- Advanced analytic operators
3. Big Data Platform - Data Warehousing:
 Workload optimized systems
-- Deep analytics appliances
-- Configurable operational analytics appliances
-- Data warehousing software
 Capabilities
-- Massive parallel processing engine
-- High performance OLAP
-- Mixed operational and analytic workloads
4. Big Data Platform - Information Integration and Governance
 Integrate any type of data to the big data platform
-- Structured
-- Unstructured
-- Streaming
 Governance and trust for big data
-- Secure sensitive data
-- Lineage and metadata of new big data sources
-- Lifecycle management to control data growth
-- Master data to establish single version of the truth

5. Leverage purpose-built connectors for multiple data sources

5
  Massive volume of structured data movement
-- 2.38 TB / Hour load to data warehouse
-- High-volume load to Hadoop file system
 Ingest unstructured data into Hadoop file system
 Integrate streaming data sources

6. Big Data Platform - User Interfaces


 Business Users
 Visualization of a large volume and wide variety of data
 Developers
 Similarity in tooling and languages
 Mature open source tools with enterprise capabilities
 Integration among environments

7. Administrators: Consoles to aid in systems management

8. Big Data Platform –Accelerators:


 Analytic accelerators
–Analytics, operators, rule sets
 Industry and Horizontal Application Accelerators
–Analytics
–Models
–Visualization / user interfaces
–Adapters

6
9. Big Data Platform - Analytic Applications:

Big Data Platform is designed for analytic application development and integration.
 BI/Reporting – Cognos BI, Attivio
 Predictive Analytics – SPSS, G2, SAS
 Exploration/Visualization – BigSheets, Datameer
 Instrumentation Analytics – Brocade, IBM GBS
 Content Analytics – IBM Content Analytics
 Functional Applications – Algorithmics, Cognos Consumer Insights, Clickfox, i2, IBM GBS
 Industry Applications – TerraEchos, Cisco, IBM GBS

List of BigData Platforms


 Hadoop
 Cloudera
 Amazon Web Services
 Hortonworks
 MapR
 IBM Open Platform
 Microsoft HDInsight
 Intel Distribution for Apache Hadoop
 Datastax Enterprise Analytics
 Teradata Enterprise Access for Hadoop
 Pivotal HD

CHALLENGES OF CONVENTIONAL SYSTEMS

7
The rise and development of social networks, multimedia, electronic commerce (e-Commerce) and
cloud computing have increased considerably the data. Additionally, since the needs of enterprise analytics
are constantly growing, the conventional architectures cannot satisfy the demands and, therefore, new and
enhanced architectures are necessary.
In this context, new challenges are encountered including storage, capture, processing, filtering,
analysis, search, sharing, visualization, querying and privacy of the very large volumes of data.

These issues are categorized and elaborated as follows:


 Data storage and management: Since big data are dependent on extensive storage capacity and data
volumes grow exponentially, the current data management systems cannot satisfy the needs of big data
due to limited storage capacity. In addition, the existing algorithms are not able to store data effectively
because of the heterogeneity of big data.
 Data transmission and curation:
1. Since network bandwidth capacity is the major drawback in the cloud, data transmission is a
challenge to overcome, especially when the volume of data is very large.
2. For managing large-scale and structured datasets, data warehouses and data marts are good
approaches. Data warehouses are relational database systems that enable the data storage, analysis
and reporting, while the data marts are based on data warehouses and facilitate the analysis of
them.
3. In this context, NoSQL databases were introduced as a potential technology for large and
distributed data management and database design. The major advantage of NoSQL databases is the
schema-free orientation, which enables the quick modification of the structure of data.
 Data processing and analysis:
1. Query response time is a significant issue in big data, as adequate time is needed when traversing
data in a database and performing real-time analytics.
2. A flexible and reconfigured data grid is required
3. Enhanced preprocessing methods are demanded
4. Effective approaches for extracting insights and meaningful knowledge from the given data sets is
essential

 Data privacy and security:


1. Since the host of data or other critical operations can be performed by third-party services or
infrastructures, security issues are witnessed with respect to big data storage and processing.
2. The current technologies used in data security are mainly static data-oriented, although big data
entails dynamic change of current and additional data or variations in attributes. Privacy-preserving

8
data mining without exposing sensitive personal information is another challenging field to be
investigated

The differences between Conventional data and Big data:

Conventional Data Big Data

Conventional data is generated in enterprise Big data is generated outside the enterprise
level. level.

Its volume ranges from Petabytes to Zettabytes


Its volume ranges from Gigabytes to Terabytes.
or Exabytes.

Conventional database system deals with Big data system deals with structured, semi-
structured data. structured and unstructured data.

Conventional data is generated per hour or per But big data is generated more frequently mainly
day or more. per seconds.

Conventional data source is centralized and it is Big data source is distributed and it is managed
managed in centralized form. in distributed form.

Data integration is very easy. Data integration is very difficult.

Normal system configuration is capable to High system configuration is required to process


process Conventional data. big data.

The size is more than the Conventional data


The size of the data is very small.
size.

Conventional data base tools are required to Special kind of data base tools are required to
perform any data base schema based operation. perform any database schema-based operation.

Normal functions can manipulate data. Special kind of functions can manipulate data.

Its data model is strict schema based and it is Its data model is a flat schema based and it is
static. dynamic.

Conventional data is stable and inter


Big data is not stable and unknown relationship.
relationship.

Big data is in huge volume which becomes


Conventional data is in manageable volume.
unmanageable.

It is easy to manage and manipulate the data. It is difficult to manage and manipulate the data.

9
Conventional Data Big Data

Its data sources includes ERP transaction data,


Its data sources includes social media, device
CRM transaction data, financial data,
data, sensor data, video, images, audio etc.
organizational data, web transaction data etc.

INTELLIGENT DATA ANALYSIS

 Intelligdent data analysis is the scientific process of transforming data into insight for making
better decisions. Here, the mathematical techniques are used to analyze the complex situations
to give power to make effective decisions and build more productive systems based on:

1. More complete data

2. Consideration of available options

3. Careful predictions of outcomes

4. Estimates of Risk

5. Latest decision tools and techniques

 Analytics relies on the simultaneous application of statistics, computer programming and


operations research to quantify the performance. Analytics often favours data visualization to
understand the insights too.

 Business firms may commonly apply analytics to business data to describe, predict and improve
business performance.

 Example areas are: retail analytics, store assessment and stock keeping, marketing, web
analytics, sales force sizing, price modeling, credit risk analysis and fraud analytics.

 The goal of Data Analytics is to get actionable insights resulting in smarter decisions and better
business outcomes.

 There are three types of data analysis:

1. Predictive (forecasting)

2. Descriptive (business intelligence and data mining)

3. Prescriptive (optimization and simulation)

PREDICTIVE ANALYTICS

10
Predictive Analysis turns data into valuable, actionable information. Predictive analytics
uses data to determine the probable future outome of an event or likelihood of a siutation
occuring.

Predictive models find the patterns in the historical and transactional data to identify the risks and
opportunities. Models capture relationships among many factors to asses the risk.

Three basic cornerstones of predictive analytics are :

1. Predictive Modelling

2. Decision Analysis and Optimization

3. Transaction Profiling.

An example of using predicive analysis:

An organization that offers multiple products, predicitve analysis can help analyze the
customers’ spending usage and other behaviour, leading to efficient cross sales or selling additional
products to current customers. This leads to higher profitabilty per customer and stronger
customer relationships.

DESCRIPTIVE ANALYTICS

Descriptive Analytics looks at past performance and understands that peformance by


mining historical data to look for reasons behind pass sucess or failure. Almost all management
reporting systems such as sales, marketing, operations and finance uses this type of post-mortem
analysis.

Descriptive models quantify relationships in data in a way is often used to classify


customers or prospects into groups. For example, Descriptive models can be used to categorize the
customers by their product preferences and their age.

Descritptive modelling tools can be utilized to develop further models that can simulate large
number of individuals and make predictions. For example, descriptive analysis examines historical
electricity usage data to help plan power needs and allow electric companies to set optimal prices.

PRESCRIPTIVE ANALYTICS
Prescriptive Analytics goes beyong predicting future outcomes by also suggesting actions
to benefit from the predictions and showing the decision maker the implications of each
decision option. Prescriptive Analytics not only anticipates what will happen and when it will
happen, but also why it will happen.

11
Prescriptive Analysis combines data, business rules and mathemtaical models.
 The data may come from multiple sources, internal and external to the organization. The
data may also be structured data which includes numerical and categorical data, as well as
unstructured data such as text, images, audio and video data.
 Business Rules define the business process and include constraints, preferences, policies,
best practices and boundaries.
 Mathematical models are techniques derived from mathematical sciences, applied
statistics, machine learning, operations research and natural language processing.
 One example is energy and utilities. Natural gas prices fluctuate depends upon supply, demand,
econometrics, geo-politics and weather conditions. Prescriptive analytics can accurately predict
prices by modelling internal and external variables simultaneously and also provide decision options
and show the impact of each decision option.

NATURE OF DATA
In BigData, data could be found in three forms:
1. Structured
2. Unstructured
3. Semi-structured
What is Structured Data?
 Any data that can be stored, accessed and processed in the form of fixed format(eg. table) is termed
as a 'structured' data.
 Developed techniques for working with such kind of data (where the format is well known in
advance) and also deriving value out of it.
 One main issue with the data these days is:
when a size of such data grows to a huge extent, typical sizes are being in the range of multiple zetta
bytes. That is why the name Big Data is given and imagine the challenges involved in its storage and
processing
 Data stored in a relational database management system(RDBMS) is one example of a 'structured'
data.
Unstructured Data

12
 Any data with unknown form or the structure is classified as unstructured data.
 In addition to the size being huge, un-structured data poses multiple challenges in terms of its
processing for deriving value out of it.
 A typical example of unstructured data is a heterogeneous data source containing a combination of
simple text files, images, videos etc.
 Now-a-days organizations have wealth of data available with them but unfortunately, they don't
know how to derive value out of it since this data is in its raw form or unstructured format.
 Example of Unstructured data – Results returned by a search engine like 'Google Search'
Semi-structured Data
 Semi-structured data can contain both the forms of data and shares the characteristics of both the
forms.
 Semi-structured data refers to data that is not captured or formatted in conventional ways. Semi-
structured data does not follow the format of a tabular data model or relational databases because
it does not have a fixed schema.
 Example of semi-structured data is a data represented in an XML,CSV, JSON files.

ANALYTIC PROCESSES AND TOOLS


Big data analytics refers to collecting, processing, cleaning, and analyzing large datasets to
help organizations operationalize their big data.

 BUSINESS UNDERSTANDING
The very first step consists of business understanding. Whenever any requirement occurs,
1. firstly we need to determine the business objective,
2. assess the situation,
3. determine data mining goals and then
4. produce the project plan as per the requirement.
5. Finally, Business objectives are defined in this phase.

 DATA EXPLORATION
The second step consists of Data understanding.
1. For the further process, we need to gather initial data, describe and explore the data and
verify data quality to ensure it contains the data we require.

13
2. Data collected from the various sources is described in terms of its application and the
need for the project in this phase. This is also known as data exploration.
Data exploration is essential step to verify the quality of data collected.

 DATA PREPARATION
1. From the data collected in the last step, we need to select data as per the need, clean it,
construct it to get useful information and then integrate it all.
2. Finally, we need to format the data to get the appropriate data.
3. Data is selected, cleaned, and integrated into the format finalized for the analysis in this
phase.

 ANALYZE DATA
1. The next step is to Analyze. The cleaned data is used for analyzing and identifying trends. It
also performs calculations and combines data for better results.
2. Here, a data model is build to
– analyze relationships between various selected objects in the data,
– test cases are built for assessing the model and model is tested and
implemented on the data in this phase.
3. Where processing is hosted?
– Distributed Servers / Cloud (e.g. Amazon EC2)
4. Where data is stored?
– Distributed Storage (e.g. Amazon S3)
5. What is the programming model?
– Distributed Processing (e.g. MapReduce)
6. How data is stored & indexed?
– High-performance schema-free databases (e.g. MongoDB)
7. What operations are performed on data?
– Analytic / Semantic Processing
8. Big data tools on clouds
– MapReduce model

14
– Iterative MapReduce model
– Graph model
– Collective model
9. Other BDA tools
– SaS
–R
– Hadoop

 DEPLOYMENT
The final step is Act. After a presentation is given based on your data model, the
stakeholders discuss whether to move forward or not. If they agreed to your recommendations,
they move further with your solutions. If they don’t agree with your findings, you will have to dig
deeper to find more possible solutions. Every step has to be re-organized. We have to repeat
every step to see whether there are any gaps in there. The data collected must be reviewed
to see if there is any bias and identify options. After the gaps are identified and the data is
analyzed, a presentation is given again.

BIG DATA ANALYTICS TOOLS


Today, almost 2.5 quintillion bytes of data are generated globally and it’s useless until that
data is segregated in a proper structure. It has become crucial for businesses to maintain
consistency in the business by collecting meaningful data from the market today and for that, all it
takes is the right data analytic tool and a professional data analyst.
Tools for Analyzing Big Data
There are five key approaches to analyzing big data and generating insight:
• Discovery tools are useful throughout the information lifecycle for rapid, intuitive exploration
and analysis of information from any combination of structured and unstructured sources. These
tools permit analysis alongside traditional BI source systems. Because there is no need for up-front
modeling, users can draw new insights, come to meaningful conclusions, and make informed
decisions quickly.
• BI tools are important for reporting, analysis and performance management, primarily with
transactional data from data warehouses and production information systems. BI Tools provide
comprehensive capabilities for business intelligence and performance management, including

15
enterprise reporting, dashboards, ad-hoc analysis, scorecards, and what-if scenario analysis on an
integrated, enterprise scale platform.
• In-Database Analytics include a variety of techniques for finding patterns and relationships in
your data. Because these techniques are applied directly within the database, you eliminate data
movement to and from other analytical servers, which accelerates information cycle times and
reduces total cost of ownership.
• Hadoop is useful for pre-processing data to identity macro trends or find nuggets of information,
such as out-of-range values. It enables businesses to unlock potential value from new data using
inexpensive commodity servers. Organizations primarily use Hadoop as a precursor to advanced
forms of analytics.
• Decision Management includes predictive modeling, business rules, and self-learning to take
informed action based on the current context. This type of analysis enables individual
recommendations across multiple channels, maximizing the value of every customer interaction.
Oracle Advanced Analytics scores can be integrated to operationalize complex predictive analytic
models and create real-time decision processes.
There are hundreds of data analytics tools out there in the market today but the selection
of the right tool will depend upon your business NEED, GOALS, and VARIETY to get business in the
right direction. The top 10 analytics tools in big data.

APACHE Hadoop
1. It’s a Java-based open-source platform that is being used to store and process big data.
2. It is built on a cluster system that allows the system to process data efficiently and let the
data run parallel. It can process both structured and unstructured data from one server to
multiple computers.
3. Hadoop also offers cross-platform support for its users.
4. Today, it is the best big data analytic tool and is popularly used by many tech giants such as
Amazon, Microsoft, IBM, etc.

Cassandra
1. APACHE Cassandra is an open-source NoSQL distributed database that is used to fetch large
amounts of data.

16
2. It’s one of the most popular tools for data analytics and has been praised by many tech
companies due to its high scalability and availability without compromising speed and
performance.
3. It is capable of delivering thousands of operations every second and can handle petabytes
of resources with almost zero downtime.
4. It was created by Facebook back in 2008 and was published publicly.

Qubole
1. It’s an open-source big data tool that helps in fetching data in a value of chain using ad-hoc
analysis in machine learning.
2. Qubole is a data lake platform that offers end-to-end service with reduced time and effort
which are required in moving data pipelines.
3. It is capable of configuring multi-cloud services such as AWS, Azure, and Google Cloud.
4. It also helps in lowering the cost of cloud computing by 50%.

Xplenty
1. It is a data analytic tool for building a data pipeline by using minimal codes in it.
2. It offers a wide range of solutions for sales, marketing, and support.
3. With the help of its interactive graphical interface, it provides solutions for ETL, ELT, etc.
4. The best part of using Xplenty is its low investment in hardware & software and its offers
support via email, chat, telephonic and virtual meetings.
5. Xplenty is a platform to process data for analytics over the cloud and segregates all the data
together.

APACHE Spark
1. APACHE Spark is a framework that is used to process data and perform numerous tasks on
a large scale.
2. It is also used to process data via multiple computers with the help of distributing tools.
3. It is widely used among data analysts as it offers easy-to-use APIs that provide easy data
pulling methods and it is capable of handling multi-petabytes of data as well.

17
4. Recently, Spark made a record of processing 100 terabytes of data in just 23 minutes which
broke the previous world record of Hadoop (71 minutes). This is the reason why big tech
giants are moving towards spark now and is highly suitable for ML and AI today.
Mongo DB
1. Mongo DB Came in limelight in 2010, is a free, open-source platform and a document-
oriented (NoSQL) database that is used to store a high volume of data.
2. It uses collections and documents for storage and its document consists of key-value pairs
which are considered a basic unit of Mongo DB.
3. It is so popular among developers due to its availability for multi programming languages
such as Python, Jscript, and Ruby.

Apache Storm
1. A storm is a robust, user-friendly tool used for data analytics, especially in small companies.
2. The best part about the storm is that it has no language barrier (programming) in it and can
support any of them.
3. It was designed to handle a pool of large data in fault-tolerance and horizontally scalable
methods.
4. Storm leads the chart because of its distributed real-time big data processing system, due
to which today many tech giants are using APACHE Storm in their system.
5. Some of the most notable names are Twitter, Zendesk, NaviSite, etc.

SAS
1. Today it is one of the best tools for creating statistical modeling used by data analysts.
2. By using SAS, a data scientist can mine, manage, extract or update data in different variants
from different sources.
3. Statistical Analytical System or SAS allows a user to access the data in any format (SAS
tables or Excel worksheets).
4. It also offers a cloud platform for business analytics called SAS Viya and also to get a strong
grip on AI & ML, they have introduced new tools and products.

Data Pine

18
1. Datapine is an analytical used for BI and was founded back in 2012 (Berlin, Germany).
2. In a short period of time, it has gained much popularity in a number of countries and it’s
mainly used for data extraction (for small-medium companies fetching data for close
monitoring).
3. With the help of its enhanced UI design, anyone can visit and check the data as per their
requirement and offer in 4 different price brackets, starting from $249 per month.
4. They do offer dashboards by functions, industry, and platform.

Rapid Miner
1. It’s a fully automated visual workflow design tool used for data analytics.
2. It’s a no-code platform and users aren’t required to code for segregating data.
3. Today, it is being heavily used in many industries such as ed-tech, training, research, etc.
4. Though it’s an open-source platform but has a limitation of adding 10000 data rows and a
single logical processor.
5. With the help of Rapid Miner, one can easily deploy their ML models to the web or mobile.
ANALYSIS VS REPORTING

Analytics and reporting can help a business improve operational efficiency and production in
several ways. Analytics is the process of making decisions based on the data presented, while
reporting is used to make complicated information easier to understand.
 Analytics is the technique of examining data and reports to obtain actionable insights that can
be used to make better decisions and improve business performance.
The steps involved in data analytics are as follows:

 Developing a data hypothesis


 Data collection and transformation
 Creating models to analyze and provide insights
 Utilization of data visualization, trend analysis, deep dives, and other tools.
 Making decisions based on data and insights

 On the other hand, reporting is the process of presenting data from numerous sources clearly
and simply. The procedure is always carefully set out to report correct data and avoid
misunderstandings. Today’s reporting applications offer cutting-edge dashboards with

19
advanced data visualization features. Companies produce a variety of reports, such as financial
reports, accounting reports, operational reports, market studies, and more.

KEY DIFFERENCES BETWEEN ANALYTICS AND REPORTING

Analytics and reporting can significantly benefit your business. If you want to use both to their full
potential knowing the difference between the two is important. Some key differences are:

Analytics Reporting

Analytics is the method of examining and Reporting is an action that includes all the
analyzing summarized data to make business needed information and data and is put together
decisions. in an organized way.

Identifying business events, gathering the


Questioning the data, understanding it,
required information, organizing, summarizing,
investigating it, and presenting it to the end
and presenting existing data are all part of
users are all part of analytics.
reporting.

The purpose of analytics is to draw conclusions The purpose of reporting is to organize the data
based on data. into meaningful information.

Reporting is provided to the appropriate business


Analytics is used by data analysts, scientists, and
leaders to perform effectively and efficiently
business people to make effective decisions.
within a firm.

20

You might also like