0% found this document useful (0 votes)
5 views

Big Data

Big Data refers to data sets that are extremely large, typically measured in petabytes, and have been rapidly generated, with 90% created in the last three years. It originates from various sources like social media, e-commerce, weather stations, telecom companies, and stock markets, and is characterized by the 3Vs: velocity, variety, and volume. Real-time analytics enables organizations to process and analyze this data immediately, allowing for quick decision-making and insights into user behavior.

Uploaded by

rohitserver21
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Big Data

Big Data refers to data sets that are extremely large, typically measured in petabytes, and have been rapidly generated, with 90% created in the last three years. It originates from various sources like social media, e-commerce, weather stations, telecom companies, and stock markets, and is characterized by the 3Vs: velocity, variety, and volume. Real-time analytics enables organizations to process and analyze this data immediately, allowing for quick decision-making and insights into user behavior.

Uploaded by

rohitserver21
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

What is Big Data

Data which are very large in size is called Big Data. Normally we work on data of size MB(WordDoc
,Excel) or maximum GB(Movies, Codes) but data in Peta bytes i.e. 10^15 byte size is called Big Data. It is
stated that almost 90% of today's data has been generated in the past 3 years.

Sources of Big Data


These data come from many sources like

o Social networking sites: Facebook, Google, LinkedIn all these sites generates huge amount of
data on a day to day basis as they have billions of users worldwide.
o E-commerce site: Sites like Amazon, Flipkart, Alibaba generates huge amount of logs from which
users buying trends can be traced.
o Weather Station: All the weather station and satellite gives very huge data which are stored and
manipulated to forecast weather.
o Telecom company: Telecom giants like Airtel, Vodafone study the user trends and accordingly
publish their plans and for this they store the data of its million users.
o Share Market: Stock exchange across the world generates huge amount of data through its daily
transaction.

3V's of Big Data


1. Velocity: The data is increasing at a very fast rate. It is estimated that the volume of data will
double in every 2 years.
2. Variety: Now a days data are not stored in rows and column. Data is structured as well as
unstructured. Log file, CCTV footage is unstructured data. Data which can be saved in tables are
structured data like the transaction data of the bank.
3. Volume: The amount of data which we deal with is of very large size of Peta bytes.

Use case
An e-commerce site XYZ (having 100 million users) wants to offer a gift voucher of 100$ to its top 10
customers who have spent the most in the previous year.Moreover, they want to find the buying trend of
these customers so that company can suggest more items related to them.

Issues
Huge amount of unstructured data which needs to be stored, processed and analyzed.

Solution
Storage: This huge amount of data, Hadoop uses HDFS (Hadoop Distributed File System) which uses
commodity hardware to form clusters and store data in a distributed fashion. It works on Write once,
read many times principle.

Processing: Map Reduce paradigm is applied to data distributed over network to find the required
output.

Analyze: Pig, Hive can be used to analyze the data.

Cost: Hadoop is open source so the cost is no more an issue.

Real -Time Analytics in Big Data


In this tutorial, we will explore real-time analytics in big data. We will present an overview of real-time
analysis and focus on its function and the advantages of its use. We will discuss the benefits of real-time
data analytics. Let's go through it in detail.

Real-Time Analytics:
In real-time, analysis of data allows users to view, analyse and understand data in the system it's entered.
Mathematical reasoning and logic are incorporated into the data, which means it gives users a sense of
real-time data to make decisions.

Overview:
Real-time analytics allows organizations to gain awareness and actionable information immediately or as
soon as the data has entered their systems. Analytics responses in real-time are completed within a
matter of minutes. They can process a huge amount of data in a short time with high speed and a low
response time. For instance, real-time big-data analytics makes use of financial databases to inform
traders of decisions. Analytics may be performed on-demand or continuously. On-demand alerts users to
results when the user wants them. Users can continuously update their results as events occur. It can also
be programmed to respond to specific circumstances automatically. For instance, real-time web analytics
could restructure the administrator's page if the load presentation is not within the boundaries of the
present.

Examples -
Examples of real-time customer analytics include the following.

o Monitoring orders as they take place to trace them better and determine the type of clothing.
o Continuously modernize customer interactions, such as the number of page views and shopping cart
usage, to better understand the etiquette of users.
o Select customers who are more advanced in their shopping habits in a shop, impacting the decisions in
real time.

The Operation of Real-time Analytics


Real-time analytics tools for data analytics can pull or push. Streaming demands that faculty push huge
amounts of fast-moving data. If streaming consumes too many resources and isn't an empirical process,
data could be moved at intervals between a couple of seconds and hours. The two may occur between
business requirements that need to be figured out in order not to interrupt the flow. The time to react
for real-time analysis can vary from nearly instantaneous to a few minutes or seconds. The key
components of real-time analytics comprise the following.

o Aggregator
o Broker
o Analytics engine
o Stream processor

Benefits of Real-time Analytics


Momentum is the primary benefit of real-time analysis of data. The shorter a company has to wait for
data from the moment it arrives and is processed, and the business is able to utilize data insights to
make changes and make the results of a crucial decision.

In the same way, real-time analytics tools allow companies to see how users connect to an item after
liberating the product, so there's no problem in understanding the behaviour of users to make the
necessary adjustments.

Advantages of Real-time Analytics:


Real-time analytics provides the benefits over traditional analytics.

o Create our interactive analytics tools.


o Transparent dashboards allow users to share information.
o Monitor behaviour in a way that is customized.
o Perform immediate adjustments if necessary.
o Make use of machine learning.

Other Benefits:
Other advantages and benefits include managing data location, detecting irregularities, enhancing
marketing and sales, etc. The following benefits can be useful.

You might also like