SlideShare a Scribd company logo
Confidential
Zaloni Data Lake Architecture for Data-Driven Decision
Maksym Demianovskyi
Denys Skalskyi co-author
Confidential
2
Agenda
➢ Data evolution
➢ Why data is so important?
➢ Data-driven decision process
➢ Zaloni Data Lake architecture
Main stages of information evolution
1. The first revolution is associated with the invention of writing, which led to a giant qualitative and quantitative leap. It
became possible to transfer knowledge from generation to generation
2. The second (mid-16th century) was caused by the invention of printing, which radically changed industrial society,
culture, and the organization of activities
3. The third (the end of the 19th century) was caused by the invention of electricity, thanks to which the telegraph, the
telephone, and the radio appeared, allowing the rapid transmission and accumulation of information in any volume
4. The fourth (Information explosion) (70s of XX century) is the invention of microprocessor technology and the
appearance of the personal computer. Computers, computer networks, data transmission systems (information
communications) are created on microprocessors and integrated circuits
3
You have to realize that for instance the amount of
information produced by humanity before 2003 year is less
than the amount of data produced by one day in 2023
And you have to realize how much data is produced by end
of 2022: 97 zettabytes
By the end of 2022, there were 94 zettabytes of data in the
world. (Source: Bernard Marr & Co.) 1 ZB is the equivalent of
1,000 exabytes.
Do you know how much 181 zettabytes is? Let’s put it this
way: If you ever tried downloading it by yourself, it’d take you
about two billion years!
The amount of data produced by humanity
4
Data usage facts
● A single person generates 1.7 MB of data every second
● Facebook generates 4 PB of data daily
● One person generates 49.8 GB of IP traffic every month
● YouTubers upload 500 hours per minute means 30,000 hours of content every hour
● Video traffic makes up 82% of all consumer internet traffic
● 50% of all data will be in the cloud by 2025
● Every day created no less than 2.5 quintillion bytes! (That’s two exabytes plus 500 petabytes.)
● AWS Snowmobile has a capacity up to 100 petabytes
5
Data is not only numbers
We can see that we have a lot of data and garbage in
that data, by them self it does not have any sense.
And to make it became a useful information we have to
clean that data (fixing or removing incorrect, corrupted,
incorrectly formatted, duplicate, or incomplete data
within a dataset), and perform statistics for cleaned
data.
And when we will have structured information draw
conclusions for measures. And make that process
continuously help to reach incredible goals.
6
● Help you make better AND smarter decisions
● Keep your business up-to-date
● Improved financial management
● Better performance & more efficient internal operations
● Creates a data-driven culture
● Better customer service
Why data is so important?
7
How companies use data to make decisions
Using Data To Create New Blockbuster Hit Series
They intelligently utilized the power of their data to run predictive analyses to learn what
exactly their customers would be receptive to and interested to watch.
Providing Faster & More Efficient Ride With Data
The company is able to analyze historical data and key metrics that include the number of
ride requests and trips getting fulfilled in different parts of a city as well as the time when this
is happening. This helps to gain insight into areas that have a supply crunch, allowing them
to pre-emptively inform drivers to move to areas ahead of time in order to capitalize on the
inevitable rise in demand.
Uses geographic information systems to analyze factors such as demographic
information, and traffic flow information to choose the best locations to expand into. Not only
does it help with choosing locations but it optimizes which product would best sell in
a given area. 8
Who makes decisions?
● Medical diagnosis
● Legal matters
● Human resources
● Ethical decision-making
● Creative industries
● Fraud detection
● Customer service
● Trading and investment
● Route management systems
● Advertising decisions
9
Data-Driven decision process
10
High level of component diagram
● Web and Mobile apps
● Services
● Devices and IoT
● Logs and Metrics
● Apache Spark
● Google BigQuery
● AWS Athena
● Azure Data Factory
● Data Lake
● Data Warehouse
● Databases
● Files
● Tableau
● Power BI
● Analysts
● 3th party services
Producer Storage Data Processing Analize
11
Future-proofing data lake stack
● Data collection and integration: allow for the collection and
integration of various types of data from different sources
● Real-time data processing: enable real-time data processing
● Data analysis: allow for the analysis of large amounts of data.
● Scalability: Data lakes can scale to meet the needs of the business.
● Efficiency: Data lakes allow for the efficient use of existing
resources, reducing costs associated with data processing and
storage
● Ease of use: Data lakes provide quick and easy access to data,
allowing users to retrieve information easily and quickly
12
Zaloni Data Lake architecture
● Understanding industry best practices
● Providing a template for solutioning
● Tracking a process
● Understanding structures and elements
13
Zaloni zones
14
● Can be complex to implement and may require specialized expertise
● Architecture may be overkill for smaller organizations or those with limited data needs
● May not be well-suited for organizations that require real-time or near-real-time data processing
● Architecture may not be easily customizable to fit specific business needs or use cases
Pros and cons of Zaloni architecture
● Intuitively clear
● Access to raw and formatted data
● Flexible and scalable architecture that can accommodate different data types, formats, and sources
● Offers a modular and extensible architecture that can be customized to meet the specific needs
15
● Lambda Architecture
● Kappa Architecture
● Data Mesh Architecture
● Virtualized Data Architecture
Alternative approaches
16
Summary
● Data is important for businesses because it can help inform decision-making, improve
operational efficiency, and identify new business opportunities
● Real-life examples of data-driven decisions include optimizing website design, improving app
usability, and informing product development
● Data storage options vary, and a data lake is a suitable choice when dealing with diverse and
unstructured data from multiple sources. It provides flexibility and agility for storing
and analyzing data
● Zaloni Data Lake architectures help to build Flexible and scalable architecture
17
18
Ad

More Related Content

Similar to GlobalLogic Java Community Webinar #16 “Zaloni’s Architecture for Data-Driven Design” (20)

Big data ppt presentation of Big data p
Big data ppt  presentation of Big data pBig data ppt  presentation of Big data p
Big data ppt presentation of Big data p
vvikashosmani
 
Technology Trends and Big Data in 2013-2014
Technology Trends and Big Data in 2013-2014Technology Trends and Big Data in 2013-2014
Technology Trends and Big Data in 2013-2014
KMS Technology
 
Kaushal Amin & Big 5 IT trends in the world
Kaushal Amin & Big 5 IT trends in the worldKaushal Amin & Big 5 IT trends in the world
Kaushal Amin & Big 5 IT trends in the world
Quang PM
 
Big data ppt
Big data pptBig data ppt
Big data ppt
OECLIB Odisha Electronics Control Library
 
Big_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptxBig_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptx
TanguturiAvinash
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptx
kalai75
 
Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docx
dickonsondorris
 
Datapreneurs
DatapreneursDatapreneurs
Datapreneurs
suresh sood
 
BigDataFinal.pptx
BigDataFinal.pptxBigDataFinal.pptx
BigDataFinal.pptx
PentaTech
 
Special issues on big data
Special issues on big dataSpecial issues on big data
Special issues on big data
Vedanand Singh
 
Big data
Big dataBig data
Big data
Mahmudul Alam
 
unit1 big data analysis description and defenition .pptx
unit1 big data analysis description and defenition .pptxunit1 big data analysis description and defenition .pptx
unit1 big data analysis description and defenition .pptx
abikishor767
 
The Future Started Yesterday: The Top Ten Computer and IT Trends
The Future Started Yesterday: The Top Ten Computer and IT TrendsThe Future Started Yesterday: The Top Ten Computer and IT Trends
The Future Started Yesterday: The Top Ten Computer and IT Trends
Career Communications Group
 
Big Data ppt
Big Data pptBig Data ppt
Big Data ppt
Vivek Gautam
 
Identifying the new frontier of big data as an enabler for T&T industries: Re...
Identifying the new frontier of big data as an enabler for T&T industries: Re...Identifying the new frontier of big data as an enabler for T&T industries: Re...
Identifying the new frontier of big data as an enabler for T&T industries: Re...
International Federation for Information Technologies in Travel and Tourism (IFITT)
 
Geospatial Intelligence Middle East 2013_Big Data_Steven Ramage
Geospatial Intelligence Middle East 2013_Big Data_Steven RamageGeospatial Intelligence Middle East 2013_Big Data_Steven Ramage
Geospatial Intelligence Middle East 2013_Big Data_Steven Ramage
Steven Ramage
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
Denodo
 
A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)
Denodo
 
Future of Data Strategy
Future of Data StrategyFuture of Data Strategy
Future of Data Strategy
Denodo
 
Bigdata
BigdataBigdata
Bigdata
sayan sarker
 
Big data ppt presentation of Big data p
Big data ppt  presentation of Big data pBig data ppt  presentation of Big data p
Big data ppt presentation of Big data p
vvikashosmani
 
Technology Trends and Big Data in 2013-2014
Technology Trends and Big Data in 2013-2014Technology Trends and Big Data in 2013-2014
Technology Trends and Big Data in 2013-2014
KMS Technology
 
Kaushal Amin & Big 5 IT trends in the world
Kaushal Amin & Big 5 IT trends in the worldKaushal Amin & Big 5 IT trends in the world
Kaushal Amin & Big 5 IT trends in the world
Quang PM
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptx
kalai75
 
Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docx
dickonsondorris
 
BigDataFinal.pptx
BigDataFinal.pptxBigDataFinal.pptx
BigDataFinal.pptx
PentaTech
 
Special issues on big data
Special issues on big dataSpecial issues on big data
Special issues on big data
Vedanand Singh
 
unit1 big data analysis description and defenition .pptx
unit1 big data analysis description and defenition .pptxunit1 big data analysis description and defenition .pptx
unit1 big data analysis description and defenition .pptx
abikishor767
 
The Future Started Yesterday: The Top Ten Computer and IT Trends
The Future Started Yesterday: The Top Ten Computer and IT TrendsThe Future Started Yesterday: The Top Ten Computer and IT Trends
The Future Started Yesterday: The Top Ten Computer and IT Trends
Career Communications Group
 
Geospatial Intelligence Middle East 2013_Big Data_Steven Ramage
Geospatial Intelligence Middle East 2013_Big Data_Steven RamageGeospatial Intelligence Middle East 2013_Big Data_Steven Ramage
Geospatial Intelligence Middle East 2013_Big Data_Steven Ramage
Steven Ramage
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
Denodo
 
A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)
Denodo
 
Future of Data Strategy
Future of Data StrategyFuture of Data Strategy
Future of Data Strategy
Denodo
 

More from GlobalLogic Ukraine (20)

GlobalLogic JavaScript Community Webinar #21 “Інтерв’ю без заспокійливих”
GlobalLogic JavaScript Community Webinar #21 “Інтерв’ю без заспокійливих”GlobalLogic JavaScript Community Webinar #21 “Інтерв’ю без заспокійливих”
GlobalLogic JavaScript Community Webinar #21 “Інтерв’ю без заспокійливих”
GlobalLogic Ukraine
 
Deadlocks in SQL - Turning Fear Into Understanding (by Sergii Stets)
Deadlocks in SQL - Turning Fear Into Understanding (by Sergii Stets)Deadlocks in SQL - Turning Fear Into Understanding (by Sergii Stets)
Deadlocks in SQL - Turning Fear Into Understanding (by Sergii Stets)
GlobalLogic Ukraine
 
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
GlobalLogic Ukraine
 
GlobalLogic Embedded Community x ROS Ukraine Webinar "Surgical Robots"
GlobalLogic Embedded Community x ROS Ukraine Webinar "Surgical Robots"GlobalLogic Embedded Community x ROS Ukraine Webinar "Surgical Robots"
GlobalLogic Embedded Community x ROS Ukraine Webinar "Surgical Robots"
GlobalLogic Ukraine
 
GlobalLogic Java Community Webinar #17 “SpringJDBC vs JDBC. Is Spring a Hero?”
GlobalLogic Java Community Webinar #17 “SpringJDBC vs JDBC. Is Spring a Hero?”GlobalLogic Java Community Webinar #17 “SpringJDBC vs JDBC. Is Spring a Hero?”
GlobalLogic Java Community Webinar #17 “SpringJDBC vs JDBC. Is Spring a Hero?”
GlobalLogic Ukraine
 
GlobalLogic JavaScript Community Webinar #18 “Long Story Short: OSI Model”
GlobalLogic JavaScript Community Webinar #18 “Long Story Short: OSI Model”GlobalLogic JavaScript Community Webinar #18 “Long Story Short: OSI Model”
GlobalLogic JavaScript Community Webinar #18 “Long Story Short: OSI Model”
GlobalLogic Ukraine
 
Штучний інтелект як допомога в навчанні, а не замінник.pptx
Штучний інтелект як допомога в навчанні, а не замінник.pptxШтучний інтелект як допомога в навчанні, а не замінник.pptx
Штучний інтелект як допомога в навчанні, а не замінник.pptx
GlobalLogic Ukraine
 
Задачі AI-розробника як застосовується штучний інтелект.pptx
Задачі AI-розробника як застосовується штучний інтелект.pptxЗадачі AI-розробника як застосовується штучний інтелект.pptx
Задачі AI-розробника як застосовується штучний інтелект.pptx
GlobalLogic Ukraine
 
Що треба вивчати, щоб стати розробником штучного інтелекту та нейромереж.pptx
Що треба вивчати, щоб стати розробником штучного інтелекту та нейромереж.pptxЩо треба вивчати, щоб стати розробником штучного інтелекту та нейромереж.pptx
Що треба вивчати, щоб стати розробником штучного інтелекту та нейромереж.pptx
GlobalLogic Ukraine
 
JavaScript Community Webinar #14 "Why Is Git Rebase?"
JavaScript Community Webinar #14 "Why Is Git Rebase?"JavaScript Community Webinar #14 "Why Is Git Rebase?"
JavaScript Community Webinar #14 "Why Is Git Rebase?"
GlobalLogic Ukraine
 
GlobalLogic .NET Community Webinar #3 "Exploring Serverless with Azure Functi...
GlobalLogic .NET Community Webinar #3 "Exploring Serverless with Azure Functi...GlobalLogic .NET Community Webinar #3 "Exploring Serverless with Azure Functi...
GlobalLogic .NET Community Webinar #3 "Exploring Serverless with Azure Functi...
GlobalLogic Ukraine
 
Страх і сила помилок - IT Inside від GlobalLogic Education
Страх і сила помилок - IT Inside від GlobalLogic EducationСтрах і сила помилок - IT Inside від GlobalLogic Education
Страх і сила помилок - IT Inside від GlobalLogic Education
GlobalLogic Ukraine
 
GlobalLogic .NET Webinar #2 “Azure RBAC and Managed Identity”
GlobalLogic .NET Webinar #2 “Azure RBAC and Managed Identity”GlobalLogic .NET Webinar #2 “Azure RBAC and Managed Identity”
GlobalLogic .NET Webinar #2 “Azure RBAC and Managed Identity”
GlobalLogic Ukraine
 
GlobalLogic QA Webinar “What does it take to become a Test Engineer”
GlobalLogic QA Webinar “What does it take to become a Test Engineer”GlobalLogic QA Webinar “What does it take to become a Test Engineer”
GlobalLogic QA Webinar “What does it take to become a Test Engineer”
GlobalLogic Ukraine
 
“How to Secure Your Applications With a Keycloak?
“How to Secure Your Applications With a Keycloak?“How to Secure Your Applications With a Keycloak?
“How to Secure Your Applications With a Keycloak?
GlobalLogic Ukraine
 
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
GlobalLogic Ukraine
 
GlobalLogic Machine Learning Webinar “Statistical learning of linear regressi...
GlobalLogic Machine Learning Webinar “Statistical learning of linear regressi...GlobalLogic Machine Learning Webinar “Statistical learning of linear regressi...
GlobalLogic Machine Learning Webinar “Statistical learning of linear regressi...
GlobalLogic Ukraine
 
GlobalLogic C++ Webinar “The Minimum Knowledge to Become a C++ Developer”
GlobalLogic C++ Webinar “The Minimum Knowledge to Become a C++ Developer”GlobalLogic C++ Webinar “The Minimum Knowledge to Become a C++ Developer”
GlobalLogic C++ Webinar “The Minimum Knowledge to Become a C++ Developer”
GlobalLogic Ukraine
 
Embedded Webinar #17 "Low-level Network Testing in Embedded Devices Development"
Embedded Webinar #17 "Low-level Network Testing in Embedded Devices Development"Embedded Webinar #17 "Low-level Network Testing in Embedded Devices Development"
Embedded Webinar #17 "Low-level Network Testing in Embedded Devices Development"
GlobalLogic Ukraine
 
GlobalLogic Webinar "Introduction to Embedded QA"
GlobalLogic Webinar "Introduction to Embedded QA"GlobalLogic Webinar "Introduction to Embedded QA"
GlobalLogic Webinar "Introduction to Embedded QA"
GlobalLogic Ukraine
 
GlobalLogic JavaScript Community Webinar #21 “Інтерв’ю без заспокійливих”
GlobalLogic JavaScript Community Webinar #21 “Інтерв’ю без заспокійливих”GlobalLogic JavaScript Community Webinar #21 “Інтерв’ю без заспокійливих”
GlobalLogic JavaScript Community Webinar #21 “Інтерв’ю без заспокійливих”
GlobalLogic Ukraine
 
Deadlocks in SQL - Turning Fear Into Understanding (by Sergii Stets)
Deadlocks in SQL - Turning Fear Into Understanding (by Sergii Stets)Deadlocks in SQL - Turning Fear Into Understanding (by Sergii Stets)
Deadlocks in SQL - Turning Fear Into Understanding (by Sergii Stets)
GlobalLogic Ukraine
 
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
GlobalLogic Ukraine
 
GlobalLogic Embedded Community x ROS Ukraine Webinar "Surgical Robots"
GlobalLogic Embedded Community x ROS Ukraine Webinar "Surgical Robots"GlobalLogic Embedded Community x ROS Ukraine Webinar "Surgical Robots"
GlobalLogic Embedded Community x ROS Ukraine Webinar "Surgical Robots"
GlobalLogic Ukraine
 
GlobalLogic Java Community Webinar #17 “SpringJDBC vs JDBC. Is Spring a Hero?”
GlobalLogic Java Community Webinar #17 “SpringJDBC vs JDBC. Is Spring a Hero?”GlobalLogic Java Community Webinar #17 “SpringJDBC vs JDBC. Is Spring a Hero?”
GlobalLogic Java Community Webinar #17 “SpringJDBC vs JDBC. Is Spring a Hero?”
GlobalLogic Ukraine
 
GlobalLogic JavaScript Community Webinar #18 “Long Story Short: OSI Model”
GlobalLogic JavaScript Community Webinar #18 “Long Story Short: OSI Model”GlobalLogic JavaScript Community Webinar #18 “Long Story Short: OSI Model”
GlobalLogic JavaScript Community Webinar #18 “Long Story Short: OSI Model”
GlobalLogic Ukraine
 
Штучний інтелект як допомога в навчанні, а не замінник.pptx
Штучний інтелект як допомога в навчанні, а не замінник.pptxШтучний інтелект як допомога в навчанні, а не замінник.pptx
Штучний інтелект як допомога в навчанні, а не замінник.pptx
GlobalLogic Ukraine
 
Задачі AI-розробника як застосовується штучний інтелект.pptx
Задачі AI-розробника як застосовується штучний інтелект.pptxЗадачі AI-розробника як застосовується штучний інтелект.pptx
Задачі AI-розробника як застосовується штучний інтелект.pptx
GlobalLogic Ukraine
 
Що треба вивчати, щоб стати розробником штучного інтелекту та нейромереж.pptx
Що треба вивчати, щоб стати розробником штучного інтелекту та нейромереж.pptxЩо треба вивчати, щоб стати розробником штучного інтелекту та нейромереж.pptx
Що треба вивчати, щоб стати розробником штучного інтелекту та нейромереж.pptx
GlobalLogic Ukraine
 
JavaScript Community Webinar #14 "Why Is Git Rebase?"
JavaScript Community Webinar #14 "Why Is Git Rebase?"JavaScript Community Webinar #14 "Why Is Git Rebase?"
JavaScript Community Webinar #14 "Why Is Git Rebase?"
GlobalLogic Ukraine
 
GlobalLogic .NET Community Webinar #3 "Exploring Serverless with Azure Functi...
GlobalLogic .NET Community Webinar #3 "Exploring Serverless with Azure Functi...GlobalLogic .NET Community Webinar #3 "Exploring Serverless with Azure Functi...
GlobalLogic .NET Community Webinar #3 "Exploring Serverless with Azure Functi...
GlobalLogic Ukraine
 
Страх і сила помилок - IT Inside від GlobalLogic Education
Страх і сила помилок - IT Inside від GlobalLogic EducationСтрах і сила помилок - IT Inside від GlobalLogic Education
Страх і сила помилок - IT Inside від GlobalLogic Education
GlobalLogic Ukraine
 
GlobalLogic .NET Webinar #2 “Azure RBAC and Managed Identity”
GlobalLogic .NET Webinar #2 “Azure RBAC and Managed Identity”GlobalLogic .NET Webinar #2 “Azure RBAC and Managed Identity”
GlobalLogic .NET Webinar #2 “Azure RBAC and Managed Identity”
GlobalLogic Ukraine
 
GlobalLogic QA Webinar “What does it take to become a Test Engineer”
GlobalLogic QA Webinar “What does it take to become a Test Engineer”GlobalLogic QA Webinar “What does it take to become a Test Engineer”
GlobalLogic QA Webinar “What does it take to become a Test Engineer”
GlobalLogic Ukraine
 
“How to Secure Your Applications With a Keycloak?
“How to Secure Your Applications With a Keycloak?“How to Secure Your Applications With a Keycloak?
“How to Secure Your Applications With a Keycloak?
GlobalLogic Ukraine
 
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
GlobalLogic Ukraine
 
GlobalLogic Machine Learning Webinar “Statistical learning of linear regressi...
GlobalLogic Machine Learning Webinar “Statistical learning of linear regressi...GlobalLogic Machine Learning Webinar “Statistical learning of linear regressi...
GlobalLogic Machine Learning Webinar “Statistical learning of linear regressi...
GlobalLogic Ukraine
 
GlobalLogic C++ Webinar “The Minimum Knowledge to Become a C++ Developer”
GlobalLogic C++ Webinar “The Minimum Knowledge to Become a C++ Developer”GlobalLogic C++ Webinar “The Minimum Knowledge to Become a C++ Developer”
GlobalLogic C++ Webinar “The Minimum Knowledge to Become a C++ Developer”
GlobalLogic Ukraine
 
Embedded Webinar #17 "Low-level Network Testing in Embedded Devices Development"
Embedded Webinar #17 "Low-level Network Testing in Embedded Devices Development"Embedded Webinar #17 "Low-level Network Testing in Embedded Devices Development"
Embedded Webinar #17 "Low-level Network Testing in Embedded Devices Development"
GlobalLogic Ukraine
 
GlobalLogic Webinar "Introduction to Embedded QA"
GlobalLogic Webinar "Introduction to Embedded QA"GlobalLogic Webinar "Introduction to Embedded QA"
GlobalLogic Webinar "Introduction to Embedded QA"
GlobalLogic Ukraine
 
Ad

Recently uploaded (20)

2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
Ad

GlobalLogic Java Community Webinar #16 “Zaloni’s Architecture for Data-Driven Design”

  • 1. Confidential Zaloni Data Lake Architecture for Data-Driven Decision Maksym Demianovskyi Denys Skalskyi co-author
  • 2. Confidential 2 Agenda ➢ Data evolution ➢ Why data is so important? ➢ Data-driven decision process ➢ Zaloni Data Lake architecture
  • 3. Main stages of information evolution 1. The first revolution is associated with the invention of writing, which led to a giant qualitative and quantitative leap. It became possible to transfer knowledge from generation to generation 2. The second (mid-16th century) was caused by the invention of printing, which radically changed industrial society, culture, and the organization of activities 3. The third (the end of the 19th century) was caused by the invention of electricity, thanks to which the telegraph, the telephone, and the radio appeared, allowing the rapid transmission and accumulation of information in any volume 4. The fourth (Information explosion) (70s of XX century) is the invention of microprocessor technology and the appearance of the personal computer. Computers, computer networks, data transmission systems (information communications) are created on microprocessors and integrated circuits 3
  • 4. You have to realize that for instance the amount of information produced by humanity before 2003 year is less than the amount of data produced by one day in 2023 And you have to realize how much data is produced by end of 2022: 97 zettabytes By the end of 2022, there were 94 zettabytes of data in the world. (Source: Bernard Marr & Co.) 1 ZB is the equivalent of 1,000 exabytes. Do you know how much 181 zettabytes is? Let’s put it this way: If you ever tried downloading it by yourself, it’d take you about two billion years! The amount of data produced by humanity 4
  • 5. Data usage facts ● A single person generates 1.7 MB of data every second ● Facebook generates 4 PB of data daily ● One person generates 49.8 GB of IP traffic every month ● YouTubers upload 500 hours per minute means 30,000 hours of content every hour ● Video traffic makes up 82% of all consumer internet traffic ● 50% of all data will be in the cloud by 2025 ● Every day created no less than 2.5 quintillion bytes! (That’s two exabytes plus 500 petabytes.) ● AWS Snowmobile has a capacity up to 100 petabytes 5
  • 6. Data is not only numbers We can see that we have a lot of data and garbage in that data, by them self it does not have any sense. And to make it became a useful information we have to clean that data (fixing or removing incorrect, corrupted, incorrectly formatted, duplicate, or incomplete data within a dataset), and perform statistics for cleaned data. And when we will have structured information draw conclusions for measures. And make that process continuously help to reach incredible goals. 6
  • 7. ● Help you make better AND smarter decisions ● Keep your business up-to-date ● Improved financial management ● Better performance & more efficient internal operations ● Creates a data-driven culture ● Better customer service Why data is so important? 7
  • 8. How companies use data to make decisions Using Data To Create New Blockbuster Hit Series They intelligently utilized the power of their data to run predictive analyses to learn what exactly their customers would be receptive to and interested to watch. Providing Faster & More Efficient Ride With Data The company is able to analyze historical data and key metrics that include the number of ride requests and trips getting fulfilled in different parts of a city as well as the time when this is happening. This helps to gain insight into areas that have a supply crunch, allowing them to pre-emptively inform drivers to move to areas ahead of time in order to capitalize on the inevitable rise in demand. Uses geographic information systems to analyze factors such as demographic information, and traffic flow information to choose the best locations to expand into. Not only does it help with choosing locations but it optimizes which product would best sell in a given area. 8
  • 9. Who makes decisions? ● Medical diagnosis ● Legal matters ● Human resources ● Ethical decision-making ● Creative industries ● Fraud detection ● Customer service ● Trading and investment ● Route management systems ● Advertising decisions 9
  • 11. High level of component diagram ● Web and Mobile apps ● Services ● Devices and IoT ● Logs and Metrics ● Apache Spark ● Google BigQuery ● AWS Athena ● Azure Data Factory ● Data Lake ● Data Warehouse ● Databases ● Files ● Tableau ● Power BI ● Analysts ● 3th party services Producer Storage Data Processing Analize 11
  • 12. Future-proofing data lake stack ● Data collection and integration: allow for the collection and integration of various types of data from different sources ● Real-time data processing: enable real-time data processing ● Data analysis: allow for the analysis of large amounts of data. ● Scalability: Data lakes can scale to meet the needs of the business. ● Efficiency: Data lakes allow for the efficient use of existing resources, reducing costs associated with data processing and storage ● Ease of use: Data lakes provide quick and easy access to data, allowing users to retrieve information easily and quickly 12
  • 13. Zaloni Data Lake architecture ● Understanding industry best practices ● Providing a template for solutioning ● Tracking a process ● Understanding structures and elements 13
  • 15. ● Can be complex to implement and may require specialized expertise ● Architecture may be overkill for smaller organizations or those with limited data needs ● May not be well-suited for organizations that require real-time or near-real-time data processing ● Architecture may not be easily customizable to fit specific business needs or use cases Pros and cons of Zaloni architecture ● Intuitively clear ● Access to raw and formatted data ● Flexible and scalable architecture that can accommodate different data types, formats, and sources ● Offers a modular and extensible architecture that can be customized to meet the specific needs 15
  • 16. ● Lambda Architecture ● Kappa Architecture ● Data Mesh Architecture ● Virtualized Data Architecture Alternative approaches 16
  • 17. Summary ● Data is important for businesses because it can help inform decision-making, improve operational efficiency, and identify new business opportunities ● Real-life examples of data-driven decisions include optimizing website design, improving app usability, and informing product development ● Data storage options vary, and a data lake is a suitable choice when dealing with diverse and unstructured data from multiple sources. It provides flexibility and agility for storing and analyzing data ● Zaloni Data Lake architectures help to build Flexible and scalable architecture 17
  • 18. 18