SlideShare a Scribd company logo
How to build
a data warehouse?
Dmytro Popovych, SE @ Tubular
Theory vs practice
Цитата #441422
Пока инженеры в белых халатах прикручивают красивый двигатель к
идеальному крылу, бригада взлохмаченных придурков во главе с
безумным авантюристом пролетает над ними на конструкции из
микроавтобуса, забора и двух промышленных фенов, навстречу
второму туру инвестиций.
Красивые проекты не взлетают, потому что они не успевают взлететь.
Agenda
• Problem statement:
• Data Ingestion
• Data Normalisation
• Data Access
• Our way to solve the problem :)
About us
• Video intelligence for the cross-platform world
• 30+ video platforms including YouTube, Facebook, Instagram
• 7M creators
• 3B videos
• 2Tb of newly ingested data a day
• 150Tb of data in the warehouse
What is a data warehouse?
A central repository of data collected from disparate sources.
ANALYST
ENGINEER
SERVICE
DATA
WAREHOUSE
Key features
Ingestion
Store raw data extracted from disparate data sources
Normalisation
Cleanup / combine raw data
Access
Help user to retrieve data
What problems does it solve in Tubular?
• For engineers / analysts:
• data discovery
• prototyping / analyse
• For services:
• data exchange
Data Ingestion
Data Ingestion Problems
• Real time data:
• tweets, comments, shares, views
• Periodical snapshots:
• dump of real time data
• results of the data analysis
• databases from internal services (in some cases)
Real time data
DATABUS / event log / message queue
Powered by KAFKA
Data serialised with AVRO
Keeps all events for the last N days
SERVICE #1 SERVICE #2
PERMANENT
STORAGE
...
Why did we choose Kafka?
• Stores streams of records in a fault-tolerant way
• Designed to serve multiple consumers per topic
• Allows to keep the last N days of records
• Tested in very big companies Linkedin, Twitter, Uber, Airbnb...
• Strict schema definition
• Safe schema evolution
• Compact (binary serialisation format)
• Cross-technology format (Java, Python, …)
• Has some ecosystem around (Schema Registry, CLI consumers, …)
• Hadoop-friendly
Why did we choose Avro?
Periodical Snapshots
DATABUS
SERVICE #1
Powered by ELASTIC
PERMANENT STORAGE
SERVICE #2
Powered by CASSANDRA
SERVICE #3
Powered by MYSQL
...
DATA IMPORT TOOL
Powered by S3
Data serialised with PARQUET
Powered by SPARK
Why did we choose S3?
• There is no need to support it
• Compatible with Hadoop ecosystem
• Relatively stable & cheap
Why did we choose Parquet?
• Column-oriented format (perfect for analytics and partial reads)
• Supports complex data structures
• Compatible with Hadoop ecosystem
Why did we choose Spark?
• Scalable data processing engine
• Faster than Hadoop
• Has connectors to all popular storages: JDBC, Elastic, Cassandra, Kafka
• Has Python bindings
• Built-in support of Parquet
Data Normalisation
Data Normalisation Problems
• Cleanup duplicates
• Partition by year / month / date / hour
• Join various data sources
Normalisation of real time data (example)
SERVICE #1
Powered by ELASTIC
DATABUS
UI
PERMANENT STORAGE
The service joins multiple data streams by sending
partial updates to Elastic.
Note! It isn’t the only way to implement a real time
join, more generic solution could be implemented
with Apache Samza.
Why did we choose Elastic?
• Provides real time search and analytics
• Has relatively cheap partial updates
• Easy to scale
Normalisation of previously imported data
DATA NORMALISATION TOOL
PERMANENT STORAGE
Powered by Spark
Joins various datasets
Removes duplicates
Creates partitions by time range buckets
Why did we choose Spark?
• Scalable data processing engine
• Has built-in SQL api to transform data (perfect for joins and deduplication)
Data Access
Data Access Problems
• Datasets discovery
• Unified data access interface
Metadata Storage
PERMANENT STORAGE
Parquet
# 1
Avro
#1
CSV
#1
Parquet
# 2
...
METADATA STORAGE
Table # 1
Table # 2
Table # 3
...
Powered by Hive Metastore
Why did we choose Hive Metastore?
• Supported by Hadoop ecosystem
• Simple (Thrift api on top of MySQL table)
• Supported by Hue (UI to access metadata)
Let's summarize...
System Overview
ANALYST,
ENGINEER
PERMANENT STORAGE
DATABUS
METADATA
STORAGE
IMPORT
TOOL
NORMALISATION
TOOL
WAREHOUSE
SERVICES
* Data flows for Metadata Storage are explained verbally, too many arrows...
Thanks! Questions?
Check this out: https://github.com/Tubular/sparkly
Ad

More Related Content

What's hot (20)

Eugene Polonichko "Architecture of modern data warehouse"
Eugene Polonichko "Architecture of modern data warehouse"Eugene Polonichko "Architecture of modern data warehouse"
Eugene Polonichko "Architecture of modern data warehouse"
Lviv Startup Club
 
Northwestern Mutual Journey – Transform BI Space to Cloud
Northwestern Mutual Journey – Transform BI Space to CloudNorthwestern Mutual Journey – Transform BI Space to Cloud
Northwestern Mutual Journey – Transform BI Space to Cloud
Databricks
 
Converging Database Transactions and Analytics
Converging Database Transactions and Analytics Converging Database Transactions and Analytics
Converging Database Transactions and Analytics
SingleStore
 
Building Custom Big Data Integrations
Building Custom Big Data IntegrationsBuilding Custom Big Data Integrations
Building Custom Big Data Integrations
Pat Patterson
 
Integration Monday - Analysing StackExchange data with Azure Data Lake
Integration Monday - Analysing StackExchange data with Azure Data LakeIntegration Monday - Analysing StackExchange data with Azure Data Lake
Integration Monday - Analysing StackExchange data with Azure Data Lake
Tom Kerkhove
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
Kujambu Murugesan
 
Personalization Journey: From Single Node to Cloud Streaming
Personalization Journey: From Single Node to Cloud StreamingPersonalization Journey: From Single Node to Cloud Streaming
Personalization Journey: From Single Node to Cloud Streaming
Databricks
 
Real-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQLReal-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQL
SingleStore
 
Snaplogic Live: Big Data in Motion
Snaplogic Live: Big Data in MotionSnaplogic Live: Big Data in Motion
Snaplogic Live: Big Data in Motion
SnapLogic
 
Community day ppt_kinesisv1.0
Community day ppt_kinesisv1.0Community day ppt_kinesisv1.0
Community day ppt_kinesisv1.0
Sridevi Murugayen
 
Data platform architecture
Data platform architectureData platform architecture
Data platform architecture
Sudheer Kondla
 
2016 Spark Summit East Keynote: Ali Ghodsi and Databricks Community Edition demo
2016 Spark Summit East Keynote: Ali Ghodsi and Databricks Community Edition demo2016 Spark Summit East Keynote: Ali Ghodsi and Databricks Community Edition demo
2016 Spark Summit East Keynote: Ali Ghodsi and Databricks Community Edition demo
Databricks
 
Industrializing Machine Learning on an Enterprise Azure Platform with Databri...
Industrializing Machine Learning on an Enterprise Azure Platform with Databri...Industrializing Machine Learning on an Enterprise Azure Platform with Databri...
Industrializing Machine Learning on an Enterprise Azure Platform with Databri...
Databricks
 
Strata+Hadoop World NY 2016 - Avinash Ramineni
Strata+Hadoop World NY 2016 - Avinash RamineniStrata+Hadoop World NY 2016 - Avinash Ramineni
Strata+Hadoop World NY 2016 - Avinash Ramineni
Avinash Ramineni
 
Scaling to Infinity - Open Source meets Big Data
Scaling to Infinity - Open Source meets Big DataScaling to Infinity - Open Source meets Big Data
Scaling to Infinity - Open Source meets Big Data
Treasure Data, Inc.
 
TechDays NL 2016 - Building your scalable secure IoT Solution on Azure
TechDays NL 2016 - Building your scalable secure IoT Solution on AzureTechDays NL 2016 - Building your scalable secure IoT Solution on Azure
TechDays NL 2016 - Building your scalable secure IoT Solution on Azure
Tom Kerkhove
 
Data Architecture Brief Overview
Data Architecture Brief OverviewData Architecture Brief Overview
Data Architecture Brief Overview
Hal Kalechofsky
 
Using Premium Data - for Business Analysts
Using Premium Data - for Business AnalystsUsing Premium Data - for Business Analysts
Using Premium Data - for Business Analysts
Lynn Langit
 
LogStash: Concept Run-Through
LogStash: Concept Run-ThroughLogStash: Concept Run-Through
LogStash: Concept Run-Through
Manuj Aggarwal
 
Options for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketOptions for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current Market
Dremio Corporation
 
Eugene Polonichko "Architecture of modern data warehouse"
Eugene Polonichko "Architecture of modern data warehouse"Eugene Polonichko "Architecture of modern data warehouse"
Eugene Polonichko "Architecture of modern data warehouse"
Lviv Startup Club
 
Northwestern Mutual Journey – Transform BI Space to Cloud
Northwestern Mutual Journey – Transform BI Space to CloudNorthwestern Mutual Journey – Transform BI Space to Cloud
Northwestern Mutual Journey – Transform BI Space to Cloud
Databricks
 
Converging Database Transactions and Analytics
Converging Database Transactions and Analytics Converging Database Transactions and Analytics
Converging Database Transactions and Analytics
SingleStore
 
Building Custom Big Data Integrations
Building Custom Big Data IntegrationsBuilding Custom Big Data Integrations
Building Custom Big Data Integrations
Pat Patterson
 
Integration Monday - Analysing StackExchange data with Azure Data Lake
Integration Monday - Analysing StackExchange data with Azure Data LakeIntegration Monday - Analysing StackExchange data with Azure Data Lake
Integration Monday - Analysing StackExchange data with Azure Data Lake
Tom Kerkhove
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
Kujambu Murugesan
 
Personalization Journey: From Single Node to Cloud Streaming
Personalization Journey: From Single Node to Cloud StreamingPersonalization Journey: From Single Node to Cloud Streaming
Personalization Journey: From Single Node to Cloud Streaming
Databricks
 
Real-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQLReal-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQL
SingleStore
 
Snaplogic Live: Big Data in Motion
Snaplogic Live: Big Data in MotionSnaplogic Live: Big Data in Motion
Snaplogic Live: Big Data in Motion
SnapLogic
 
Community day ppt_kinesisv1.0
Community day ppt_kinesisv1.0Community day ppt_kinesisv1.0
Community day ppt_kinesisv1.0
Sridevi Murugayen
 
Data platform architecture
Data platform architectureData platform architecture
Data platform architecture
Sudheer Kondla
 
2016 Spark Summit East Keynote: Ali Ghodsi and Databricks Community Edition demo
2016 Spark Summit East Keynote: Ali Ghodsi and Databricks Community Edition demo2016 Spark Summit East Keynote: Ali Ghodsi and Databricks Community Edition demo
2016 Spark Summit East Keynote: Ali Ghodsi and Databricks Community Edition demo
Databricks
 
Industrializing Machine Learning on an Enterprise Azure Platform with Databri...
Industrializing Machine Learning on an Enterprise Azure Platform with Databri...Industrializing Machine Learning on an Enterprise Azure Platform with Databri...
Industrializing Machine Learning on an Enterprise Azure Platform with Databri...
Databricks
 
Strata+Hadoop World NY 2016 - Avinash Ramineni
Strata+Hadoop World NY 2016 - Avinash RamineniStrata+Hadoop World NY 2016 - Avinash Ramineni
Strata+Hadoop World NY 2016 - Avinash Ramineni
Avinash Ramineni
 
Scaling to Infinity - Open Source meets Big Data
Scaling to Infinity - Open Source meets Big DataScaling to Infinity - Open Source meets Big Data
Scaling to Infinity - Open Source meets Big Data
Treasure Data, Inc.
 
TechDays NL 2016 - Building your scalable secure IoT Solution on Azure
TechDays NL 2016 - Building your scalable secure IoT Solution on AzureTechDays NL 2016 - Building your scalable secure IoT Solution on Azure
TechDays NL 2016 - Building your scalable secure IoT Solution on Azure
Tom Kerkhove
 
Data Architecture Brief Overview
Data Architecture Brief OverviewData Architecture Brief Overview
Data Architecture Brief Overview
Hal Kalechofsky
 
Using Premium Data - for Business Analysts
Using Premium Data - for Business AnalystsUsing Premium Data - for Business Analysts
Using Premium Data - for Business Analysts
Lynn Langit
 
LogStash: Concept Run-Through
LogStash: Concept Run-ThroughLogStash: Concept Run-Through
LogStash: Concept Run-Through
Manuj Aggarwal
 
Options for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketOptions for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current Market
Dremio Corporation
 

Similar to Дмитрий Попович "How to build a data warehouse?" (20)

Lessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsLessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatterns
Claudiu Barbura
 
Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022
Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022
Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022
HostedbyConfluent
 
Adding structure to your streaming pipelines: moving from Spark streaming to ...
Adding structure to your streaming pipelines: moving from Spark streaming to ...Adding structure to your streaming pipelines: moving from Spark streaming to ...
Adding structure to your streaming pipelines: moving from Spark streaming to ...
DataWorks Summit
 
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
Spark Summit
 
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical DemonstrationMaximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Denodo
 
Serverless SQL
Serverless SQLServerless SQL
Serverless SQL
Torsten Steinbach
 
Yellowbrick Webcast with DBTA for Real-Time Analytics
Yellowbrick Webcast with DBTA for Real-Time AnalyticsYellowbrick Webcast with DBTA for Real-Time Analytics
Yellowbrick Webcast with DBTA for Real-Time Analytics
Yellowbrick Data
 
Real time analytics
Real time analyticsReal time analytics
Real time analytics
Leandro Totino Pereira
 
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
AWS Chicago
 
Elastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @DatadogElastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @Datadog
C4Media
 
Music city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lake
Timothy Spann
 
Lecture 5- Data Collection and Storage.pptx
Lecture 5- Data Collection and Storage.pptxLecture 5- Data Collection and Storage.pptx
Lecture 5- Data Collection and Storage.pptx
Brianc34
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)
Kent Graziano
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservices
Bigstep
 
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Testing Big Data: Automated  Testing of Hadoop with QuerySurgeTesting Big Data: Automated  Testing of Hadoop with QuerySurge
Testing Big Data: Automated Testing of Hadoop with QuerySurge
RTTS
 
Data Science with the Help of Metadata
Data Science with the Help of MetadataData Science with the Help of Metadata
Data Science with the Help of Metadata
Jim Dowling
 
DataEngConf: Parquet at Datadog: Fast, Efficient, Portable Storage for Big Data
DataEngConf: Parquet at Datadog: Fast, Efficient, Portable Storage for Big DataDataEngConf: Parquet at Datadog: Fast, Efficient, Portable Storage for Big Data
DataEngConf: Parquet at Datadog: Fast, Efficient, Portable Storage for Big Data
Hakka Labs
 
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyModernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Alluxio, Inc.
 
Apache IOTDB: a Time Series Database for Industrial IoT
Apache IOTDB: a Time Series Database for Industrial IoTApache IOTDB: a Time Series Database for Industrial IoT
Apache IOTDB: a Time Series Database for Industrial IoT
jixuan1989
 
How to create custom dashboards in Elastic Search / Kibana with Performance V...
How to create custom dashboards in Elastic Search / Kibana with Performance V...How to create custom dashboards in Elastic Search / Kibana with Performance V...
How to create custom dashboards in Elastic Search / Kibana with Performance V...
PerformanceVision (previously SecurActive)
 
Lessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsLessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatterns
Claudiu Barbura
 
Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022
Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022
Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022
HostedbyConfluent
 
Adding structure to your streaming pipelines: moving from Spark streaming to ...
Adding structure to your streaming pipelines: moving from Spark streaming to ...Adding structure to your streaming pipelines: moving from Spark streaming to ...
Adding structure to your streaming pipelines: moving from Spark streaming to ...
DataWorks Summit
 
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
Spark Summit
 
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical DemonstrationMaximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Denodo
 
Yellowbrick Webcast with DBTA for Real-Time Analytics
Yellowbrick Webcast with DBTA for Real-Time AnalyticsYellowbrick Webcast with DBTA for Real-Time Analytics
Yellowbrick Webcast with DBTA for Real-Time Analytics
Yellowbrick Data
 
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
AWS Chicago
 
Elastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @DatadogElastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @Datadog
C4Media
 
Music city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lake
Timothy Spann
 
Lecture 5- Data Collection and Storage.pptx
Lecture 5- Data Collection and Storage.pptxLecture 5- Data Collection and Storage.pptx
Lecture 5- Data Collection and Storage.pptx
Brianc34
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)
Kent Graziano
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservices
Bigstep
 
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Testing Big Data: Automated  Testing of Hadoop with QuerySurgeTesting Big Data: Automated  Testing of Hadoop with QuerySurge
Testing Big Data: Automated Testing of Hadoop with QuerySurge
RTTS
 
Data Science with the Help of Metadata
Data Science with the Help of MetadataData Science with the Help of Metadata
Data Science with the Help of Metadata
Jim Dowling
 
DataEngConf: Parquet at Datadog: Fast, Efficient, Portable Storage for Big Data
DataEngConf: Parquet at Datadog: Fast, Efficient, Portable Storage for Big DataDataEngConf: Parquet at Datadog: Fast, Efficient, Portable Storage for Big Data
DataEngConf: Parquet at Datadog: Fast, Efficient, Portable Storage for Big Data
Hakka Labs
 
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyModernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Alluxio, Inc.
 
Apache IOTDB: a Time Series Database for Industrial IoT
Apache IOTDB: a Time Series Database for Industrial IoTApache IOTDB: a Time Series Database for Industrial IoT
Apache IOTDB: a Time Series Database for Industrial IoT
jixuan1989
 
How to create custom dashboards in Elastic Search / Kibana with Performance V...
How to create custom dashboards in Elastic Search / Kibana with Performance V...How to create custom dashboards in Elastic Search / Kibana with Performance V...
How to create custom dashboards in Elastic Search / Kibana with Performance V...
PerformanceVision (previously SecurActive)
 
Ad

More from Fwdays (20)

Від KPI до OKR: як синхронізувати продажі, маркетинг і продукт, щоб бізнес ре...
Від KPI до OKR: як синхронізувати продажі, маркетинг і продукт, щоб бізнес ре...Від KPI до OKR: як синхронізувати продажі, маркетинг і продукт, щоб бізнес ре...
Від KPI до OKR: як синхронізувати продажі, маркетинг і продукт, щоб бізнес ре...
Fwdays
 
"Demand Generation: How a Founder’s Brand Turns Content into Leads", Alex Her...
"Demand Generation: How a Founder’s Brand Turns Content into Leads", Alex Her..."Demand Generation: How a Founder’s Brand Turns Content into Leads", Alex Her...
"Demand Generation: How a Founder’s Brand Turns Content into Leads", Alex Her...
Fwdays
 
"Rebranding for Growth", Anna Velykoivanenko
"Rebranding for Growth", Anna Velykoivanenko"Rebranding for Growth", Anna Velykoivanenko
"Rebranding for Growth", Anna Velykoivanenko
Fwdays
 
"Must-have AI-tools for cost-efficient marketing", Irina Smirnova
"Must-have AI-tools for cost-efficient marketing",  Irina Smirnova"Must-have AI-tools for cost-efficient marketing",  Irina Smirnova
"Must-have AI-tools for cost-efficient marketing", Irina Smirnova
Fwdays
 
"Client Partnership — the Path to Exponential Growth for Companies Sized 50-5...
"Client Partnership — the Path to Exponential Growth for Companies Sized 50-5..."Client Partnership — the Path to Exponential Growth for Companies Sized 50-5...
"Client Partnership — the Path to Exponential Growth for Companies Sized 50-5...
Fwdays
 
"Building a Product IT Team in a Defense-Tech Company", Arthur Seletskiy
"Building a Product IT Team in a Defense-Tech Company", Arthur Seletskiy"Building a Product IT Team in a Defense-Tech Company", Arthur Seletskiy
"Building a Product IT Team in a Defense-Tech Company", Arthur Seletskiy
Fwdays
 
"Scaling Smart: GTM Strategies that Fuel Growth for Service IT Companies", V...
"Scaling Smart: GTM Strategies that Fuel Growth for Service IT Companies",  V..."Scaling Smart: GTM Strategies that Fuel Growth for Service IT Companies",  V...
"Scaling Smart: GTM Strategies that Fuel Growth for Service IT Companies", V...
Fwdays
 
"Pushy Sales Don’t Work: How to Sell Without Driving People Crazy", Aliona Ka...
"Pushy Sales Don’t Work: How to Sell Without Driving People Crazy", Aliona Ka..."Pushy Sales Don’t Work: How to Sell Without Driving People Crazy", Aliona Ka...
"Pushy Sales Don’t Work: How to Sell Without Driving People Crazy", Aliona Ka...
Fwdays
 
Performance Marketing Research для запуску нового WorldWide продукту
Performance Marketing Research для запуску нового WorldWide продуктуPerformance Marketing Research для запуску нового WorldWide продукту
Performance Marketing Research для запуску нового WorldWide продукту
Fwdays
 
"Scaling Product Mindset: From Individual Ideas to Team Culture", Oksana Holu...
"Scaling Product Mindset: From Individual Ideas to Team Culture", Oksana Holu..."Scaling Product Mindset: From Individual Ideas to Team Culture", Oksana Holu...
"Scaling Product Mindset: From Individual Ideas to Team Culture", Oksana Holu...
Fwdays
 
"AI-Driven Automation for High-Performing Teams: Optimize Routine Tasks & Lea...
"AI-Driven Automation for High-Performing Teams: Optimize Routine Tasks & Lea..."AI-Driven Automation for High-Performing Teams: Optimize Routine Tasks & Lea...
"AI-Driven Automation for High-Performing Teams: Optimize Routine Tasks & Lea...
Fwdays
 
"Constructive Interaction During Emotional Burnout: With Local and Internatio...
"Constructive Interaction During Emotional Burnout: With Local and Internatio..."Constructive Interaction During Emotional Burnout: With Local and Internatio...
"Constructive Interaction During Emotional Burnout: With Local and Internatio...
Fwdays
 
"Perfectionisin: What Does the Medicine for Perfectionism Look Like?", Manoil...
"Perfectionisin: What Does the Medicine for Perfectionism Look Like?", Manoil..."Perfectionisin: What Does the Medicine for Perfectionism Look Like?", Manoil...
"Perfectionisin: What Does the Medicine for Perfectionism Look Like?", Manoil...
Fwdays
 
"39 offers for my mentees in a year. How to create a professional environment...
"39 offers for my mentees in a year. How to create a professional environment..."39 offers for my mentees in a year. How to create a professional environment...
"39 offers for my mentees in a year. How to create a professional environment...
Fwdays
 
"From “doing tasks” to leadership: how to adapt management style to the conte...
"From “doing tasks” to leadership: how to adapt management style to the conte..."From “doing tasks” to leadership: how to adapt management style to the conte...
"From “doing tasks” to leadership: how to adapt management style to the conte...
Fwdays
 
[QUICK TALK] "Why Some Teams Grow Better Under Pressure", Oleksandr Marchenko...
[QUICK TALK] "Why Some Teams Grow Better Under Pressure", Oleksandr Marchenko...[QUICK TALK] "Why Some Teams Grow Better Under Pressure", Oleksandr Marchenko...
[QUICK TALK] "Why Some Teams Grow Better Under Pressure", Oleksandr Marchenko...
Fwdays
 
[QUICK TALK] "How to study to acquire a skill, not a certificate?", Uliana Du...
[QUICK TALK] "How to study to acquire a skill, not a certificate?", Uliana Du...[QUICK TALK] "How to study to acquire a skill, not a certificate?", Uliana Du...
[QUICK TALK] "How to study to acquire a skill, not a certificate?", Uliana Du...
Fwdays
 
[QUICK TALK] "Coaching 101: How to Identify and Develop Your Leadership Quali...
[QUICK TALK] "Coaching 101: How to Identify and Develop Your Leadership Quali...[QUICK TALK] "Coaching 101: How to Identify and Develop Your Leadership Quali...
[QUICK TALK] "Coaching 101: How to Identify and Develop Your Leadership Quali...
Fwdays
 
"Dialogue about fakapas: how to pass an interview without unnecessary mistake...
"Dialogue about fakapas: how to pass an interview without unnecessary mistake..."Dialogue about fakapas: how to pass an interview without unnecessary mistake...
"Dialogue about fakapas: how to pass an interview without unnecessary mistake...
Fwdays
 
"Conflicts within a Team: Not an Enemy, But an Opportunity for Growth", Orest...
"Conflicts within a Team: Not an Enemy, But an Opportunity for Growth", Orest..."Conflicts within a Team: Not an Enemy, But an Opportunity for Growth", Orest...
"Conflicts within a Team: Not an Enemy, But an Opportunity for Growth", Orest...
Fwdays
 
Від KPI до OKR: як синхронізувати продажі, маркетинг і продукт, щоб бізнес ре...
Від KPI до OKR: як синхронізувати продажі, маркетинг і продукт, щоб бізнес ре...Від KPI до OKR: як синхронізувати продажі, маркетинг і продукт, щоб бізнес ре...
Від KPI до OKR: як синхронізувати продажі, маркетинг і продукт, щоб бізнес ре...
Fwdays
 
"Demand Generation: How a Founder’s Brand Turns Content into Leads", Alex Her...
"Demand Generation: How a Founder’s Brand Turns Content into Leads", Alex Her..."Demand Generation: How a Founder’s Brand Turns Content into Leads", Alex Her...
"Demand Generation: How a Founder’s Brand Turns Content into Leads", Alex Her...
Fwdays
 
"Rebranding for Growth", Anna Velykoivanenko
"Rebranding for Growth", Anna Velykoivanenko"Rebranding for Growth", Anna Velykoivanenko
"Rebranding for Growth", Anna Velykoivanenko
Fwdays
 
"Must-have AI-tools for cost-efficient marketing", Irina Smirnova
"Must-have AI-tools for cost-efficient marketing",  Irina Smirnova"Must-have AI-tools for cost-efficient marketing",  Irina Smirnova
"Must-have AI-tools for cost-efficient marketing", Irina Smirnova
Fwdays
 
"Client Partnership — the Path to Exponential Growth for Companies Sized 50-5...
"Client Partnership — the Path to Exponential Growth for Companies Sized 50-5..."Client Partnership — the Path to Exponential Growth for Companies Sized 50-5...
"Client Partnership — the Path to Exponential Growth for Companies Sized 50-5...
Fwdays
 
"Building a Product IT Team in a Defense-Tech Company", Arthur Seletskiy
"Building a Product IT Team in a Defense-Tech Company", Arthur Seletskiy"Building a Product IT Team in a Defense-Tech Company", Arthur Seletskiy
"Building a Product IT Team in a Defense-Tech Company", Arthur Seletskiy
Fwdays
 
"Scaling Smart: GTM Strategies that Fuel Growth for Service IT Companies", V...
"Scaling Smart: GTM Strategies that Fuel Growth for Service IT Companies",  V..."Scaling Smart: GTM Strategies that Fuel Growth for Service IT Companies",  V...
"Scaling Smart: GTM Strategies that Fuel Growth for Service IT Companies", V...
Fwdays
 
"Pushy Sales Don’t Work: How to Sell Without Driving People Crazy", Aliona Ka...
"Pushy Sales Don’t Work: How to Sell Without Driving People Crazy", Aliona Ka..."Pushy Sales Don’t Work: How to Sell Without Driving People Crazy", Aliona Ka...
"Pushy Sales Don’t Work: How to Sell Without Driving People Crazy", Aliona Ka...
Fwdays
 
Performance Marketing Research для запуску нового WorldWide продукту
Performance Marketing Research для запуску нового WorldWide продуктуPerformance Marketing Research для запуску нового WorldWide продукту
Performance Marketing Research для запуску нового WorldWide продукту
Fwdays
 
"Scaling Product Mindset: From Individual Ideas to Team Culture", Oksana Holu...
"Scaling Product Mindset: From Individual Ideas to Team Culture", Oksana Holu..."Scaling Product Mindset: From Individual Ideas to Team Culture", Oksana Holu...
"Scaling Product Mindset: From Individual Ideas to Team Culture", Oksana Holu...
Fwdays
 
"AI-Driven Automation for High-Performing Teams: Optimize Routine Tasks & Lea...
"AI-Driven Automation for High-Performing Teams: Optimize Routine Tasks & Lea..."AI-Driven Automation for High-Performing Teams: Optimize Routine Tasks & Lea...
"AI-Driven Automation for High-Performing Teams: Optimize Routine Tasks & Lea...
Fwdays
 
"Constructive Interaction During Emotional Burnout: With Local and Internatio...
"Constructive Interaction During Emotional Burnout: With Local and Internatio..."Constructive Interaction During Emotional Burnout: With Local and Internatio...
"Constructive Interaction During Emotional Burnout: With Local and Internatio...
Fwdays
 
"Perfectionisin: What Does the Medicine for Perfectionism Look Like?", Manoil...
"Perfectionisin: What Does the Medicine for Perfectionism Look Like?", Manoil..."Perfectionisin: What Does the Medicine for Perfectionism Look Like?", Manoil...
"Perfectionisin: What Does the Medicine for Perfectionism Look Like?", Manoil...
Fwdays
 
"39 offers for my mentees in a year. How to create a professional environment...
"39 offers for my mentees in a year. How to create a professional environment..."39 offers for my mentees in a year. How to create a professional environment...
"39 offers for my mentees in a year. How to create a professional environment...
Fwdays
 
"From “doing tasks” to leadership: how to adapt management style to the conte...
"From “doing tasks” to leadership: how to adapt management style to the conte..."From “doing tasks” to leadership: how to adapt management style to the conte...
"From “doing tasks” to leadership: how to adapt management style to the conte...
Fwdays
 
[QUICK TALK] "Why Some Teams Grow Better Under Pressure", Oleksandr Marchenko...
[QUICK TALK] "Why Some Teams Grow Better Under Pressure", Oleksandr Marchenko...[QUICK TALK] "Why Some Teams Grow Better Under Pressure", Oleksandr Marchenko...
[QUICK TALK] "Why Some Teams Grow Better Under Pressure", Oleksandr Marchenko...
Fwdays
 
[QUICK TALK] "How to study to acquire a skill, not a certificate?", Uliana Du...
[QUICK TALK] "How to study to acquire a skill, not a certificate?", Uliana Du...[QUICK TALK] "How to study to acquire a skill, not a certificate?", Uliana Du...
[QUICK TALK] "How to study to acquire a skill, not a certificate?", Uliana Du...
Fwdays
 
[QUICK TALK] "Coaching 101: How to Identify and Develop Your Leadership Quali...
[QUICK TALK] "Coaching 101: How to Identify and Develop Your Leadership Quali...[QUICK TALK] "Coaching 101: How to Identify and Develop Your Leadership Quali...
[QUICK TALK] "Coaching 101: How to Identify and Develop Your Leadership Quali...
Fwdays
 
"Dialogue about fakapas: how to pass an interview without unnecessary mistake...
"Dialogue about fakapas: how to pass an interview without unnecessary mistake..."Dialogue about fakapas: how to pass an interview without unnecessary mistake...
"Dialogue about fakapas: how to pass an interview without unnecessary mistake...
Fwdays
 
"Conflicts within a Team: Not an Enemy, But an Opportunity for Growth", Orest...
"Conflicts within a Team: Not an Enemy, But an Opportunity for Growth", Orest..."Conflicts within a Team: Not an Enemy, But an Opportunity for Growth", Orest...
"Conflicts within a Team: Not an Enemy, But an Opportunity for Growth", Orest...
Fwdays
 
Ad

Recently uploaded (20)

Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 

Дмитрий Попович "How to build a data warehouse?"

  • 1. How to build a data warehouse? Dmytro Popovych, SE @ Tubular
  • 2. Theory vs practice Цитата #441422 Пока инженеры в белых халатах прикручивают красивый двигатель к идеальному крылу, бригада взлохмаченных придурков во главе с безумным авантюристом пролетает над ними на конструкции из микроавтобуса, забора и двух промышленных фенов, навстречу второму туру инвестиций. Красивые проекты не взлетают, потому что они не успевают взлететь.
  • 3. Agenda • Problem statement: • Data Ingestion • Data Normalisation • Data Access • Our way to solve the problem :)
  • 4. About us • Video intelligence for the cross-platform world • 30+ video platforms including YouTube, Facebook, Instagram • 7M creators • 3B videos • 2Tb of newly ingested data a day • 150Tb of data in the warehouse
  • 5. What is a data warehouse? A central repository of data collected from disparate sources. ANALYST ENGINEER SERVICE DATA WAREHOUSE
  • 6. Key features Ingestion Store raw data extracted from disparate data sources Normalisation Cleanup / combine raw data Access Help user to retrieve data
  • 7. What problems does it solve in Tubular? • For engineers / analysts: • data discovery • prototyping / analyse • For services: • data exchange
  • 9. Data Ingestion Problems • Real time data: • tweets, comments, shares, views • Periodical snapshots: • dump of real time data • results of the data analysis • databases from internal services (in some cases)
  • 10. Real time data DATABUS / event log / message queue Powered by KAFKA Data serialised with AVRO Keeps all events for the last N days SERVICE #1 SERVICE #2 PERMANENT STORAGE ...
  • 11. Why did we choose Kafka? • Stores streams of records in a fault-tolerant way • Designed to serve multiple consumers per topic • Allows to keep the last N days of records • Tested in very big companies Linkedin, Twitter, Uber, Airbnb...
  • 12. • Strict schema definition • Safe schema evolution • Compact (binary serialisation format) • Cross-technology format (Java, Python, …) • Has some ecosystem around (Schema Registry, CLI consumers, …) • Hadoop-friendly Why did we choose Avro?
  • 13. Periodical Snapshots DATABUS SERVICE #1 Powered by ELASTIC PERMANENT STORAGE SERVICE #2 Powered by CASSANDRA SERVICE #3 Powered by MYSQL ... DATA IMPORT TOOL Powered by S3 Data serialised with PARQUET Powered by SPARK
  • 14. Why did we choose S3? • There is no need to support it • Compatible with Hadoop ecosystem • Relatively stable & cheap
  • 15. Why did we choose Parquet? • Column-oriented format (perfect for analytics and partial reads) • Supports complex data structures • Compatible with Hadoop ecosystem
  • 16. Why did we choose Spark? • Scalable data processing engine • Faster than Hadoop • Has connectors to all popular storages: JDBC, Elastic, Cassandra, Kafka • Has Python bindings • Built-in support of Parquet
  • 18. Data Normalisation Problems • Cleanup duplicates • Partition by year / month / date / hour • Join various data sources
  • 19. Normalisation of real time data (example) SERVICE #1 Powered by ELASTIC DATABUS UI PERMANENT STORAGE The service joins multiple data streams by sending partial updates to Elastic. Note! It isn’t the only way to implement a real time join, more generic solution could be implemented with Apache Samza.
  • 20. Why did we choose Elastic? • Provides real time search and analytics • Has relatively cheap partial updates • Easy to scale
  • 21. Normalisation of previously imported data DATA NORMALISATION TOOL PERMANENT STORAGE Powered by Spark Joins various datasets Removes duplicates Creates partitions by time range buckets
  • 22. Why did we choose Spark? • Scalable data processing engine • Has built-in SQL api to transform data (perfect for joins and deduplication)
  • 24. Data Access Problems • Datasets discovery • Unified data access interface
  • 25. Metadata Storage PERMANENT STORAGE Parquet # 1 Avro #1 CSV #1 Parquet # 2 ... METADATA STORAGE Table # 1 Table # 2 Table # 3 ... Powered by Hive Metastore
  • 26. Why did we choose Hive Metastore? • Supported by Hadoop ecosystem • Simple (Thrift api on top of MySQL table) • Supported by Hue (UI to access metadata)
  • 29. Thanks! Questions? Check this out: https://github.com/Tubular/sparkly