SlideShare a Scribd company logo
WHAT YOU NEED TO KNOW,
BEFORE MIGRATING DATA
PLATFORM TO GCP
by
SERHII KHOLODNIUK
Serhii Kholodniuk
Senior Big Data
Engineer
Sigma Software
Ukraine
Kyiv office
My interest and goals:
• interested in designing and developing data platforms for the needs of
business intelligence and machine learning.
• constantly looking for opportunities to simplify and optimize solutions, their
implementation and maintenance.
• client value oriented.
Mastering GCP:
• currently building data platform in GCP
• migrating data pipelines in to GCP infrastructure
• optimizing data warehouse structure
AGENDA
—
3
Why GCP becomes popular . . . . . . . . . . . . . . . . . . . . . . . . . . . 04
Migration phases . . . . . . . . . . . . . . . . . . . . . . . . . . . 09
Pipelines migration . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Schema and data migration . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Data storages . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
WHY GCP BECOMES POPULAR?
—
4
Cloud Infrastructure
Network
Cloud sustainability
Data cloud
Security out of box (encrypt data at rest and in transit)
Powerful BigQuery features with ergonomic design
Provide cloud infrastructure for all data needs
Customized solutions for different industries
Provide best practices industry solutions
Artificial intelligence solutions
Prebuilt ML model APIs
Custom Model Building with SQL in BigQuery ML
Custom Model Building with Cloud AutoML
CLOUD INFRASTRUCTURE
—
5
Network
29 regions
88 availability zones
146 edge locations
Cloud sustainability
100% renewable energy for all cloud regions
81% waste diverted from landfills
2x more efficient thana typical enterprise
data center
DATA CLOUD
—
6
Security out of box
(encrypt data at rest and in transit)
Provide cloud infrastructure for all data needs
Powerful BigQuery features with
ergonomic design
CUSTOMIZED SOLUTIONS FOR DIFFERENT INDUSTRIES
—
7
Provide best practices industry solutions
Industry solutions
Retail
Consumer packaged goods
Manufacturing
Automotive
Supply chain and logistics
Energy
Healthcare and life sciences
Media and entertainment
Gaming
Telecommunications
Financial services
Financial services
Capital markets
Government and public sector
Government
State and local government
Federal government
Education
AI SOLUTIONS
—
8
MIGRATION PHASES
—
9
1. Pre-migration phase
• complete inventory of workloads and stuff to be
migrated
• calculate Total Cost of Ownership and future
business value
• build a use case backlog
• select use cases for iteration
2. Migration phase
• schema migration
• pipelines migration
• data migration
3. Post-migration phase
• cost and performance optimization
• schema denormalization for BigQuery
• removing nested and repeated schema fields
• clustering and partitioning
• slots reservation for BigQuery
ITERATIVE APPROACH IN AGILE WAY
—
10
Prioritize use case backlog Select use cases for iteration Execution Release
1. Setup and data governance
2. Migrate schema and data
3. Translate queries
4. Migrate services and apps
5. Migrate data pipelines
6. Optimise perfomance
7. Verify and validate
Next iteration
PIPELINES MIGRATION
—
11
Cloud Composer
Cloud Dataflow
Cloud Dataproc
Cloud Compute Engine
WHAT TO CHOOSE?
DATAFLOW vs DATAPROC
—
12
Cloud Dataproc Cloud Dataflow
Recommended for: New data processing pipelines, unified
batch and streaming Existing
Hadoop/Spark applications, machine
learning/data science ecosystem, large-
batch jobs, preemptible VMs
New data processing pipelines, unified
batch and streaming
Fully-managed: No Yes
Managed by: DevOps Serverless
Auto-scaling: Yes, based on cluster utilization (reactive) Yes, transform-by-transform (adaptive)
Expertise: Hadoop, Hive, Pig, Apache Big Data
ecosystem, Spark, Flink, Presto, Druid
Apache Beam
DATAFLOW vs SPARK SERVERLESS
—
13
Spark Serverless Cloud Dataflow
Recommended for: New data processing pipelines, unified
batch existing Spark applications (from
Spark 3.2), machine learning/data science
ecosystem, large-batch jobs
New data processing pipelines, unified
batch and streaming
Fully-managed: Yes Yes
Managed by: Serverless Serverless
Auto-scaling: Yes, transform-by-transform (adaptive) Yes, transform-by-transform (adaptive)
Expertise: Pyspark, Spark SQL, Spark R, Spark
Java/Scala
Apache Beam
SCHEMA AND DATA MIGRATION
—
14
Database Migration Service – helps migrating MySQL and PostgresSQL to CloudSQL
BigQuery Data Transfer Service
Google recommends loading large data volumes by using Cloud Storage Transfer Service, and preferable are
Avro, Parquet or ORC format rather than CSV or JSON
For migration stratagies for Oracle workloads: rehost (by Bare Metal Solution), replatform, rewrite
Hbase to Bigtable migration path: HDFS -> Cloud Storage -> Storage Transfer Service -> Bigtable
DATA STORES FOR DIFFERENT USE CASES
—
15
Data
Unstructured Structured
Cloud Storage
Transactional
workloads
Data analytics
workloads
Millisecond
latency
Latency in
seconds
Cloud Bigtable
BigQuery
Firestore
NoSQL
SQL
One database
enough
Horisontal
scalability
Cloud SQL
Cloud Spanner
THANK YOU!

More Related Content

What's hot (20)

PDF
Data Governance and Stewardship Roundtable
Summa
 
PDF
Building Lakehouses on Delta Lake with SQL Analytics Primer
Databricks
 
PDF
Cloud Transformation
Hexaware Technologies
 
PDF
Modernizing to a Cloud Data Architecture
Databricks
 
PPTX
Introduction to Microsoft ERP Dynamics 365 for finance and operation
Ali Raza Zaidi
 
PPTX
Building Modern Data Platform with Microsoft Azure
Dmitry Anoshin
 
PPTX
Modern Data Architecture
Alexey Grishchenko
 
PDF
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
PPTX
Get Savvy with Snowflake
Matillion
 
PDF
Deliver Dynamic Customer Journey Orchestration at Scale
Databricks
 
PDF
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
HostedbyConfluent
 
PDF
Data Governance Best Practices
DATAVERSITY
 
PPTX
Building a modern data warehouse
James Serra
 
PDF
Designing An Enterprise Data Fabric
Alan McSweeney
 
PPTX
Azure Cloud Adoption Framework + Governance - Sana Khan and Jay Kumar
Timothy McAliley
 
PDF
Modernizing Integration with Data Virtualization
Denodo
 
PPTX
SAP Integration Suite L1
SAP Technology
 
PDF
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
DATAVERSITY
 
PPTX
The Top 5 Apache Kafka Use Cases and Architectures in 2022
Kai Wähner
 
PPTX
Intro Microsoft Dynamics 365
Juan Fabian
 
Data Governance and Stewardship Roundtable
Summa
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Databricks
 
Cloud Transformation
Hexaware Technologies
 
Modernizing to a Cloud Data Architecture
Databricks
 
Introduction to Microsoft ERP Dynamics 365 for finance and operation
Ali Raza Zaidi
 
Building Modern Data Platform with Microsoft Azure
Dmitry Anoshin
 
Modern Data Architecture
Alexey Grishchenko
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Get Savvy with Snowflake
Matillion
 
Deliver Dynamic Customer Journey Orchestration at Scale
Databricks
 
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
HostedbyConfluent
 
Data Governance Best Practices
DATAVERSITY
 
Building a modern data warehouse
James Serra
 
Designing An Enterprise Data Fabric
Alan McSweeney
 
Azure Cloud Adoption Framework + Governance - Sana Khan and Jay Kumar
Timothy McAliley
 
Modernizing Integration with Data Virtualization
Denodo
 
SAP Integration Suite L1
SAP Technology
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
DATAVERSITY
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
Kai Wähner
 
Intro Microsoft Dynamics 365
Juan Fabian
 

Similar to Serhii Kholodniuk: What you need to know, before migrating data platform to GCP (Google cloud platform) (20)

PPTX
Introduction to Google Cloud Platform for Big Data - Trusted Conf
In Marketing We Trust
 
PDF
Getting more into GCP.pdf
Knoldus Inc.
 
PPTX
The Best GCP Cloud Data Engineer Training in Hyderabad.pptx
sivavisualpath
 
PPTX
GCP Data Engineering Online Training in Hyderabad - GCP.pptx
sivavisualpath
 
PDF
Bridge to Cloud: Using Apache Kafka to Migrate to GCP
confluent
 
PDF
Cloud Big Data Architectures
Lynn Langit
 
PPTX
Eric Andersen Keynote
Data Con LA
 
PPTX
Google Cloud and Data Pipeline Patterns
Lynn Langit
 
PDF
Building real-time data analytics on Google Cloud
Jonny Daenen
 
PDF
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Edwin Poot
 
PDF
Designing Cloud Data Platforms 1st Edition Danil Zburivsky Lynda Partner
reynoseeto81
 
PDF
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
GetInData
 
PPTX
Big Data Analytics with Hadoop
Philippe Julio
 
PDF
Cloud & Big Data: Lessons Learnt
philipbalinov
 
PPTX
GDSC Cloud Jam.pptx
GDSCIITBhilai
 
PDF
Cloud Computing for Data Professionals
Ankit Rathi
 
PPTX
Big data? No. Big Decisions are What You Want
Stuart Miniman
 
PDF
Six Steps to Modernize Your Data Ecosystem - Mindtree
samirandev1
 
PDF
Steps to Modernize Your Data Ecosystem with Mindtree Blog
sameerroshan
 
PDF
6 Steps to Modernize Data Ecosystem with Mindtree
devraajsingh
 
Introduction to Google Cloud Platform for Big Data - Trusted Conf
In Marketing We Trust
 
Getting more into GCP.pdf
Knoldus Inc.
 
The Best GCP Cloud Data Engineer Training in Hyderabad.pptx
sivavisualpath
 
GCP Data Engineering Online Training in Hyderabad - GCP.pptx
sivavisualpath
 
Bridge to Cloud: Using Apache Kafka to Migrate to GCP
confluent
 
Cloud Big Data Architectures
Lynn Langit
 
Eric Andersen Keynote
Data Con LA
 
Google Cloud and Data Pipeline Patterns
Lynn Langit
 
Building real-time data analytics on Google Cloud
Jonny Daenen
 
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Edwin Poot
 
Designing Cloud Data Platforms 1st Edition Danil Zburivsky Lynda Partner
reynoseeto81
 
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
GetInData
 
Big Data Analytics with Hadoop
Philippe Julio
 
Cloud & Big Data: Lessons Learnt
philipbalinov
 
GDSC Cloud Jam.pptx
GDSCIITBhilai
 
Cloud Computing for Data Professionals
Ankit Rathi
 
Big data? No. Big Decisions are What You Want
Stuart Miniman
 
Six Steps to Modernize Your Data Ecosystem - Mindtree
samirandev1
 
Steps to Modernize Your Data Ecosystem with Mindtree Blog
sameerroshan
 
6 Steps to Modernize Data Ecosystem with Mindtree
devraajsingh
 
Ad

More from Lviv Startup Club (20)

PDF
Maksym Vyshnivetskyi: Управління закупівлями (UA)
Lviv Startup Club
 
PDF
Oleksandr Osypenko: Управління ризиками (UA)
Lviv Startup Club
 
PPTX
Dmytro Zubkov: PMO Resource Management (UA)
Lviv Startup Club
 
PPTX
Rostyslav Chayka: Комунікація за допомогою AI (UA)
Lviv Startup Club
 
PDF
Ihor Pavlenko: Комунікація за допомогою AI (UA)
Lviv Startup Club
 
PDF
Maksym Vyshnivetskyi: Управління якістю (UA)
Lviv Startup Club
 
PDF
Ihor Pavlenko: Робота зі стейкхолдерами за допомогою AI (UA)
Lviv Startup Club
 
PDF
Maksym Vyshnivetskyi: Управління вартістю (Cost) (UA)
Lviv Startup Club
 
PDF
Oleksandr Osypenko: Управління часом та ресурсами (UA)
Lviv Startup Club
 
PPTX
Dmytro Liesov: Developing PMO Services and Functions (UA)
Lviv Startup Club
 
PDF
Igor Dumbur: Інженерна досконалість та DevOps (UA)
Lviv Startup Club
 
PDF
Ihor Pavlenko: Управління ризиками за допомогою AI (UA)
Lviv Startup Club
 
PPTX
Dmytro Liesov: Управління інтеграцією (UA)
Lviv Startup Club
 
PDF
Oleksandr Osypenko: Управління обсягом (Scope) (UA)
Lviv Startup Club
 
PDF
Oleksandr Osypenko: Defining PMO Structure and Governance (UA)
Lviv Startup Club
 
PDF
Oleksandra Apanasenkova: Управління delivery (Частина 2) (UA)
Lviv Startup Club
 
PDF
Michael Vidyakin: Планування проєктів за допомогою AI (UA)
Lviv Startup Club
 
PDF
Oleksandr Osypenko: Комунікації у проєкті (UA)
Lviv Startup Club
 
PDF
Oleksandra Apanasenkova: Управління delivery (Частина 1) (UA)
Lviv Startup Club
 
PPTX
Dmytro Liesov: Лідерство та управління конфліктами (UA)
Lviv Startup Club
 
Maksym Vyshnivetskyi: Управління закупівлями (UA)
Lviv Startup Club
 
Oleksandr Osypenko: Управління ризиками (UA)
Lviv Startup Club
 
Dmytro Zubkov: PMO Resource Management (UA)
Lviv Startup Club
 
Rostyslav Chayka: Комунікація за допомогою AI (UA)
Lviv Startup Club
 
Ihor Pavlenko: Комунікація за допомогою AI (UA)
Lviv Startup Club
 
Maksym Vyshnivetskyi: Управління якістю (UA)
Lviv Startup Club
 
Ihor Pavlenko: Робота зі стейкхолдерами за допомогою AI (UA)
Lviv Startup Club
 
Maksym Vyshnivetskyi: Управління вартістю (Cost) (UA)
Lviv Startup Club
 
Oleksandr Osypenko: Управління часом та ресурсами (UA)
Lviv Startup Club
 
Dmytro Liesov: Developing PMO Services and Functions (UA)
Lviv Startup Club
 
Igor Dumbur: Інженерна досконалість та DevOps (UA)
Lviv Startup Club
 
Ihor Pavlenko: Управління ризиками за допомогою AI (UA)
Lviv Startup Club
 
Dmytro Liesov: Управління інтеграцією (UA)
Lviv Startup Club
 
Oleksandr Osypenko: Управління обсягом (Scope) (UA)
Lviv Startup Club
 
Oleksandr Osypenko: Defining PMO Structure and Governance (UA)
Lviv Startup Club
 
Oleksandra Apanasenkova: Управління delivery (Частина 2) (UA)
Lviv Startup Club
 
Michael Vidyakin: Планування проєктів за допомогою AI (UA)
Lviv Startup Club
 
Oleksandr Osypenko: Комунікації у проєкті (UA)
Lviv Startup Club
 
Oleksandra Apanasenkova: Управління delivery (Частина 1) (UA)
Lviv Startup Club
 
Dmytro Liesov: Лідерство та управління конфліктами (UA)
Lviv Startup Club
 
Ad

Recently uploaded (20)

PDF
GIÁO TRÌNH KINH DOANH QUỐC TẾ ĐẠI HỌC NGOẠI THƯƠNG
k622314115078
 
PDF
Two-phase direct-to-chip cooling - Parker Components
Parker Hannifin Corporation
 
PDF
Dr. Tran Quoc Bao - The Visionary Architect Behind Ho Chi Minh City’s Rise as...
Gorman Bain Capital
 
PPTX
Asia Pacific Tropical Fruit Puree Market Overview & Growth
chanderdeepseoexpert
 
PDF
Jordan Minnesota City Codes and Ordinances
Forklift Trucks in Minnesota
 
PDF
Top 10 Emerging Tech Trends to Watch in 2025.pdf
marketingyourtechdig
 
PDF
_How Freshers Can Find the Best IT Companies in Jaipur with Salarite.pdf
SALARITE
 
DOCX
How to Build Digital Income From Scratch Without Tech Skills or Experience
legendarybook73
 
PPTX
World First Cardiovascular & Thoracic CT Scanner
arineta37
 
PDF
Top Trends Redefining B2B Apparel Exporting in 2025
ananyaa2255
 
PPTX
Revolutionizing Retail: The Impact of Artificial Intelligence
RUPAL AGARWAL
 
PDF
Agriculture Machinery PartsAgriculture Machinery Parts
mizhanw168
 
PPTX
Melbourne’s Trusted Accountants for Business Tax - Clear Tax
Clear Tax
 
PDF
Flexible Metal Hose & Custom Hose Assemblies
McGill Hose & Coupling Inc
 
PPTX
Business profile making an example ppt for small scales
Bindu222929
 
PDF
LDM Recording for Yogi Goddess Projects Summer 2025
LDMMia GrandMaster
 
PDF
FastnersFastnersFastnersFastnersFastners
mizhanw168
 
PDF
BeMetals_Presentation_July_2025 .pdf
DerekIwanaka2
 
PPTX
25 Future Mega Trends Reshaping the World in 2025 and Beyond
presentifyai
 
PPTX
SYMCA LGP - Social Enterprise Exchange.pptx
Social Enterprise Exchange
 
GIÁO TRÌNH KINH DOANH QUỐC TẾ ĐẠI HỌC NGOẠI THƯƠNG
k622314115078
 
Two-phase direct-to-chip cooling - Parker Components
Parker Hannifin Corporation
 
Dr. Tran Quoc Bao - The Visionary Architect Behind Ho Chi Minh City’s Rise as...
Gorman Bain Capital
 
Asia Pacific Tropical Fruit Puree Market Overview & Growth
chanderdeepseoexpert
 
Jordan Minnesota City Codes and Ordinances
Forklift Trucks in Minnesota
 
Top 10 Emerging Tech Trends to Watch in 2025.pdf
marketingyourtechdig
 
_How Freshers Can Find the Best IT Companies in Jaipur with Salarite.pdf
SALARITE
 
How to Build Digital Income From Scratch Without Tech Skills or Experience
legendarybook73
 
World First Cardiovascular & Thoracic CT Scanner
arineta37
 
Top Trends Redefining B2B Apparel Exporting in 2025
ananyaa2255
 
Revolutionizing Retail: The Impact of Artificial Intelligence
RUPAL AGARWAL
 
Agriculture Machinery PartsAgriculture Machinery Parts
mizhanw168
 
Melbourne’s Trusted Accountants for Business Tax - Clear Tax
Clear Tax
 
Flexible Metal Hose & Custom Hose Assemblies
McGill Hose & Coupling Inc
 
Business profile making an example ppt for small scales
Bindu222929
 
LDM Recording for Yogi Goddess Projects Summer 2025
LDMMia GrandMaster
 
FastnersFastnersFastnersFastnersFastners
mizhanw168
 
BeMetals_Presentation_July_2025 .pdf
DerekIwanaka2
 
25 Future Mega Trends Reshaping the World in 2025 and Beyond
presentifyai
 
SYMCA LGP - Social Enterprise Exchange.pptx
Social Enterprise Exchange
 

Serhii Kholodniuk: What you need to know, before migrating data platform to GCP (Google cloud platform)

  • 1. WHAT YOU NEED TO KNOW, BEFORE MIGRATING DATA PLATFORM TO GCP by SERHII KHOLODNIUK
  • 2. Serhii Kholodniuk Senior Big Data Engineer Sigma Software Ukraine Kyiv office My interest and goals: • interested in designing and developing data platforms for the needs of business intelligence and machine learning. • constantly looking for opportunities to simplify and optimize solutions, their implementation and maintenance. • client value oriented. Mastering GCP: • currently building data platform in GCP • migrating data pipelines in to GCP infrastructure • optimizing data warehouse structure
  • 3. AGENDA — 3 Why GCP becomes popular . . . . . . . . . . . . . . . . . . . . . . . . . . . 04 Migration phases . . . . . . . . . . . . . . . . . . . . . . . . . . . 09 Pipelines migration . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Schema and data migration . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Data storages . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
  • 4. WHY GCP BECOMES POPULAR? — 4 Cloud Infrastructure Network Cloud sustainability Data cloud Security out of box (encrypt data at rest and in transit) Powerful BigQuery features with ergonomic design Provide cloud infrastructure for all data needs Customized solutions for different industries Provide best practices industry solutions Artificial intelligence solutions Prebuilt ML model APIs Custom Model Building with SQL in BigQuery ML Custom Model Building with Cloud AutoML
  • 5. CLOUD INFRASTRUCTURE — 5 Network 29 regions 88 availability zones 146 edge locations Cloud sustainability 100% renewable energy for all cloud regions 81% waste diverted from landfills 2x more efficient thana typical enterprise data center
  • 6. DATA CLOUD — 6 Security out of box (encrypt data at rest and in transit) Provide cloud infrastructure for all data needs Powerful BigQuery features with ergonomic design
  • 7. CUSTOMIZED SOLUTIONS FOR DIFFERENT INDUSTRIES — 7 Provide best practices industry solutions Industry solutions Retail Consumer packaged goods Manufacturing Automotive Supply chain and logistics Energy Healthcare and life sciences Media and entertainment Gaming Telecommunications Financial services Financial services Capital markets Government and public sector Government State and local government Federal government Education
  • 9. MIGRATION PHASES — 9 1. Pre-migration phase • complete inventory of workloads and stuff to be migrated • calculate Total Cost of Ownership and future business value • build a use case backlog • select use cases for iteration 2. Migration phase • schema migration • pipelines migration • data migration 3. Post-migration phase • cost and performance optimization • schema denormalization for BigQuery • removing nested and repeated schema fields • clustering and partitioning • slots reservation for BigQuery
  • 10. ITERATIVE APPROACH IN AGILE WAY — 10 Prioritize use case backlog Select use cases for iteration Execution Release 1. Setup and data governance 2. Migrate schema and data 3. Translate queries 4. Migrate services and apps 5. Migrate data pipelines 6. Optimise perfomance 7. Verify and validate Next iteration
  • 11. PIPELINES MIGRATION — 11 Cloud Composer Cloud Dataflow Cloud Dataproc Cloud Compute Engine WHAT TO CHOOSE?
  • 12. DATAFLOW vs DATAPROC — 12 Cloud Dataproc Cloud Dataflow Recommended for: New data processing pipelines, unified batch and streaming Existing Hadoop/Spark applications, machine learning/data science ecosystem, large- batch jobs, preemptible VMs New data processing pipelines, unified batch and streaming Fully-managed: No Yes Managed by: DevOps Serverless Auto-scaling: Yes, based on cluster utilization (reactive) Yes, transform-by-transform (adaptive) Expertise: Hadoop, Hive, Pig, Apache Big Data ecosystem, Spark, Flink, Presto, Druid Apache Beam
  • 13. DATAFLOW vs SPARK SERVERLESS — 13 Spark Serverless Cloud Dataflow Recommended for: New data processing pipelines, unified batch existing Spark applications (from Spark 3.2), machine learning/data science ecosystem, large-batch jobs New data processing pipelines, unified batch and streaming Fully-managed: Yes Yes Managed by: Serverless Serverless Auto-scaling: Yes, transform-by-transform (adaptive) Yes, transform-by-transform (adaptive) Expertise: Pyspark, Spark SQL, Spark R, Spark Java/Scala Apache Beam
  • 14. SCHEMA AND DATA MIGRATION — 14 Database Migration Service – helps migrating MySQL and PostgresSQL to CloudSQL BigQuery Data Transfer Service Google recommends loading large data volumes by using Cloud Storage Transfer Service, and preferable are Avro, Parquet or ORC format rather than CSV or JSON For migration stratagies for Oracle workloads: rehost (by Bare Metal Solution), replatform, rewrite Hbase to Bigtable migration path: HDFS -> Cloud Storage -> Storage Transfer Service -> Bigtable
  • 15. DATA STORES FOR DIFFERENT USE CASES — 15 Data Unstructured Structured Cloud Storage Transactional workloads Data analytics workloads Millisecond latency Latency in seconds Cloud Bigtable BigQuery Firestore NoSQL SQL One database enough Horisontal scalability Cloud SQL Cloud Spanner