SlideShare a Scribd company logo
Developing Enterprise
Consciousness: Building
Modern Open Data Platforms
Rahul Singh, CEO
Rahul Singh
■ Built hosting companies and data centers in high-school.
(Servers, Switches, DNS, etc.)
■ Built and managed CMS/KMS, Portals, SaaS apps for clients
(.NET, Java, SQL Server, MySQL).
■ Dove deep into big data to get better at enterprise search for
massive knowledge / content systems. (Spark, Scala,
Cassandra, Solr, Elastic)
■ Focus now on global scale real-time data platforms for large
organizations or organizations with a large audience.
■ Published Cassandra.Link, Cassandra.Tools, more on the way.
■ Playbook
■ Design
■ Framework
■ Approach
Presentation Agenda
■ Use Cases
■ Migration
■ Standard Data Fabric
■ Cloud vs. Open Core
Business / Platform Dream
Enterprise Consciousness :
- People
- Processes,
- Information
- Systems
Connected / Synchronized.
Business has been chasing this
dream for a while. As technologies
improve, this becomes more
accessible.
Image Source: Digital Business
Technology Platforms, Gartner 2016
Going Beyond “Reactive Manifesto” / 12 Factor
References: https:/
/12factor.net/,
https:/
/www.reactivemanifesto.org/
- Current Business Information is available to People in the swiftest way
possible within the bounds of reasonable costs.
- Business Information is generally available to the enterprise, siloed only by
security and governance.
- Data platforms make use of appropriate resources for hot vs. cold, raw vs.
enhanced data.
- Data platforms are always available, redundant, always trying to achieve a
RPO/RTO of zero.
Challenges of Managing
Data Platforms in a
Growing Enterprise
Phases of Business Modularity
Business
Silos
Standardized
Platform
Optimized
Core
Business
Modularity
Optimized Core enabled Business Modularity
This process needs to
be done in sequence.
Otherwise we end up
having to redo the
work.
Generic Data Platform Operations
How Distributed Data Helps Transformation
XDCR: Cross datacenter
replication is the ultimate
data fabric.
Resilience, performance,
availability, and scale.
Made widely available by
Cassandra and Couchbase,
expanded and accelerated
by ScyllaDB
Modern Open Data
Platform
Design
Contexts
Responsibilities
Approach
Framework
Tools
So Many Different “Modern Stacks?”
Lots of “reference” architectures
available. They tend not to think about the
speed layer since they are focusing on
batch. What about SPEED?
How Do You Choose From the Landscape?
Lots and lots of components in the Data &
AI Landscape. Which ones are the right
ones for your business?
Playbook for Modern Open Data Platform
Platform Design
Discovery (Inventory)
- People
- Process
- Information (Objects)
- Systems (Apps)
Evaluate Framework
User Experience
- No-Code/Low Code Apps/Form Builders
- Automatic API Generator/Platform
- Customer App/API Framework
Cloud
- Public
- Private
- Hybrid
Data
- Data:Object
- Data:Stream
- Data:Table
- Data:Index
- Processor:Batch
- Processor:Stream
DevOps
- Infrastructure as Code
- Systems Automation
- Application CICD
DataOps
- ETL/ELT/EtLT
- Reverse ETL
- Orchestration
Execute Approach
Architecture (Design)
- Cloud
- Data
- DevOps
- DataOps
Engineering
- Configuration
- Scripting
- Programming
Operation
- Setup / Deploy
- Monitoring/Alerts
- Administration
Framework
Design
Distributed
Realtime
Extendable / Open
Automated
Monitored / Managed
Public Cloud Native - Amazon
100% Serverless Data Platform Architecture on Amazon AWS
Public Cloud Native - Microsoft
100% Azure + Azure Databricks Data
Platform Architecture on Microsoft
Azure Cloud
Public Cloud Native - Google
100% GCP / Google
Data Platform
Architecture on GCP
Use Case: Optimizing
Distributed Data with
Cloud vs. Open Core
Open Core Distributed Data Platform
To create globally distributed and real time platforms, we need to use
distributed realtime technologies to build your platform. Here are some.
Which ones should you choose?
Open Core Data Modernization / Automation
/Integration
In addition to vastly scalable tools, there are also modern
innovations that can help teams automate and maximize
human capital by making data platform management easier.
Framework Components
■ Major Components
■ Persistent Queues (RAM/BUS)
■ Queue Processing & Compute (CPU)
■ Persistent Storage (DISK/RAM)
■ Reporting Engine (Display)
■ Orchestration Framework (Motherboard)
■ Scheduler (Operating System)
■ Strategies
■ Cloud Native on Google
■ Self-Managed Open Source
■ Self-Managed Commercial Source
■ Managed Commercial Source
Customers want options, so we decided to create a
Framework that can scale with whatever Infrastructure
and Software strategy they want to use.
24
Framework
Approach
Approach
Setup
Training
Administration
Configuration
Knowledge
Approach
Sample STACK Outline
Framework
Platform
Components
Resources
Platform
Setup
Training
Administration
Configuration
Knowledge
● Components
○ Infrastructure
■ Source / Git
■ Github
■ Gitlab
■ Cloud / Public
■ AWS
■ Azure
■ GCP
■ DO
■ Orchestration
■ Terraform
■ Terraform / Atlanits
■ Configuration
■ Ansible
■ Ansible / AWX / Semaphore
○ Compute
■ Datastax / Spark
■ Datastax / Livy
■ Databricks
○ Data / Open Core
■ Datastax Enterprise
■ Cassandra
■ Search / Solr
■ Graph
■ Confluent Platform
○ Data / Cloud
■ Datastax / Astra
■ Confluent Cloud
○ Data / Open Source
■ Cassandra
■ Kafka
■ Elassandra
■ YugaByteDB
■ ScyllaDB
■ Pulsar
○ Application
■ Airflow
■ Airbyte
■ Kafka Streams
■ Jupyter
■ Redash
■ Metabase
■ Superset
■ Zeppelin
Use Case: Standard
Data Fabric
How ScyllaDB Allows Us To Go Further…
All the benefits of XDCR and ….
- More Data Density at High Speed /
Multiple Workloads on the Same
Datacenter
- Better Memory / CPU management
due to C++ Seastar Framework,
Faster Caches
- CQL Queries to support Non
Relational / C* CQL like queries.
- DynamoDB Queries to support
legacy Dynamo
- Transactions/Consistency
- …
Let’s Get Data Into ScyllaDB - Easier
Today
Open Source:
- Airbyte / RudderStack makes
ETL Easier and are open source
- Kafka Connect / Pulsar IO can
convert ETL into Streaming ETL
SaaS/PaaS:
- SaaS like Stitch/HevoData/Make
- Supported versions of Airbyte/RudderStack
Once It’s There, Serve it, Do More
Processing
Open Source:
- Flink / Spark / Kafka Streams
can be used to save Analytics /
ML processed data.
- Accelerator can help serve data
as DynamoDB via REST.
- Several GraphQL Solutions
Available
Let’s Send It Back via Reverse ETL!
Open Source:
- Grouparoo / Airbyte ,
RudderStack are free. Others
are paid.
- You can always use Kafka
Connect / Pulsar IO to send
data back also.
Reverse ETL is the process of copying data from a warehouse into business applications like CRM,
analytics, and marketing automation software. You perform this process by using a reverse ETL tool
that integrates with your data source and your business SaaS tools.
- Segment Blog
Let’s Put It All Together Now - ONE DATA
FABRIC
Still need design, but
hopefully less
useless plumbing
code.
One cluster, many workloads.
With any other pure relational
database, this would be
problematic. With ScyllaDB, this
is a core feature.
Key Takeaways for Open Data Platforms
Don’t reinvent the wheel.
Identify the Objectives
Prioritize DevOps / DataOps
Use open tools that are well
supported
Document the STACK
- Identify the objectives so that you
know what success looks like.
- DevOps / DataOps combined with a
true agile approach allows you to
iterate your platform quickly.
- Put the data into ScyllaDB, and
possibly archive it into
Parquet/Iceberg (historical data)
- Get the data out to your Systems using
“Reverse ETL” tools.
Thank You and Dream Big
Check us out
- Design Workshops
- Innovation Sprints
- Service Catalog
- Big Data DLM Toolkit
Anant.us
- Read our Playbook
- Join our Mailing List
- Read up on Data Platforms
- Watch our Videos
- Download Examples
Weekly Webinars
- Data Engineer’s Lunch
- Cassandra Lunch
Thank You
Stay in Touch
Rahul Singh
rahul.singh@anant.us
@xingh
xingh
https://www.linkedin.com/in/xingh
Ad

More Related Content

Similar to Developing Enterprise Consciousness: Building Modern Open Data Platforms (20)

Big Data Analytics PPT - S1 working .pptx
Big Data Analytics PPT - S1 working .pptxBig Data Analytics PPT - S1 working .pptx
Big Data Analytics PPT - S1 working .pptx
VivekChaurasia43
 
Bigdata.sunil_6+yearsExp
Bigdata.sunil_6+yearsExpBigdata.sunil_6+yearsExp
Bigdata.sunil_6+yearsExp
bigdata sunil
 
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessData Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Anant Corporation
 
Journey to SAS Analytics Grid with SAS, R, Python
Journey to SAS Analytics Grid with SAS, R, PythonJourney to SAS Analytics Grid with SAS, R, Python
Journey to SAS Analytics Grid with SAS, R, Python
Sumit Sarkar
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
Hortonworks
 
Above the cloud joarder kamal
Above the cloud   joarder kamalAbove the cloud   joarder kamal
Above the cloud joarder kamal
Joarder Kamal
 
Big Data on Azure Tutorial
Big Data on Azure TutorialBig Data on Azure Tutorial
Big Data on Azure Tutorial
rustd
 
Melbourne: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cl...
Melbourne: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cl...Melbourne: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cl...
Melbourne: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cl...
Certus Solutions
 
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
DataStax
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptx
ElsonPaul2
 
Accelerate Big Data Application Development with Cascading
Accelerate Big Data Application Development with CascadingAccelerate Big Data Application Development with Cascading
Accelerate Big Data Application Development with Cascading
Cascading
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
James Serra
 
Data Modernization_Harinath Susairaj.pptx
Data Modernization_Harinath Susairaj.pptxData Modernization_Harinath Susairaj.pptx
Data Modernization_Harinath Susairaj.pptx
ArunPandiyan890855
 
Developing and deploying AI solutions on the cloud using Team Data Science Pr...
Developing and deploying AI solutions on the cloud using Team Data Science Pr...Developing and deploying AI solutions on the cloud using Team Data Science Pr...
Developing and deploying AI solutions on the cloud using Team Data Science Pr...
Debraj GuhaThakurta
 
2017 OpenWorld Keynote for Data Integration
2017 OpenWorld Keynote for Data Integration2017 OpenWorld Keynote for Data Integration
2017 OpenWorld Keynote for Data Integration
Jeffrey T. Pollock
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
markgrover
 
Microsoft R Server for Data Sciencea
Microsoft R Server for Data ScienceaMicrosoft R Server for Data Sciencea
Microsoft R Server for Data Sciencea
Data Science Thailand
 
Big Data Simplified - Is all about Ab'strakSHeN
Big Data Simplified - Is all about Ab'strakSHeNBig Data Simplified - Is all about Ab'strakSHeN
Big Data Simplified - Is all about Ab'strakSHeN
DataWorks Summit
 
USQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventUSQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake Event
Trivadis
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?
James Serra
 
Big Data Analytics PPT - S1 working .pptx
Big Data Analytics PPT - S1 working .pptxBig Data Analytics PPT - S1 working .pptx
Big Data Analytics PPT - S1 working .pptx
VivekChaurasia43
 
Bigdata.sunil_6+yearsExp
Bigdata.sunil_6+yearsExpBigdata.sunil_6+yearsExp
Bigdata.sunil_6+yearsExp
bigdata sunil
 
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessData Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Anant Corporation
 
Journey to SAS Analytics Grid with SAS, R, Python
Journey to SAS Analytics Grid with SAS, R, PythonJourney to SAS Analytics Grid with SAS, R, Python
Journey to SAS Analytics Grid with SAS, R, Python
Sumit Sarkar
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
Hortonworks
 
Above the cloud joarder kamal
Above the cloud   joarder kamalAbove the cloud   joarder kamal
Above the cloud joarder kamal
Joarder Kamal
 
Big Data on Azure Tutorial
Big Data on Azure TutorialBig Data on Azure Tutorial
Big Data on Azure Tutorial
rustd
 
Melbourne: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cl...
Melbourne: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cl...Melbourne: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cl...
Melbourne: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cl...
Certus Solutions
 
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
DataStax
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptx
ElsonPaul2
 
Accelerate Big Data Application Development with Cascading
Accelerate Big Data Application Development with CascadingAccelerate Big Data Application Development with Cascading
Accelerate Big Data Application Development with Cascading
Cascading
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
James Serra
 
Data Modernization_Harinath Susairaj.pptx
Data Modernization_Harinath Susairaj.pptxData Modernization_Harinath Susairaj.pptx
Data Modernization_Harinath Susairaj.pptx
ArunPandiyan890855
 
Developing and deploying AI solutions on the cloud using Team Data Science Pr...
Developing and deploying AI solutions on the cloud using Team Data Science Pr...Developing and deploying AI solutions on the cloud using Team Data Science Pr...
Developing and deploying AI solutions on the cloud using Team Data Science Pr...
Debraj GuhaThakurta
 
2017 OpenWorld Keynote for Data Integration
2017 OpenWorld Keynote for Data Integration2017 OpenWorld Keynote for Data Integration
2017 OpenWorld Keynote for Data Integration
Jeffrey T. Pollock
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
markgrover
 
Microsoft R Server for Data Sciencea
Microsoft R Server for Data ScienceaMicrosoft R Server for Data Sciencea
Microsoft R Server for Data Sciencea
Data Science Thailand
 
Big Data Simplified - Is all about Ab'strakSHeN
Big Data Simplified - Is all about Ab'strakSHeNBig Data Simplified - Is all about Ab'strakSHeN
Big Data Simplified - Is all about Ab'strakSHeN
DataWorks Summit
 
USQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventUSQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake Event
Trivadis
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?
James Serra
 

More from ScyllaDB (20)

Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
ScyllaDB
 
Leading a High-Stakes Database Migration
Leading a High-Stakes Database MigrationLeading a High-Stakes Database Migration
Leading a High-Stakes Database Migration
ScyllaDB
 
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
Achieving Extreme Scale with ScyllaDB: Tips & TradeoffsAchieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
ScyllaDB
 
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
ScyllaDB
 
How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham
How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn IsarathamHow Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham
How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham
ScyllaDB
 
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd ColemanHow Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
ScyllaDB
 
ScyllaDB: 10 Years and Beyond by Dor Laor
ScyllaDB: 10 Years and Beyond by Dor LaorScyllaDB: 10 Years and Beyond by Dor Laor
ScyllaDB: 10 Years and Beyond by Dor Laor
ScyllaDB
 
Reduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
Reduce Your Cloud Spend with ScyllaDB by Tzach LivyatanReduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
Reduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
ScyllaDB
 
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence LiuMigrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
ScyllaDB
 
Vector Search with ScyllaDB by Szymon Wasik
Vector Search with ScyllaDB by Szymon WasikVector Search with ScyllaDB by Szymon Wasik
Vector Search with ScyllaDB by Szymon Wasik
ScyllaDB
 
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
ScyllaDB
 
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
ScyllaDB
 
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
ScyllaDB
 
Object Storage in ScyllaDB by Ran Regev, ScyllaDB
Object Storage in ScyllaDB by Ran Regev, ScyllaDBObject Storage in ScyllaDB by Ran Regev, ScyllaDB
Object Storage in ScyllaDB by Ran Regev, ScyllaDB
ScyllaDB
 
Lessons Learned from Building a Serverless Notifications System by Srushith R...
Lessons Learned from Building a Serverless Notifications System by Srushith R...Lessons Learned from Building a Serverless Notifications System by Srushith R...
Lessons Learned from Building a Serverless Notifications System by Srushith R...
ScyllaDB
 
A Dist Sys Programmer's Journey into AI by Piotr Sarna
A Dist Sys Programmer's Journey into AI by Piotr SarnaA Dist Sys Programmer's Journey into AI by Piotr Sarna
A Dist Sys Programmer's Journey into AI by Piotr Sarna
ScyllaDB
 
High Availability: Lessons Learned by Paul Preuveneers
High Availability: Lessons Learned by Paul PreuveneersHigh Availability: Lessons Learned by Paul Preuveneers
High Availability: Lessons Learned by Paul Preuveneers
ScyllaDB
 
How Natura Uses ScyllaDB and ScyllaDB Connector to Create a Real-time Data Pi...
How Natura Uses ScyllaDB and ScyllaDB Connector to Create a Real-time Data Pi...How Natura Uses ScyllaDB and ScyllaDB Connector to Create a Real-time Data Pi...
How Natura Uses ScyllaDB and ScyllaDB Connector to Create a Real-time Data Pi...
ScyllaDB
 
Persistence Pipelines in a Processing Graph: Mutable Big Data at Salesforce b...
Persistence Pipelines in a Processing Graph: Mutable Big Data at Salesforce b...Persistence Pipelines in a Processing Graph: Mutable Big Data at Salesforce b...
Persistence Pipelines in a Processing Graph: Mutable Big Data at Salesforce b...
ScyllaDB
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
ScyllaDB
 
Leading a High-Stakes Database Migration
Leading a High-Stakes Database MigrationLeading a High-Stakes Database Migration
Leading a High-Stakes Database Migration
ScyllaDB
 
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
Achieving Extreme Scale with ScyllaDB: Tips & TradeoffsAchieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
ScyllaDB
 
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
ScyllaDB
 
How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham
How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn IsarathamHow Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham
How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham
ScyllaDB
 
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd ColemanHow Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
ScyllaDB
 
ScyllaDB: 10 Years and Beyond by Dor Laor
ScyllaDB: 10 Years and Beyond by Dor LaorScyllaDB: 10 Years and Beyond by Dor Laor
ScyllaDB: 10 Years and Beyond by Dor Laor
ScyllaDB
 
Reduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
Reduce Your Cloud Spend with ScyllaDB by Tzach LivyatanReduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
Reduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
ScyllaDB
 
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence LiuMigrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
ScyllaDB
 
Vector Search with ScyllaDB by Szymon Wasik
Vector Search with ScyllaDB by Szymon WasikVector Search with ScyllaDB by Szymon Wasik
Vector Search with ScyllaDB by Szymon Wasik
ScyllaDB
 
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
ScyllaDB
 
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
ScyllaDB
 
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
ScyllaDB
 
Object Storage in ScyllaDB by Ran Regev, ScyllaDB
Object Storage in ScyllaDB by Ran Regev, ScyllaDBObject Storage in ScyllaDB by Ran Regev, ScyllaDB
Object Storage in ScyllaDB by Ran Regev, ScyllaDB
ScyllaDB
 
Lessons Learned from Building a Serverless Notifications System by Srushith R...
Lessons Learned from Building a Serverless Notifications System by Srushith R...Lessons Learned from Building a Serverless Notifications System by Srushith R...
Lessons Learned from Building a Serverless Notifications System by Srushith R...
ScyllaDB
 
A Dist Sys Programmer's Journey into AI by Piotr Sarna
A Dist Sys Programmer's Journey into AI by Piotr SarnaA Dist Sys Programmer's Journey into AI by Piotr Sarna
A Dist Sys Programmer's Journey into AI by Piotr Sarna
ScyllaDB
 
High Availability: Lessons Learned by Paul Preuveneers
High Availability: Lessons Learned by Paul PreuveneersHigh Availability: Lessons Learned by Paul Preuveneers
High Availability: Lessons Learned by Paul Preuveneers
ScyllaDB
 
How Natura Uses ScyllaDB and ScyllaDB Connector to Create a Real-time Data Pi...
How Natura Uses ScyllaDB and ScyllaDB Connector to Create a Real-time Data Pi...How Natura Uses ScyllaDB and ScyllaDB Connector to Create a Real-time Data Pi...
How Natura Uses ScyllaDB and ScyllaDB Connector to Create a Real-time Data Pi...
ScyllaDB
 
Persistence Pipelines in a Processing Graph: Mutable Big Data at Salesforce b...
Persistence Pipelines in a Processing Graph: Mutable Big Data at Salesforce b...Persistence Pipelines in a Processing Graph: Mutable Big Data at Salesforce b...
Persistence Pipelines in a Processing Graph: Mutable Big Data at Salesforce b...
ScyllaDB
 
Ad

Recently uploaded (20)

Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Ad

Developing Enterprise Consciousness: Building Modern Open Data Platforms

  • 1. Developing Enterprise Consciousness: Building Modern Open Data Platforms Rahul Singh, CEO
  • 2. Rahul Singh ■ Built hosting companies and data centers in high-school. (Servers, Switches, DNS, etc.) ■ Built and managed CMS/KMS, Portals, SaaS apps for clients (.NET, Java, SQL Server, MySQL). ■ Dove deep into big data to get better at enterprise search for massive knowledge / content systems. (Spark, Scala, Cassandra, Solr, Elastic) ■ Focus now on global scale real-time data platforms for large organizations or organizations with a large audience. ■ Published Cassandra.Link, Cassandra.Tools, more on the way.
  • 3. ■ Playbook ■ Design ■ Framework ■ Approach Presentation Agenda ■ Use Cases ■ Migration ■ Standard Data Fabric ■ Cloud vs. Open Core
  • 4. Business / Platform Dream Enterprise Consciousness : - People - Processes, - Information - Systems Connected / Synchronized. Business has been chasing this dream for a while. As technologies improve, this becomes more accessible. Image Source: Digital Business Technology Platforms, Gartner 2016
  • 5. Going Beyond “Reactive Manifesto” / 12 Factor References: https:/ /12factor.net/, https:/ /www.reactivemanifesto.org/ - Current Business Information is available to People in the swiftest way possible within the bounds of reasonable costs. - Business Information is generally available to the enterprise, siloed only by security and governance. - Data platforms make use of appropriate resources for hot vs. cold, raw vs. enhanced data. - Data platforms are always available, redundant, always trying to achieve a RPO/RTO of zero.
  • 6. Challenges of Managing Data Platforms in a Growing Enterprise
  • 7. Phases of Business Modularity Business Silos Standardized Platform Optimized Core Business Modularity Optimized Core enabled Business Modularity This process needs to be done in sequence. Otherwise we end up having to redo the work.
  • 9. How Distributed Data Helps Transformation XDCR: Cross datacenter replication is the ultimate data fabric. Resilience, performance, availability, and scale. Made widely available by Cassandra and Couchbase, expanded and accelerated by ScyllaDB
  • 12. So Many Different “Modern Stacks?” Lots of “reference” architectures available. They tend not to think about the speed layer since they are focusing on batch. What about SPEED?
  • 13. How Do You Choose From the Landscape? Lots and lots of components in the Data & AI Landscape. Which ones are the right ones for your business?
  • 14. Playbook for Modern Open Data Platform Platform Design Discovery (Inventory) - People - Process - Information (Objects) - Systems (Apps) Evaluate Framework User Experience - No-Code/Low Code Apps/Form Builders - Automatic API Generator/Platform - Customer App/API Framework Cloud - Public - Private - Hybrid Data - Data:Object - Data:Stream - Data:Table - Data:Index - Processor:Batch - Processor:Stream DevOps - Infrastructure as Code - Systems Automation - Application CICD DataOps - ETL/ELT/EtLT - Reverse ETL - Orchestration Execute Approach Architecture (Design) - Cloud - Data - DevOps - DataOps Engineering - Configuration - Scripting - Programming Operation - Setup / Deploy - Monitoring/Alerts - Administration
  • 17. Public Cloud Native - Amazon 100% Serverless Data Platform Architecture on Amazon AWS
  • 18. Public Cloud Native - Microsoft 100% Azure + Azure Databricks Data Platform Architecture on Microsoft Azure Cloud
  • 19. Public Cloud Native - Google 100% GCP / Google Data Platform Architecture on GCP
  • 20. Use Case: Optimizing Distributed Data with Cloud vs. Open Core
  • 21. Open Core Distributed Data Platform To create globally distributed and real time platforms, we need to use distributed realtime technologies to build your platform. Here are some. Which ones should you choose?
  • 22. Open Core Data Modernization / Automation /Integration In addition to vastly scalable tools, there are also modern innovations that can help teams automate and maximize human capital by making data platform management easier.
  • 23. Framework Components ■ Major Components ■ Persistent Queues (RAM/BUS) ■ Queue Processing & Compute (CPU) ■ Persistent Storage (DISK/RAM) ■ Reporting Engine (Display) ■ Orchestration Framework (Motherboard) ■ Scheduler (Operating System) ■ Strategies ■ Cloud Native on Google ■ Self-Managed Open Source ■ Self-Managed Commercial Source ■ Managed Commercial Source Customers want options, so we decided to create a Framework that can scale with whatever Infrastructure and Software strategy they want to use.
  • 28. Sample STACK Outline Framework Platform Components Resources Platform Setup Training Administration Configuration Knowledge ● Components ○ Infrastructure ■ Source / Git ■ Github ■ Gitlab ■ Cloud / Public ■ AWS ■ Azure ■ GCP ■ DO ■ Orchestration ■ Terraform ■ Terraform / Atlanits ■ Configuration ■ Ansible ■ Ansible / AWX / Semaphore ○ Compute ■ Datastax / Spark ■ Datastax / Livy ■ Databricks ○ Data / Open Core ■ Datastax Enterprise ■ Cassandra ■ Search / Solr ■ Graph ■ Confluent Platform ○ Data / Cloud ■ Datastax / Astra ■ Confluent Cloud ○ Data / Open Source ■ Cassandra ■ Kafka ■ Elassandra ■ YugaByteDB ■ ScyllaDB ■ Pulsar ○ Application ■ Airflow ■ Airbyte ■ Kafka Streams ■ Jupyter ■ Redash ■ Metabase ■ Superset ■ Zeppelin
  • 30. How ScyllaDB Allows Us To Go Further… All the benefits of XDCR and …. - More Data Density at High Speed / Multiple Workloads on the Same Datacenter - Better Memory / CPU management due to C++ Seastar Framework, Faster Caches - CQL Queries to support Non Relational / C* CQL like queries. - DynamoDB Queries to support legacy Dynamo - Transactions/Consistency - …
  • 31. Let’s Get Data Into ScyllaDB - Easier Today Open Source: - Airbyte / RudderStack makes ETL Easier and are open source - Kafka Connect / Pulsar IO can convert ETL into Streaming ETL SaaS/PaaS: - SaaS like Stitch/HevoData/Make - Supported versions of Airbyte/RudderStack Once It’s There, Serve it, Do More Processing Open Source: - Flink / Spark / Kafka Streams can be used to save Analytics / ML processed data. - Accelerator can help serve data as DynamoDB via REST. - Several GraphQL Solutions Available Let’s Send It Back via Reverse ETL! Open Source: - Grouparoo / Airbyte , RudderStack are free. Others are paid. - You can always use Kafka Connect / Pulsar IO to send data back also. Reverse ETL is the process of copying data from a warehouse into business applications like CRM, analytics, and marketing automation software. You perform this process by using a reverse ETL tool that integrates with your data source and your business SaaS tools. - Segment Blog
  • 32. Let’s Put It All Together Now - ONE DATA FABRIC Still need design, but hopefully less useless plumbing code. One cluster, many workloads. With any other pure relational database, this would be problematic. With ScyllaDB, this is a core feature.
  • 33. Key Takeaways for Open Data Platforms Don’t reinvent the wheel. Identify the Objectives Prioritize DevOps / DataOps Use open tools that are well supported Document the STACK - Identify the objectives so that you know what success looks like. - DevOps / DataOps combined with a true agile approach allows you to iterate your platform quickly. - Put the data into ScyllaDB, and possibly archive it into Parquet/Iceberg (historical data) - Get the data out to your Systems using “Reverse ETL” tools.
  • 34. Thank You and Dream Big Check us out - Design Workshops - Innovation Sprints - Service Catalog - Big Data DLM Toolkit Anant.us - Read our Playbook - Join our Mailing List - Read up on Data Platforms - Watch our Videos - Download Examples Weekly Webinars - Data Engineer’s Lunch - Cassandra Lunch
  • 35. Thank You Stay in Touch Rahul Singh [email protected] @xingh xingh https://www.linkedin.com/in/xingh