SlideShare a Scribd company logo
HBase @ Salesforce
Lars Hofhansl
Architect, Father, Meditator,Aikido Blackbelt
http://hadoop-hbase.blogspot.com
Safe harbor statement under the Private Securities Litigation Reform Act of 1995:
This presentation may contain forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties
materialize or if any of the assumptions proves incorrect, the results of salesforce.com, inc. could differ materially from the results
expressed or implied by the forward-looking statements we make. All statements other than statements of historical fact could be
deemed forward-looking, including any projections of product or service availability, subscriber growth, earnings, revenues, or other
financial items and any statements regarding strategies or plans of management for future operations, statements of belief, any
statements concerning new, planned, or upgraded services or technology developments and customer contracts or use of our services.
The risks and uncertainties referred to above include – but are not limited to – risks associated with developing and delivering new
functionality for our service, new products and services, our new business model, our past operating losses, possible fluctuations in our
operating results and rate of growth, interruptions or delays in our Web hosting, breach of our security measures, the outcome of any
litigation, risks associated with completed and any possible mergers and acquisitions, the immature market in which we operate, our
relatively limited operating history, our ability to expand, retain, and motivate our employees and manage our growth, new releases of our
service and successful customer deployment, our limited history reselling non-salesforce.com products, and utilization and selling to
larger enterprise customers. Further information on potential factors that could affect the financial results of salesforce.com, inc. is
included in our annual report on Form 10-K for the most recent fiscal year and in our quarterly report on Form 10-Q for the most recent
fiscal quarter. These documents and others containing important disclosures are available on the SEC Filings section of the Investor
Information section of our Web site.
Any unreleased services or features referenced in this or other presentations, press releases or public statements are not currently
available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions
based upon features that are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these
forward-looking statements.
Safe Harbor
Why HBase?
• SAN
• RDBMS
• Transactions
Zookeeper?
Commodity
Hardware?
HBase?
HDFS?Unstructured
Data?
A. Why HBase?
B. Interacting with the open source community
C. HBase at Salesforce
Size Matters*
New Salesforce customer:
•“How many rows do you have?”
•We will turn folks away if they have too many!
Data Storage is expensive:
•SAN storage
•Relational Database
•Too many rows  Too expensive
* In a relational world
What if in the future we:
… and have cheaper storage?
… and never need to ask again
about the number of rows?
… grow with the data by just
adding more machines?
(Disclaimer: no transactions, no joins, no 2nd’ary indexes, …)
(A quick note about) Relational Databases
• We love them. They are core to our infrastructure.
• SQL and NoSQL NoACID are complementary.
• (Almost) everything we do is SQL based (see Phoenix – the SQL layer for HBase.)
The Search - Requirements
• Consistent
– “Eventually consistent stores are 100% consistent 99% of the time” – Ian Varley
• Scalable
– No “features” impeding horizontal scaling
• Persistent
– Duh...?
• Key lookups
• Range lookups
• Open source (ASL great, GPLv2 OK, GPLv3/AGPL not acceptable)
Enter HBase
“A Sparse, Consistent, Distributed,
Multidimensional, Persistent, Sorted Map”
Salesforce and the HBase Community
To Fork or not to Fork – that is the question
Fork - pros
• Agility. No waiting for community review. Just get stuff done
• Freedom. Patches that might not be acceptable to the community
Fork - cons
• Lose out on community work
• Patches not useful to other parties
There is no right or wrong. It’s a matter of choice, taste, and requirements.
HBase Development @ Salesforce
• No fork of HBase.
• No fork of HBase.
• Internal HBase/HDFS branch for possible emergency fixes
• All fixes are cleaned and contributed back
• We switch to the next open source point release periodically
PMC member, 2 committers, release manager, contributors
HBASE-11042 HBASE-11040 HBASE-11037 HBASE-11030 HBASE-11029 HBASE-11024 HBASE-11022 HBASE-
11010 HBASE-10996 HBASE-10989 HBASE-10988 HBASE-10987 HBASE-10982 HBASE-10969 HBASE-10847
HBASE-10805 HBASE-10722 HBASE-10706 HBASE-10642 HBASE-10594 HBASE-10562 HBASE-10551
HBASE-10546 HBASE-10505 HBASE-10501 HBASE-10489 HBASE-10470 HBASE-10420 HBASE-10416
HBASE-10383 HBASE-10363 HBASE-10320 HBASE-10317 HBASE-10286 HBASE-10284 HBASE-10281
HBASE-10279 HBASE-10259 HBASE-10257 HBASE-10250 HBASE-10181 HBASE-10117 HBASE-10076
HBASE-10058 HBASE-10057 HBASE-10015 HBASE-9993 HBASE-9971 HBASE-9956 HBASE-9915 HBASE-
9865 HBASE-9834 HBASE-9807 HBASE-9799 HBASE-9789 HBASE-9778 HBASE-9751 HBASE-9749 HBASE-
9732 HBASE-9731 HBASE-9711 HBASE-9658 HBASE-9584 HBASE-9566 HBASE-9534 HBASE-9429 HBASE-
9428 HBASE-9377 HBASE-9356 HBASE-9344 HBASE-9301 HBASE-9266 HBASE-9231 HBASE-9221 HBASE-
9186 HBASE-9158 HBASE-9103 HBASE-9097 HBASE-9049 HBASE-8971 HBASE-8945 HBASE-8930 HBASE-
8912 HBASE-8858 HBASE-8809 HBASE-8767 HBASE-8702 HBASE-8698 HBASE-8684 HBASE-8671 HBASE-
8636 HBASE-8525 HBASE-8503 HBASE-8355 HBASE-8316 HBASE-8229 HBASE-8188 HBASE-8166 HBASE-
8151 HBASE-8110 HBASE-8108 HBASE-8055 HBASE-8008 HBASE-7999 HBASE-7947 HBASE-7945 HBASE-
7817 HBASE-7801 HBASE-7729 HBASE-7725 HBASE-7717 HBASE-7709 HBASE-7702 HBASE-7681 HBASE-
7617 HBASE-7602 HBASE-7578 HBASE-7550 HBASE-7499 HBASE-7497 HBASE-7483 HBASE-7466 HBASE-
7465 HBASE-7455 HBASE-7438 HBASE-7435 HBASE-7432 HBASE-7431 HBASE-7417 HBASE-7415 HBASE-
7371 HBASE-7336 HBASE-7293 HBASE-7279 HBASE-7270 HBASE-7252 HBASE-7240 HBASE-7215 HBASE-
7214 HBASE-7180 HBASE-7177 HBASE-7166 HBASE-7165 HBASE-7091 HBASE-7069 HBASE-7051 HBASE-
7047 HBASE-7021 HBASE-7010 HBASE-6996 HBASE-6974
PMC member, 2 committers, release manager, contributors
HBASE-6949 HBASE-6946 HBASE-6912 HBASE-6889 HBASE-6879 HBASE-6868 HBASE-6865 HBASE-6863
HBASE-6797 HBASE-6796 HBASE-6784 HBASE-6765 HBASE-6757 HBASE-6755 HBASE-6711 HBASE-6707
HBASE-6690 HBASE-6667 HBASE-6638 HBASE-6637 HBASE-6621 HBASE-6582 HBASE-6580 HBASE-6579
HBASE-6573 HBASE-6571 HBASE-6570 HBASE-6569 HBASE-6568 HBASE-6561 HBASE-6523 HBASE-6522
HBASE-6505 HBASE-6504 HBASE-6496 HBASE-6495 HBASE-6441 HBASE-6439 HBASE-6427 HBASE-6426
HBASE-6421 HBASE-6406 HBASE-6355 HBASE-6347 HBASE-6326 HBASE-6296 HBASE-6293 HBASE-6291
HBASE-6178 HBASE-6138 HBASE-6113 HBASE-6112 HBASE-6110 HBASE-6087 HBASE-5961 HBASE-5955
HBASE-5909 HBASE-5884 HBASE-5871 HBASE-5865 HBASE-5782 HBASE-5775 HBASE-5774 HBASE-5682
HBASE-5670 HBASE-5659 HBASE-5641 HBASE-5609 HBASE-5604 HBASE-5574 HBASE-5569 HBASE-5548
HBASE-5547 HBASE-5541 HBASE-5526 HBASE-5523 HBASE-5509 HBASE-5497 HBASE-5460 HBASE-5455
HBASE-5440 HBASE-5431 HBASE-5368 HBASE-5350 HBASE-5348 HBASE-5318 HBASE-5304 HBASE-5266
HBASE-5229 HBASE-5203 HBASE-5118 HBASE-5096 HBASE-5088 HBASE-5084 HBASE-5070 HBASE-5058
HBASE-5005 HBASE-5001 HBASE-4998 HBASE-4981 HBASE-4979 HBASE-4945 HBASE-4886 HBASE-4874
HBASE-4870 HBASE-4838 HBASE-4805 HBASE-4800 HBASE-4691 HBASE-4682 HBASE-4673 HBASE-4657
HBASE-4626 HBASE-4605 HBASE-4583 HBASE-4561 HBASE-4559 HBASE-4556 HBASE-4536 HBASE-4517
HBASE-4488 HBASE-4454 HBASE-4439 HBASE-4404 HBASE-4387 HBASE-4347 HBASE-4336 HBASE-4335
HBASE-4334 HBASE-4331 HBASE-4296 HBASE-4283 HBASE-4263 HBASE-4242 HBASE-4241 HBASE-4197
HBASE-4178 HBASE-4171 HBASE-4102 HBASE-4071 HBASE-3661 HBASE-3645 HBASE-3584 HBASE-3443
HBASE-3433 HBASE-3387 HBASE-2947 HBASE-2196 HBASE-2195 HDFS-3979 HDFS-744
Managing HBase 0.94
Established monthly release train for 0.94
Contributed >300 of features, bug fixes, perf improvements
Reviewed 1000’s of open source patches
Committed 100’s of patches
Open Sourced Apache Phoenix – SQL skin on HBase
Salesforce High-level Architecture
Salesforce *is* a database
Salesforce is a Database
Query Parser
Query (SQL)
Parsed Query
Query Optimizer
Plan
Generator
Plan Cost
Estimator
Evaluation Plan
Query Plan Evaluator
System
Catalog
Database
Stats
Tables
Columns
Indexes
Salesforce is a Database
Query Parser
Query (SOQL)
Parsed Query
Query Optimizer
Plan
Generator
Plan Cost
Estimator
System
Catalog
Oracle
Hinted Oracle SQL
Database
Stats
Objects
Fields
Indexes
Salesforce is multi tenant
…pod
Tenant A-D
pod
Tenant E-H
pod
Tenant I-O
pod = a database instance
•Oracle RAC
•AppServers
•Blob store servers
•Search servers
•Shared SAN storage
•SAN replication for DR
App
Server
App
Server
App
Server
App
Server
…
Oracle
Node
Oracle
Node
Oracle
Node
Oracle
Node…
Oracle RAC cluster
Primary Site
Secondary Site
SAN replication
SAN
SAN
SQL/JDBC
Finally: HBase @ Salesforce
Oracle
Hinted Oracle SQL
Query Parser
Query (SOQL)
Parsed Query
Query Optimizer
Plan
Generator
Plan Cost
Estimator
System
Catalog
Database
Stats
Objects
Fields
Indexes
1. External Objects 2. Phoenix SQL
HBaseHBaseHBaseHBase
Where does HBase Fit?
Where does HBase Fit?
•Separate HBase per pod (close to 50 clusters)
•Logically co-located with Oracle
•Small clusters striped across five racks
•Each cluster’s master service on a different rack
•Identical cluster for DR
App
Server
App
Server
App
Server
App
Server
…
Oracle
Node
Oracle
Node
HBase
Node
HBase
Node…
Oracle Cluster
HBase
Node
HBase
Node
HBase
Node …
Primary Site
Secondary Site
DR HBase Cluster
Decentralized
HBase
Replication
SQL/JDBC
via Phoenix
HBase Cluster
…
SAN
SAN
Use Cases
1. Audit Trails (Entity History)
• Identity managed in RDBMS
• Indexed in HBase (Phoenix indexes)
• Historical, immutable data only
• No need to reason about updates, split identities, and transactions
2. Archiving (Data Lifecycle Management)
• Objects (rows) moved to HBase
• Identity managed in HBase after move
• Data immutable in HBase
• No Transactions
3. Live data in HBase (BigObjects)
• Mutable data (possibly)
• Everything managed in HBase
• Still no Transactions, yet
• Platform for other team to use
Merrill Lynch Rationalization Data Governance, Audit & Archive
• First Salesforce Enterprise Customer
• On PlatformArchival compelling versus On Premise
Solution from Informatica
• Retention Requirements for 7 Years
Merrill Lynch
“Data Audit, Governance & Lifecycle management is
critical for Merrill for the entire banking & financial
industry has become a benchmark requirement
Heating, ventilation, and air-conditioning in the EU
• Top 10 Platform Users
• Subject to highly variable data governance and
retention requirements
• Significant SAP footprint driving business rules –
need to connect that to Salesforce data for archival
and data retention needs
• Massive service workforce generates significant data
processing challenges
“The Salesforce.com Platform roadmap for Data Archive is
critical for future data management needs”
MichaelRoehr, CTO Vailliant
BMW Enriches Their Customer Perspective
• Sales Cloud available across all German Dealership
Franchises
• All customer data subject stringent & government
mandated protection, audit & retention
• Correlations with Car Builder App data enables more
contextual customer interactions
• Car Telemetry, used correctly help refine product
evolution and customer needs alignment
“Data driven customer engagement is a
key driver for our enhance customer
experience
System Of Record (SOR)
SOR = HA + DR + Backup + M&M
+ Security
Hbase at Salesforce.com
Highly Available, Disaster Recovery
• Five peer Zookeeper Quorum
• Five Quorum Journals (for fs edits)
• Five HMasters
• Three NameNodes (yes, three, we made a patch to run more than one standby)
• HBase Replication to identical hot standby pod in a different data center
– In the event of a disaster we fail a complete pod to the secondary site
• Weekly automated, unattended rolling restarts
Replication
Backup High-level Architecture
Primary pod
HBase 48h
HDFS
Backup
per tenant
DR pod
HBase 48h
HDFS
Merkle Tree
Verification
Backup
per tenant
Monitoring & Management (M&M)
• Nagios alerts
• Trending via OpenTSDB.
Custom UI on top the time series data.
• Rolling upgrades
– Eventually scheduled and unattended
• Absolutely no unscheduled downtime.
Not even during a rack failure.
A. Why HBase?
B. Interacting with the open source community
C. HBase at Salesforce
Lars Hofhansl
http://hadoop-hbase.blogspot.com
Ad

More Related Content

What's hot (20)

Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache KafkaKafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kai Wähner
 
Stream processing with Apache Flink (Timo Walther - Ververica)
Stream processing with Apache Flink (Timo Walther - Ververica)Stream processing with Apache Flink (Timo Walther - Ververica)
Stream processing with Apache Flink (Timo Walther - Ververica)
KafkaZone
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache Pinot
Xiang Fu
 
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Kai Wähner
 
kafka
kafkakafka
kafka
Amikam Snir
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
Till Rohrmann
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
Jiangjie Qin
 
Red Hat Openshift Fundamentals.pptx
Red Hat Openshift Fundamentals.pptxRed Hat Openshift Fundamentals.pptx
Red Hat Openshift Fundamentals.pptx
ssuser18b1c6
 
Serverless Kafka on AWS as Part of a Cloud-native Data Lake Architecture
Serverless Kafka on AWS as Part of a Cloud-native Data Lake ArchitectureServerless Kafka on AWS as Part of a Cloud-native Data Lake Architecture
Serverless Kafka on AWS as Part of a Cloud-native Data Lake Architecture
Kai Wähner
 
Scaling HBase for Big Data
Scaling HBase for Big DataScaling HBase for Big Data
Scaling HBase for Big Data
Salesforce Engineering
 
Kafka Security 101 and Real-World Tips
Kafka Security 101 and Real-World Tips Kafka Security 101 and Real-World Tips
Kafka Security 101 and Real-World Tips
confluent
 
Open shift 4 infra deep dive
Open shift 4    infra deep diveOpen shift 4    infra deep dive
Open shift 4 infra deep dive
Winton Winton
 
Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Ac...
Flink Forward San Francisco 2018: Andrew Gao &  Jeff Sharpe - "Finding Bad Ac...Flink Forward San Francisco 2018: Andrew Gao &  Jeff Sharpe - "Finding Bad Ac...
Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Ac...
Flink Forward
 
Introduction to Vault
Introduction to VaultIntroduction to Vault
Introduction to Vault
Knoldus Inc.
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
confluent
 
Apache Phoenix: Transforming HBase into a SQL Database
Apache Phoenix: Transforming HBase into a SQL DatabaseApache Phoenix: Transforming HBase into a SQL Database
Apache Phoenix: Transforming HBase into a SQL Database
DataWorks Summit
 
Security-by-Design and -Default
 Security-by-Design and -Default Security-by-Design and -Default
Security-by-Design and -Default
Mehdi Mirakhorli
 
apidays LIVE Singapore - Next-generation microservice architecture based on A...
apidays LIVE Singapore - Next-generation microservice architecture based on A...apidays LIVE Singapore - Next-generation microservice architecture based on A...
apidays LIVE Singapore - Next-generation microservice architecture based on A...
apidays
 
Show Me Kafka Tools That Will Increase My Productivity! (Stephane Maarek, Dat...
Show Me Kafka Tools That Will Increase My Productivity! (Stephane Maarek, Dat...Show Me Kafka Tools That Will Increase My Productivity! (Stephane Maarek, Dat...
Show Me Kafka Tools That Will Increase My Productivity! (Stephane Maarek, Dat...
confluent
 
Apache Kafka in the Airline, Aviation and Travel Industry
Apache Kafka in the Airline, Aviation and Travel IndustryApache Kafka in the Airline, Aviation and Travel Industry
Apache Kafka in the Airline, Aviation and Travel Industry
Kai Wähner
 
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache KafkaKafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kai Wähner
 
Stream processing with Apache Flink (Timo Walther - Ververica)
Stream processing with Apache Flink (Timo Walther - Ververica)Stream processing with Apache Flink (Timo Walther - Ververica)
Stream processing with Apache Flink (Timo Walther - Ververica)
KafkaZone
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache Pinot
Xiang Fu
 
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Kai Wähner
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
Till Rohrmann
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
Jiangjie Qin
 
Red Hat Openshift Fundamentals.pptx
Red Hat Openshift Fundamentals.pptxRed Hat Openshift Fundamentals.pptx
Red Hat Openshift Fundamentals.pptx
ssuser18b1c6
 
Serverless Kafka on AWS as Part of a Cloud-native Data Lake Architecture
Serverless Kafka on AWS as Part of a Cloud-native Data Lake ArchitectureServerless Kafka on AWS as Part of a Cloud-native Data Lake Architecture
Serverless Kafka on AWS as Part of a Cloud-native Data Lake Architecture
Kai Wähner
 
Kafka Security 101 and Real-World Tips
Kafka Security 101 and Real-World Tips Kafka Security 101 and Real-World Tips
Kafka Security 101 and Real-World Tips
confluent
 
Open shift 4 infra deep dive
Open shift 4    infra deep diveOpen shift 4    infra deep dive
Open shift 4 infra deep dive
Winton Winton
 
Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Ac...
Flink Forward San Francisco 2018: Andrew Gao &  Jeff Sharpe - "Finding Bad Ac...Flink Forward San Francisco 2018: Andrew Gao &  Jeff Sharpe - "Finding Bad Ac...
Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Ac...
Flink Forward
 
Introduction to Vault
Introduction to VaultIntroduction to Vault
Introduction to Vault
Knoldus Inc.
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
confluent
 
Apache Phoenix: Transforming HBase into a SQL Database
Apache Phoenix: Transforming HBase into a SQL DatabaseApache Phoenix: Transforming HBase into a SQL Database
Apache Phoenix: Transforming HBase into a SQL Database
DataWorks Summit
 
Security-by-Design and -Default
 Security-by-Design and -Default Security-by-Design and -Default
Security-by-Design and -Default
Mehdi Mirakhorli
 
apidays LIVE Singapore - Next-generation microservice architecture based on A...
apidays LIVE Singapore - Next-generation microservice architecture based on A...apidays LIVE Singapore - Next-generation microservice architecture based on A...
apidays LIVE Singapore - Next-generation microservice architecture based on A...
apidays
 
Show Me Kafka Tools That Will Increase My Productivity! (Stephane Maarek, Dat...
Show Me Kafka Tools That Will Increase My Productivity! (Stephane Maarek, Dat...Show Me Kafka Tools That Will Increase My Productivity! (Stephane Maarek, Dat...
Show Me Kafka Tools That Will Increase My Productivity! (Stephane Maarek, Dat...
confluent
 
Apache Kafka in the Airline, Aviation and Travel Industry
Apache Kafka in the Airline, Aviation and Travel IndustryApache Kafka in the Airline, Aviation and Travel Industry
Apache Kafka in the Airline, Aviation and Travel Industry
Kai Wähner
 

Viewers also liked (20)

HBase Operations and Best Practices
HBase Operations and Best PracticesHBase Operations and Best Practices
HBase Operations and Best Practices
Venu Anuganti
 
HBase Sizing Guide
HBase Sizing GuideHBase Sizing Guide
HBase Sizing Guide
larsgeorge
 
HBase Sizing Notes
HBase Sizing NotesHBase Sizing Notes
HBase Sizing Notes
larsgeorge
 
Designing Scalable Data Warehouse Using MySQL
Designing Scalable Data Warehouse Using MySQLDesigning Scalable Data Warehouse Using MySQL
Designing Scalable Data Warehouse Using MySQL
Venu Anuganti
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance Tuning
Lars Hofhansl
 
Durable Streaming and Enterprise Messaging
Durable Streaming and Enterprise MessagingDurable Streaming and Enterprise Messaging
Durable Streaming and Enterprise Messaging
Salesforce Developers
 
HBASE Overview
HBASE OverviewHBASE Overview
HBASE Overview
Sampath Rachakonda
 
TriHUG 3/14: HBase in Production
TriHUG 3/14: HBase in ProductionTriHUG 3/14: HBase in Production
TriHUG 3/14: HBase in Production
trihug
 
Salesforce External Objects for Big Data
Salesforce External Objects for Big DataSalesforce External Objects for Big Data
Salesforce External Objects for Big Data
Sumit Sarkar
 
HBase In Action - Chapter 04: HBase table design
HBase In Action - Chapter 04: HBase table designHBase In Action - Chapter 04: HBase table design
HBase In Action - Chapter 04: HBase table design
phanleson
 
Salesforce for Nonprofits: Turn Big Data into Social Change
Salesforce for Nonprofits: Turn Big Data into Social ChangeSalesforce for Nonprofits: Turn Big Data into Social Change
Salesforce for Nonprofits: Turn Big Data into Social Change
Salesforce.org
 
Bringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to SalesforceBringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to Salesforce
Salesforce Developers
 
Phoenix - A High Performance Open Source SQL Layer over HBase
Phoenix - A High Performance Open Source SQL Layer over HBasePhoenix - A High Performance Open Source SQL Layer over HBase
Phoenix - A High Performance Open Source SQL Layer over HBase
Salesforce Developers
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
Carol McDonald
 
Analyze billions of records on Salesforce App Cloud with BigObject
Analyze billions of records on Salesforce App Cloud with BigObjectAnalyze billions of records on Salesforce App Cloud with BigObject
Analyze billions of records on Salesforce App Cloud with BigObject
Salesforce Developers
 
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Spark Summit
 
Column-Stores vs. Row-Stores: How Different are they Really?
Column-Stores vs. Row-Stores: How Different are they Really?Column-Stores vs. Row-Stores: How Different are they Really?
Column-Stores vs. Row-Stores: How Different are they Really?
Daniel Abadi
 
Apache HBase 1.0 Release
Apache HBase 1.0 ReleaseApache HBase 1.0 Release
Apache HBase 1.0 Release
Nick Dimiduk
 
[Spark meetup] Spark Streaming Overview
[Spark meetup] Spark Streaming Overview[Spark meetup] Spark Streaming Overview
[Spark meetup] Spark Streaming Overview
Stratio
 
The Future of Hbase
The Future of HbaseThe Future of Hbase
The Future of Hbase
Salesforce Engineering
 
HBase Operations and Best Practices
HBase Operations and Best PracticesHBase Operations and Best Practices
HBase Operations and Best Practices
Venu Anuganti
 
HBase Sizing Guide
HBase Sizing GuideHBase Sizing Guide
HBase Sizing Guide
larsgeorge
 
HBase Sizing Notes
HBase Sizing NotesHBase Sizing Notes
HBase Sizing Notes
larsgeorge
 
Designing Scalable Data Warehouse Using MySQL
Designing Scalable Data Warehouse Using MySQLDesigning Scalable Data Warehouse Using MySQL
Designing Scalable Data Warehouse Using MySQL
Venu Anuganti
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance Tuning
Lars Hofhansl
 
Durable Streaming and Enterprise Messaging
Durable Streaming and Enterprise MessagingDurable Streaming and Enterprise Messaging
Durable Streaming and Enterprise Messaging
Salesforce Developers
 
TriHUG 3/14: HBase in Production
TriHUG 3/14: HBase in ProductionTriHUG 3/14: HBase in Production
TriHUG 3/14: HBase in Production
trihug
 
Salesforce External Objects for Big Data
Salesforce External Objects for Big DataSalesforce External Objects for Big Data
Salesforce External Objects for Big Data
Sumit Sarkar
 
HBase In Action - Chapter 04: HBase table design
HBase In Action - Chapter 04: HBase table designHBase In Action - Chapter 04: HBase table design
HBase In Action - Chapter 04: HBase table design
phanleson
 
Salesforce for Nonprofits: Turn Big Data into Social Change
Salesforce for Nonprofits: Turn Big Data into Social ChangeSalesforce for Nonprofits: Turn Big Data into Social Change
Salesforce for Nonprofits: Turn Big Data into Social Change
Salesforce.org
 
Bringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to SalesforceBringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to Salesforce
Salesforce Developers
 
Phoenix - A High Performance Open Source SQL Layer over HBase
Phoenix - A High Performance Open Source SQL Layer over HBasePhoenix - A High Performance Open Source SQL Layer over HBase
Phoenix - A High Performance Open Source SQL Layer over HBase
Salesforce Developers
 
Analyze billions of records on Salesforce App Cloud with BigObject
Analyze billions of records on Salesforce App Cloud with BigObjectAnalyze billions of records on Salesforce App Cloud with BigObject
Analyze billions of records on Salesforce App Cloud with BigObject
Salesforce Developers
 
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Spark Summit
 
Column-Stores vs. Row-Stores: How Different are they Really?
Column-Stores vs. Row-Stores: How Different are they Really?Column-Stores vs. Row-Stores: How Different are they Really?
Column-Stores vs. Row-Stores: How Different are they Really?
Daniel Abadi
 
Apache HBase 1.0 Release
Apache HBase 1.0 ReleaseApache HBase 1.0 Release
Apache HBase 1.0 Release
Nick Dimiduk
 
[Spark meetup] Spark Streaming Overview
[Spark meetup] Spark Streaming Overview[Spark meetup] Spark Streaming Overview
[Spark meetup] Spark Streaming Overview
Stratio
 
Ad

Similar to Hbase at Salesforce.com (20)

developer-burnout.pdf
developer-burnout.pdfdeveloper-burnout.pdf
developer-burnout.pdf
DivyanshGupta922023
 
Winter 14 Release Developer Preview
Winter 14 Release Developer PreviewWinter 14 Release Developer Preview
Winter 14 Release Developer Preview
Salesforce Developers
 
Data hero dream ole19
Data hero dream ole19Data hero dream ole19
Data hero dream ole19
rikkehovgaard
 
Moving Your ERP to the Cloud
Moving Your ERP to the CloudMoving Your ERP to the Cloud
Moving Your ERP to the Cloud
Kenandy
 
Forces of the Future That's Now - Peter Coffee at SoTeC 2015
Forces of the Future That's Now - Peter Coffee at SoTeC 2015Forces of the Future That's Now - Peter Coffee at SoTeC 2015
Forces of the Future That's Now - Peter Coffee at SoTeC 2015
Peter Coffee
 
Using Apex for REST Integration
Using Apex for REST IntegrationUsing Apex for REST Integration
Using Apex for REST Integration
Salesforce Developers
 
Introduction to Apex Triggers
Introduction to Apex TriggersIntroduction to Apex Triggers
Introduction to Apex Triggers
Salesforce Developers
 
Spice up Your Internal Portal with Visualforce and Twitter Bootstrap
Spice up Your Internal Portal with Visualforce and Twitter BootstrapSpice up Your Internal Portal with Visualforce and Twitter Bootstrap
Spice up Your Internal Portal with Visualforce and Twitter Bootstrap
Salesforce Developers
 
Realtime Apps with Node.js, Heroku, and Force.com Streaming
Realtime Apps with Node.js, Heroku, and Force.com StreamingRealtime Apps with Node.js, Heroku, and Force.com Streaming
Realtime Apps with Node.js, Heroku, and Force.com Streaming
Salesforce Developers
 
再考PaaS 〜 Heroku最新情報で考える、2017年のPaaS選択基準 〜
再考PaaS 〜 Heroku最新情報で考える、2017年のPaaS選択基準 〜再考PaaS 〜 Heroku最新情報で考える、2017年のPaaS選択基準 〜
再考PaaS 〜 Heroku最新情報で考える、2017年のPaaS選択基準 〜
Mitch Okamoto
 
Forcing Functions: Reconceiving Everything - Peter Coffee at AITP San Diego C...
Forcing Functions: Reconceiving Everything - Peter Coffee at AITP San Diego C...Forcing Functions: Reconceiving Everything - Peter Coffee at AITP San Diego C...
Forcing Functions: Reconceiving Everything - Peter Coffee at AITP San Diego C...
Peter Coffee
 
10 Best Practices using Flow - Darrell DeVeaux
10 Best Practices using Flow - Darrell DeVeaux10 Best Practices using Flow - Darrell DeVeaux
10 Best Practices using Flow - Darrell DeVeaux
Salesforce Admins
 
Operationalizing Big Data as a Service
Operationalizing Big Data as a ServiceOperationalizing Big Data as a Service
Operationalizing Big Data as a Service
Salesforce Engineering
 
Df14 Building Machine Learning Systems with Apex
Df14 Building Machine Learning Systems with ApexDf14 Building Machine Learning Systems with Apex
Df14 Building Machine Learning Systems with Apex
pbattisson
 
Data Democracy: Use Lightning Connect & Heroku to Visualize any Data, Anywhere
Data Democracy: Use Lightning Connect & Heroku to Visualize any Data, AnywhereData Democracy: Use Lightning Connect & Heroku to Visualize any Data, Anywhere
Data Democracy: Use Lightning Connect & Heroku to Visualize any Data, Anywhere
Salesforce Developers
 
Forcelandia 2016 Wave App Development
Forcelandia 2016   Wave App DevelopmentForcelandia 2016   Wave App Development
Forcelandia 2016 Wave App Development
Skip Sauls
 
Docker on Heroku のはじめ方
Docker on Heroku のはじめ方Docker on Heroku のはじめ方
Docker on Heroku のはじめ方
Takashi Abe
 
Finding relevant results faster with Elasticsearch
Finding relevant results faster with ElasticsearchFinding relevant results faster with Elasticsearch
Finding relevant results faster with Elasticsearch
Elasticsearch
 
Doc is Dead! How Walkthroughs Changed Salesforce's Content Strategy
Doc is Dead! How Walkthroughs Changed Salesforce's Content StrategyDoc is Dead! How Walkthroughs Changed Salesforce's Content Strategy
Doc is Dead! How Walkthroughs Changed Salesforce's Content Strategy
Gavin Austin
 
Loading Data into the Analytics Cloud with Apex
Loading Data into the Analytics Cloud with ApexLoading Data into the Analytics Cloud with Apex
Loading Data into the Analytics Cloud with Apex
Salesforce Developers
 
Data hero dream ole19
Data hero dream ole19Data hero dream ole19
Data hero dream ole19
rikkehovgaard
 
Moving Your ERP to the Cloud
Moving Your ERP to the CloudMoving Your ERP to the Cloud
Moving Your ERP to the Cloud
Kenandy
 
Forces of the Future That's Now - Peter Coffee at SoTeC 2015
Forces of the Future That's Now - Peter Coffee at SoTeC 2015Forces of the Future That's Now - Peter Coffee at SoTeC 2015
Forces of the Future That's Now - Peter Coffee at SoTeC 2015
Peter Coffee
 
Spice up Your Internal Portal with Visualforce and Twitter Bootstrap
Spice up Your Internal Portal with Visualforce and Twitter BootstrapSpice up Your Internal Portal with Visualforce and Twitter Bootstrap
Spice up Your Internal Portal with Visualforce and Twitter Bootstrap
Salesforce Developers
 
Realtime Apps with Node.js, Heroku, and Force.com Streaming
Realtime Apps with Node.js, Heroku, and Force.com StreamingRealtime Apps with Node.js, Heroku, and Force.com Streaming
Realtime Apps with Node.js, Heroku, and Force.com Streaming
Salesforce Developers
 
再考PaaS 〜 Heroku最新情報で考える、2017年のPaaS選択基準 〜
再考PaaS 〜 Heroku最新情報で考える、2017年のPaaS選択基準 〜再考PaaS 〜 Heroku最新情報で考える、2017年のPaaS選択基準 〜
再考PaaS 〜 Heroku最新情報で考える、2017年のPaaS選択基準 〜
Mitch Okamoto
 
Forcing Functions: Reconceiving Everything - Peter Coffee at AITP San Diego C...
Forcing Functions: Reconceiving Everything - Peter Coffee at AITP San Diego C...Forcing Functions: Reconceiving Everything - Peter Coffee at AITP San Diego C...
Forcing Functions: Reconceiving Everything - Peter Coffee at AITP San Diego C...
Peter Coffee
 
10 Best Practices using Flow - Darrell DeVeaux
10 Best Practices using Flow - Darrell DeVeaux10 Best Practices using Flow - Darrell DeVeaux
10 Best Practices using Flow - Darrell DeVeaux
Salesforce Admins
 
Operationalizing Big Data as a Service
Operationalizing Big Data as a ServiceOperationalizing Big Data as a Service
Operationalizing Big Data as a Service
Salesforce Engineering
 
Df14 Building Machine Learning Systems with Apex
Df14 Building Machine Learning Systems with ApexDf14 Building Machine Learning Systems with Apex
Df14 Building Machine Learning Systems with Apex
pbattisson
 
Data Democracy: Use Lightning Connect & Heroku to Visualize any Data, Anywhere
Data Democracy: Use Lightning Connect & Heroku to Visualize any Data, AnywhereData Democracy: Use Lightning Connect & Heroku to Visualize any Data, Anywhere
Data Democracy: Use Lightning Connect & Heroku to Visualize any Data, Anywhere
Salesforce Developers
 
Forcelandia 2016 Wave App Development
Forcelandia 2016   Wave App DevelopmentForcelandia 2016   Wave App Development
Forcelandia 2016 Wave App Development
Skip Sauls
 
Docker on Heroku のはじめ方
Docker on Heroku のはじめ方Docker on Heroku のはじめ方
Docker on Heroku のはじめ方
Takashi Abe
 
Finding relevant results faster with Elasticsearch
Finding relevant results faster with ElasticsearchFinding relevant results faster with Elasticsearch
Finding relevant results faster with Elasticsearch
Elasticsearch
 
Doc is Dead! How Walkthroughs Changed Salesforce's Content Strategy
Doc is Dead! How Walkthroughs Changed Salesforce's Content StrategyDoc is Dead! How Walkthroughs Changed Salesforce's Content Strategy
Doc is Dead! How Walkthroughs Changed Salesforce's Content Strategy
Gavin Austin
 
Loading Data into the Analytics Cloud with Apex
Loading Data into the Analytics Cloud with ApexLoading Data into the Analytics Cloud with Apex
Loading Data into the Analytics Cloud with Apex
Salesforce Developers
 
Ad

More from Salesforce Engineering (20)

Locker Service Ready Lightning Components With Webpack
Locker Service Ready Lightning Components With WebpackLocker Service Ready Lightning Components With Webpack
Locker Service Ready Lightning Components With Webpack
Salesforce Engineering
 
Techniques to Effectively Monitor the Performance of Customers in the Cloud
Techniques to Effectively Monitor the Performance of Customers in the CloudTechniques to Effectively Monitor the Performance of Customers in the Cloud
Techniques to Effectively Monitor the Performance of Customers in the Cloud
Salesforce Engineering
 
Predictive System Performance Data Analysis
Predictive System Performance Data AnalysisPredictive System Performance Data Analysis
Predictive System Performance Data Analysis
Salesforce Engineering
 
Apache HBase State of the Project
Apache HBase State of the ProjectApache HBase State of the Project
Apache HBase State of the Project
Salesforce Engineering
 
Hit the Trail with Trailhead
Hit the Trail with TrailheadHit the Trail with Trailhead
Hit the Trail with Trailhead
Salesforce Engineering
 
HBase/PHOENIX @ Scale
HBase/PHOENIX @ ScaleHBase/PHOENIX @ Scale
HBase/PHOENIX @ Scale
Salesforce Engineering
 
Scaling up data science applications
Scaling up data science applicationsScaling up data science applications
Scaling up data science applications
Salesforce Engineering
 
Containers and Security for DevOps
Containers and Security for DevOpsContainers and Security for DevOps
Containers and Security for DevOps
Salesforce Engineering
 
Aspect Oriented Programming: Hidden Toolkit That You Already Have
Aspect Oriented Programming: Hidden Toolkit That You Already HaveAspect Oriented Programming: Hidden Toolkit That You Already Have
Aspect Oriented Programming: Hidden Toolkit That You Already Have
Salesforce Engineering
 
Monitoring @ Scale in Salesforce
Monitoring @ Scale in SalesforceMonitoring @ Scale in Salesforce
Monitoring @ Scale in Salesforce
Salesforce Engineering
 
Performance Tuning with XHProf
Performance Tuning with XHProfPerformance Tuning with XHProf
Performance Tuning with XHProf
Salesforce Engineering
 
A Smarter Pig: Building a SQL interface to Pig using Apache Calcite
A Smarter Pig: Building a SQL interface to Pig using Apache CalciteA Smarter Pig: Building a SQL interface to Pig using Apache Calcite
A Smarter Pig: Building a SQL interface to Pig using Apache Calcite
Salesforce Engineering
 
Implementing a Content Strategy Is Like Running 100 Miles
Implementing a Content Strategy Is Like Running 100 MilesImplementing a Content Strategy Is Like Running 100 Miles
Implementing a Content Strategy Is Like Running 100 Miles
Salesforce Engineering
 
Salesforce Cloud Infrastructure and Challenges - A Brief Overview
Salesforce Cloud Infrastructure and Challenges - A Brief OverviewSalesforce Cloud Infrastructure and Challenges - A Brief Overview
Salesforce Cloud Infrastructure and Challenges - A Brief Overview
Salesforce Engineering
 
Koober Preduction IO Presentation
Koober Preduction IO PresentationKoober Preduction IO Presentation
Koober Preduction IO Presentation
Salesforce Engineering
 
Finding Security Issues Fast!
Finding Security Issues Fast!Finding Security Issues Fast!
Finding Security Issues Fast!
Salesforce Engineering
 
Microservices
MicroservicesMicroservices
Microservices
Salesforce Engineering
 
Global State Management of Micro Services
Global State Management of Micro ServicesGlobal State Management of Micro Services
Global State Management of Micro Services
Salesforce Engineering
 
Apache BookKeeper Distributed Store- a Salesforce use case
Apache BookKeeper Distributed Store- a Salesforce use caseApache BookKeeper Distributed Store- a Salesforce use case
Apache BookKeeper Distributed Store- a Salesforce use case
Salesforce Engineering
 
Content Strategy Workshop
Content Strategy WorkshopContent Strategy Workshop
Content Strategy Workshop
Salesforce Engineering
 
Locker Service Ready Lightning Components With Webpack
Locker Service Ready Lightning Components With WebpackLocker Service Ready Lightning Components With Webpack
Locker Service Ready Lightning Components With Webpack
Salesforce Engineering
 
Techniques to Effectively Monitor the Performance of Customers in the Cloud
Techniques to Effectively Monitor the Performance of Customers in the CloudTechniques to Effectively Monitor the Performance of Customers in the Cloud
Techniques to Effectively Monitor the Performance of Customers in the Cloud
Salesforce Engineering
 
Predictive System Performance Data Analysis
Predictive System Performance Data AnalysisPredictive System Performance Data Analysis
Predictive System Performance Data Analysis
Salesforce Engineering
 
Aspect Oriented Programming: Hidden Toolkit That You Already Have
Aspect Oriented Programming: Hidden Toolkit That You Already HaveAspect Oriented Programming: Hidden Toolkit That You Already Have
Aspect Oriented Programming: Hidden Toolkit That You Already Have
Salesforce Engineering
 
A Smarter Pig: Building a SQL interface to Pig using Apache Calcite
A Smarter Pig: Building a SQL interface to Pig using Apache CalciteA Smarter Pig: Building a SQL interface to Pig using Apache Calcite
A Smarter Pig: Building a SQL interface to Pig using Apache Calcite
Salesforce Engineering
 
Implementing a Content Strategy Is Like Running 100 Miles
Implementing a Content Strategy Is Like Running 100 MilesImplementing a Content Strategy Is Like Running 100 Miles
Implementing a Content Strategy Is Like Running 100 Miles
Salesforce Engineering
 
Salesforce Cloud Infrastructure and Challenges - A Brief Overview
Salesforce Cloud Infrastructure and Challenges - A Brief OverviewSalesforce Cloud Infrastructure and Challenges - A Brief Overview
Salesforce Cloud Infrastructure and Challenges - A Brief Overview
Salesforce Engineering
 
Global State Management of Micro Services
Global State Management of Micro ServicesGlobal State Management of Micro Services
Global State Management of Micro Services
Salesforce Engineering
 
Apache BookKeeper Distributed Store- a Salesforce use case
Apache BookKeeper Distributed Store- a Salesforce use caseApache BookKeeper Distributed Store- a Salesforce use case
Apache BookKeeper Distributed Store- a Salesforce use case
Salesforce Engineering
 

Recently uploaded (20)

Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 

Hbase at Salesforce.com

  • 1. HBase @ Salesforce Lars Hofhansl Architect, Father, Meditator,Aikido Blackbelt http://hadoop-hbase.blogspot.com
  • 2. Safe harbor statement under the Private Securities Litigation Reform Act of 1995: This presentation may contain forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties materialize or if any of the assumptions proves incorrect, the results of salesforce.com, inc. could differ materially from the results expressed or implied by the forward-looking statements we make. All statements other than statements of historical fact could be deemed forward-looking, including any projections of product or service availability, subscriber growth, earnings, revenues, or other financial items and any statements regarding strategies or plans of management for future operations, statements of belief, any statements concerning new, planned, or upgraded services or technology developments and customer contracts or use of our services. The risks and uncertainties referred to above include – but are not limited to – risks associated with developing and delivering new functionality for our service, new products and services, our new business model, our past operating losses, possible fluctuations in our operating results and rate of growth, interruptions or delays in our Web hosting, breach of our security measures, the outcome of any litigation, risks associated with completed and any possible mergers and acquisitions, the immature market in which we operate, our relatively limited operating history, our ability to expand, retain, and motivate our employees and manage our growth, new releases of our service and successful customer deployment, our limited history reselling non-salesforce.com products, and utilization and selling to larger enterprise customers. Further information on potential factors that could affect the financial results of salesforce.com, inc. is included in our annual report on Form 10-K for the most recent fiscal year and in our quarterly report on Form 10-Q for the most recent fiscal quarter. These documents and others containing important disclosures are available on the SEC Filings section of the Investor Information section of our Web site. Any unreleased services or features referenced in this or other presentations, press releases or public statements are not currently available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions based upon features that are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these forward-looking statements. Safe Harbor
  • 3. Why HBase? • SAN • RDBMS • Transactions
  • 5. A. Why HBase? B. Interacting with the open source community C. HBase at Salesforce
  • 6. Size Matters* New Salesforce customer: •“How many rows do you have?” •We will turn folks away if they have too many! Data Storage is expensive: •SAN storage •Relational Database •Too many rows  Too expensive * In a relational world
  • 7. What if in the future we: … and have cheaper storage? … and never need to ask again about the number of rows? … grow with the data by just adding more machines? (Disclaimer: no transactions, no joins, no 2nd’ary indexes, …)
  • 8. (A quick note about) Relational Databases • We love them. They are core to our infrastructure. • SQL and NoSQL NoACID are complementary. • (Almost) everything we do is SQL based (see Phoenix – the SQL layer for HBase.)
  • 9. The Search - Requirements • Consistent – “Eventually consistent stores are 100% consistent 99% of the time” – Ian Varley • Scalable – No “features” impeding horizontal scaling • Persistent – Duh...? • Key lookups • Range lookups • Open source (ASL great, GPLv2 OK, GPLv3/AGPL not acceptable)
  • 10. Enter HBase “A Sparse, Consistent, Distributed, Multidimensional, Persistent, Sorted Map”
  • 11. Salesforce and the HBase Community
  • 12. To Fork or not to Fork – that is the question Fork - pros • Agility. No waiting for community review. Just get stuff done • Freedom. Patches that might not be acceptable to the community Fork - cons • Lose out on community work • Patches not useful to other parties There is no right or wrong. It’s a matter of choice, taste, and requirements.
  • 13. HBase Development @ Salesforce • No fork of HBase. • No fork of HBase. • Internal HBase/HDFS branch for possible emergency fixes • All fixes are cleaned and contributed back • We switch to the next open source point release periodically
  • 14. PMC member, 2 committers, release manager, contributors HBASE-11042 HBASE-11040 HBASE-11037 HBASE-11030 HBASE-11029 HBASE-11024 HBASE-11022 HBASE- 11010 HBASE-10996 HBASE-10989 HBASE-10988 HBASE-10987 HBASE-10982 HBASE-10969 HBASE-10847 HBASE-10805 HBASE-10722 HBASE-10706 HBASE-10642 HBASE-10594 HBASE-10562 HBASE-10551 HBASE-10546 HBASE-10505 HBASE-10501 HBASE-10489 HBASE-10470 HBASE-10420 HBASE-10416 HBASE-10383 HBASE-10363 HBASE-10320 HBASE-10317 HBASE-10286 HBASE-10284 HBASE-10281 HBASE-10279 HBASE-10259 HBASE-10257 HBASE-10250 HBASE-10181 HBASE-10117 HBASE-10076 HBASE-10058 HBASE-10057 HBASE-10015 HBASE-9993 HBASE-9971 HBASE-9956 HBASE-9915 HBASE- 9865 HBASE-9834 HBASE-9807 HBASE-9799 HBASE-9789 HBASE-9778 HBASE-9751 HBASE-9749 HBASE- 9732 HBASE-9731 HBASE-9711 HBASE-9658 HBASE-9584 HBASE-9566 HBASE-9534 HBASE-9429 HBASE- 9428 HBASE-9377 HBASE-9356 HBASE-9344 HBASE-9301 HBASE-9266 HBASE-9231 HBASE-9221 HBASE- 9186 HBASE-9158 HBASE-9103 HBASE-9097 HBASE-9049 HBASE-8971 HBASE-8945 HBASE-8930 HBASE- 8912 HBASE-8858 HBASE-8809 HBASE-8767 HBASE-8702 HBASE-8698 HBASE-8684 HBASE-8671 HBASE- 8636 HBASE-8525 HBASE-8503 HBASE-8355 HBASE-8316 HBASE-8229 HBASE-8188 HBASE-8166 HBASE- 8151 HBASE-8110 HBASE-8108 HBASE-8055 HBASE-8008 HBASE-7999 HBASE-7947 HBASE-7945 HBASE- 7817 HBASE-7801 HBASE-7729 HBASE-7725 HBASE-7717 HBASE-7709 HBASE-7702 HBASE-7681 HBASE- 7617 HBASE-7602 HBASE-7578 HBASE-7550 HBASE-7499 HBASE-7497 HBASE-7483 HBASE-7466 HBASE- 7465 HBASE-7455 HBASE-7438 HBASE-7435 HBASE-7432 HBASE-7431 HBASE-7417 HBASE-7415 HBASE- 7371 HBASE-7336 HBASE-7293 HBASE-7279 HBASE-7270 HBASE-7252 HBASE-7240 HBASE-7215 HBASE- 7214 HBASE-7180 HBASE-7177 HBASE-7166 HBASE-7165 HBASE-7091 HBASE-7069 HBASE-7051 HBASE- 7047 HBASE-7021 HBASE-7010 HBASE-6996 HBASE-6974
  • 15. PMC member, 2 committers, release manager, contributors HBASE-6949 HBASE-6946 HBASE-6912 HBASE-6889 HBASE-6879 HBASE-6868 HBASE-6865 HBASE-6863 HBASE-6797 HBASE-6796 HBASE-6784 HBASE-6765 HBASE-6757 HBASE-6755 HBASE-6711 HBASE-6707 HBASE-6690 HBASE-6667 HBASE-6638 HBASE-6637 HBASE-6621 HBASE-6582 HBASE-6580 HBASE-6579 HBASE-6573 HBASE-6571 HBASE-6570 HBASE-6569 HBASE-6568 HBASE-6561 HBASE-6523 HBASE-6522 HBASE-6505 HBASE-6504 HBASE-6496 HBASE-6495 HBASE-6441 HBASE-6439 HBASE-6427 HBASE-6426 HBASE-6421 HBASE-6406 HBASE-6355 HBASE-6347 HBASE-6326 HBASE-6296 HBASE-6293 HBASE-6291 HBASE-6178 HBASE-6138 HBASE-6113 HBASE-6112 HBASE-6110 HBASE-6087 HBASE-5961 HBASE-5955 HBASE-5909 HBASE-5884 HBASE-5871 HBASE-5865 HBASE-5782 HBASE-5775 HBASE-5774 HBASE-5682 HBASE-5670 HBASE-5659 HBASE-5641 HBASE-5609 HBASE-5604 HBASE-5574 HBASE-5569 HBASE-5548 HBASE-5547 HBASE-5541 HBASE-5526 HBASE-5523 HBASE-5509 HBASE-5497 HBASE-5460 HBASE-5455 HBASE-5440 HBASE-5431 HBASE-5368 HBASE-5350 HBASE-5348 HBASE-5318 HBASE-5304 HBASE-5266 HBASE-5229 HBASE-5203 HBASE-5118 HBASE-5096 HBASE-5088 HBASE-5084 HBASE-5070 HBASE-5058 HBASE-5005 HBASE-5001 HBASE-4998 HBASE-4981 HBASE-4979 HBASE-4945 HBASE-4886 HBASE-4874 HBASE-4870 HBASE-4838 HBASE-4805 HBASE-4800 HBASE-4691 HBASE-4682 HBASE-4673 HBASE-4657 HBASE-4626 HBASE-4605 HBASE-4583 HBASE-4561 HBASE-4559 HBASE-4556 HBASE-4536 HBASE-4517 HBASE-4488 HBASE-4454 HBASE-4439 HBASE-4404 HBASE-4387 HBASE-4347 HBASE-4336 HBASE-4335 HBASE-4334 HBASE-4331 HBASE-4296 HBASE-4283 HBASE-4263 HBASE-4242 HBASE-4241 HBASE-4197 HBASE-4178 HBASE-4171 HBASE-4102 HBASE-4071 HBASE-3661 HBASE-3645 HBASE-3584 HBASE-3443 HBASE-3433 HBASE-3387 HBASE-2947 HBASE-2196 HBASE-2195 HDFS-3979 HDFS-744
  • 17. Established monthly release train for 0.94
  • 18. Contributed >300 of features, bug fixes, perf improvements
  • 19. Reviewed 1000’s of open source patches
  • 21. Open Sourced Apache Phoenix – SQL skin on HBase
  • 23. Salesforce *is* a database
  • 24. Salesforce is a Database Query Parser Query (SQL) Parsed Query Query Optimizer Plan Generator Plan Cost Estimator Evaluation Plan Query Plan Evaluator System Catalog Database Stats Tables Columns Indexes
  • 25. Salesforce is a Database Query Parser Query (SOQL) Parsed Query Query Optimizer Plan Generator Plan Cost Estimator System Catalog Oracle Hinted Oracle SQL Database Stats Objects Fields Indexes
  • 28. pod = a database instance •Oracle RAC •AppServers •Blob store servers •Search servers •Shared SAN storage •SAN replication for DR App Server App Server App Server App Server … Oracle Node Oracle Node Oracle Node Oracle Node… Oracle RAC cluster Primary Site Secondary Site SAN replication SAN SAN SQL/JDBC
  • 29. Finally: HBase @ Salesforce
  • 30. Oracle Hinted Oracle SQL Query Parser Query (SOQL) Parsed Query Query Optimizer Plan Generator Plan Cost Estimator System Catalog Database Stats Objects Fields Indexes 1. External Objects 2. Phoenix SQL HBaseHBaseHBaseHBase Where does HBase Fit?
  • 31. Where does HBase Fit? •Separate HBase per pod (close to 50 clusters) •Logically co-located with Oracle •Small clusters striped across five racks •Each cluster’s master service on a different rack •Identical cluster for DR App Server App Server App Server App Server … Oracle Node Oracle Node HBase Node HBase Node… Oracle Cluster HBase Node HBase Node HBase Node … Primary Site Secondary Site DR HBase Cluster Decentralized HBase Replication SQL/JDBC via Phoenix HBase Cluster … SAN SAN
  • 33. 1. Audit Trails (Entity History) • Identity managed in RDBMS • Indexed in HBase (Phoenix indexes) • Historical, immutable data only • No need to reason about updates, split identities, and transactions
  • 34. 2. Archiving (Data Lifecycle Management) • Objects (rows) moved to HBase • Identity managed in HBase after move • Data immutable in HBase • No Transactions
  • 35. 3. Live data in HBase (BigObjects) • Mutable data (possibly) • Everything managed in HBase • Still no Transactions, yet • Platform for other team to use
  • 36. Merrill Lynch Rationalization Data Governance, Audit & Archive • First Salesforce Enterprise Customer • On PlatformArchival compelling versus On Premise Solution from Informatica • Retention Requirements for 7 Years Merrill Lynch “Data Audit, Governance & Lifecycle management is critical for Merrill for the entire banking & financial industry has become a benchmark requirement
  • 37. Heating, ventilation, and air-conditioning in the EU • Top 10 Platform Users • Subject to highly variable data governance and retention requirements • Significant SAP footprint driving business rules – need to connect that to Salesforce data for archival and data retention needs • Massive service workforce generates significant data processing challenges “The Salesforce.com Platform roadmap for Data Archive is critical for future data management needs” MichaelRoehr, CTO Vailliant
  • 38. BMW Enriches Their Customer Perspective • Sales Cloud available across all German Dealership Franchises • All customer data subject stringent & government mandated protection, audit & retention • Correlations with Car Builder App data enables more contextual customer interactions • Car Telemetry, used correctly help refine product evolution and customer needs alignment “Data driven customer engagement is a key driver for our enhance customer experience
  • 39. System Of Record (SOR) SOR = HA + DR + Backup + M&M + Security
  • 41. Highly Available, Disaster Recovery • Five peer Zookeeper Quorum • Five Quorum Journals (for fs edits) • Five HMasters • Three NameNodes (yes, three, we made a patch to run more than one standby) • HBase Replication to identical hot standby pod in a different data center – In the event of a disaster we fail a complete pod to the secondary site • Weekly automated, unattended rolling restarts
  • 42. Replication Backup High-level Architecture Primary pod HBase 48h HDFS Backup per tenant DR pod HBase 48h HDFS Merkle Tree Verification Backup per tenant
  • 43. Monitoring & Management (M&M) • Nagios alerts • Trending via OpenTSDB. Custom UI on top the time series data. • Rolling upgrades – Eventually scheduled and unattended • Absolutely no unscheduled downtime. Not even during a rack failure.
  • 44. A. Why HBase? B. Interacting with the open source community C. HBase at Salesforce

Editor's Notes

  • #2: Spent time with StumbleUpon, Facebook, many others. This is a great community.
  • #3: Salesforce is seeing increasing change of center of gravity of customer data.Driving this forward across verticals such as Banking & Finserv requires data audit driven by post 2008 regularity requirements and Sar-Box requirements. As this data generated in a transactional environment we use HBase as our historical and immutable storage. 
  • #4: Their use of the  Salesforce.com platform to drive their entire business keeps to keep their dynamic and highly work force mobile in touch with their data.Given their operating environment in Germany they are required to deliver complete data audit and use Field History for this. They also are required to keep all customer data for at least 15 years which is why Archive is so key for them.
  • #5: Across Germany we've had a successful deployment in each franchise to establish new base lines in customer interactions with BMW customers, leases and service interactions. Looking beyond this usecase the capability of marrying together the customer data generated for the BMW Car Builder application and cleansed and anonymizedtelemetrics data is pushing Salesforce to deliver the concepts and tools to allow BMW to absorb the full spectrum of their customer event data stream, and take business actions on it.Imagine how I would feel as a prospective customer if I walked into a dealership and they have a more informed knowledge of who I am and my likely preferences. We are using the notion of BigObjects to absorb, store and act on the data that is behind the Internet of Customers.