SlideShare a Scribd company logo
Mongo & Ecommerce
Mongo & Ecommerce
    A perfect combination
SQL Dilemma
   Pick One
SQL Dilemma
                 Pick One


• Stick to one vertical (Sane schema)
SQL Dilemma
                 Pick One


• Stick to one vertical (Sane schema)
• Flexiblity (Insane schema)
Sane Schema
Sane Schema


• Works fine for a few types of products
Sane Schema


• Works fine for a few types of products
• Not possible when more product types
  introduced.
Let’s Use an Example
Let’s Use an Example
   How about we start with books
Book Product Schema
id:
sku:
product dimensions:                   General Product
shipping weight:
MSRP:
                                      attributes
price:
description:
...

author:       Orson Scott Card
title:      Enders Game
binding:       Hardcover
publication date: July 15, 1994       Book Specific
publisher name: Tor Science Fiction   attributes
number of pages: 352
ISBN:        0812550706
language:       English
...
Seems simple enough
Seems simple enough

What happens when we add another vertical...
            say music albums
Album Product Schema
id:
sku:
product dimensions:                General Product
shipping weight:
MSRP:
                                   attributes stay the
price:                             same
description:
...

artist:      MxPx
title:      Panic                  Album Specific
release date: June 7, 2005
label:       Side One Dummy        attributes are
track listing: [ The Darkest ...   different
language:        English
format:        CD
Okay, it’s getting hairy but
is still manageable, right?
Okay, it’s getting hairy but
is still manageable, right?

    Now the business want to sell jeans
Jeans Product Schema
id:
sku:                         General Product
product dimensions:
shipping weight:             attributes stay the
MSRP:                        same
price:
description:
...

brand:       Lucky
gender:       Mens           Jeans specific
make:        Vintage         attributes are totally
style:      Straight Cut     different ... and not
length:      34
width:       34              consistent across
color:      Hipster          brands & make
material:     Cotten Blend
...
Now we’re screwed
We need a flexible
schema in RDBMS
    We got this ... right?
Common approaches dealing with
unknown unknowns in RDBMS world
Common approaches dealing with
unknown unknowns in RDBMS world
•   EAV (Entity Attribute Value popularized by
    Magento)
Common approaches dealing with
unknown unknowns in RDBMS world
•   EAV (Entity Attribute Value popularized by
    Magento)
•   Single table inheritance aka. Insanely wides tables
    (field 1, field 2, ...)
Common approaches dealing with
unknown unknowns in RDBMS world
•   EAV (Entity Attribute Value popularized by
    Magento)
•   Single table inheritance aka. Insanely wides tables
    (field 1, field 2, ...)
•   Generic columns ( ‘attribute1’, ‘attribute2’, ...)
Common approaches dealing with
unknown unknowns in RDBMS world
•   EAV (Entity Attribute Value popularized by
    Magento)
•   Single table inheritance aka. Insanely wides tables
    (field 1, field 2, ...)
•   Generic columns ( ‘attribute1’, ‘attribute2’, ...)
•   Serialized data in blob (or similar)
Common approaches dealing with
unknown unknowns in RDBMS world
•   EAV (Entity Attribute Value popularized by
    Magento)
•   Single table inheritance aka. Insanely wides tables
    (field 1, field 2, ...)
•   Generic columns ( ‘attribute1’, ‘attribute2’, ...)
•   Serialized data in blob (or similar)
•   Concrete table inheritance
Common approaches dealing with
unknown unknowns in RDBMS world
•   EAV (Entity Attribute Value popularized by
    Magento)
•   Single table inheritance aka. Insanely wides tables
    (field 1, field 2, ...)
•   Generic columns ( ‘attribute1’, ‘attribute2’, ...)
•   Serialized data in blob (or similar)
•   Concrete table inheritance
•   Class table inheritance
None Work Well
    Let’s look at why
EAV
            as popularized by Magento
“For purposes of flexibility, the Magneto database heavily utilizes
         an Entity-Attribute-Value (EAV) data model.

   As is often the case, the cost of flexibility is complexity -
                   Magento is no exception.

  The process of manipulating data in Magento is often more
  “involved” than that typically experienced using traditional
                       relational tables.”
                                          - Varien
EAV
•   Crazy SQL queries.

•   Hundreds of joins in a
    query... or

•   Hundreds of queries
    joined in the application

•   No database enforced
    integrity
Did I say crazy SQL
(this is a single query)
Did I say crazy SQL
(this is a single query)




  You may have trouble reading this in the back
Selecting a single
    product
Single Table Inheritance
       (insanely wide tables)
•   No data integrity
    enforcement

•   Only can use FK for
    common elements

•   Very wasteful (but disk is
    cheap!)

•   Can’t effectively index
Generic Columns
•   No data integrity
    enforcement

•   No data type enforcement

•   Only can use FK for
    common elements

•   Wasteful (but disk is
    cheap!)

•   Can’t index
Serialized in Blob
•   Not searchable

•   No integrity

•   All the disadvantages of a
    document store, but none
    of the advantages

•   Never should be used

•   One exception is Oracle
    XML which operates
    similar to a document store
Concrete Table Inheritance
(a table for each product attribute set)

•   Allows for data integrity

•   Querying across
    attribute sets quite hard
    to do (lots of joins, OR
    statements and full table
    scanning)

•   New table needs to be
    created for each new
    attribute set
Class table inheritance
              (single product table,
         each attribute set in own table)
•   Likely best solution within the
    constraint of SQL

•   Supports data type enforcement

•   No data integrity enforcement

•   Easy querying across categories
    (for browse pages) since
    common data in single table

•   Every set needs a new table

•   Requires a ton of forsight, as
    changes are very complicated
Square Peg Round Hole
Mongo to the Rescue

• Flexible (and sane) Schema
• Easily searchable
• Easily accessible
• Fast
Examples

• grab products.js from http://github.com/
  spf13/mongoProducts
• mongo --shell products.js
• > use mongoProducts
Schema
Flexible Structure




             DVD
CD
Basic Queries
•   db.products.find( { 'details.title': "The
    Matrix" } );
•   db.products.find( { 'details.director': "Frank
    Capra" } );
•   db.products.find( { 'details.actor': "Groucho
    Marx" } );
•   db.products.find( { 'details.tracks': "Violins" } );
•   db.order.find( { 'items.sku': '00e8da9f' } );
Advanced Queries

• db.products.find( { 'pricing.pct_savings':
  { $gt : 15 } } );
• db.products.find( { 'details.actor': 'James
  Stewart', 'details.actor': 'Donna Reed' })
• db.order.find( { 'items.details.actor': /reed$/
  i }, {'items.sku': 1} );
Embedded documents
  great for orders
• Ordered items need to be fixed at the time
  of purchase
• Embed them right in the order
• db.order.find( { 'items.sku': '00e8da9f' } );
• db.order.find( { 'items.details.actor': 'James
  Stewart' } ).count()
What about the $$?
Stricter data
requirements for $$
Stricter data
    requirements for $$
•   For financial systems any data inconsistency is
    unacceptable
Stricter data
    requirements for $$
•   For financial systems any data inconsistency is
    unacceptable
•   Mongo supports atomic single document writes
Stricter data
    requirements for $$
•   For financial systems any data inconsistency is
    unacceptable
•   Mongo supports atomic single document writes
•   Mongo doesn’t support multi-document
    transactions
Stricter data
    requirements for $$
•   For financial systems any data inconsistency is
    unacceptable
•   Mongo supports atomic single document writes
•   Mongo doesn’t support multi-document
    transactions
•   If you need atomic writes across multi-document
    transactions don’t use Mongo.
Stricter data
    requirements for $$
•   For financial systems any data inconsistency is
    unacceptable
•   Mongo supports atomic single document writes
•   Mongo doesn’t support multi-document
    transactions
•   If you need atomic writes across multi-document
    transactions don’t use Mongo.
•   Many ecommerce transactions could be
    accomplished within a single document write.
What about ACID?
What about ACID?
• ACID compliance is difficult to implement
  with a distributed system. For the most
  sensitive data you may want the reliability
  ACID guarantees.
What about ACID?
• ACID compliance is difficult to implement
  with a distributed system. For the most
  sensitive data you may want the reliability
  ACID guarantees.
• Mongo is ACI for single collection writes.
What about ACID?
• ACID compliance is difficult to implement
  with a distributed system. For the most
  sensitive data you may want the reliability
  ACID guarantees.
• Mongo is ACI for single collection writes.
• For many systems ACID compliance for
  recording a purchase is overkill and
  unnecessary.
About Me

• VP Engineering @ OpenSky
• @spf13
• github.com/spf13
• Hacker, Father, Husband, Entrepreneur,
  Skate Punk
About OpenSky
OpenSky is a better way to shop where you buy from people you trust.
OpenSky is a free online platform that helps people discover
amazing products; share them with their friends, family and followers;
and earn money.

Launching in July 2010, the OpenSky is the first ecommerce platform built
on MongoDB.  The OpenSky Platform has been created using MongoDB,
Symfony 2, Doctrine 2, PHP 5.3, jQuery, node.js, Git (with gitflow) and a
touch of Java and Python. We support and contribute to many of these
open source projects.  

OpenSky was founded by John Caplan, former CEO of Ford Models and
former President of the About.com Network. OpenSky is well funded by
Highland Capital Partners, Canaan Partners and Ron Conway and is based
in NYC near Union Square.
Interested in what
        you saw?

• We’re Hiring
• Contact me at jobs@shopOpenSky.com
Questions
Appendix
OpenSky currently
uses a hybrid approach
• OpenSky uses a 3rd party library to
  process transactions it is written to only
  use SQL for transactions. In 2011 we will
  transition off this library.
• OpenSky uses Solr for search.
• All other functions of the site currently on
  MongoDB.
Note on Ecommerce
         and Mongo
  Concerning transactions, it is true that MongoDB doesn’t support
the multi-object kind; however, it does support atomic operations on
    individual documents. And this, combined with the document-
oriented modeling just described and a little creativity, is adequate to
many e-commerce problems. Certainly, if we needed to literally debit
   one account and credit another in the same operation, or if we
  wanted rollback, we’d need full-fledged transactions. However, the
 transactionality provided by MongoDB may be sufficient for most, if
                    not all, e-commerce operations.
                                         - Kyle Banker 10gen

More Related Content

What's hot (20)

PDF
From ActiveRecord to EventSourcing
Emanuele DelBono
 
PPTX
DevSecops: Defined, tools, characteristics, tools, frameworks, benefits and c...
Mohamed Nizzad
 
PDF
Cloud-Native Security
VMware Tanzu
 
PPTX
Introduction to MongoDB.pptx
Surya937648
 
PPT
MYSQL
Ankush Jain
 
PPTX
Software Composition Analysis Deep Dive
Ulisses Albuquerque
 
PPTX
Introduction to MongoDB
MongoDB
 
PDF
분산 트랜잭션 - 큰힘에는 큰 책임이 따른다 [MongoDB]
MongoDB
 
PPTX
Mongo DB 성능최적화 전략
Jin wook
 
PDF
Practical DevSecOps: Fundamentals of Successful Programs
Matt Tesauro
 
PDF
MySQL Tutorial For Beginners | Relational Database Management System | MySQL ...
Edureka!
 
PDF
Bridging the Security Testing Gap in Your CI/CD Pipeline
DevOps.com
 
PPTX
Database Testing.pptx
ssuser88c0fd1
 
PDF
Introduction to SQL Server Security
Jason Strate
 
PPTX
Microservices Architecture - Bangkok 2018
Araf Karsh Hamid
 
PPTX
Overview SQL Server 2019
Juan Fabian
 
ODP
Kong API Gateway
Chris Mague
 
PDF
SQL
kaushal123
 
PPTX
PostgreSQL Database Slides
metsarin
 
PPTX
Introduction to Scala
Mohammad Hossein Rimaz
 
From ActiveRecord to EventSourcing
Emanuele DelBono
 
DevSecops: Defined, tools, characteristics, tools, frameworks, benefits and c...
Mohamed Nizzad
 
Cloud-Native Security
VMware Tanzu
 
Introduction to MongoDB.pptx
Surya937648
 
Software Composition Analysis Deep Dive
Ulisses Albuquerque
 
Introduction to MongoDB
MongoDB
 
분산 트랜잭션 - 큰힘에는 큰 책임이 따른다 [MongoDB]
MongoDB
 
Mongo DB 성능최적화 전략
Jin wook
 
Practical DevSecOps: Fundamentals of Successful Programs
Matt Tesauro
 
MySQL Tutorial For Beginners | Relational Database Management System | MySQL ...
Edureka!
 
Bridging the Security Testing Gap in Your CI/CD Pipeline
DevOps.com
 
Database Testing.pptx
ssuser88c0fd1
 
Introduction to SQL Server Security
Jason Strate
 
Microservices Architecture - Bangkok 2018
Araf Karsh Hamid
 
Overview SQL Server 2019
Juan Fabian
 
Kong API Gateway
Chris Mague
 
PostgreSQL Database Slides
metsarin
 
Introduction to Scala
Mohammad Hossein Rimaz
 

Similar to MongoDB and Ecommerce : A perfect combination (20)

KEY
Augmenting RDBMS with MongoDB for ecommerce
Steven Francia
 
KEY
Blending MongoDB and RDBMS for ecommerce
Steven Francia
 
KEY
Optimize drupal using mongo db
Vladimir Ilic
 
PPT
No SQL and MongoDB - Hyderabad Scalability Meetup
Hyderabad Scalability Meetup
 
KEY
Hybrid MongoDB and RDBMS Applications
Steven Francia
 
PPTX
No sql introduction_v1.1.1
Fan Ang
 
PDF
Multi model-databases
Michael Hackstein
 
PDF
Multi model-databases
ArangoDB Database
 
PPTX
EAV Sytem- Magento EAV Model
Khoa Truong Dinh
 
PPTX
Unify Your Selling Channels in One Product Catalog Service
MongoDB
 
PPTX
Mongo - an intermediate introduction
nklmish
 
PDF
Polyglot Persistence & Multi Model-Databases at JMaghreb3.0
ArangoDB Database
 
PDF
Polyglot Persistence & Multi-Model Databases (FullStack Toronto)
ArangoDB Database
 
PPTX
Prepare for Peak Holiday Season with MongoDB
MongoDB
 
PDF
Hpts 2011 flexible_oltp
Jags Ramnarayan
 
PDF
Managing Database Indexes: A Data-Driven Approach - Amadeus Magrabi
Amadeus Magrabi
 
PDF
10gen telco white_paper
El Taller Web
 
PPTX
Nosql Now 2012: MongoDB Use Cases
MongoDB
 
PPTX
Retail Reference Architecture Part 3: Scalable Insight Component Providing Us...
MongoDB
 
PDF
Multi model-databases 29-10-2014 LJC
ArangoDB Database
 
Augmenting RDBMS with MongoDB for ecommerce
Steven Francia
 
Blending MongoDB and RDBMS for ecommerce
Steven Francia
 
Optimize drupal using mongo db
Vladimir Ilic
 
No SQL and MongoDB - Hyderabad Scalability Meetup
Hyderabad Scalability Meetup
 
Hybrid MongoDB and RDBMS Applications
Steven Francia
 
No sql introduction_v1.1.1
Fan Ang
 
Multi model-databases
Michael Hackstein
 
Multi model-databases
ArangoDB Database
 
EAV Sytem- Magento EAV Model
Khoa Truong Dinh
 
Unify Your Selling Channels in One Product Catalog Service
MongoDB
 
Mongo - an intermediate introduction
nklmish
 
Polyglot Persistence & Multi Model-Databases at JMaghreb3.0
ArangoDB Database
 
Polyglot Persistence & Multi-Model Databases (FullStack Toronto)
ArangoDB Database
 
Prepare for Peak Holiday Season with MongoDB
MongoDB
 
Hpts 2011 flexible_oltp
Jags Ramnarayan
 
Managing Database Indexes: A Data-Driven Approach - Amadeus Magrabi
Amadeus Magrabi
 
10gen telco white_paper
El Taller Web
 
Nosql Now 2012: MongoDB Use Cases
MongoDB
 
Retail Reference Architecture Part 3: Scalable Insight Component Providing Us...
MongoDB
 
Multi model-databases 29-10-2014 LJC
ArangoDB Database
 
Ad

More from Steven Francia (20)

PDF
State of the Gopher Nation - Golang - August 2017
Steven Francia
 
PDF
Building Awesome CLI apps in Go
Steven Francia
 
PDF
The Future of the Operating System - Keynote LinuxCon 2015
Steven Francia
 
PDF
7 Common Mistakes in Go (2015)
Steven Francia
 
PDF
What every successful open source project needs
Steven Francia
 
PDF
7 Common mistakes in Go and when to avoid them
Steven Francia
 
PDF
Go for Object Oriented Programmers or Object Oriented Programming without Obj...
Steven Francia
 
PDF
Painless Data Storage with MongoDB & Go
Steven Francia
 
PDF
Getting Started with Go
Steven Francia
 
PDF
Build your first MongoDB App in Ruby @ StrangeLoop 2013
Steven Francia
 
PDF
Modern Database Systems (for Genealogy)
Steven Francia
 
PPTX
Introduction to MongoDB and Hadoop
Steven Francia
 
PPTX
Future of data
Steven Francia
 
PDF
MongoDB, Hadoop and humongous data - MongoSV 2012
Steven Francia
 
KEY
Big data for the rest of us
Steven Francia
 
KEY
OSCON 2012 MongoDB Tutorial
Steven Francia
 
KEY
Replication, Durability, and Disaster Recovery
Steven Francia
 
KEY
Multi Data Center Strategies
Steven Francia
 
KEY
NoSQL databases and managing big data
Steven Francia
 
KEY
MongoDB, Hadoop and Humongous Data
Steven Francia
 
State of the Gopher Nation - Golang - August 2017
Steven Francia
 
Building Awesome CLI apps in Go
Steven Francia
 
The Future of the Operating System - Keynote LinuxCon 2015
Steven Francia
 
7 Common Mistakes in Go (2015)
Steven Francia
 
What every successful open source project needs
Steven Francia
 
7 Common mistakes in Go and when to avoid them
Steven Francia
 
Go for Object Oriented Programmers or Object Oriented Programming without Obj...
Steven Francia
 
Painless Data Storage with MongoDB & Go
Steven Francia
 
Getting Started with Go
Steven Francia
 
Build your first MongoDB App in Ruby @ StrangeLoop 2013
Steven Francia
 
Modern Database Systems (for Genealogy)
Steven Francia
 
Introduction to MongoDB and Hadoop
Steven Francia
 
Future of data
Steven Francia
 
MongoDB, Hadoop and humongous data - MongoSV 2012
Steven Francia
 
Big data for the rest of us
Steven Francia
 
OSCON 2012 MongoDB Tutorial
Steven Francia
 
Replication, Durability, and Disaster Recovery
Steven Francia
 
Multi Data Center Strategies
Steven Francia
 
NoSQL databases and managing big data
Steven Francia
 
MongoDB, Hadoop and Humongous Data
Steven Francia
 
Ad

Recently uploaded (20)

PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PPTX
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 

MongoDB and Ecommerce : A perfect combination

  • 2. Mongo & Ecommerce A perfect combination
  • 3. SQL Dilemma Pick One
  • 4. SQL Dilemma Pick One • Stick to one vertical (Sane schema)
  • 5. SQL Dilemma Pick One • Stick to one vertical (Sane schema) • Flexiblity (Insane schema)
  • 7. Sane Schema • Works fine for a few types of products
  • 8. Sane Schema • Works fine for a few types of products • Not possible when more product types introduced.
  • 9. Let’s Use an Example
  • 10. Let’s Use an Example How about we start with books
  • 11. Book Product Schema id: sku: product dimensions: General Product shipping weight: MSRP: attributes price: description: ... author: Orson Scott Card title: Enders Game binding: Hardcover publication date: July 15, 1994 Book Specific publisher name: Tor Science Fiction attributes number of pages: 352 ISBN: 0812550706 language: English ...
  • 13. Seems simple enough What happens when we add another vertical... say music albums
  • 14. Album Product Schema id: sku: product dimensions: General Product shipping weight: MSRP: attributes stay the price: same description: ... artist: MxPx title: Panic Album Specific release date: June 7, 2005 label: Side One Dummy attributes are track listing: [ The Darkest ... different language: English format: CD
  • 15. Okay, it’s getting hairy but is still manageable, right?
  • 16. Okay, it’s getting hairy but is still manageable, right? Now the business want to sell jeans
  • 17. Jeans Product Schema id: sku: General Product product dimensions: shipping weight: attributes stay the MSRP: same price: description: ... brand: Lucky gender: Mens Jeans specific make: Vintage attributes are totally style: Straight Cut different ... and not length: 34 width: 34 consistent across color: Hipster brands & make material: Cotten Blend ...
  • 19. We need a flexible schema in RDBMS We got this ... right?
  • 20. Common approaches dealing with unknown unknowns in RDBMS world
  • 21. Common approaches dealing with unknown unknowns in RDBMS world • EAV (Entity Attribute Value popularized by Magento)
  • 22. Common approaches dealing with unknown unknowns in RDBMS world • EAV (Entity Attribute Value popularized by Magento) • Single table inheritance aka. Insanely wides tables (field 1, field 2, ...)
  • 23. Common approaches dealing with unknown unknowns in RDBMS world • EAV (Entity Attribute Value popularized by Magento) • Single table inheritance aka. Insanely wides tables (field 1, field 2, ...) • Generic columns ( ‘attribute1’, ‘attribute2’, ...)
  • 24. Common approaches dealing with unknown unknowns in RDBMS world • EAV (Entity Attribute Value popularized by Magento) • Single table inheritance aka. Insanely wides tables (field 1, field 2, ...) • Generic columns ( ‘attribute1’, ‘attribute2’, ...) • Serialized data in blob (or similar)
  • 25. Common approaches dealing with unknown unknowns in RDBMS world • EAV (Entity Attribute Value popularized by Magento) • Single table inheritance aka. Insanely wides tables (field 1, field 2, ...) • Generic columns ( ‘attribute1’, ‘attribute2’, ...) • Serialized data in blob (or similar) • Concrete table inheritance
  • 26. Common approaches dealing with unknown unknowns in RDBMS world • EAV (Entity Attribute Value popularized by Magento) • Single table inheritance aka. Insanely wides tables (field 1, field 2, ...) • Generic columns ( ‘attribute1’, ‘attribute2’, ...) • Serialized data in blob (or similar) • Concrete table inheritance • Class table inheritance
  • 27. None Work Well Let’s look at why
  • 28. EAV as popularized by Magento “For purposes of flexibility, the Magneto database heavily utilizes an Entity-Attribute-Value (EAV) data model. As is often the case, the cost of flexibility is complexity - Magento is no exception. The process of manipulating data in Magento is often more “involved” than that typically experienced using traditional relational tables.” - Varien
  • 29. EAV • Crazy SQL queries. • Hundreds of joins in a query... or • Hundreds of queries joined in the application • No database enforced integrity
  • 30. Did I say crazy SQL (this is a single query)
  • 31. Did I say crazy SQL (this is a single query) You may have trouble reading this in the back
  • 33. Single Table Inheritance (insanely wide tables) • No data integrity enforcement • Only can use FK for common elements • Very wasteful (but disk is cheap!) • Can’t effectively index
  • 34. Generic Columns • No data integrity enforcement • No data type enforcement • Only can use FK for common elements • Wasteful (but disk is cheap!) • Can’t index
  • 35. Serialized in Blob • Not searchable • No integrity • All the disadvantages of a document store, but none of the advantages • Never should be used • One exception is Oracle XML which operates similar to a document store
  • 36. Concrete Table Inheritance (a table for each product attribute set) • Allows for data integrity • Querying across attribute sets quite hard to do (lots of joins, OR statements and full table scanning) • New table needs to be created for each new attribute set
  • 37. Class table inheritance (single product table, each attribute set in own table) • Likely best solution within the constraint of SQL • Supports data type enforcement • No data integrity enforcement • Easy querying across categories (for browse pages) since common data in single table • Every set needs a new table • Requires a ton of forsight, as changes are very complicated
  • 39. Mongo to the Rescue • Flexible (and sane) Schema • Easily searchable • Easily accessible • Fast
  • 40. Examples • grab products.js from http://github.com/ spf13/mongoProducts • mongo --shell products.js • > use mongoProducts
  • 43. Basic Queries • db.products.find( { 'details.title': "The Matrix" } ); • db.products.find( { 'details.director': "Frank Capra" } ); • db.products.find( { 'details.actor': "Groucho Marx" } ); • db.products.find( { 'details.tracks': "Violins" } ); • db.order.find( { 'items.sku': '00e8da9f' } );
  • 44. Advanced Queries • db.products.find( { 'pricing.pct_savings': { $gt : 15 } } ); • db.products.find( { 'details.actor': 'James Stewart', 'details.actor': 'Donna Reed' }) • db.order.find( { 'items.details.actor': /reed$/ i }, {'items.sku': 1} );
  • 45. Embedded documents great for orders • Ordered items need to be fixed at the time of purchase • Embed them right in the order • db.order.find( { 'items.sku': '00e8da9f' } ); • db.order.find( { 'items.details.actor': 'James Stewart' } ).count()
  • 48. Stricter data requirements for $$ • For financial systems any data inconsistency is unacceptable
  • 49. Stricter data requirements for $$ • For financial systems any data inconsistency is unacceptable • Mongo supports atomic single document writes
  • 50. Stricter data requirements for $$ • For financial systems any data inconsistency is unacceptable • Mongo supports atomic single document writes • Mongo doesn’t support multi-document transactions
  • 51. Stricter data requirements for $$ • For financial systems any data inconsistency is unacceptable • Mongo supports atomic single document writes • Mongo doesn’t support multi-document transactions • If you need atomic writes across multi-document transactions don’t use Mongo.
  • 52. Stricter data requirements for $$ • For financial systems any data inconsistency is unacceptable • Mongo supports atomic single document writes • Mongo doesn’t support multi-document transactions • If you need atomic writes across multi-document transactions don’t use Mongo. • Many ecommerce transactions could be accomplished within a single document write.
  • 54. What about ACID? • ACID compliance is difficult to implement with a distributed system. For the most sensitive data you may want the reliability ACID guarantees.
  • 55. What about ACID? • ACID compliance is difficult to implement with a distributed system. For the most sensitive data you may want the reliability ACID guarantees. • Mongo is ACI for single collection writes.
  • 56. What about ACID? • ACID compliance is difficult to implement with a distributed system. For the most sensitive data you may want the reliability ACID guarantees. • Mongo is ACI for single collection writes. • For many systems ACID compliance for recording a purchase is overkill and unnecessary.
  • 57. About Me • VP Engineering @ OpenSky • @spf13 • github.com/spf13 • Hacker, Father, Husband, Entrepreneur, Skate Punk
  • 58. About OpenSky OpenSky is a better way to shop where you buy from people you trust. OpenSky is a free online platform that helps people discover amazing products; share them with their friends, family and followers; and earn money. Launching in July 2010, the OpenSky is the first ecommerce platform built on MongoDB.  The OpenSky Platform has been created using MongoDB, Symfony 2, Doctrine 2, PHP 5.3, jQuery, node.js, Git (with gitflow) and a touch of Java and Python. We support and contribute to many of these open source projects.   OpenSky was founded by John Caplan, former CEO of Ford Models and former President of the About.com Network. OpenSky is well funded by Highland Capital Partners, Canaan Partners and Ron Conway and is based in NYC near Union Square.
  • 59. Interested in what you saw? • We’re Hiring • Contact me at [email protected]
  • 62. OpenSky currently uses a hybrid approach • OpenSky uses a 3rd party library to process transactions it is written to only use SQL for transactions. In 2011 we will transition off this library. • OpenSky uses Solr for search. • All other functions of the site currently on MongoDB.
  • 63. Note on Ecommerce and Mongo Concerning transactions, it is true that MongoDB doesn’t support the multi-object kind; however, it does support atomic operations on individual documents. And this, combined with the document- oriented modeling just described and a little creativity, is adequate to many e-commerce problems. Certainly, if we needed to literally debit one account and credit another in the same operation, or if we wanted rollback, we’d need full-fledged transactions. However, the transactionality provided by MongoDB may be sufficient for most, if not all, e-commerce operations. - Kyle Banker 10gen

Editor's Notes

  • #25: actually, just the first 1/3 of it.
  • #27: Ironically this is how magento solves the performance problems associated with EAV, by caching the data into insanely wide tables.
  • #31: Can’t create a FK as each set references a different table. “Key” really made of attribute table name id and attribute table name
  • #37: Notice how it doesn’t matter if it’s an array or not.
  • #41: Whenever you use a inter system coordination you need to implement your own atomic checks in the application... But SOAP does have transactions.. so not quite accurate. kyle idea... but we are fairly atomic with authorize.net
  • #42: Whenever you use a inter system coordination you need to implement your own atomic checks in the application... But SOAP does have transactions.. so not quite accurate. kyle idea... but we are fairly atomic with authorize.net
  • #43: Whenever you use a inter system coordination you need to implement your own atomic checks in the application... But SOAP does have transactions.. so not quite accurate. kyle idea... but we are fairly atomic with authorize.net
  • #44: Whenever you use a inter system coordination you need to implement your own atomic checks in the application... But SOAP does have transactions.. so not quite accurate. kyle idea... but we are fairly atomic with authorize.net
  • #45: Whenever you use a inter system coordination you need to implement your own atomic checks in the application... But SOAP does have transactions.. so not quite accurate. kyle idea... but we are fairly atomic with authorize.net
  • #46: atomicity, consistency, isolation, durability.
  • #47: atomicity, consistency, isolation, durability.
  • #48: atomicity, consistency, isolation, durability.