SlideShare a Scribd company logo
Nuxeo: from SQL to
MongoDB
Florent Guillaume — Director of R&D, Nuxeo
2014-07-03
The Nuxeo Model
Nuxeo Platform
SQL DB
Document
BLOBS
<META>
<META>
<META>
Repository
BlobStore
Store
Read
Cache
Persistence
Engine
Insert
Update
Select
FS
MongoDB
VCS DBS
Nuxeo Core — Rich Documents
• Scalars
• Strings, Integers, Floats, Booleans, Dates
• Binary blobs (stored using separate BinaryStore service)
• Arrays of scalars
• Complex properties (sub-documents)
• Lists of complex properties
• System properties
• Id, type, facets, lifecycle state, ACL, version flags...
Nuxeo Core — Rich Documents
• Scalar properties and arrays
• dc:title = "My Document"
• dc:contributors = ["bob", "pete", "mary"]
• dc:created = 2014-07-03T12:15:07+0200
• ecm:uuid = 52a7352b-041e-49ed-8676-328ce90cc103
• ecm:primaryType = "MyFile"
• ecm:majorVersion = 2, ecm:minorVersion = 0
• ecm:isLatestMajorVersion = true, ecm:isLatestVersion = false
Nuxeo Core — Rich Documents
• Complex properties and lists of them
• primaryAddress = { street = "1 rue René Clair", zip = "75018",

city = "Paris", country = "France" }
• files = [
• { name = "doc.txt", length = 1234, mime-type = "plain/text",

data = 0111fefdc8b14738067e54f30e568115 }
• { name = "doc.pdf", length = 29344, mime-type = "application/pdf",

data = 20f42df3221d61cb3e6ab8916b248216 }
]
Nuxeo Core — Rich Operations
• CRUD
• Create
• Retrieve
• Update
• Delete
• Move
• Copy
• ... but in a Hierarchy
Nuxeo Core — Rich Features
• Security based on ACLs and inheritance
• block bob for Write, allow members for Read
• Proxies (multi-filing)
• Versioning
• Placeless documents (versions, tags, relations...)
• Facets (dynamic typing)
• Locking
• Search (NXQL)

SELECT * FROM File WHERE files/*/name = 'doc.txt'
Nuxeo Core — Hierarchy
• Parent-child relationship
• Recursion
• Find all the children to change something
• Lifecycle state
• Security
• Search on a subset of the hierarchy
• ... AND ecm:path STARTSWITH '/workspaces/receipts'
SQL vs DBS/MongoDB
Storage — SQL
• Stores data in a set of JOINed tables
• Star schema, around the main hierarchy
• Lists as JOINed table with item/pos
• Complex properties as sub-documents (children)
• Lists of complex properties as ordered sub-documents
• Id generated by application or database
• String / native UUID / serial integer
Storage — SQL (base hierarchy)
Storage — SQL (simple props)
Storage — SQL (complex props)
Storage — MongoDB
• Standard JSON documents
• Property names fully prefixed
• Lists as arrays of scalars
• Complex properties as sub-documents
• Complex lists as arrays of sub-documents
• Id generated by MongoDB
• Counter using findAndModify, $inc and returnNew
Storage — MongoDB
"ecm:id": "52a7352b-041e-49ed-8676-328ce90cc103",

"dc:title": "My Document",

"dc:contributors": ["bob", "pete", "mary"],

"dc:created": ISODate("2014-07-03T12:15:07+0200"),

"ecm:primaryType": "MyFile",

"ecm:majorVersion": NumberLong(2),

"ecm:minorVersion": NumberLong(0),

"ecm:isLatestMajorVersion": true,

"ecm:isLatestVersion": false,

Storage — MongoDB
primaryAddress: { street: "1 rue René Clair", zip: "75018",

city: "Paris", country: "France" },

files: [{ name: "doc.txt", length: 1234, mime-type: "plain/text",

data: "0111fefdc8b14738067e54f30e568115" },

{ name: "doc.pdf", length: 29344, mime-type: "application/
pdf",

data: "20f42df3221d61cb3e6ab8916b248216" }]

"ecm:acp": [{

name: "local",

acl: [{ grant: false, perm: "Write", user: "bob" },

{ grant: true, perm: "Read", user: "pete" },

{ grant: true, perm: "Read", user: "members" }]

}]
Hierarchy — SQL
• Parent-child relationship
• hierarchy.parentid column
• Recursion optimized through ancestors table
• For each document list all its ancestors
• Maintained by database triggers (create, delete, move, copy)
• Alternative for PostgreSQL: array column with all ancestors
Hierarchy — SQL
Hierarchy — MongoDB
• Parent-child relationship
• ecm:parentId field
• Recursion optimized through ecm:ancestorIds array
• Maintained by framework (create, delete, move, copy)
Hierarchy — MongoDB
"ecm:parentId": "afb488e7",
"ecm:ancestorIds": ["00000000", "18ba9e90",
"afb488e7"],

Proxies — SQL
• Reference to target document
• proxies.targetid column
• Holds only hierarchy-based information, no content
• Parent, name, ACL...
• Additional JOIN during search
Proxies — MongoDB
• Copy of the target document
• ecm:proxyTargetId field
• Target document knows who's pointing to it
• ecm:proxyIds field
• Maintained by framework
• Copy needs to be kept up to date when target changes
• Maintained by framework
Proxies — Semantics
• What to do when:
• Target removed (→ forbid)
• Proxy removed
• Proxy + target removed at the same time (→ ok)
• Target copied
• Proxy copied (→ new proxy to original target)
• Proxy + target copied at the same time (todo)
Security — SQL
• Generic ACP stored in acls table
• Precomputed Read ACLs needed for search
• Ordered list of identities having access, with blocking

["Management", "Supervisors", "-Temps", "bob"]
• Read ACLs are given an identifier
• Identities having access to which Read ACL is precomputed
• Maintained by database triggers
• Search matches using JOIN
Security — SQL
Security — SQL
Security — MongoDB
• Generic ACP stored in ecm:acp field
• Precomputed Read ACLs needed for search
• Simple set of identities having access

ecm:racl: ["Management", "Supervisors", "bob"]!
• Semantic restrictions on blocking
• Maintained by framework
• Search matches if intersection

{"ecm:racl": {"$in": ["bob", "members", "Everyone"]}}
Search — SQL
• Translated from NXQL to SQL
• JOIN of all required star/list/complex properties tables
• Additional UNION + JOINs for proxies
• Additional JOIN for security
• Can have correlations (reuse same JOIN)
• Fulltext index(es) on fulltext.simpletext /
fulltext.binarytext columns
• Translated from NXQL to MongoDB syntax
• Proxies queried directly
• Security queried by set intersection
• One fulltext index for ecm:fulltextSimple /
ecm:fulltextBinary fields
• Some limitations
Search — MongoDB
Search — MongoDB Limitations
• Only one fulltext search per query, restrictions on position
• No generic boolean NOT, must be pushed down as
negative operators
• Search is field/value based
• No multi-field operators (title = description,
expirationDate > modificationDate)
• No multi-field arithmetic (amount + bonus < 1000)
• Subdocument correlation with $elemMatch is less generic than
full JOINs
Transactions — SQL
• Standard SQL database capabilities
• Atomic commit
• Two-phase commit (prepare/commit) also useable, although
costly
• Rollback
• Transient data is data modified in the database but not
yet committed
• Transient data is visible along committed data for retrieval and
search
Transactions — MongoDB
• No atomic commit beyond a single document
• Commit using a big batch of create/delete/update
accumulated in-memory
• Not atomic, others can see partial state
• No transient space
• Emulate transient space in-memory, flush at commit time
• All accesses and searches must check the transient space as
well as MongoDB
Transactions — MongoDB
• No rollback
• Rollback by dropping the in-memory transient space
• Operations involving several documents in relation
• Move, delete, copy, ancestors or recursion checks
• Using transient space + MongoDB for them is too complex
• Flush to MongoDB before doing them (commit)
• Must be able to be rolled back if needed (transaction
compensation)
• Others can see state that's eventually invalid
MongoDB — Restrictions
• Eventual consistency and no transactions
• Prevents strong checks
• Duplicate name in a folder
• Move creating cycles
• Remove target before proxy
• Create document in a deleted folder
• Prevents full consistency of hierarchical processing
• Read ACLs, quotas
• Needs background jobs that check consistency
MongoDB — Features
• Bulk operations
• Map-reduce for aggregations
• Quotas / count / folder content last modified
• Conditional updates
• Locks
• Prevent dirty writes
• GridFS to store binaries
• Sharding
DBS — Future Work
Future Work
• DBS used for more services
• Directories / Vocabularies / User database
• Audit log
• DBS for other backends
• Elasticsearch
• Redis
• PostgreSQL / JSON
• Other...
Thanks!
We're Hiring!

More Related Content

What's hot (20)

PDF
Mongo db eveningschemadesign
MongoDB APAC
 
PDF
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...
Prasoon Kumar
 
PDF
Rpsonmongodb
MongoDB APAC
 
PPTX
Introduction to Windows Azure Data Services
Robert Greiner
 
PDF
Nuxeo Platform LTS 2015 Highlights
Nuxeo
 
PPTX
MMS - Monitoring, backup and management at a single click
Matias Cascallares
 
PDF
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
Raul Chong
 
PDF
Node.js and couchbase Full Stack JSON - Munich NoSQL
Philipp Fehre
 
PPTX
Choosing the right Cloud Database
Janakiram MSV
 
PPTX
Soaring through the Clouds - Oracle Fusion Middleware Partner Forum 2016
Lucas Jellema
 
PPT
Cloudant Overview Bluemix Meetup from Lisa Neddam
Romeo Kienzler
 
PPTX
Introduction to RavenDB
Sasha Goldshtein
 
PPTX
Elk ruminating on logs
Mathew Beane
 
PDF
Couchbase@live person meetup july 22nd
Ido Shilon
 
PPTX
The Essentials of Building Cloud-Based Web Apps with Azure
Ido Flatow
 
PPSX
MongoDB seminar
mahdi dousti
 
PPTX
Migrating Customers to Microsoft Azure: Lessons Learned From the Field
Ido Flatow
 
PDF
Accelerating Data Ingestion with Databricks Autoloader
Databricks
 
PPT
.NET Core Apps: Design & Development
GlobalLogic Ukraine
 
PPTX
Webinar: Architecting Secure and Compliant Applications with MongoDB
MongoDB
 
Mongo db eveningschemadesign
MongoDB APAC
 
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...
Prasoon Kumar
 
Rpsonmongodb
MongoDB APAC
 
Introduction to Windows Azure Data Services
Robert Greiner
 
Nuxeo Platform LTS 2015 Highlights
Nuxeo
 
MMS - Monitoring, backup and management at a single click
Matias Cascallares
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
Raul Chong
 
Node.js and couchbase Full Stack JSON - Munich NoSQL
Philipp Fehre
 
Choosing the right Cloud Database
Janakiram MSV
 
Soaring through the Clouds - Oracle Fusion Middleware Partner Forum 2016
Lucas Jellema
 
Cloudant Overview Bluemix Meetup from Lisa Neddam
Romeo Kienzler
 
Introduction to RavenDB
Sasha Goldshtein
 
Elk ruminating on logs
Mathew Beane
 
Couchbase@live person meetup july 22nd
Ido Shilon
 
The Essentials of Building Cloud-Based Web Apps with Azure
Ido Flatow
 
MongoDB seminar
mahdi dousti
 
Migrating Customers to Microsoft Azure: Lessons Learned From the Field
Ido Flatow
 
Accelerating Data Ingestion with Databricks Autoloader
Databricks
 
.NET Core Apps: Design & Development
GlobalLogic Ukraine
 
Webinar: Architecting Secure and Compliant Applications with MongoDB
MongoDB
 

Viewers also liked (20)

PDF
Manual magento 1-1
plopez_7
 
PPTX
Literatura guatemalteca de finales del siglo XIX
Grace Herrera
 
PDF
IVA CAIXA
montseibarz
 
DOCX
Actitud Laboral
Ruddy Peralta Rodriguez
 
PPTX
Digital Influence - The social professional
Dallas McMillan
 
KEY
Motion Django Meetup
Mike Malone
 
PPT
Presentación Internet_ruben diaz
rudaden7
 
PPTX
RAF TABTRONICS LLC COMPANY OVERVIEW - 2014
Lisa Carpenter
 
PDF
Norwich o crest_bred
criaderodecanarios
 
PPS
La Diaclasa (Benaocaz)
clubchiclamon
 
PDF
2911 1 3
Boopathi Yoganathan
 
PPTX
¿Qué es la Facioterapia?
UIMEC cursos de acupuntura
 
PPT
Curso inicial
Jesús Alvarado López
 
DOCX
Procesos mc perú
Edwin Leading
 
PDF
Psicologia+clinica+que+es
Engel Perez
 
PDF
Exercici portfolio
slovapari
 
PDF
PROGRAMA ERRADICACION DE LA MOSCA DEL MEDITERRANEO EN MENDOZA
Dirección General de Escuelas Mendoza
 
PPTX
Contraste
Edy Hm
 
Manual magento 1-1
plopez_7
 
Literatura guatemalteca de finales del siglo XIX
Grace Herrera
 
IVA CAIXA
montseibarz
 
Actitud Laboral
Ruddy Peralta Rodriguez
 
Digital Influence - The social professional
Dallas McMillan
 
Motion Django Meetup
Mike Malone
 
Presentación Internet_ruben diaz
rudaden7
 
RAF TABTRONICS LLC COMPANY OVERVIEW - 2014
Lisa Carpenter
 
Norwich o crest_bred
criaderodecanarios
 
La Diaclasa (Benaocaz)
clubchiclamon
 
¿Qué es la Facioterapia?
UIMEC cursos de acupuntura
 
Curso inicial
Jesús Alvarado López
 
Procesos mc perú
Edwin Leading
 
Psicologia+clinica+que+es
Engel Perez
 
Exercici portfolio
slovapari
 
PROGRAMA ERRADICACION DE LA MOSCA DEL MEDITERRANEO EN MENDOZA
Dirección General de Escuelas Mendoza
 
Contraste
Edy Hm
 
Ad

Similar to From SQL to MongoDB (20)

KEY
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!
Daniel Cousineau
 
PPTX
Einführung in MongoDB
NETUserGroupBern
 
PPTX
Introduction to MongoDB at IGDTUW
Ankur Raina
 
KEY
MongoDB
Steven Francia
 
KEY
Mongodb intro
christkv
 
PDF
Using MongoDB to Build a Fast and Scalable Content Repository
MongoDB
 
PDF
MongoDB.pdf
KuldeepKumar778733
 
PPTX
introtomongodb
saikiran
 
PPTX
lecture_34e.pptx
janibashashaik25
 
PPTX
Mongo db
Gyanendra Yadav
 
PDF
MongoDB: a gentle, friendly overview
Antonio Pintus
 
ODP
MongoDB - A Document NoSQL Database
Ruben Inoto Soto
 
KEY
MongoDB - Ruby document store that doesn't rhyme with ouch
Wynn Netherland
 
PDF
Using MongoDB and Python
Mike Bright
 
PDF
2016 feb-23 pyugre-py_mongo
Michael Bright
 
PPT
MongoDB Pros and Cons
johnrjenson
 
PDF
MongoDB NoSQL database a deep dive -MyWhitePaper
Rajesh Kumar
 
PPTX
MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?
MongoDB
 
DOCX
MongoDB DOC v1.5
Tharun Srinivasa
 
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!
Daniel Cousineau
 
Einführung in MongoDB
NETUserGroupBern
 
Introduction to MongoDB at IGDTUW
Ankur Raina
 
Mongodb intro
christkv
 
Using MongoDB to Build a Fast and Scalable Content Repository
MongoDB
 
MongoDB.pdf
KuldeepKumar778733
 
introtomongodb
saikiran
 
lecture_34e.pptx
janibashashaik25
 
Mongo db
Gyanendra Yadav
 
MongoDB: a gentle, friendly overview
Antonio Pintus
 
MongoDB - A Document NoSQL Database
Ruben Inoto Soto
 
MongoDB - Ruby document store that doesn't rhyme with ouch
Wynn Netherland
 
Using MongoDB and Python
Mike Bright
 
2016 feb-23 pyugre-py_mongo
Michael Bright
 
MongoDB Pros and Cons
johnrjenson
 
MongoDB NoSQL database a deep dive -MyWhitePaper
Rajesh Kumar
 
MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?
MongoDB
 
MongoDB DOC v1.5
Tharun Srinivasa
 
Ad

More from Nuxeo (20)

PDF
Own the Digital Shelf Strategies Food and Beverage Companies
Nuxeo
 
PDF
How DAM Librarians Can Get Ready for the Uncertain Future
Nuxeo
 
PDF
How Insurers Fueled Transformation During a Pandemic
Nuxeo
 
PDF
Manage your Content at Scale with MongoDB and Nuxeo
Nuxeo
 
PDF
Accelerate the Digital Supply Chain From Idea to Support
Nuxeo
 
PDF
Where are you in the DAM Continuum
Nuxeo
 
PDF
Customer Experience in 2021
Nuxeo
 
PPTX
L’IA personnalisée, clé d’une gestion de l’information innovante
Nuxeo
 
PDF
Gérer ses contenus avec MongoDB et Nuxeo
Nuxeo
 
PPTX
Le DAM en 2021 : Tendances, points clés et critères d'évaluation
Nuxeo
 
PPTX
Enabling Digital Transformation Amidst a Global Pandemic | Low-Code, Cloud, A...
Nuxeo
 
PDF
Elevate your Customer's Experience and Stay Ahead of the Competition
Nuxeo
 
PDF
Driving Brand Loyalty Through Superior Customer Experience
Nuxeo
 
PDF
Drive Enterprise Speed and Scale with A Cloud-Native DAM
Nuxeo
 
PPTX
The Big Picture: the Role of Video, Photography, and Content in Enhancing the...
Nuxeo
 
PDF
How Creatives Are Getting Creative in 2020 and Beyond
Nuxeo
 
PPTX
Digitalisation : Améliorez la collaboration et l’expérience client grâce au DAM
Nuxeo
 
PDF
Reimagine Your Claims Process with Future-Proof Technologies
Nuxeo
 
PPTX
Comment le Centre Hospitalier Laborit dématérialise ses processus administratifs
Nuxeo
 
PDF
Accelerating the Packaging Design Process with Artificial Intelligence
Nuxeo
 
Own the Digital Shelf Strategies Food and Beverage Companies
Nuxeo
 
How DAM Librarians Can Get Ready for the Uncertain Future
Nuxeo
 
How Insurers Fueled Transformation During a Pandemic
Nuxeo
 
Manage your Content at Scale with MongoDB and Nuxeo
Nuxeo
 
Accelerate the Digital Supply Chain From Idea to Support
Nuxeo
 
Where are you in the DAM Continuum
Nuxeo
 
Customer Experience in 2021
Nuxeo
 
L’IA personnalisée, clé d’une gestion de l’information innovante
Nuxeo
 
Gérer ses contenus avec MongoDB et Nuxeo
Nuxeo
 
Le DAM en 2021 : Tendances, points clés et critères d'évaluation
Nuxeo
 
Enabling Digital Transformation Amidst a Global Pandemic | Low-Code, Cloud, A...
Nuxeo
 
Elevate your Customer's Experience and Stay Ahead of the Competition
Nuxeo
 
Driving Brand Loyalty Through Superior Customer Experience
Nuxeo
 
Drive Enterprise Speed and Scale with A Cloud-Native DAM
Nuxeo
 
The Big Picture: the Role of Video, Photography, and Content in Enhancing the...
Nuxeo
 
How Creatives Are Getting Creative in 2020 and Beyond
Nuxeo
 
Digitalisation : Améliorez la collaboration et l’expérience client grâce au DAM
Nuxeo
 
Reimagine Your Claims Process with Future-Proof Technologies
Nuxeo
 
Comment le Centre Hospitalier Laborit dématérialise ses processus administratifs
Nuxeo
 
Accelerating the Packaging Design Process with Artificial Intelligence
Nuxeo
 

Recently uploaded (20)

PDF
vMix Pro 28.0.0.42 Download vMix Registration key Bundle
kulindacore
 
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked} 2025
hashhshs786
 
PDF
Executive Business Intelligence Dashboards
vandeslie24
 
PDF
Salesforce CRM Services.VALiNTRY360
VALiNTRY360
 
PDF
Powering GIS with FME and VertiGIS - Peak of Data & AI 2025
Safe Software
 
PDF
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
PDF
GetOnCRM Speeds Up Agentforce 3 Deployment for Enterprise AI Wins.pdf
GetOnCRM Solutions
 
PDF
Linux Certificate of Completion - LabEx Certificate
VICTOR MAESTRE RAMIREZ
 
PPTX
The Role of a PHP Development Company in Modern Web Development
SEO Company for School in Delhi NCR
 
PPTX
How Odoo Became a Game-Changer for an IT Company in Manufacturing ERP
SatishKumar2651
 
PPTX
Human Resources Information System (HRIS)
Amity University, Patna
 
PDF
Efficient, Automated Claims Processing Software for Insurers
Insurance Tech Services
 
PPTX
Platform for Enterprise Solution - Java EE5
abhishekoza1981
 
PPTX
Perfecting XM Cloud for Multisite Setup.pptx
Ahmed Okour
 
PPTX
How Apagen Empowered an EPC Company with Engineering ERP Software
SatishKumar2651
 
PPTX
Writing Better Code - Helping Developers make Decisions.pptx
Lorraine Steyn
 
PPTX
A Complete Guide to Salesforce SMS Integrations Build Scalable Messaging With...
360 SMS APP
 
PPTX
Migrating Millions of Users with Debezium, Apache Kafka, and an Acyclic Synch...
MD Sayem Ahmed
 
PPTX
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pptx
Varsha Nayak
 
PPTX
3uTools Full Crack Free Version Download [Latest] 2025
muhammadgurbazkhan
 
vMix Pro 28.0.0.42 Download vMix Registration key Bundle
kulindacore
 
Capcut Pro Crack For PC Latest Version {Fully Unlocked} 2025
hashhshs786
 
Executive Business Intelligence Dashboards
vandeslie24
 
Salesforce CRM Services.VALiNTRY360
VALiNTRY360
 
Powering GIS with FME and VertiGIS - Peak of Data & AI 2025
Safe Software
 
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
GetOnCRM Speeds Up Agentforce 3 Deployment for Enterprise AI Wins.pdf
GetOnCRM Solutions
 
Linux Certificate of Completion - LabEx Certificate
VICTOR MAESTRE RAMIREZ
 
The Role of a PHP Development Company in Modern Web Development
SEO Company for School in Delhi NCR
 
How Odoo Became a Game-Changer for an IT Company in Manufacturing ERP
SatishKumar2651
 
Human Resources Information System (HRIS)
Amity University, Patna
 
Efficient, Automated Claims Processing Software for Insurers
Insurance Tech Services
 
Platform for Enterprise Solution - Java EE5
abhishekoza1981
 
Perfecting XM Cloud for Multisite Setup.pptx
Ahmed Okour
 
How Apagen Empowered an EPC Company with Engineering ERP Software
SatishKumar2651
 
Writing Better Code - Helping Developers make Decisions.pptx
Lorraine Steyn
 
A Complete Guide to Salesforce SMS Integrations Build Scalable Messaging With...
360 SMS APP
 
Migrating Millions of Users with Debezium, Apache Kafka, and an Acyclic Synch...
MD Sayem Ahmed
 
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pptx
Varsha Nayak
 
3uTools Full Crack Free Version Download [Latest] 2025
muhammadgurbazkhan
 

From SQL to MongoDB

  • 1. Nuxeo: from SQL to MongoDB Florent Guillaume — Director of R&D, Nuxeo 2014-07-03
  • 4. Nuxeo Core — Rich Documents • Scalars • Strings, Integers, Floats, Booleans, Dates • Binary blobs (stored using separate BinaryStore service) • Arrays of scalars • Complex properties (sub-documents) • Lists of complex properties • System properties • Id, type, facets, lifecycle state, ACL, version flags...
  • 5. Nuxeo Core — Rich Documents • Scalar properties and arrays • dc:title = "My Document" • dc:contributors = ["bob", "pete", "mary"] • dc:created = 2014-07-03T12:15:07+0200 • ecm:uuid = 52a7352b-041e-49ed-8676-328ce90cc103 • ecm:primaryType = "MyFile" • ecm:majorVersion = 2, ecm:minorVersion = 0 • ecm:isLatestMajorVersion = true, ecm:isLatestVersion = false
  • 6. Nuxeo Core — Rich Documents • Complex properties and lists of them • primaryAddress = { street = "1 rue René Clair", zip = "75018",
 city = "Paris", country = "France" } • files = [ • { name = "doc.txt", length = 1234, mime-type = "plain/text",
 data = 0111fefdc8b14738067e54f30e568115 } • { name = "doc.pdf", length = 29344, mime-type = "application/pdf",
 data = 20f42df3221d61cb3e6ab8916b248216 } ]
  • 7. Nuxeo Core — Rich Operations • CRUD • Create • Retrieve • Update • Delete • Move • Copy • ... but in a Hierarchy
  • 8. Nuxeo Core — Rich Features • Security based on ACLs and inheritance • block bob for Write, allow members for Read • Proxies (multi-filing) • Versioning • Placeless documents (versions, tags, relations...) • Facets (dynamic typing) • Locking • Search (NXQL)
 SELECT * FROM File WHERE files/*/name = 'doc.txt'
  • 9. Nuxeo Core — Hierarchy • Parent-child relationship • Recursion • Find all the children to change something • Lifecycle state • Security • Search on a subset of the hierarchy • ... AND ecm:path STARTSWITH '/workspaces/receipts'
  • 11. Storage — SQL • Stores data in a set of JOINed tables • Star schema, around the main hierarchy • Lists as JOINed table with item/pos • Complex properties as sub-documents (children) • Lists of complex properties as ordered sub-documents • Id generated by application or database • String / native UUID / serial integer
  • 15. Storage — MongoDB • Standard JSON documents • Property names fully prefixed • Lists as arrays of scalars • Complex properties as sub-documents • Complex lists as arrays of sub-documents • Id generated by MongoDB • Counter using findAndModify, $inc and returnNew
  • 16. Storage — MongoDB "ecm:id": "52a7352b-041e-49ed-8676-328ce90cc103",
 "dc:title": "My Document",
 "dc:contributors": ["bob", "pete", "mary"],
 "dc:created": ISODate("2014-07-03T12:15:07+0200"),
 "ecm:primaryType": "MyFile",
 "ecm:majorVersion": NumberLong(2),
 "ecm:minorVersion": NumberLong(0),
 "ecm:isLatestMajorVersion": true,
 "ecm:isLatestVersion": false,

  • 17. Storage — MongoDB primaryAddress: { street: "1 rue René Clair", zip: "75018",
 city: "Paris", country: "France" },
 files: [{ name: "doc.txt", length: 1234, mime-type: "plain/text",
 data: "0111fefdc8b14738067e54f30e568115" },
 { name: "doc.pdf", length: 29344, mime-type: "application/ pdf",
 data: "20f42df3221d61cb3e6ab8916b248216" }]
 "ecm:acp": [{
 name: "local",
 acl: [{ grant: false, perm: "Write", user: "bob" },
 { grant: true, perm: "Read", user: "pete" },
 { grant: true, perm: "Read", user: "members" }]
 }]
  • 18. Hierarchy — SQL • Parent-child relationship • hierarchy.parentid column • Recursion optimized through ancestors table • For each document list all its ancestors • Maintained by database triggers (create, delete, move, copy) • Alternative for PostgreSQL: array column with all ancestors
  • 20. Hierarchy — MongoDB • Parent-child relationship • ecm:parentId field • Recursion optimized through ecm:ancestorIds array • Maintained by framework (create, delete, move, copy)
  • 22. Proxies — SQL • Reference to target document • proxies.targetid column • Holds only hierarchy-based information, no content • Parent, name, ACL... • Additional JOIN during search
  • 23. Proxies — MongoDB • Copy of the target document • ecm:proxyTargetId field • Target document knows who's pointing to it • ecm:proxyIds field • Maintained by framework • Copy needs to be kept up to date when target changes • Maintained by framework
  • 24. Proxies — Semantics • What to do when: • Target removed (→ forbid) • Proxy removed • Proxy + target removed at the same time (→ ok) • Target copied • Proxy copied (→ new proxy to original target) • Proxy + target copied at the same time (todo)
  • 25. Security — SQL • Generic ACP stored in acls table • Precomputed Read ACLs needed for search • Ordered list of identities having access, with blocking
 ["Management", "Supervisors", "-Temps", "bob"] • Read ACLs are given an identifier • Identities having access to which Read ACL is precomputed • Maintained by database triggers • Search matches using JOIN
  • 28. Security — MongoDB • Generic ACP stored in ecm:acp field • Precomputed Read ACLs needed for search • Simple set of identities having access
 ecm:racl: ["Management", "Supervisors", "bob"]! • Semantic restrictions on blocking • Maintained by framework • Search matches if intersection
 {"ecm:racl": {"$in": ["bob", "members", "Everyone"]}}
  • 29. Search — SQL • Translated from NXQL to SQL • JOIN of all required star/list/complex properties tables • Additional UNION + JOINs for proxies • Additional JOIN for security • Can have correlations (reuse same JOIN) • Fulltext index(es) on fulltext.simpletext / fulltext.binarytext columns
  • 30. • Translated from NXQL to MongoDB syntax • Proxies queried directly • Security queried by set intersection • One fulltext index for ecm:fulltextSimple / ecm:fulltextBinary fields • Some limitations Search — MongoDB
  • 31. Search — MongoDB Limitations • Only one fulltext search per query, restrictions on position • No generic boolean NOT, must be pushed down as negative operators • Search is field/value based • No multi-field operators (title = description, expirationDate > modificationDate) • No multi-field arithmetic (amount + bonus < 1000) • Subdocument correlation with $elemMatch is less generic than full JOINs
  • 32. Transactions — SQL • Standard SQL database capabilities • Atomic commit • Two-phase commit (prepare/commit) also useable, although costly • Rollback • Transient data is data modified in the database but not yet committed • Transient data is visible along committed data for retrieval and search
  • 33. Transactions — MongoDB • No atomic commit beyond a single document • Commit using a big batch of create/delete/update accumulated in-memory • Not atomic, others can see partial state • No transient space • Emulate transient space in-memory, flush at commit time • All accesses and searches must check the transient space as well as MongoDB
  • 34. Transactions — MongoDB • No rollback • Rollback by dropping the in-memory transient space • Operations involving several documents in relation • Move, delete, copy, ancestors or recursion checks • Using transient space + MongoDB for them is too complex • Flush to MongoDB before doing them (commit) • Must be able to be rolled back if needed (transaction compensation) • Others can see state that's eventually invalid
  • 35. MongoDB — Restrictions • Eventual consistency and no transactions • Prevents strong checks • Duplicate name in a folder • Move creating cycles • Remove target before proxy • Create document in a deleted folder • Prevents full consistency of hierarchical processing • Read ACLs, quotas • Needs background jobs that check consistency
  • 36. MongoDB — Features • Bulk operations • Map-reduce for aggregations • Quotas / count / folder content last modified • Conditional updates • Locks • Prevent dirty writes • GridFS to store binaries • Sharding
  • 38. Future Work • DBS used for more services • Directories / Vocabularies / User database • Audit log • DBS for other backends • Elasticsearch • Redis • PostgreSQL / JSON • Other...