SlideShare a Scribd company logo
Building an Enterprise
Knowledge
Graph @Uber:
Lessons from Reality
Joshua Shinavier, PhD
Knowledge Graph Conference
May 8th
, 2019
...
The future
Half empty
Half full
The present
Knowledge @Uber
● Uber is an ideal proving ground for an enterprise knowledge graph (EKG)
● 200k managed data sets
● Billions and billions of trips served
○ Low thousands of new entities per second
○ Totally doable!
● Even more sensor data
○ Use cases for graph stream processing
● Genuine need for knowledge and real-time inference
Knowledge @Uber
EKG hierarchy of needs
Building an Enterprise Knowledge Graph @Uber: Lessons from Reality
● Real data is messy
● Real data is messy
● We are not all ontologists
● Real data is messy
● We are not all ontologists
● Good enough does not scale
● Real data is messy
● We are not all ontologists
● Good enough does not scale
● Beware of the hype cycle
● Real data is messy
● We are not all ontologists
● Good enough does not scale
● Beware of the hype cycle
● RDF is a hard sell
● Real data is messy
● We are not all ontologists
● Good enough does not scale
● Beware of the hype cycle
● RDF is a hard sell
● Property Graphs are not enough
● Use and promote standards
● Use and promote standards
● Invest in shared vocabulary
● Use and promote standards
● Invest in shared vocabulary
● Fit the tooling to the infrastructure
● Use and promote standards
● Invest in shared vocabulary
● Fit the tooling to the infrastructure
● Fit the data model to the data
● Use and promote standards
● Invest in shared vocabulary
● Fit the tooling to the infrastructure
● Fit the data model to the data
● Budget for “other stuff”
● Use and promote standards
● Invest in shared vocabulary
● Fit the tooling to the infrastructure
● Fit the data model to the data
● Budget for “other stuff”
● Collaborate early and often
Risk & Safety Knowledge Graph
This slide intentionally left blank to save entropy.
UBER KNOWLEDGE GRAPH
● Controlled vocabularies for all of Uber
○ Basic type aliases
○ Structured types for geospatial data, sensor data, money, etc. etc.
○ Entities and relationships (User, Vehicle, Trip, etc.)
○ Metadata vocabularies
● Elevates domain-specific RPC and storage schemas to ontologies
● Tooling carries schemas between data representation languages
○ Protobuf, Thrift, Avro, RDF, PG, etc.
Data Standardization
● Hundreds of thousands of structured datasets at Uber
● Data protections and user trust
○ GDPR and other regulations, Uber’s own data policies
○ What kind of user data? Where is it?
○ Heroic numbers of manual annotations
■ Limited expressivity, limited guarantees
■ Inference is required
● Two birds: in annotating datasets, standardize and compose schemas
○ Now we have a true global knowledge graph
○ Investigating efficient reasoning and “No ETL” solutions
Metadata graph
● Common data model for RPC, storage, and KR at Uber
● In progress: alignment with the Property Graph Schema Working Group
● In progress: “Universal structure” of TinkerPop4
Algebraic Property Graphs
● Real data is messy
● We are not all ontologists
● Good enough does not scale
● Beware of the hype cycle
● RDF is a hard sell
● The Property Graph is not enough
● Use and promote standards
● Invest in shared vocabulary
● Fit the tooling to the infrastructure
● Fit the data model to the data
● Budget for “other stuff”
● Collaborate early and often
joshsh@uber.com
Thanks
Ad

More Related Content

What's hot (20)

Introduction of Knowledge Graphs
Introduction of Knowledge GraphsIntroduction of Knowledge Graphs
Introduction of Knowledge Graphs
Jeff Z. Pan
 
Workshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data ScienceWorkshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data Science
Neo4j
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph Introduction
Sören Auer
 
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
Jeff Z. Pan
 
A Primer on Entity Resolution
A Primer on Entity ResolutionA Primer on Entity Resolution
A Primer on Entity Resolution
Benjamin Bengfort
 
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryData Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Kai Wähner
 
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataIntroduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Sören Auer
 
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Simplilearn
 
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
Edureka!
 
Introduction to Knowledge Graphs: Data Summit 2020
Introduction to Knowledge Graphs: Data Summit 2020Introduction to Knowledge Graphs: Data Summit 2020
Introduction to Knowledge Graphs: Data Summit 2020
Enterprise Knowledge
 
NoSQL
NoSQLNoSQL
NoSQL
Radu Potop
 
ESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge Graphs
Peter Haase
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
DataStax
 
A Universe of Knowledge Graphs
A Universe of Knowledge GraphsA Universe of Knowledge Graphs
A Universe of Knowledge Graphs
Neo4j
 
Enterprise Knowledge Graph
Enterprise Knowledge GraphEnterprise Knowledge Graph
Enterprise Knowledge Graph
Benjamin Raethlein
 
RDBMS to Graph
RDBMS to GraphRDBMS to Graph
RDBMS to Graph
Neo4j
 
Introduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & BahrainIntroduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & Bahrain
Neo4j
 
Knowledge Graphs - The Power of Graph-Based Search
Knowledge Graphs - The Power of Graph-Based SearchKnowledge Graphs - The Power of Graph-Based Search
Knowledge Graphs - The Power of Graph-Based Search
Neo4j
 
Slides: Knowledge Graphs vs. Property Graphs
Slides: Knowledge Graphs vs. Property GraphsSlides: Knowledge Graphs vs. Property Graphs
Slides: Knowledge Graphs vs. Property Graphs
DATAVERSITY
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
Max De Marzi
 
Introduction of Knowledge Graphs
Introduction of Knowledge GraphsIntroduction of Knowledge Graphs
Introduction of Knowledge Graphs
Jeff Z. Pan
 
Workshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data ScienceWorkshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data Science
Neo4j
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph Introduction
Sören Auer
 
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
Jeff Z. Pan
 
A Primer on Entity Resolution
A Primer on Entity ResolutionA Primer on Entity Resolution
A Primer on Entity Resolution
Benjamin Bengfort
 
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryData Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Kai Wähner
 
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataIntroduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Sören Auer
 
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Simplilearn
 
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
Edureka!
 
Introduction to Knowledge Graphs: Data Summit 2020
Introduction to Knowledge Graphs: Data Summit 2020Introduction to Knowledge Graphs: Data Summit 2020
Introduction to Knowledge Graphs: Data Summit 2020
Enterprise Knowledge
 
ESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge Graphs
Peter Haase
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
DataStax
 
A Universe of Knowledge Graphs
A Universe of Knowledge GraphsA Universe of Knowledge Graphs
A Universe of Knowledge Graphs
Neo4j
 
RDBMS to Graph
RDBMS to GraphRDBMS to Graph
RDBMS to Graph
Neo4j
 
Introduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & BahrainIntroduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & Bahrain
Neo4j
 
Knowledge Graphs - The Power of Graph-Based Search
Knowledge Graphs - The Power of Graph-Based SearchKnowledge Graphs - The Power of Graph-Based Search
Knowledge Graphs - The Power of Graph-Based Search
Neo4j
 
Slides: Knowledge Graphs vs. Property Graphs
Slides: Knowledge Graphs vs. Property GraphsSlides: Knowledge Graphs vs. Property Graphs
Slides: Knowledge Graphs vs. Property Graphs
DATAVERSITY
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
Max De Marzi
 

Similar to Building an Enterprise Knowledge Graph @Uber: Lessons from Reality (20)

Data mining with Rattle For R
Data mining with Rattle For RData mining with Rattle For R
Data mining with Rattle For R
Akhil Anil
 
AI & Personalised Experiences
AI & Personalised ExperiencesAI & Personalised Experiences
AI & Personalised Experiences
Neal Lathia
 
Domain Semantics
Domain SemanticsDomain Semantics
Domain Semantics
mlang22
 
Data Science as Scale
Data Science as ScaleData Science as Scale
Data Science as Scale
Conor B. Murphy
 
"The Hunt For Alpha Among Alternative Data Sources" by Dr. Michael Halls-Moor...
"The Hunt For Alpha Among Alternative Data Sources" by Dr. Michael Halls-Moor..."The Hunt For Alpha Among Alternative Data Sources" by Dr. Michael Halls-Moor...
"The Hunt For Alpha Among Alternative Data Sources" by Dr. Michael Halls-Moor...
Quantopian
 
Context is King: Smart User Experiences and the World of Work
Context is King: Smart User Experiences and the World of WorkContext is King: Smart User Experiences and the World of Work
Context is King: Smart User Experiences and the World of Work
Ultan O'Broin
 
Smart User Experiences and the World of Work: Context is King
Smart User Experiences and the World of Work: Context is KingSmart User Experiences and the World of Work: Context is King
Smart User Experiences and the World of Work: Context is King
Ultan O'Broin
 
Artificial Intelligence and Antitrust (Hal Varian)
Artificial Intelligence and Antitrust (Hal Varian)Artificial Intelligence and Antitrust (Hal Varian)
Artificial Intelligence and Antitrust (Hal Varian)
FSR Communications and Media
 
Global Scale in Research (Nancy Douyon at Enterprise UX 2018)
Global Scale in Research (Nancy Douyon at Enterprise UX 2018)Global Scale in Research (Nancy Douyon at Enterprise UX 2018)
Global Scale in Research (Nancy Douyon at Enterprise UX 2018)
Rosenfeld Media
 
SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018
CareerBuilder.com
 
Big Data Pipelines and Machine Learning at Uber
Big Data Pipelines and Machine Learning at UberBig Data Pipelines and Machine Learning at Uber
Big Data Pipelines and Machine Learning at Uber
Sudhir Tonse
 
Publishing Linked Data using Schema.org
Publishing Linked Data using Schema.orgPublishing Linked Data using Schema.org
Publishing Linked Data using Schema.org
DESTIN-Informatique.com
 
[Strata NYC 2019] Turning big data into knowledge: Managing metadata and data...
[Strata NYC 2019] Turning big data into knowledge: Managing metadata and data...[Strata NYC 2019] Turning big data into knowledge: Managing metadata and data...
[Strata NYC 2019] Turning big data into knowledge: Managing metadata and data...
Kaan Onuk
 
Big Data with IOT approach and trends with case study
Big Data with IOT approach and trends with case studyBig Data with IOT approach and trends with case study
Big Data with IOT approach and trends with case study
Sharjeel Imtiaz
 
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
Sarah Aerni
 
Hadoop Training Tutorial for Freshers
Hadoop Training Tutorial for FreshersHadoop Training Tutorial for Freshers
Hadoop Training Tutorial for Freshers
rajkamaltibacademy
 
Data science: use cases and tools
Data science: use cases and toolsData science: use cases and tools
Data science: use cases and tools
Alexey Grigorev
 
L15.pptx
L15.pptxL15.pptx
L15.pptx
ImonBennett
 
D.3.1: State of the Art - Linked Data and Digital Preservation
D.3.1: State of the Art - Linked Data and Digital PreservationD.3.1: State of the Art - Linked Data and Digital Preservation
D.3.1: State of the Art - Linked Data and Digital Preservation
PRELIDA Project
 
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
Dan Lynn
 
Data mining with Rattle For R
Data mining with Rattle For RData mining with Rattle For R
Data mining with Rattle For R
Akhil Anil
 
AI & Personalised Experiences
AI & Personalised ExperiencesAI & Personalised Experiences
AI & Personalised Experiences
Neal Lathia
 
Domain Semantics
Domain SemanticsDomain Semantics
Domain Semantics
mlang22
 
"The Hunt For Alpha Among Alternative Data Sources" by Dr. Michael Halls-Moor...
"The Hunt For Alpha Among Alternative Data Sources" by Dr. Michael Halls-Moor..."The Hunt For Alpha Among Alternative Data Sources" by Dr. Michael Halls-Moor...
"The Hunt For Alpha Among Alternative Data Sources" by Dr. Michael Halls-Moor...
Quantopian
 
Context is King: Smart User Experiences and the World of Work
Context is King: Smart User Experiences and the World of WorkContext is King: Smart User Experiences and the World of Work
Context is King: Smart User Experiences and the World of Work
Ultan O'Broin
 
Smart User Experiences and the World of Work: Context is King
Smart User Experiences and the World of Work: Context is KingSmart User Experiences and the World of Work: Context is King
Smart User Experiences and the World of Work: Context is King
Ultan O'Broin
 
Artificial Intelligence and Antitrust (Hal Varian)
Artificial Intelligence and Antitrust (Hal Varian)Artificial Intelligence and Antitrust (Hal Varian)
Artificial Intelligence and Antitrust (Hal Varian)
FSR Communications and Media
 
Global Scale in Research (Nancy Douyon at Enterprise UX 2018)
Global Scale in Research (Nancy Douyon at Enterprise UX 2018)Global Scale in Research (Nancy Douyon at Enterprise UX 2018)
Global Scale in Research (Nancy Douyon at Enterprise UX 2018)
Rosenfeld Media
 
SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018
CareerBuilder.com
 
Big Data Pipelines and Machine Learning at Uber
Big Data Pipelines and Machine Learning at UberBig Data Pipelines and Machine Learning at Uber
Big Data Pipelines and Machine Learning at Uber
Sudhir Tonse
 
[Strata NYC 2019] Turning big data into knowledge: Managing metadata and data...
[Strata NYC 2019] Turning big data into knowledge: Managing metadata and data...[Strata NYC 2019] Turning big data into knowledge: Managing metadata and data...
[Strata NYC 2019] Turning big data into knowledge: Managing metadata and data...
Kaan Onuk
 
Big Data with IOT approach and trends with case study
Big Data with IOT approach and trends with case studyBig Data with IOT approach and trends with case study
Big Data with IOT approach and trends with case study
Sharjeel Imtiaz
 
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
Sarah Aerni
 
Hadoop Training Tutorial for Freshers
Hadoop Training Tutorial for FreshersHadoop Training Tutorial for Freshers
Hadoop Training Tutorial for Freshers
rajkamaltibacademy
 
Data science: use cases and tools
Data science: use cases and toolsData science: use cases and tools
Data science: use cases and tools
Alexey Grigorev
 
D.3.1: State of the Art - Linked Data and Digital Preservation
D.3.1: State of the Art - Linked Data and Digital PreservationD.3.1: State of the Art - Linked Data and Digital Preservation
D.3.1: State of the Art - Linked Data and Digital Preservation
PRELIDA Project
 
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
Dan Lynn
 
Ad

More from Joshua Shinavier (16)

Anything-to-Graph
Anything-to-GraphAnything-to-Graph
Anything-to-Graph
Joshua Shinavier
 
Transpilers Gone Wild: Introducing Hydra
Transpilers Gone Wild: Introducing HydraTranspilers Gone Wild: Introducing Hydra
Transpilers Gone Wild: Introducing Hydra
Joshua Shinavier
 
TinkerPop 2020
TinkerPop 2020TinkerPop 2020
TinkerPop 2020
Joshua Shinavier
 
An Algebraic Data Model for Graphs and Hypergraphs (Category Theory meetup, N...
An Algebraic Data Model for Graphs and Hypergraphs (Category Theory meetup, N...An Algebraic Data Model for Graphs and Hypergraphs (Category Theory meetup, N...
An Algebraic Data Model for Graphs and Hypergraphs (Category Theory meetup, N...
Joshua Shinavier
 
In Search of the Universal Data Model (ISWC 2019 Minute Madness)
In Search of the Universal Data Model (ISWC 2019 Minute Madness)In Search of the Universal Data Model (ISWC 2019 Minute Madness)
In Search of the Universal Data Model (ISWC 2019 Minute Madness)
Joshua Shinavier
 
In Search of the Universal Data Model (Connected Data London 2019)
In Search of the Universal Data Model (Connected Data London 2019)In Search of the Universal Data Model (Connected Data London 2019)
In Search of the Universal Data Model (Connected Data London 2019)
Joshua Shinavier
 
Algebraic Property Graphs (GQL Community Update, oct. 9, 2019)
Algebraic Property Graphs (GQL Community Update, oct. 9, 2019)Algebraic Property Graphs (GQL Community Update, oct. 9, 2019)
Algebraic Property Graphs (GQL Community Update, oct. 9, 2019)
Joshua Shinavier
 
Evolution of the Graph Schema
Evolution of the Graph SchemaEvolution of the Graph Schema
Evolution of the Graph Schema
Joshua Shinavier
 
TinkerPop: a story of graphs, DBs, and graph DBs
TinkerPop: a story of graphs, DBs, and graph DBsTinkerPop: a story of graphs, DBs, and graph DBs
TinkerPop: a story of graphs, DBs, and graph DBs
Joshua Shinavier
 
Semantics and Sensors
Semantics and SensorsSemantics and Sensors
Semantics and Sensors
Joshua Shinavier
 
semantic markup using schema.org
semantic markup using schema.orgsemantic markup using schema.org
semantic markup using schema.org
Joshua Shinavier
 
The Real-time Web in the Age of Agents
The Real-time Web in the Age of AgentsThe Real-time Web in the Age of Agents
The Real-time Web in the Age of Agents
Joshua Shinavier
 
Linked Process
Linked ProcessLinked Process
Linked Process
Joshua Shinavier
 
Real-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter AnnotationsReal-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter Annotations
Joshua Shinavier
 
Real-time #SemanticWeb in 140 chars
Real-time #SemanticWeb in 140 charsReal-time #SemanticWeb in 140 chars
Real-time #SemanticWeb in 140 chars
Joshua Shinavier
 
The state of the art in Linked Data
The state of the art in Linked DataThe state of the art in Linked Data
The state of the art in Linked Data
Joshua Shinavier
 
Transpilers Gone Wild: Introducing Hydra
Transpilers Gone Wild: Introducing HydraTranspilers Gone Wild: Introducing Hydra
Transpilers Gone Wild: Introducing Hydra
Joshua Shinavier
 
An Algebraic Data Model for Graphs and Hypergraphs (Category Theory meetup, N...
An Algebraic Data Model for Graphs and Hypergraphs (Category Theory meetup, N...An Algebraic Data Model for Graphs and Hypergraphs (Category Theory meetup, N...
An Algebraic Data Model for Graphs and Hypergraphs (Category Theory meetup, N...
Joshua Shinavier
 
In Search of the Universal Data Model (ISWC 2019 Minute Madness)
In Search of the Universal Data Model (ISWC 2019 Minute Madness)In Search of the Universal Data Model (ISWC 2019 Minute Madness)
In Search of the Universal Data Model (ISWC 2019 Minute Madness)
Joshua Shinavier
 
In Search of the Universal Data Model (Connected Data London 2019)
In Search of the Universal Data Model (Connected Data London 2019)In Search of the Universal Data Model (Connected Data London 2019)
In Search of the Universal Data Model (Connected Data London 2019)
Joshua Shinavier
 
Algebraic Property Graphs (GQL Community Update, oct. 9, 2019)
Algebraic Property Graphs (GQL Community Update, oct. 9, 2019)Algebraic Property Graphs (GQL Community Update, oct. 9, 2019)
Algebraic Property Graphs (GQL Community Update, oct. 9, 2019)
Joshua Shinavier
 
Evolution of the Graph Schema
Evolution of the Graph SchemaEvolution of the Graph Schema
Evolution of the Graph Schema
Joshua Shinavier
 
TinkerPop: a story of graphs, DBs, and graph DBs
TinkerPop: a story of graphs, DBs, and graph DBsTinkerPop: a story of graphs, DBs, and graph DBs
TinkerPop: a story of graphs, DBs, and graph DBs
Joshua Shinavier
 
semantic markup using schema.org
semantic markup using schema.orgsemantic markup using schema.org
semantic markup using schema.org
Joshua Shinavier
 
The Real-time Web in the Age of Agents
The Real-time Web in the Age of AgentsThe Real-time Web in the Age of Agents
The Real-time Web in the Age of Agents
Joshua Shinavier
 
Real-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter AnnotationsReal-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter Annotations
Joshua Shinavier
 
Real-time #SemanticWeb in 140 chars
Real-time #SemanticWeb in 140 charsReal-time #SemanticWeb in 140 chars
Real-time #SemanticWeb in 140 chars
Joshua Shinavier
 
The state of the art in Linked Data
The state of the art in Linked DataThe state of the art in Linked Data
The state of the art in Linked Data
Joshua Shinavier
 
Ad

Recently uploaded (20)

Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Cyntexa
 
UiPath Agentic Automation: Community Developer Opportunities
UiPath Agentic Automation: Community Developer OpportunitiesUiPath Agentic Automation: Community Developer Opportunities
UiPath Agentic Automation: Community Developer Opportunities
DianaGray10
 
UiPath Agentic Automation: Community Developer Opportunities
UiPath Agentic Automation: Community Developer OpportunitiesUiPath Agentic Automation: Community Developer Opportunities
UiPath Agentic Automation: Community Developer Opportunities
DianaGray10
 
The Future of Cisco Cloud Security: Innovations and AI Integration
The Future of Cisco Cloud Security: Innovations and AI IntegrationThe Future of Cisco Cloud Security: Innovations and AI Integration
The Future of Cisco Cloud Security: Innovations and AI Integration
Re-solution Data Ltd
 
How to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabberHow to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabber
eGrabber
 
Web and Graphics Designing Training in Rajpura
Web and Graphics Designing Training in RajpuraWeb and Graphics Designing Training in Rajpura
Web and Graphics Designing Training in Rajpura
Erginous Technology
 
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à GenèveUiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPathCommunity
 
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Raffi Khatchadourian
 
Transcript: Canadian book publishing: Insights from the latest salary survey ...
Transcript: Canadian book publishing: Insights from the latest salary survey ...Transcript: Canadian book publishing: Insights from the latest salary survey ...
Transcript: Canadian book publishing: Insights from the latest salary survey ...
BookNet Canada
 
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
Ivano Malavolta
 
Q1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor PresentationQ1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor Presentation
Dropbox
 
AsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API DesignAsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API Design
leonid54
 
Canadian book publishing: Insights from the latest salary survey - Tech Forum...
Canadian book publishing: Insights from the latest salary survey - Tech Forum...Canadian book publishing: Insights from the latest salary survey - Tech Forum...
Canadian book publishing: Insights from the latest salary survey - Tech Forum...
BookNet Canada
 
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
Lorenzo Miniero
 
Does Pornify Allow NSFW? Everything You Should Know
Does Pornify Allow NSFW? Everything You Should KnowDoes Pornify Allow NSFW? Everything You Should Know
Does Pornify Allow NSFW? Everything You Should Know
Pornify CC
 
fennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solutionfennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solution
shallal2
 
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdfKit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Wonjun Hwang
 
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make .pptx
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make   .pptxWebinar - Top 5 Backup Mistakes MSPs and Businesses Make   .pptx
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make .pptx
MSP360
 
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptxReimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
John Moore
 
The Microsoft Excel Parts Presentation.pdf
The Microsoft Excel Parts Presentation.pdfThe Microsoft Excel Parts Presentation.pdf
The Microsoft Excel Parts Presentation.pdf
YvonneRoseEranista
 
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Cyntexa
 
UiPath Agentic Automation: Community Developer Opportunities
UiPath Agentic Automation: Community Developer OpportunitiesUiPath Agentic Automation: Community Developer Opportunities
UiPath Agentic Automation: Community Developer Opportunities
DianaGray10
 
UiPath Agentic Automation: Community Developer Opportunities
UiPath Agentic Automation: Community Developer OpportunitiesUiPath Agentic Automation: Community Developer Opportunities
UiPath Agentic Automation: Community Developer Opportunities
DianaGray10
 
The Future of Cisco Cloud Security: Innovations and AI Integration
The Future of Cisco Cloud Security: Innovations and AI IntegrationThe Future of Cisco Cloud Security: Innovations and AI Integration
The Future of Cisco Cloud Security: Innovations and AI Integration
Re-solution Data Ltd
 
How to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabberHow to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabber
eGrabber
 
Web and Graphics Designing Training in Rajpura
Web and Graphics Designing Training in RajpuraWeb and Graphics Designing Training in Rajpura
Web and Graphics Designing Training in Rajpura
Erginous Technology
 
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à GenèveUiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPathCommunity
 
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Raffi Khatchadourian
 
Transcript: Canadian book publishing: Insights from the latest salary survey ...
Transcript: Canadian book publishing: Insights from the latest salary survey ...Transcript: Canadian book publishing: Insights from the latest salary survey ...
Transcript: Canadian book publishing: Insights from the latest salary survey ...
BookNet Canada
 
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
Ivano Malavolta
 
Q1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor PresentationQ1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor Presentation
Dropbox
 
AsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API DesignAsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API Design
leonid54
 
Canadian book publishing: Insights from the latest salary survey - Tech Forum...
Canadian book publishing: Insights from the latest salary survey - Tech Forum...Canadian book publishing: Insights from the latest salary survey - Tech Forum...
Canadian book publishing: Insights from the latest salary survey - Tech Forum...
BookNet Canada
 
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
Lorenzo Miniero
 
Does Pornify Allow NSFW? Everything You Should Know
Does Pornify Allow NSFW? Everything You Should KnowDoes Pornify Allow NSFW? Everything You Should Know
Does Pornify Allow NSFW? Everything You Should Know
Pornify CC
 
fennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solutionfennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solution
shallal2
 
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdfKit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Wonjun Hwang
 
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make .pptx
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make   .pptxWebinar - Top 5 Backup Mistakes MSPs and Businesses Make   .pptx
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make .pptx
MSP360
 
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptxReimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
John Moore
 
The Microsoft Excel Parts Presentation.pdf
The Microsoft Excel Parts Presentation.pdfThe Microsoft Excel Parts Presentation.pdf
The Microsoft Excel Parts Presentation.pdf
YvonneRoseEranista
 

Building an Enterprise Knowledge Graph @Uber: Lessons from Reality

  • 1. Building an Enterprise Knowledge Graph @Uber: Lessons from Reality Joshua Shinavier, PhD Knowledge Graph Conference May 8th , 2019 ...
  • 2. The future Half empty Half full The present Knowledge @Uber
  • 3. ● Uber is an ideal proving ground for an enterprise knowledge graph (EKG) ● 200k managed data sets ● Billions and billions of trips served ○ Low thousands of new entities per second ○ Totally doable! ● Even more sensor data ○ Use cases for graph stream processing ● Genuine need for knowledge and real-time inference Knowledge @Uber
  • 6. ● Real data is messy
  • 7. ● Real data is messy ● We are not all ontologists
  • 8. ● Real data is messy ● We are not all ontologists ● Good enough does not scale
  • 9. ● Real data is messy ● We are not all ontologists ● Good enough does not scale ● Beware of the hype cycle
  • 10. ● Real data is messy ● We are not all ontologists ● Good enough does not scale ● Beware of the hype cycle ● RDF is a hard sell
  • 11. ● Real data is messy ● We are not all ontologists ● Good enough does not scale ● Beware of the hype cycle ● RDF is a hard sell ● Property Graphs are not enough
  • 12. ● Use and promote standards
  • 13. ● Use and promote standards ● Invest in shared vocabulary
  • 14. ● Use and promote standards ● Invest in shared vocabulary ● Fit the tooling to the infrastructure
  • 15. ● Use and promote standards ● Invest in shared vocabulary ● Fit the tooling to the infrastructure ● Fit the data model to the data
  • 16. ● Use and promote standards ● Invest in shared vocabulary ● Fit the tooling to the infrastructure ● Fit the data model to the data ● Budget for “other stuff”
  • 17. ● Use and promote standards ● Invest in shared vocabulary ● Fit the tooling to the infrastructure ● Fit the data model to the data ● Budget for “other stuff” ● Collaborate early and often
  • 18. Risk & Safety Knowledge Graph This slide intentionally left blank to save entropy. UBER KNOWLEDGE GRAPH
  • 19. ● Controlled vocabularies for all of Uber ○ Basic type aliases ○ Structured types for geospatial data, sensor data, money, etc. etc. ○ Entities and relationships (User, Vehicle, Trip, etc.) ○ Metadata vocabularies ● Elevates domain-specific RPC and storage schemas to ontologies ● Tooling carries schemas between data representation languages ○ Protobuf, Thrift, Avro, RDF, PG, etc. Data Standardization
  • 20. ● Hundreds of thousands of structured datasets at Uber ● Data protections and user trust ○ GDPR and other regulations, Uber’s own data policies ○ What kind of user data? Where is it? ○ Heroic numbers of manual annotations ■ Limited expressivity, limited guarantees ■ Inference is required ● Two birds: in annotating datasets, standardize and compose schemas ○ Now we have a true global knowledge graph ○ Investigating efficient reasoning and “No ETL” solutions Metadata graph
  • 21. ● Common data model for RPC, storage, and KR at Uber ● In progress: alignment with the Property Graph Schema Working Group ● In progress: “Universal structure” of TinkerPop4 Algebraic Property Graphs
  • 22. ● Real data is messy ● We are not all ontologists ● Good enough does not scale ● Beware of the hype cycle ● RDF is a hard sell ● The Property Graph is not enough ● Use and promote standards ● Invest in shared vocabulary ● Fit the tooling to the infrastructure ● Fit the data model to the data ● Budget for “other stuff” ● Collaborate early and often