0% found this document useful (0 votes)
18 views

TAS - Graph Database - June 2020

Graph databases store data in nodes and edges, representing objects and their relationships. The top graph database vendors include Neo4j, which is an open source native graph database platform. It offers high performance, scalability, and native graph storage and processing. Neo4j is commonly used for applications involving recommendations, master data management, and network analysis due to its ability to intuitively represent complex relationships.

Uploaded by

1977am
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

TAS - Graph Database - June 2020

Graph databases store data in nodes and edges, representing objects and their relationships. The top graph database vendors include Neo4j, which is an open source native graph database platform. It offers high performance, scalability, and native graph storage and processing. Neo4j is commonly used for applications involving recommendations, master data management, and network analysis due to its ability to intuitively represent complex relationships.

Uploaded by

1977am
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Graph Database

- List of Top Vendors

Technology Assessment Services


June 2020
Understanding Graph Database

Graph Database
 Based upon the concept of a mathematical graph, a graph database contains a collection of nodes and edges
 A node represents an object, and an edge represents the connection or relationship between two objects
 Each node in a graph database is identified by a unique identifier that expresses key value pairs
 Additionally, each edge is defined by a unique identifier that details a starting or ending node, along with a set of
properties

Traditional Database store data to efficiently store facts,


but relationships must be rebuilt with JOINs and other
inexact techniques

Graph Database store both facts and the relationships


between the facts, making certain types of analysis more
Intuitive Traditional Graph
Database Database

Graph vs Relational Database?


 Relational databases store highly-structured data in tables
with pre-determined columns and rows, graph databases
can map multiple types of relational and complex data
 Thus, graph databases are not rigid in their organization
and structure, where as relational databases are rigid
Graph Database

Graph Database | Technology Assessment Services | June 2020 Source: CBR online, Neo4j, © Capgemini 2020. All rights reserved | 2
Graph Query Languages
 Graph Database Query Language (GraphQL), is a concrete mechanism for creating, manipulating and querying graph data in a
graph database
 Graph query languages are SQL equivalents for Graph DBMS
 GraphQL is actually an API Query Language while Gremlin, SPARQL and now GQL are all Query Languages for Graph Databases

1 Gremlin 2 Cypher 3 SPARQL

 Most common and widely-used  Originally developed by Neo4j as  Originally developed by the W3C
graph query language a graph query language that to query data stored in the
allows users to store and retrieve Resource Description Framework
 It is the query language of data from the graph database (RDF) format for metadata
Apache TinkerPop graph
computing framework  Opensource since 2015  SPARQL (SPARQL Protocol And
and openCypher project RDF Query Language) is a W3C
 Gremlin is a functional, data- provides an open language standard designed to meet the
flow language that enables users specification, technical use cases identified by the RDF
to succinctly express complex compatibility kit, and reference Data Access Working Group
traversals on (or queries of) their implementation of the parser,
application's property graph planner, and runtime for Cypher  Even though its a protocol, for
most use cases SPARQL's
 Widely adopted and supported by  OpenCypher has industry greatest value is a query
nearly all graph databases support, most prominently by language for RDF
supporting Property Graphs (PG) SAP graphs (another W3C standard)

Graph Database | Technology Assessment Services | June 2020 © Capgemini 2020. All rights reserved | 3
Use cases of Knowledge graphs

It powers Google’s search engine, as the original page rank algorithm is based on a form of
knowledge graph, as well as later additions to its search technology

Relies on this form of information organization, to keep track of networks of people


and the connections between them, as well as every other data point they use to
build a picture of their users, such as favorite artists and movies, events attended
and geographical locations

Uses knowledge graph technology to organize information on its vast catalog of content, drawing
connections between movies and TV shows and the actors, directors or producers who put them
together. This helps them to predict what customers might like to watch next, and foster the "binge-
watching" model of consumption it has built its business around

Uses knowledge graphs to build accessible models of all of the data it generates and
stores, and use it for risk management, process monitoring and building “digital
twins” – simulated versions of real-world systems which can be used for design,
prototyping and training

Graph Database | Technology Assessment Services | June 2020 © Capgemini 2020. All rights reserved | 4
List of Top 5 Graph
Database
Neo4j - Leading native graph database and graph platform

Competitive Advantages
 Database combines everything needed for performance and trust in Implementation Language
applications that bring data relationships to the fore  Java, Scala
 Native graph storage, native graph processing, graph scalability, high Server operating systems
availability, graph clustering, graphs in the cloud, graphs on Spark,
 Linux, OS X, Solaris,
built-in ETL, and integration support, plus Cypher, a powerful and
Windows
expressive language for queries using vastly less code than SQL

Applications / Use cases


 Open source Graph APIs and other access
 Real-Time Recommendations
Database Scalability methods
 Master Data Management
 Highly scalable both vertically  Identity and Access Mgmt  Bolt protocol
 Neo4j is a native graph and horizontally, without  Cypher query language
 Network and IT Operations
database platform that is introducing data integrity or  Java API
 Fraud Detection
built to store, query, consistency issues using its  Neo4j-OGM
 Tax Evasion /AML
analyze and manage Causal Clustering architecture  Neo4j-OGM
 Graph-Based Search
highly connected data  Also supports multi-clustering  Spring Data Neo4j
 Graph Analytics & Algorithms
more efficiently than other  TinkerPop 3
 Graph-powered AI
databases
 Smart Homes, IoT
 Initial Release – 2007
 Current Release - 4.0.5, June Major Customers Supported programming
2020  300 commercial customers and over 750 startups languages
 eBay, Walmart, Cisco, Citibank, ING, UBS, HP, CenturyLink, Telenor,  .Net, Clojure, Elixir, Go, Groovy,
TomTom, Telia, Comcast, The National Geographic Society, Airbus, Haskell, Java, JavaScript, Perl,
Orange, AT&T, Verizon, DHS, US Army, Pitney Bowes, Vanguard, PHP, Python, Ruby, Scala
Microsoft, IBM, Thomson Reuters, Amadeus Travel, Caterpillar, Volvo

Graph Database | Technology Assessment Services | June 2020 © Capgemini 2020. All rights reserved | 6
OrientDB - First Multi-Model Distributed DBMS with a True
Graph Engine

Competitive Advantages
 It is touted to be the fastest graph database and OrientDB’s query Implementation Language
language is built on SQL  Java
 Can be used as a pure Graph Database or as a Multi-Model, avoiding Server operating systems
using multiple DBMS products in the same application  All OS with a Java JDK (>=
 Supports the creation of schemas around graphs JDK 6)

Scalability
 Open source , Multi-model  Supports a Multi-Master + Applications / Use cases
Sharded architecture: all the APIs and other access
DBMS (Document, Graph,  Fraud detection
servers are masters methods
Key/Value)  Fighting Crime
 Manages relationships without  Tinkerpop technology stack
 Investigation, Fraud
using JOINs, but rather direct with Blueprints
 Multi-Model means 2nd Detection and prevention
pointers. This allows to have  Gremlin, Pipes
generation NoSQL able to  Data Governance, Master
constant performance on  Java API
manage complex domain Data Management
traversing relationships, no  RESTful HTTP/JSON API
with incredible  Traffic Management
performance matter the database size

 OrientDB is the first Multi-


Major Customers
Model Distributed DBMS Supported programming
 Verizon, KPMG, AT&T, Expedia, Dell, Comcast, JPMorgan Chase,
with a True Graph Engine languages
Schneider Electric, Accenture, CenturyLink, Cisco, SAP, Informatica,
Juniper Networks, United Nations, AXA equitable, Warner Music, Sky,  .Net, C, C#, C++, Clojure, Java,
 Initial Release – 2010 JavaScript, JavaScript (Node.js),
Kaiser Permanente, Pitney bowes, Vadafone, Orange
 Current Release - 3.1.0, June PHP, Python, Ruby, Scala
 Several clients have passed from Neo4j to OrientDB
2020
An independent benchmark study by IBM and the Tokyo
Institute of Technology showed that OrientDB is 10x faster than
Neo4j on graph operations among all the workloads

Graph Database | Technology Assessment Services | June 2020 © Capgemini 2020. All rights reserved | 7
ArangoDB - Fast growing native multi-model NoSQL
database

Competitive Advantages
 As a native multi-model database, can be used as a full blown
document store, graph graph database, search engine or any Implementation Language
combination of these technologies  C++
 Strong Data Consistency and Simplified Performance Scaling Server operating systems
 Deployment is very easy with the ArangoDB Starter and as well on  Linux, OS X, Windows
Kubernetes with the ArangoDB Operator

 Open source , native Scalability Applications / Use cases APIs and other access
multi-model DBMS for  Scales both vertically and  Single View of everything methods
graph, document, horizontally  Cybersecurity  AQL
key/value and search  If performance needs  Simulations in manufacturing  Foxx Framework
decrease, it can be easily  Identity & Access Mgmt  Graph API (Gremlin)
 All in one engine and scale down the backend  Fraud detection  GraphQL query language
accessible with one query system to save on hardware  Recommendation Engines  HTTP API
language and operational  Feature Engineering in ML &  Java & SpringData
requirements. AI  JSON style queries
 Designed to store data  Network Mgmt & Surveillance  VelocyPack/VelocyStream
natively as key-value pairs,
graphs and JSON documents Major Customers
that can be accessed with Cisco, Barclays, Refinitive, Siemens Mentor, Kabbage, Liaison, Douglas, Supported programming
one declarative query MakeMyTrip, Kaseware, Demonware, Brainhub, Oxford University, IC languages
language - AQL Manage, Actify  C#, C++, lojure, Elixir, Go, Java,
JavaScript (Node.js), PHP,
 Initial Release – 2012 Gartner Peer Insight recognizes ArangoDB as one of the highest Python, R, Rust
 Current Release - 3.6.0, rated operational databases
January 2020

Graph Database | Technology Assessment Services | June 2020 © Capgemini 2020. All rights reserved | 8
Microsoft Azure CosmosDB - Native support for NoSQL
choices

Competitive Advantages
 Offers multiple well-defined consistency models Implementation Language
 Guarantees single-digit-millisecond latencies at the 99th percentile,  C++
and guarantees high availability with multi-homing capabilities and Server operating systems
low latencies anywhere in the world  Hosted

Applications / Use cases


Scalability  Management of customer- APIs and other access
 Indexing, scaling, and geo- generated data, such as blog methods
replication are handled posts, ratings and comments  DocumentDB API
 Globally distributed,  Graph API (Gremlin)
automatically in the Azure  Store catalogs and manage
horizontally scalable,  MongoDB API
cloud, without any knob- event data
multi-model database  RESTful HTTP API
twiddling on user’s end  Supports Microsoft Store and
service  Table API
Xbox Live
 Internet of Things
 Azure Cosmos DB provides
native support for NoSQL
choices Supported programming
Major Customers languages
ABB, CocaCola, Citrix, Caltex, Symantec, Liberty Mutual, Servicelink,  .Net, C#, Java, JavaScript,
 Initial Release – 2014 Zeiss, Diply, Archive360, Allscripts, Johnson Controls, Quest, Swedavia
 Current Release - NA JavaScript (Node.js)
Airports, New Zealand Trade & Enterprise, BMI, Siemens Healthineers,  MongoDB client drivers
Exxonmobil, Aveva, Skype, Rolls-Royce, Kohler, Albertsons-Safeway, written for various
SitePro, Bentley, Kognitiv Spark, Cincinnati Children's Hospital Medical programming languages
Center, Finastr  Python

Graph Database | Technology Assessment Services | June 2020 © Capgemini 2020. All rights reserved | 9
Amazon Neptune - Fully-managed graph database service

Competitive Advantages
 Fast, reliable, fully-managed graph database service that makes it Implementation Language
easy to build and run applications that work with highly connected  Java, Scala
datasets Server operating systems
 The core of Amazon Neptune is a purpose-built, high-performance  Hosted
graph database engine optimized for storing billions of relationships
and querying the graph with milliseconds latency

Applications / Use cases


 Fraud detection
 Fast, reliable graph Scalability  Recommendation engines
database built for the  Indexing, scaling, and geo-  Social networking APIs and other access
cloud replication are handled  Regulatory compliance
methods
automatically in the Azure  Knowledge graphs
 RDF 1.1 / SPARQL 1.1
 Supports popular graph cloud, without any knob-  Supply chain transparency
 TinkerPop Gremlin 3.3
models Property Graph twiddling on user’s end  Network/IT Operations -
and W3C's RDF, and their including identity and access
respective query management, detection of
languages Apache malicious file paths
TinkerPop Gremlin and
SPARQL, all
Major Customers Supported programming
 Initial Release – 2017 Siemens, AstraZeneca, Samsung, Pearson, Intuit, Amazon Alexa, languages
 Current Release - NA Thomson Reuters, Finra, Ingnition One, Blackfynn, Pay Sense, LiFeOMIC  C#, Go, Java, JavaScript,
PHP, Python, Ruby, Scala

Graph Database | Technology Assessment Services | June 2020 © Capgemini 2020. All rights reserved | 10
Other Popular graph databases

 Enterprise Knowledge Graph  Distributed, hyper-  Enterprise RDF and graph


platform and graph DBMS with relational database for database with efficient
high availability, high managing complex data that reasoning, cluster and external
performance reasoning, and serves as a knowledge base for index synchronization support
virtualization cognitive/AI systems
 Scalable, secure, and standards-  SPARQL is used as query language
based  It stores data in a way that allows  High-performance semantic
 Virtual data Connectors to all major machines to understand the repository created by Ontotext
SQL servers, Cassandra, MongoDB meaning of information in the  Implemented in Java and packaged
and more to easily access data complete context of their as a Storage and Inference Layer
silos relationships (SAIL) for the RDF4J framework
 NLP pipeline, BITES, lets user  Consequently, Grakn allows  Loading, reasoning and query
incorporate unstructured data in computers to process complex evaluation proceed fast even
addition to SQL and NoSQL data information more intelligently with against huge ontologies and
into the knowledge graph less human intervention knowledge bases
 BI/SQL Server which translates the  Graql is a declarative, knowledge-  MongoDB integration for large-
knowledge graph back into SQL; oriented graph query language that scale metadata management
supported platforms include uses machine reasoning for  Most utilized semantic triplestore
Tableau, PowerBI, Cognos, and retrieving explicitly stored and for mission-critical enterprise
more implicitly derived knowledge from deployments
 Implementation Language: Java Grakn  Implementation Language: Java
 Customers: Bosch, Dow Jones,  Implementation Language: Java  Initial Release – 2000
Elsevier, Ericsson, Morgan Stanley,  Initial Release – 2016  Current Release – 8.8, January
NASA, NIH, Nokia, Salesforce,  Current Release – 1.8.0, June 2020 2019
Siemens, Springer, Raytheon

Graph Database | Technology Assessment Services | June 2020 © Capgemini 2020. All rights reserved | 11
About Capgemini
A global leader in consulting, technology services and digital transformation,
Capgemini is at the forefront of innovation to address the entire breadth of clients’
opportunities in the evolving world of cloud, digital and platforms. Building on its
strong 50-year heritage and deep industry-specific expertise, Capgemini enables
organizations to realize their business ambitions through an array of services from
strategy to operations. Capgemini is driven by the conviction that the business
value of technology comes from and through people. It is a multicultural company
of almost 220,000 team members in more than 40 countries. The Group reported
2019 global revenues of EUR 14.1 billion.

Learn more about us at

www.capgemini.com

This presentation contains information that may be privileged or confidential


and is the property of the Capgemini Group.
Copyright © 2020 Capgemini. All rights reserved.

You might also like