SlideShare a Scribd company logo
Graph Databases Josh Adell <josh.adell@gmail.com> 20110719
Who am I? Software developer: PHP, Javascript, SQL http://www.dunnwell.com Fan of using the right tool for the job
The Problem
The Solution? > -- Given &quot;Keanu Reeves&quot; find a connection to &quot;Kevin Bacon&quot; > SELECT ??? FROM cast WHERE ??? +---------------------------------------------------------------------+ | actor_name                 | movie_title                            | +============================+========================================+ | Jennifer Connelley         | Higher Learning                        | +----------------------------+----------------------------------------+ | Laurence Fishburne         | Mystic River                           | +----------------------------+----------------------------------------+ | Laurence Fishburne         | Higher Learning                        | +----------------------------+----------------------------------------+ | Kevin Bacon                | Mystic River                           | +----------------------------+----------------------------------------+ | Keanu Reeves               | The Matrix                             | +----------------------------+----------------------------------------+ | Laurence Fishburne         | The Matrix                             | +----------------------------+----------------------------------------+
Find Every Actor at Each Degree > -- First degree > SELECT actor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_title FROM cast WHERE actor_name='Kevin Bacon') > -- Second degree > SELECT actor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_title FROM cast WHERE actor_name IN (SELECT actor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_title FROM cast WHERE actor_name='Kevin Bacon'))) > -- Third degree > SELECT actor_name FROM cast WHERE movie_title IN(SELECT DISTINCT movie_title FROM cast WHERE actor_name IN (SELECT actor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_title FROM cast WHERE actor_name IN (SELECT actor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_title FROM cast WHERE actor_name='Kevin Bacon'))))
The Truth Relational databases aren't very good with relationsh ips Data RDBMs
RDBs Use Set Math
The Real Problem Finding relationships across multiple degrees of separation      ...and across multiple data types      ...and where you don't even know there is a relationship
The Real Solution
Computer Science Definition A graph is an ordered pair  G = (V, E)  where V is a set of  vertices  and E is a set of  edges , which are pairs of vertices.
Some Graph DB Vocabulary Node : vertex Relationship : edge Property : meta-datum attached to a node or relationship Path : an ordered list of nodes and relationships Index : node or relationship lookup table
Relationships are First-Class Citizens Have a type Have properties Have a direction Domain semantics Traversable in any direction
Graph Examples
Relational Databases are Graphs!
New Solution to the Bacon Problem $keanu = $actorIndex->find('name', 'Keanu Reeves'); $kevin = $actorIndex->find('name', 'Kevin Bacon'); $path = $keanu->findPathTo($kevin);
Some Graph Use Cases Social networking Manufacturing Map directions Fraud detection Multi-tenancy
Modelling a Domain with Graphs Graphs are &quot;whiteboard-friendly&quot; Nouns become nodes Verbs become relationships Properties are adjectives and adverbs
Audience Participation!
Neo Technologies http://neo4j.org Embedded in Java applications Standalone server via REST Plugins: spatial, lucene, rdf http://github.com/jadell/Neo4jPHP
Using the REST client $client = new Client(new Transport()); $customer = new Node($client); $customer->setProperty('name', 'Josh')->save(); $store = new Node($client); $store->setProperty('name', 'Home Despot')        ->setProperty('location', 'Durham, NC')->save(); $order = new Node($client); $order->save(); $item = new Node($client); $item->setProperty('item_number', 'Q32-ESM')->save(); $order->relateTo($item, 'CONTAINS')->save(); $customer->relateTo($order, 'BOUGHT')->save(); $store->relateTo($order, 'SOLD')->save(); $customerIndex = new Index($client, Index::TypeNode, 'customers'); $customerIndex->add($customer, 'name', $customer->getProperty('name')); $customerIndex->add($customer, 'rating', 'A++');
Graph Mining Paths Traversals Ad-hoc Queries
Path Finding Find any connection from node A to node B Limit by relationship types and/or direction Path finding algorithms: all, simple, shortest, Dijkstra $customer = $customerIndex->findOne('name', 'Josh'); $item = $itemIndex->findOne('item_number', 'Q32-ESM'); $path = $item->findPathsTo($customer)               ->setMaxDepth(2)               ->getSinglePath(); foreach ($path as $node) {      echo $node->getId() . &quot;\n&quot;; }
Traversal Complex/Custom path finding Base next decision on previous path $traversal = new Traversal($client); $traversal ->setOrder(Traversal::OrderDepthFirst) ->setUniqueness(Traversal::UniquenessNodeGlobal) ->setPruneEvaluator('javascript','(function traverse(pos) {       if (pos.length() == 1 && pos.lastRelationship.getType() == &quot;CONTAINS&quot;) {          return false;      } else if (pos.length() == 2 && pos.lastRelationship.getType() == &quot;BOUGHT&quot;) {          return false;      }      return true;})(position)') ->setReturnFilter('javascript',      'return position.endNode().getProperty('type') == 'Customer;'); $customers = $traversal->getResults($item, Traversal::ReturnTypeNode);
Uses mathematical notation approach Complex traversal behaviors, including backtracking https://github.com/tinkerpop/gremlin/wiki m = [:] g.v(1).out('likes').in('likes').out('likes').groupCount(m) m.sort{a,b -> a.value <=> b.value}
Cypher &quot;What to find&quot; vs. &quot;How to find&quot; $query = 'START item=(1) MATCH  (item)<-[:CONTAINS]-(order)<-[:BOUGHT]-(customer) RETURN customer'; $cypher = new Cypher\Query($client, $query); $customers = $cypher->getResultSet();
Cypher Syntax START item = (1)                        START item = (1,2,3) START item = (items, 'name:Q32*')       START item = (1), customer = (2,3) MATCH (item)<--(order)                  MATCH (order)-->(item) MATCH (order)-[r]->(item)                                MATCH ()--(item) MATCH      (supplier)-[:SUPPLIES]->(item)<-[:CONTAINS]-(order),      (customer)-[:RATED]->(item) WHERE customer.name = 'Josh' and s.coupon = 'freewidget' RETURN item, order                      RETURN customer, item, r.rating RETURN r~TYPE                                                        RETURN COUNT(*) ORDER BY customer.name DESC             RETURN AVG(r.rating) LIMIT 3 SKIP 2
Cypher - All Together Now // Find the top 10 `widget` ratings by customers who bought AND rated // `widgets`, and the supplier START item = (items, 'name:widget') MATCH (item)<--(order)<--(customer)-[r:RATED]->(item)<--(supplier) RETURN customer, r.rating, supplier ORDER BY r.rating DESC LIMIT 10
Tools Neoclipse Webadmin
Are RDBs Useful At All? Aggregation Ordered data Truly tabular data Few or clearly defined relationships
Questions?
Resources http://neo4j.org http://docs.neo4j.org http://www.youtube.com/watch?v=UodTzseLh04 Emil Eifrem (Neo Tech. CEO) webinar Check out around the 54 minute mark http://github.com/jadell/Neo4jPHP http://joshadell.com [email_address] @josh_adell Google+, Facebook, LinkedIn

More Related Content

What's hot (20)

PPTX
Moose Best Practices
Aran Deltac
 
PDF
MongoDB.local Austin 2018: Tips and Tricks for Avoiding Common Query Pitfalls
MongoDB
 
PPT
Drupal Lightning FAPI Jumpstart
guestfd47e4c7
 
ODP
Cena-DTA PHP Conference 2011 Slides
Asao Kamei
 
DOCX
New tags in html5
SathyaseelanK1
 
ODP
Moose (Perl 5)
xSawyer
 
PPT
Heroku Waza 2013 Lessons Learned
Simon Bagreev
 
PDF
Proposal for xSpep BDD Framework for PHP
Yuya Takeyama
 
KEY
Introduction to Perl Best Practices
José Castro
 
PDF
Crafting [Better] API Clients
Wellfire Interactive
 
PPT
Power Theming
drkdn
 
ODP
Evolving Software with Moose
Dave Cross
 
PDF
MongoDB World 2019: How to Keep an Average API Response Time Less than 5ms wi...
MongoDB
 
PPT
Addmi 10.5-basic query-language
odanyboy
 
KEY
Actions filters
John Dillick
 
PDF
Introduction to Moose
thashaa
 
PDF
Procedures
Luther Quinn
 
PDF
Solr's Search Relevancy (Understand Solr's query debug)
Wongnai
 
KEY
(Ab)Using the MetaCPAN API for Fun and Profit
Olaf Alders
 
Moose Best Practices
Aran Deltac
 
MongoDB.local Austin 2018: Tips and Tricks for Avoiding Common Query Pitfalls
MongoDB
 
Drupal Lightning FAPI Jumpstart
guestfd47e4c7
 
Cena-DTA PHP Conference 2011 Slides
Asao Kamei
 
New tags in html5
SathyaseelanK1
 
Moose (Perl 5)
xSawyer
 
Heroku Waza 2013 Lessons Learned
Simon Bagreev
 
Proposal for xSpep BDD Framework for PHP
Yuya Takeyama
 
Introduction to Perl Best Practices
José Castro
 
Crafting [Better] API Clients
Wellfire Interactive
 
Power Theming
drkdn
 
Evolving Software with Moose
Dave Cross
 
MongoDB World 2019: How to Keep an Average API Response Time Less than 5ms wi...
MongoDB
 
Addmi 10.5-basic query-language
odanyboy
 
Actions filters
John Dillick
 
Introduction to Moose
thashaa
 
Procedures
Luther Quinn
 
Solr's Search Relevancy (Understand Solr's query debug)
Wongnai
 
(Ab)Using the MetaCPAN API for Fun and Profit
Olaf Alders
 

Viewers also liked (6)

PPT
Introduction to Graph Databases
Josh Adell
 
PPT
Design Pattern Zoology
Josh Adell
 
PPT
Application Modeling with Graph Databases
Josh Adell
 
PPTX
The Apache Software Foundation - Ted's Tool Time - Sep 2015
Ted Vinke
 
PDF
Migrating to dependency injection
Josh Adell
 
PDF
Application modelling with graph databases
Josh Adell
 
Introduction to Graph Databases
Josh Adell
 
Design Pattern Zoology
Josh Adell
 
Application Modeling with Graph Databases
Josh Adell
 
The Apache Software Foundation - Ted's Tool Time - Sep 2015
Ted Vinke
 
Migrating to dependency injection
Josh Adell
 
Application modelling with graph databases
Josh Adell
 
Ad

Similar to Graph Databases (20)

PDF
Introduction to Graph Databases with Neo4J
Brant Boehmann
 
PPT
Cypher
Max De Marzi
 
PPT
Processing Large Graphs
Nishant Gandhi
 
PDF
Introduction to Neo4j - a hands-on crash course
Neo4j
 
PPTX
Graph Database Query Languages
Jay Coskey
 
PPTX
Neo4j 20 minutes introduction
András Fehér
 
PDF
Graph Databases, a little connected tour (Codemotion Rome)
fcofdezc
 
PDF
Graph Database, a little connected tour - Castano
Codemotion
 
PPTX
Introduction to graph databases in term of neo4j
Abdullah Hamidi
 
PDF
Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH)
David Fombella Pombal
 
PDF
Introduction to neo4j - a hands-on crash course
Neo4j
 
PPTX
Introduction to SQL Server Graph DB
Greg McMurray
 
PDF
Findability Day 2014 Neo4j how graph data boost your insights
Findwise
 
PDF
Understanding Graph Databases with Neo4j and Cypher
Ruhaim Izmeth
 
PPT
Hands on Training – Graph Database with Neo4j
Serendio Inc.
 
PDF
3rd Athens Big Data Meetup - 2nd Talk - Neo4j: The World's Leading Graph DB
Athens Big Data
 
PDF
EdgeQL — A primer
EdgeDB
 
PDF
QCon 2014 - How Shutl delivers even faster with Neo4j
Volker Pacher
 
PDF
Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...
Lucidworks
 
Introduction to Graph Databases with Neo4J
Brant Boehmann
 
Cypher
Max De Marzi
 
Processing Large Graphs
Nishant Gandhi
 
Introduction to Neo4j - a hands-on crash course
Neo4j
 
Graph Database Query Languages
Jay Coskey
 
Neo4j 20 minutes introduction
András Fehér
 
Graph Databases, a little connected tour (Codemotion Rome)
fcofdezc
 
Graph Database, a little connected tour - Castano
Codemotion
 
Introduction to graph databases in term of neo4j
Abdullah Hamidi
 
Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH)
David Fombella Pombal
 
Introduction to neo4j - a hands-on crash course
Neo4j
 
Introduction to SQL Server Graph DB
Greg McMurray
 
Findability Day 2014 Neo4j how graph data boost your insights
Findwise
 
Understanding Graph Databases with Neo4j and Cypher
Ruhaim Izmeth
 
Hands on Training – Graph Database with Neo4j
Serendio Inc.
 
3rd Athens Big Data Meetup - 2nd Talk - Neo4j: The World's Leading Graph DB
Athens Big Data
 
EdgeQL — A primer
EdgeDB
 
QCon 2014 - How Shutl delivers even faster with Neo4j
Volker Pacher
 
Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...
Lucidworks
 
Ad

Recently uploaded (20)

PDF
FME as an Orchestration Tool with Principles From Data Gravity
Safe Software
 
PDF
Unlocking FME Flow’s Potential: Architecture Design for Modern Enterprises
Safe Software
 
PDF
Darley - FIRST Copenhagen Lightning Talk (2025-06-26) Epochalypse 2038 - Time...
treyka
 
PDF
Optimizing the trajectory of a wheel loader working in short loading cycles
Reno Filla
 
PPTX
Practical Applications of AI in Local Government
OnBoard
 
PDF
Understanding The True Cost of DynamoDB Webinar
ScyllaDB
 
PDF
Simplify Your FME Flow Setup: Fault-Tolerant Deployment Made Easy with Packer...
Safe Software
 
PPTX
Reimaginando la Ciberdefensa: De Copilots a Redes de Agentes
Cristian Garcia G.
 
PDF
DoS Attack vs DDoS Attack_ The Silent Wars of the Internet.pdf
CyberPro Magazine
 
PDF
5 Things to Consider When Deploying AI in Your Enterprise
Safe Software
 
PDF
ArcGIS Utility Network Migration - The Hunter Water Story
Safe Software
 
PPTX
2025 HackRedCon Cyber Career Paths.pptx Scott Stanton
Scott Stanton
 
PDF
Proactive Server and System Monitoring with FME: Using HTTP and System Caller...
Safe Software
 
PDF
Java 25 and Beyond - A Roadmap of Innovations
Ana-Maria Mihalceanu
 
PDF
How to Comply With Saudi Arabia’s National Cybersecurity Regulations.pdf
Bluechip Advanced Technologies
 
PPTX
Smarter Governance with AI: What Every Board Needs to Know
OnBoard
 
PDF
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) Slides
Ravi Tamada
 
PDF
Understanding AI Optimization AIO, LLMO, and GEO
CoDigital
 
PDF
Next level data operations using Power Automate magic
Andries den Haan
 
PPTX
MARTSIA: A Tool for Confidential Data Exchange via Public Blockchain - Pitch ...
Michele Kryston
 
FME as an Orchestration Tool with Principles From Data Gravity
Safe Software
 
Unlocking FME Flow’s Potential: Architecture Design for Modern Enterprises
Safe Software
 
Darley - FIRST Copenhagen Lightning Talk (2025-06-26) Epochalypse 2038 - Time...
treyka
 
Optimizing the trajectory of a wheel loader working in short loading cycles
Reno Filla
 
Practical Applications of AI in Local Government
OnBoard
 
Understanding The True Cost of DynamoDB Webinar
ScyllaDB
 
Simplify Your FME Flow Setup: Fault-Tolerant Deployment Made Easy with Packer...
Safe Software
 
Reimaginando la Ciberdefensa: De Copilots a Redes de Agentes
Cristian Garcia G.
 
DoS Attack vs DDoS Attack_ The Silent Wars of the Internet.pdf
CyberPro Magazine
 
5 Things to Consider When Deploying AI in Your Enterprise
Safe Software
 
ArcGIS Utility Network Migration - The Hunter Water Story
Safe Software
 
2025 HackRedCon Cyber Career Paths.pptx Scott Stanton
Scott Stanton
 
Proactive Server and System Monitoring with FME: Using HTTP and System Caller...
Safe Software
 
Java 25 and Beyond - A Roadmap of Innovations
Ana-Maria Mihalceanu
 
How to Comply With Saudi Arabia’s National Cybersecurity Regulations.pdf
Bluechip Advanced Technologies
 
Smarter Governance with AI: What Every Board Needs to Know
OnBoard
 
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) Slides
Ravi Tamada
 
Understanding AI Optimization AIO, LLMO, and GEO
CoDigital
 
Next level data operations using Power Automate magic
Andries den Haan
 
MARTSIA: A Tool for Confidential Data Exchange via Public Blockchain - Pitch ...
Michele Kryston
 

Graph Databases

  • 2. Who am I? Software developer: PHP, Javascript, SQL http://www.dunnwell.com Fan of using the right tool for the job
  • 4. The Solution? > -- Given &quot;Keanu Reeves&quot; find a connection to &quot;Kevin Bacon&quot; > SELECT ??? FROM cast WHERE ??? +---------------------------------------------------------------------+ | actor_name                 | movie_title                            | +============================+========================================+ | Jennifer Connelley         | Higher Learning                        | +----------------------------+----------------------------------------+ | Laurence Fishburne         | Mystic River                           | +----------------------------+----------------------------------------+ | Laurence Fishburne         | Higher Learning                        | +----------------------------+----------------------------------------+ | Kevin Bacon                | Mystic River                           | +----------------------------+----------------------------------------+ | Keanu Reeves               | The Matrix                             | +----------------------------+----------------------------------------+ | Laurence Fishburne         | The Matrix                             | +----------------------------+----------------------------------------+
  • 5. Find Every Actor at Each Degree > -- First degree > SELECT actor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_title FROM cast WHERE actor_name='Kevin Bacon') > -- Second degree > SELECT actor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_title FROM cast WHERE actor_name IN (SELECT actor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_title FROM cast WHERE actor_name='Kevin Bacon'))) > -- Third degree > SELECT actor_name FROM cast WHERE movie_title IN(SELECT DISTINCT movie_title FROM cast WHERE actor_name IN (SELECT actor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_title FROM cast WHERE actor_name IN (SELECT actor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_title FROM cast WHERE actor_name='Kevin Bacon'))))
  • 6. The Truth Relational databases aren't very good with relationsh ips Data RDBMs
  • 8. The Real Problem Finding relationships across multiple degrees of separation     ...and across multiple data types     ...and where you don't even know there is a relationship
  • 10. Computer Science Definition A graph is an ordered pair G = (V, E) where V is a set of vertices and E is a set of edges , which are pairs of vertices.
  • 11. Some Graph DB Vocabulary Node : vertex Relationship : edge Property : meta-datum attached to a node or relationship Path : an ordered list of nodes and relationships Index : node or relationship lookup table
  • 12. Relationships are First-Class Citizens Have a type Have properties Have a direction Domain semantics Traversable in any direction
  • 15. New Solution to the Bacon Problem $keanu = $actorIndex->find('name', 'Keanu Reeves'); $kevin = $actorIndex->find('name', 'Kevin Bacon'); $path = $keanu->findPathTo($kevin);
  • 16. Some Graph Use Cases Social networking Manufacturing Map directions Fraud detection Multi-tenancy
  • 17. Modelling a Domain with Graphs Graphs are &quot;whiteboard-friendly&quot; Nouns become nodes Verbs become relationships Properties are adjectives and adverbs
  • 19. Neo Technologies http://neo4j.org Embedded in Java applications Standalone server via REST Plugins: spatial, lucene, rdf http://github.com/jadell/Neo4jPHP
  • 20. Using the REST client $client = new Client(new Transport()); $customer = new Node($client); $customer->setProperty('name', 'Josh')->save(); $store = new Node($client); $store->setProperty('name', 'Home Despot')       ->setProperty('location', 'Durham, NC')->save(); $order = new Node($client); $order->save(); $item = new Node($client); $item->setProperty('item_number', 'Q32-ESM')->save(); $order->relateTo($item, 'CONTAINS')->save(); $customer->relateTo($order, 'BOUGHT')->save(); $store->relateTo($order, 'SOLD')->save(); $customerIndex = new Index($client, Index::TypeNode, 'customers'); $customerIndex->add($customer, 'name', $customer->getProperty('name')); $customerIndex->add($customer, 'rating', 'A++');
  • 21. Graph Mining Paths Traversals Ad-hoc Queries
  • 22. Path Finding Find any connection from node A to node B Limit by relationship types and/or direction Path finding algorithms: all, simple, shortest, Dijkstra $customer = $customerIndex->findOne('name', 'Josh'); $item = $itemIndex->findOne('item_number', 'Q32-ESM'); $path = $item->findPathsTo($customer)               ->setMaxDepth(2)               ->getSinglePath(); foreach ($path as $node) {     echo $node->getId() . &quot;\n&quot;; }
  • 23. Traversal Complex/Custom path finding Base next decision on previous path $traversal = new Traversal($client); $traversal ->setOrder(Traversal::OrderDepthFirst) ->setUniqueness(Traversal::UniquenessNodeGlobal) ->setPruneEvaluator('javascript','(function traverse(pos) {       if (pos.length() == 1 && pos.lastRelationship.getType() == &quot;CONTAINS&quot;) {         return false;     } else if (pos.length() == 2 && pos.lastRelationship.getType() == &quot;BOUGHT&quot;) {         return false;      }     return true;})(position)') ->setReturnFilter('javascript',      'return position.endNode().getProperty('type') == 'Customer;'); $customers = $traversal->getResults($item, Traversal::ReturnTypeNode);
  • 24. Uses mathematical notation approach Complex traversal behaviors, including backtracking https://github.com/tinkerpop/gremlin/wiki m = [:] g.v(1).out('likes').in('likes').out('likes').groupCount(m) m.sort{a,b -> a.value <=> b.value}
  • 25. Cypher &quot;What to find&quot; vs. &quot;How to find&quot; $query = 'START item=(1) MATCH (item)<-[:CONTAINS]-(order)<-[:BOUGHT]-(customer) RETURN customer'; $cypher = new Cypher\Query($client, $query); $customers = $cypher->getResultSet();
  • 26. Cypher Syntax START item = (1)                        START item = (1,2,3) START item = (items, 'name:Q32*')       START item = (1), customer = (2,3) MATCH (item)<--(order)                  MATCH (order)-->(item) MATCH (order)-[r]->(item)                                MATCH ()--(item) MATCH      (supplier)-[:SUPPLIES]->(item)<-[:CONTAINS]-(order),     (customer)-[:RATED]->(item) WHERE customer.name = 'Josh' and s.coupon = 'freewidget' RETURN item, order                      RETURN customer, item, r.rating RETURN r~TYPE                                                        RETURN COUNT(*) ORDER BY customer.name DESC             RETURN AVG(r.rating) LIMIT 3 SKIP 2
  • 27. Cypher - All Together Now // Find the top 10 `widget` ratings by customers who bought AND rated // `widgets`, and the supplier START item = (items, 'name:widget') MATCH (item)<--(order)<--(customer)-[r:RATED]->(item)<--(supplier) RETURN customer, r.rating, supplier ORDER BY r.rating DESC LIMIT 10
  • 29. Are RDBs Useful At All? Aggregation Ordered data Truly tabular data Few or clearly defined relationships
  • 31. Resources http://neo4j.org http://docs.neo4j.org http://www.youtube.com/watch?v=UodTzseLh04 Emil Eifrem (Neo Tech. CEO) webinar Check out around the 54 minute mark http://github.com/jadell/Neo4jPHP http://joshadell.com [email_address] @josh_adell Google+, Facebook, LinkedIn

Editor's Notes

  • #3: * graph db usage poll
  • #4: * Six degrees game * Relational databases can&apos;t easily answer certain types of questions
  • #5: * first pass using a relational database * cast table: actor_name, movie_title * hard to visualize the solution * In order to do this, you need to do multiple passes or joins
  • #6: * Each degree adds a join * Increases complexity * Decreases performance * Stop when the actor you&apos;re looking for is in the list
  • #7: * this problem highlights the ugly truth about RDBs * they weren&apos;t designed to handle these types of problems. * RDB relationships join data, but are not data in themselves
  • #8: * Gather everything in the set that matches these criteria, then tell me if this thing is in the set * 1 set, no problem * 2nd set no problem * 3rd set not related to 1st * 4th not related to 2nd * 5th related to 1st and 4th * etc. * Relationships are only available between overlapping sets
  • #9: * disjoint sets
  • #10: * Graphs * Not X-Y * Computer Science definition of graphs
  • #11: * graph theory
  • #12: * Nodes can have arbitrary properties * Relationships can have arbitrary properties * Paths are found using traversal algorithms * Indexes help find starting points
  • #13: * This is how graph dbs solve the problems that RDBs can&apos;t
  • #14: * Tree data-structures * Networks * Maps * vehicles on streets == packets through network
  • #15: * Make each record a node * Make every foreign key a relationship * RDB indexes are usually stored in a tree structure * Trees are graphs * Why not use RDBs? * The trouble with RDBs is how they are stored in memory and queried   * Require a translation step from memory blocks to graph structure * Relationships not first-class citizens * Many problem domains map poorly to rows/tables
  • #16: * Actors are nodes * Movies are nodes * Relationship: Actor is IN a movie * pseudo-code shortened for brevity * Compare to degree selection join queries
  • #17: * Social networking - friends of friends of friends of friends * Assembly/Manufacturing - 1 widget contains 3 gadgets each contain 2 gizmos * Map directions - starting at my house find a route to the office that goes past the pub * Multi-tenancy - root node per tenant * all queries start at root * No overlap between graphs = no accidental data spillage * Fraud: track transactions back to origination * Pretty much anything that can be drawn on a whiteboard
  • #18: * Example: retail system * Customer makes Order * Store sells Order * Order contains Items * Supplier supplied Items * Customer rates Items * Did this customer rank supplier X highly? * Which suppliers sell the highest rated items? * Does item A get rated higher when ordered with Item B? * All can be answered with RDBs as well * Not as elegant * Not as performant
  • #19: * Recreate Google+
  • #20: * billions of nodes and relationships in a single instance * cluster replication * transactions * native bindings for Ruby, Python, and language that can run in JVM * Licensing * Neo4jPHP - Josh&apos;s REST client, no affiliated with Neo Technologies
  • #21: * Index can be saved separately * Or it is saved on `add` * Note that indexes don&apos;t have to be on real properties or values
  • #22: * This is where the power of graph dbs comes from * Paths - find any relationship chain between A and B * Traversal - filter out paths that don&apos;t meet criteria * Queries - Here is what I want, find it however you can
  • #23: * Paths deal with two known nodes * start and end point * This is the Kevin Bacon example, but with multiple datatypes  * Path can be treated as an array of nodes or relationships * findPathsTo() returns a PathFinder which can have further restrictions placed on it
  • #24: * Written in Javascript * plugins provide other languages: Groovy, Python * Anything that runs on JVM * Path object, check apidocs * inline edit/update/delete * explicit prune evaluator of maxDepth = 1 unless overriden * built in prune: none * built in return: all or all-but-start * Prune: should we continue doen this path? Return: Should we return the entity at this position? * You can return things and still continue traversing * Pros: expressive, powerful, complex search behaviors, in-line edit/update * Cons: complex to write, complex to understand (query languages make this better)
  • #25: * Not very familiar with it * Just mentioning it&apos;s out there
  • #26: * Cypher is &amp;quot;what to find&amp;quot; * describe the &amp;quot;shape&amp;quot; of the thing you&apos;re looking for * Very white-board friendly * Pros: easy to understand, query looks like domain model * Cons: not as powerful, not fully featured (YET) * result set is an array of arrays 
  • #27: * Three parts ** Where to start ** Shape to find   ** possibly qualifiers ** What to return
  • #28: * If there could be more than one relationship type, could further constrain by ratings 
  • #29: * Webadmin built into neo4j server
  • #30: * RDBs are really good at data aggregation * Set math, duh * Have to traverse the whole graph in order to do aggregation * Truly tabular means not a lot of relationships between the data types