SlideShare a Scribd company logo
My Cypher Query Takes Too Long
What Can I Do ?
Véronique Gendner
7 Nov. 2024
Véronique Gendner – e-tissage.net – Nov. 2024
2
Cypher : graph DB query language
Cypher is a declarative language: you describe what you want to get, not how to
get it.
Therefore, the database engine has to figure by itself, how to get what you want.
( :Movie)-[:ACTED_IN]-( :Person)
:Movie :Person
title : The Matrix
role : Neo
ACTED_IN
m p
name : Keanu Reeves
born : 1964
MATCH (m:Movie)-[:ACTED_IN]-(p:Person { name : "Keanu Reeves" } )
RETURN m.title
How to get the movie Keanu Reeves acted in ?
An Introduction to graph query language Cypher by author, Feb. 2024
Véronique Gendner – e-tissage.net – Nov. 2024
3
Searching in a graph
is hard
Searching requires to traverse the graph.
Determining the best way to traverse a general graph is not
straight forward :
– Where to start ?
– How to move forward in order to not miss anything and minimize
processing time (graph have cycles that can lead to redundancy) ?
– How to compromise between processing time / memory / disk space
usage ?
Pete Souza, plaque on President Obama’s desk
Véronique Gendner – e-tissage.net – Nov. 2024
4
What does Cypher queries
processing time depends upon ?
Cypher queries processing time depends on a lot of things.
Many choices need to be made and the best choices also depend on what the graphs looks like.
+ Cypher is evolving fast* !
This presentation is about useful guidelines through the different aspects Cypher query
processing time depends upon, starting with key principles of Cypher execution.
Agenda:
• Cypher query processing & Cypher query planner
How to read EXPLAIN / PROFILE outputted logical plan ?
Operators, Estimated costs (rows, db hits, …)
• Examples & Hints
• Other levers on processing time
• Resources
* Cypher Performance Improvements in Neo4j 5.26 LTS, Christoffer Bergman, October 2024
Véronique Gendner – e-tissage.net – Nov. 2024
5
Cypher query processing
MATCH (n:Label {prop:val}) …
RETURN …
parsing /
rewriting
MATCH (n:Label) WHERE n. prop=val …
RETURN …
Cypher query string
binary file by Abd Majd from Noun Project (CC BY 3.0)
Query execution
execution of the operators with the kernel API
lives only during the execution,
is not otherwise materialized
Physical plan
= output of
EXPLAIN / PROFILE
query
planner
also called
query optimizer
Logical plan
also called
execution plan
Véronique Gendner – e-tissage.net – Nov. 2024
Determines the most efficient way to execute queries,
given the current state of the database*,
= based on counts of certain db items
Query planner
*Which means testing on a toy db, in hope that isolating the
problem will make things clearer can be misleading…
Matrix reloaded, 2003. The Architect
Véronique Gendner – e-tissage.net – Nov. 2024
7
Minimizing costs
Due to the nature of graph structures, for any given Cypher query, there is likely a
number of execution plan candidates that each solves the query in a different way.
The planner determines which plan has the lowest estimated execution cost.
What the planner takes into account :
• # of rows
• # of database hits (DB hits)
• Avoiding certain planner operators (highly memory costly)
• Favor other planner operators (indexes uses)
*
* Exact # of Db hits are actually only known after execution, but they are indirectly
taken into account in the cost model that also avoids or favors some operators
Véronique Gendner – e-tissage.net – Nov. 2024
8
The statistics of the DB
used by the query planner
The statistics of the DB used by the query planner include :
– nodes
• Tot #
• # by label
– relationships
• # by type,
• # by type & starting with Label
• # by type & ending with Label
– index
• # of indexed values,
• estimate of # of unique values
– constraints
• Label, property and type
CALL db.stats.retrieve("GRAPH COUNTS")
Examples form the https://sandbox.neo4j.com Movie db
Véronique Gendner – e-tissage.net – Nov. 2024
9
Rows (1)
https://sandbox.neo4j.com Movie db
Véronique Gendner – e-tissage.net – Nov. 2024
10
Query planner 4 important phases
Adapted from Understanding how Neo4j interprets and executes Cypher is key to debugging slow-running statements ,
Feb 2024, Adam Cowley
Find Anchor nodes
= where to start
In more complex
db / query
there can be
several anchor
nodes
row
Filter by label,
property value, …
Aggregate
Expand = follow
relations in pattern
row
row
Véronique Gendner – e-tissage.net – Nov. 2024
11
Rows (2)
row
row
row
row
row
row
row
row
row
row
row
row
row
Véronique Gendner – e-tissage.net – Nov. 2024
12
DB hits
A database hit (DB hit) is an abstract unit of storage engine work, such as
retrieving or updating data.
Actions that trigger one or more database hits:
Create actions
Create a node.
Create a relationship.
Create a new node label.
Create a new relationship type.
Create a new ID for property keys with the same name.
Delete actions
Delete a node.
Delete a relationship.
Update actions
Set one or more labels on a node.
Remove one or more labels from a node.
Node-specific actions
Get a node by its ID.
Get the degree of a node.
Determine whether a node is dense.
Determine whether a label is set on a node.
Get the labels of a node.
Get a property of a node.
Get an existing node label.
Get the name of a label by its ID, or its ID by its name.
Relationship-specific actions
Get a relationship by its ID.
Get a property of a relationship.
Get an existing relationship type.
Get a relationship type name by its ID, or its ID by its name.
General actions
Get the name of a property key by its ID, or its ID by the key name.
Find a node or relationship through an index seek or index scan.
Find a path in a variable-length expand.
Find a shortest path.
Ask the count store for a value.
Schema actions
Add an index.
Drop an index.
Get the reference of an index.
Create a constraint.
Drop a constraint.
Call a procedure.
Call a user-defined function
Database hits
Véronique Gendner – e-tissage.net – Nov. 2024
13
EXPLAIN / PROFILE
EXPLAIN will display the execution plan but not run the query.
Always returns an empty result and makes no changes to the database.
Displayed costs (# of rows, # of db hits) are estimate.
PROFILE will run your query and keep track of actual costs : how many rows pass through
each operator and how much each operator needs to interact with the storage layer (db hits).
estimates
explain profile
actual count
Véronique Gendner – e-tissage.net – Nov. 2024
14
Text logical plan Image logical plan
Cypher shell Neo4j Desktop
How to read PROFILE (EXPLAIN)
outputted execution plan ?
Read from bottom up
operator
leaf operator
root operator
https://workspace.neo4j.io/
Véronique Gendner – e-tissage.net – Nov. 2024
15
Query planner Operators
row
NodeIndexSeek
CacheProperties
Filter
Filter
Expand(All)
Expand(All)
OrderedAggretation
Véronique Gendner – e-tissage.net – Nov. 2024
16
Lazy / Eager Operators
Lazy operators pipe their output rows to their parent operators as soon as
they are produced. In other words, a child operator may not be fully
exhausted before the parent operator starts consuming the input rows
produced by the child.
Eager operators such as those used for aggregation (max, min…) and sorting,
need to aggregate all their rows before they can produce output and send
rows to the parent operator. For this reason, eager operators can be very
memory consuming.
Véronique Gendner – e-tissage.net – Nov. 2024
17
Walk through the list of
query planner operators
Cypher query planner operators Cypher query planner Operators.xlsx
Véronique Gendner – e-tissage.net – Nov. 2024
Examples & Hints
Gandalf, Lord of The Ring
Véronique Gendner – e-tissage.net – Nov. 2024
19
Different syntax same plan
Véronique Gendner – e-tissage.net – Nov. 2024
20
Add variable names to make EXPLAIN
/ PROFILE output easier to read
Véronique Gendner – e-tissage.net – Nov. 2024
21
Specify label(s) to avoid
@AllNode operator
If there is no index :
= as many rows as
there are nodes in the db !!!
Véronique Gendner – e-tissage.net – Nov. 2024
22
Collect & Filter data as early as possible
Véronique Gendner – e-tissage.net – Nov. 2024
23
Reduce the number of rows
• Specify Labels to avoid allNodesScan operator
• Use Indexes
• Collect & Filter data as early as possible
• Move ORDER BY and LIMIT up, as early as possible
• …
Adapted from Tuning Cypher, Andrew Bowman at NODES 2019
Very insightful, but some parts now outdated
Véronique Gendner – e-tissage.net – Nov. 2024
24
The cost of dense nodes
… 40k enrolments
corresponding :Course node
becomes a dense node
-> expanding from :Course
along the :FOR_COURSE relation
is very costly
…
Véronique Gendner – e-tissage.net – Nov. 2024
25
How to avoid dense nodes
Understanding how Neo4j interprets and executes Cypher is key to debugging slow-running statements , Feb 2024, Adam Cowley
Update a user’s progress in a lesson:
anchor
expand
anchor
expand
expand
…
expand
Véronique Gendner – e-tissage.net – Nov. 2024
26
Properties
Although there have been improvement in the most recent versions of Neo4j,
properties access are costly. When # of db hits is high, you can
• Aggregate by node itself rather than a unique property
• Delay property access until nodes are distinct
• If a property is used several times, projecting it is more efficient since you
do not need to hit the graph each time
See also Top 10 Cypher Tuning Tips & Tricks, Michael Hunger, GraphConnect 2022
Only a few property access across one row
Véronique Gendner – e-tissage.net – Nov. 2024
27
Use indexes
Whenever possible, leverage nodes indexes
• for starting nodes (INDEX HINT)
• as workaround costly property reads
• …
The impact of indexes on query performances
Véronique Gendner – e-tissage.net – Nov. 2024
28
Relationships
-[*]-(:Person)
MATCH (:Person {name:"Gil"})
<-[:mother|father*]-(:Person)
• Specify relation direction and :TYPE
1 hop
2 hops
• Set an upper limit on variable length patterns
<-[:mother|father*0..12]-(:Person)
Top 10 Cypher Tuning Tips & Tricks,
Michael Hunger, GraphConnect 2022
slides 3 hops
4 hops
5 hops
…
…
… …
…
…
…
…
Véronique Gendner – e-tissage.net – Nov. 2024
29
More to look into…
At the Cypher writing level, more leads you should look into :
• Subqueries
GraphAcademy, Intermediate Cypher Queries course, Reducing Memory,6-subqueries
Cypher Manual, Subqueries
• Pattern comprehension
Cypher Manual, Pattern comprehension
• Quantified Path Pattern (QPP)
Advanced Pathfinding, Simple Queries, and Efficient Execution,
Bastien Louërat & Finbar Good, NODES 2023
How to make your queries 1000x faster with quantified path patterns (QPP),
P. Halftermeyer, sept 2023
Speed and Precision: Mastering Graph Traversal With QPP, Pierre Haltermeyer, NODES 2024
Cypher Manual, Quantified Path Pattern
Véronique Gendner – e-tissage.net – Nov. 2024
Other levers
on processing time
Véronique Gendner – e-tissage.net – Nov. 2024
31
Several cache
Parsing cache
Logical plan
cache
Executable
query cache
Physical cache
MATCH (n:Label {prop:val}) …
RETURN …
parsing /
rewriting
MATCH (n:Label) WHERE n. prop=val …
RETURN …
Cypher query string
Query execution
Physical plan
query
planner
Logical plan
Properties
cache
binary file by Abd Majd from Noun Project (CC BY 3.0)
Véronique Gendner – e-tissage.net – Nov. 2024
32
Db config and hardware features
server.memory.heap.initial_size=2G
server.memory.heap.max_size=4G
server.memory.pagecache.size=2G
Performance - Operations Manual
R2-D2, Star Wars, 1977
Véronique Gendner – e-tissage.net – Nov. 2024
33
Modelling
• Label / Node / Relationship / property choice
? Elevating a property to a Label to reduce property access
• Watch out for hyper connected node
• Read vs Write time :
Sometimes most efficient strategy is not the same when doing
massive imports or update than at reading query time.
=> Consider using temporary representations
Ex : using a temporary Labels across several existing labels, to
add an index
Véronique Gendner – e-tissage.net – Nov. 2024
34
Resources
Cypher query planner
• Slow Cypher Statements and How to Fix Them , Feb 2024, Adam Cowley
• Execution plans and query tuning Execution plan operators
• Ninja Call about Cypher, Louërat Bastien, July 2024
Performances & Query tunning
• Cypher Performance Improvements in Neo4j 5.26 LTS, Christoffer Bergman, October 2024
• Neo4j Live: Lessons learned from Real-World Graph App Development, Dave Aitel, Aug 2023
• Top 10 Cypher Tuning Tips & Tricks, Michael Hunger, GraphConnect 2022 slides
• Tuning Cypher, Andrew Bowman at NODES 2019
• GraphAcademy Intermediate Cypher Queries course
Communities
• Neo4j discord https://discord.gg/UnsvEs8u
• Neo4j community forum https://community.neo4j.com
• GraphGeeks https://www.graphgeeks.org/
Courses & Tutorials Videos Article Technical documentation
Slides and replay of this presentation : https://www.e-tissage.net/cypher-query-takes-too-long
Very insightful, but some parts now outdated
With many thanks to everyone who spent time explaining things, to help me connect the dots !
Examples from the https://sandbox.neo4j.com Movie db
Ad

More Related Content

Similar to My cypher query takes too long, what can I do.with links.pdf (20)

Demystify LDAP and OIDC Providing Security to Your App on Kubernetes
Demystify LDAP and OIDC Providing Security to Your App on KubernetesDemystify LDAP and OIDC Providing Security to Your App on Kubernetes
Demystify LDAP and OIDC Providing Security to Your App on Kubernetes
VMware Tanzu
 
How Shutl Delivers Even Faster Using Neo4J
How Shutl Delivers Even Faster Using Neo4JHow Shutl Delivers Even Faster Using Neo4J
How Shutl Delivers Even Faster Using Neo4J
C4Media
 
My cypher query takes too long, what can I do.with comments.pdf
My cypher query takes too long, what can I do.with comments.pdfMy cypher query takes too long, what can I do.with comments.pdf
My cypher query takes too long, what can I do.with comments.pdf
Véronique Gendner
 
Verndale - Sitecore User Group Los Angeles Presentation
Verndale - Sitecore User Group Los Angeles PresentationVerndale - Sitecore User Group Los Angeles Presentation
Verndale - Sitecore User Group Los Angeles Presentation
David Brown
 
Velociraptor - SANS Summit 2019
Velociraptor - SANS Summit 2019Velociraptor - SANS Summit 2019
Velociraptor - SANS Summit 2019
Velocidex Enterprises
 
StackEngine Problem Space Demo
StackEngine Problem Space DemoStackEngine Problem Space Demo
StackEngine Problem Space Demo
Boyd Hemphill
 
Rspec and Capybara Intro Tutorial at RailsConf 2013
Rspec and Capybara Intro Tutorial at RailsConf 2013Rspec and Capybara Intro Tutorial at RailsConf 2013
Rspec and Capybara Intro Tutorial at RailsConf 2013
Brian Sam-Bodden
 
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Lucidworks
 
Knowledge-Based Analysis and Design (KBAD): An Approach to Rapid Systems Engi...
Knowledge-Based Analysis and Design (KBAD): An Approach to Rapid Systems Engi...Knowledge-Based Analysis and Design (KBAD): An Approach to Rapid Systems Engi...
Knowledge-Based Analysis and Design (KBAD): An Approach to Rapid Systems Engi...
Elizabeth Steiner
 
GraphConnect EU 2017 - Performance Improvements in Neo4j 3.2
GraphConnect EU 2017 - Performance Improvements in Neo4j 3.2GraphConnect EU 2017 - Performance Improvements in Neo4j 3.2
GraphConnect EU 2017 - Performance Improvements in Neo4j 3.2
Craig Taverner
 
Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep Features
Turi, Inc.
 
Dense Retrieval with Apache Solr Neural Search.pdf
Dense Retrieval with Apache Solr Neural Search.pdfDense Retrieval with Apache Solr Neural Search.pdf
Dense Retrieval with Apache Solr Neural Search.pdf
Sease
 
GraphTour 2020 - Neo4j: What's New?
GraphTour 2020 - Neo4j: What's New?GraphTour 2020 - Neo4j: What's New?
GraphTour 2020 - Neo4j: What's New?
Neo4j
 
An Algorithm for Keyword Search on an Execution Path
An Algorithm for Keyword Search on an Execution PathAn Algorithm for Keyword Search on an Execution Path
An Algorithm for Keyword Search on an Execution Path
Kamiya Toshihiro
 
Neo4j Graph DB & LLM.graphs & genAI introduction & cheatsheet.with links.pdf
Neo4j Graph DB & LLM.graphs & genAI introduction & cheatsheet.with links.pdfNeo4j Graph DB & LLM.graphs & genAI introduction & cheatsheet.with links.pdf
Neo4j Graph DB & LLM.graphs & genAI introduction & cheatsheet.with links.pdf
Véronique Gendner
 
Drupalcon cph
Drupalcon cphDrupalcon cph
Drupalcon cph
cyberswat
 
Provenance witha purpose
Provenance witha purposeProvenance witha purpose
Provenance witha purpose
Khalid Belhajjame
 
Untangling - fall2017 - week 9
Untangling - fall2017 - week 9Untangling - fall2017 - week 9
Untangling - fall2017 - week 9
Derek Jacoby
 
Lessons Learnt From Working With Rails
Lessons Learnt From Working With RailsLessons Learnt From Working With Rails
Lessons Learnt From Working With Rails
martinbtt
 
Machine Learning with ML.NET and Azure - Andy Cross
Machine Learning with ML.NET and Azure - Andy CrossMachine Learning with ML.NET and Azure - Andy Cross
Machine Learning with ML.NET and Azure - Andy Cross
Andrew Flatters
 
Demystify LDAP and OIDC Providing Security to Your App on Kubernetes
Demystify LDAP and OIDC Providing Security to Your App on KubernetesDemystify LDAP and OIDC Providing Security to Your App on Kubernetes
Demystify LDAP and OIDC Providing Security to Your App on Kubernetes
VMware Tanzu
 
How Shutl Delivers Even Faster Using Neo4J
How Shutl Delivers Even Faster Using Neo4JHow Shutl Delivers Even Faster Using Neo4J
How Shutl Delivers Even Faster Using Neo4J
C4Media
 
My cypher query takes too long, what can I do.with comments.pdf
My cypher query takes too long, what can I do.with comments.pdfMy cypher query takes too long, what can I do.with comments.pdf
My cypher query takes too long, what can I do.with comments.pdf
Véronique Gendner
 
Verndale - Sitecore User Group Los Angeles Presentation
Verndale - Sitecore User Group Los Angeles PresentationVerndale - Sitecore User Group Los Angeles Presentation
Verndale - Sitecore User Group Los Angeles Presentation
David Brown
 
StackEngine Problem Space Demo
StackEngine Problem Space DemoStackEngine Problem Space Demo
StackEngine Problem Space Demo
Boyd Hemphill
 
Rspec and Capybara Intro Tutorial at RailsConf 2013
Rspec and Capybara Intro Tutorial at RailsConf 2013Rspec and Capybara Intro Tutorial at RailsConf 2013
Rspec and Capybara Intro Tutorial at RailsConf 2013
Brian Sam-Bodden
 
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Lucidworks
 
Knowledge-Based Analysis and Design (KBAD): An Approach to Rapid Systems Engi...
Knowledge-Based Analysis and Design (KBAD): An Approach to Rapid Systems Engi...Knowledge-Based Analysis and Design (KBAD): An Approach to Rapid Systems Engi...
Knowledge-Based Analysis and Design (KBAD): An Approach to Rapid Systems Engi...
Elizabeth Steiner
 
GraphConnect EU 2017 - Performance Improvements in Neo4j 3.2
GraphConnect EU 2017 - Performance Improvements in Neo4j 3.2GraphConnect EU 2017 - Performance Improvements in Neo4j 3.2
GraphConnect EU 2017 - Performance Improvements in Neo4j 3.2
Craig Taverner
 
Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep Features
Turi, Inc.
 
Dense Retrieval with Apache Solr Neural Search.pdf
Dense Retrieval with Apache Solr Neural Search.pdfDense Retrieval with Apache Solr Neural Search.pdf
Dense Retrieval with Apache Solr Neural Search.pdf
Sease
 
GraphTour 2020 - Neo4j: What's New?
GraphTour 2020 - Neo4j: What's New?GraphTour 2020 - Neo4j: What's New?
GraphTour 2020 - Neo4j: What's New?
Neo4j
 
An Algorithm for Keyword Search on an Execution Path
An Algorithm for Keyword Search on an Execution PathAn Algorithm for Keyword Search on an Execution Path
An Algorithm for Keyword Search on an Execution Path
Kamiya Toshihiro
 
Neo4j Graph DB & LLM.graphs & genAI introduction & cheatsheet.with links.pdf
Neo4j Graph DB & LLM.graphs & genAI introduction & cheatsheet.with links.pdfNeo4j Graph DB & LLM.graphs & genAI introduction & cheatsheet.with links.pdf
Neo4j Graph DB & LLM.graphs & genAI introduction & cheatsheet.with links.pdf
Véronique Gendner
 
Drupalcon cph
Drupalcon cphDrupalcon cph
Drupalcon cph
cyberswat
 
Untangling - fall2017 - week 9
Untangling - fall2017 - week 9Untangling - fall2017 - week 9
Untangling - fall2017 - week 9
Derek Jacoby
 
Lessons Learnt From Working With Rails
Lessons Learnt From Working With RailsLessons Learnt From Working With Rails
Lessons Learnt From Working With Rails
martinbtt
 
Machine Learning with ML.NET and Azure - Andy Cross
Machine Learning with ML.NET and Azure - Andy CrossMachine Learning with ML.NET and Azure - Andy Cross
Machine Learning with ML.NET and Azure - Andy Cross
Andrew Flatters
 

Recently uploaded (20)

DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Ad

My cypher query takes too long, what can I do.with links.pdf

  • 1. My Cypher Query Takes Too Long What Can I Do ? Véronique Gendner 7 Nov. 2024
  • 2. Véronique Gendner – e-tissage.net – Nov. 2024 2 Cypher : graph DB query language Cypher is a declarative language: you describe what you want to get, not how to get it. Therefore, the database engine has to figure by itself, how to get what you want. ( :Movie)-[:ACTED_IN]-( :Person) :Movie :Person title : The Matrix role : Neo ACTED_IN m p name : Keanu Reeves born : 1964 MATCH (m:Movie)-[:ACTED_IN]-(p:Person { name : "Keanu Reeves" } ) RETURN m.title How to get the movie Keanu Reeves acted in ? An Introduction to graph query language Cypher by author, Feb. 2024
  • 3. Véronique Gendner – e-tissage.net – Nov. 2024 3 Searching in a graph is hard Searching requires to traverse the graph. Determining the best way to traverse a general graph is not straight forward : – Where to start ? – How to move forward in order to not miss anything and minimize processing time (graph have cycles that can lead to redundancy) ? – How to compromise between processing time / memory / disk space usage ? Pete Souza, plaque on President Obama’s desk
  • 4. Véronique Gendner – e-tissage.net – Nov. 2024 4 What does Cypher queries processing time depends upon ? Cypher queries processing time depends on a lot of things. Many choices need to be made and the best choices also depend on what the graphs looks like. + Cypher is evolving fast* ! This presentation is about useful guidelines through the different aspects Cypher query processing time depends upon, starting with key principles of Cypher execution. Agenda: • Cypher query processing & Cypher query planner How to read EXPLAIN / PROFILE outputted logical plan ? Operators, Estimated costs (rows, db hits, …) • Examples & Hints • Other levers on processing time • Resources * Cypher Performance Improvements in Neo4j 5.26 LTS, Christoffer Bergman, October 2024
  • 5. Véronique Gendner – e-tissage.net – Nov. 2024 5 Cypher query processing MATCH (n:Label {prop:val}) … RETURN … parsing / rewriting MATCH (n:Label) WHERE n. prop=val … RETURN … Cypher query string binary file by Abd Majd from Noun Project (CC BY 3.0) Query execution execution of the operators with the kernel API lives only during the execution, is not otherwise materialized Physical plan = output of EXPLAIN / PROFILE query planner also called query optimizer Logical plan also called execution plan
  • 6. Véronique Gendner – e-tissage.net – Nov. 2024 Determines the most efficient way to execute queries, given the current state of the database*, = based on counts of certain db items Query planner *Which means testing on a toy db, in hope that isolating the problem will make things clearer can be misleading… Matrix reloaded, 2003. The Architect
  • 7. Véronique Gendner – e-tissage.net – Nov. 2024 7 Minimizing costs Due to the nature of graph structures, for any given Cypher query, there is likely a number of execution plan candidates that each solves the query in a different way. The planner determines which plan has the lowest estimated execution cost. What the planner takes into account : • # of rows • # of database hits (DB hits) • Avoiding certain planner operators (highly memory costly) • Favor other planner operators (indexes uses) * * Exact # of Db hits are actually only known after execution, but they are indirectly taken into account in the cost model that also avoids or favors some operators
  • 8. Véronique Gendner – e-tissage.net – Nov. 2024 8 The statistics of the DB used by the query planner The statistics of the DB used by the query planner include : – nodes • Tot # • # by label – relationships • # by type, • # by type & starting with Label • # by type & ending with Label – index • # of indexed values, • estimate of # of unique values – constraints • Label, property and type CALL db.stats.retrieve("GRAPH COUNTS") Examples form the https://sandbox.neo4j.com Movie db
  • 9. Véronique Gendner – e-tissage.net – Nov. 2024 9 Rows (1) https://sandbox.neo4j.com Movie db
  • 10. Véronique Gendner – e-tissage.net – Nov. 2024 10 Query planner 4 important phases Adapted from Understanding how Neo4j interprets and executes Cypher is key to debugging slow-running statements , Feb 2024, Adam Cowley Find Anchor nodes = where to start In more complex db / query there can be several anchor nodes row Filter by label, property value, … Aggregate Expand = follow relations in pattern row row
  • 11. Véronique Gendner – e-tissage.net – Nov. 2024 11 Rows (2) row row row row row row row row row row row row row
  • 12. Véronique Gendner – e-tissage.net – Nov. 2024 12 DB hits A database hit (DB hit) is an abstract unit of storage engine work, such as retrieving or updating data. Actions that trigger one or more database hits: Create actions Create a node. Create a relationship. Create a new node label. Create a new relationship type. Create a new ID for property keys with the same name. Delete actions Delete a node. Delete a relationship. Update actions Set one or more labels on a node. Remove one or more labels from a node. Node-specific actions Get a node by its ID. Get the degree of a node. Determine whether a node is dense. Determine whether a label is set on a node. Get the labels of a node. Get a property of a node. Get an existing node label. Get the name of a label by its ID, or its ID by its name. Relationship-specific actions Get a relationship by its ID. Get a property of a relationship. Get an existing relationship type. Get a relationship type name by its ID, or its ID by its name. General actions Get the name of a property key by its ID, or its ID by the key name. Find a node or relationship through an index seek or index scan. Find a path in a variable-length expand. Find a shortest path. Ask the count store for a value. Schema actions Add an index. Drop an index. Get the reference of an index. Create a constraint. Drop a constraint. Call a procedure. Call a user-defined function Database hits
  • 13. Véronique Gendner – e-tissage.net – Nov. 2024 13 EXPLAIN / PROFILE EXPLAIN will display the execution plan but not run the query. Always returns an empty result and makes no changes to the database. Displayed costs (# of rows, # of db hits) are estimate. PROFILE will run your query and keep track of actual costs : how many rows pass through each operator and how much each operator needs to interact with the storage layer (db hits). estimates explain profile actual count
  • 14. Véronique Gendner – e-tissage.net – Nov. 2024 14 Text logical plan Image logical plan Cypher shell Neo4j Desktop How to read PROFILE (EXPLAIN) outputted execution plan ? Read from bottom up operator leaf operator root operator https://workspace.neo4j.io/
  • 15. Véronique Gendner – e-tissage.net – Nov. 2024 15 Query planner Operators row NodeIndexSeek CacheProperties Filter Filter Expand(All) Expand(All) OrderedAggretation
  • 16. Véronique Gendner – e-tissage.net – Nov. 2024 16 Lazy / Eager Operators Lazy operators pipe their output rows to their parent operators as soon as they are produced. In other words, a child operator may not be fully exhausted before the parent operator starts consuming the input rows produced by the child. Eager operators such as those used for aggregation (max, min…) and sorting, need to aggregate all their rows before they can produce output and send rows to the parent operator. For this reason, eager operators can be very memory consuming.
  • 17. Véronique Gendner – e-tissage.net – Nov. 2024 17 Walk through the list of query planner operators Cypher query planner operators Cypher query planner Operators.xlsx
  • 18. Véronique Gendner – e-tissage.net – Nov. 2024 Examples & Hints Gandalf, Lord of The Ring
  • 19. Véronique Gendner – e-tissage.net – Nov. 2024 19 Different syntax same plan
  • 20. Véronique Gendner – e-tissage.net – Nov. 2024 20 Add variable names to make EXPLAIN / PROFILE output easier to read
  • 21. Véronique Gendner – e-tissage.net – Nov. 2024 21 Specify label(s) to avoid @AllNode operator If there is no index : = as many rows as there are nodes in the db !!!
  • 22. Véronique Gendner – e-tissage.net – Nov. 2024 22 Collect & Filter data as early as possible
  • 23. Véronique Gendner – e-tissage.net – Nov. 2024 23 Reduce the number of rows • Specify Labels to avoid allNodesScan operator • Use Indexes • Collect & Filter data as early as possible • Move ORDER BY and LIMIT up, as early as possible • … Adapted from Tuning Cypher, Andrew Bowman at NODES 2019 Very insightful, but some parts now outdated
  • 24. Véronique Gendner – e-tissage.net – Nov. 2024 24 The cost of dense nodes … 40k enrolments corresponding :Course node becomes a dense node -> expanding from :Course along the :FOR_COURSE relation is very costly …
  • 25. Véronique Gendner – e-tissage.net – Nov. 2024 25 How to avoid dense nodes Understanding how Neo4j interprets and executes Cypher is key to debugging slow-running statements , Feb 2024, Adam Cowley Update a user’s progress in a lesson: anchor expand anchor expand expand … expand
  • 26. Véronique Gendner – e-tissage.net – Nov. 2024 26 Properties Although there have been improvement in the most recent versions of Neo4j, properties access are costly. When # of db hits is high, you can • Aggregate by node itself rather than a unique property • Delay property access until nodes are distinct • If a property is used several times, projecting it is more efficient since you do not need to hit the graph each time See also Top 10 Cypher Tuning Tips & Tricks, Michael Hunger, GraphConnect 2022 Only a few property access across one row
  • 27. Véronique Gendner – e-tissage.net – Nov. 2024 27 Use indexes Whenever possible, leverage nodes indexes • for starting nodes (INDEX HINT) • as workaround costly property reads • … The impact of indexes on query performances
  • 28. Véronique Gendner – e-tissage.net – Nov. 2024 28 Relationships -[*]-(:Person) MATCH (:Person {name:"Gil"}) <-[:mother|father*]-(:Person) • Specify relation direction and :TYPE 1 hop 2 hops • Set an upper limit on variable length patterns <-[:mother|father*0..12]-(:Person) Top 10 Cypher Tuning Tips & Tricks, Michael Hunger, GraphConnect 2022 slides 3 hops 4 hops 5 hops … … … … … … … …
  • 29. Véronique Gendner – e-tissage.net – Nov. 2024 29 More to look into… At the Cypher writing level, more leads you should look into : • Subqueries GraphAcademy, Intermediate Cypher Queries course, Reducing Memory,6-subqueries Cypher Manual, Subqueries • Pattern comprehension Cypher Manual, Pattern comprehension • Quantified Path Pattern (QPP) Advanced Pathfinding, Simple Queries, and Efficient Execution, Bastien Louërat & Finbar Good, NODES 2023 How to make your queries 1000x faster with quantified path patterns (QPP), P. Halftermeyer, sept 2023 Speed and Precision: Mastering Graph Traversal With QPP, Pierre Haltermeyer, NODES 2024 Cypher Manual, Quantified Path Pattern
  • 30. Véronique Gendner – e-tissage.net – Nov. 2024 Other levers on processing time
  • 31. Véronique Gendner – e-tissage.net – Nov. 2024 31 Several cache Parsing cache Logical plan cache Executable query cache Physical cache MATCH (n:Label {prop:val}) … RETURN … parsing / rewriting MATCH (n:Label) WHERE n. prop=val … RETURN … Cypher query string Query execution Physical plan query planner Logical plan Properties cache binary file by Abd Majd from Noun Project (CC BY 3.0)
  • 32. Véronique Gendner – e-tissage.net – Nov. 2024 32 Db config and hardware features server.memory.heap.initial_size=2G server.memory.heap.max_size=4G server.memory.pagecache.size=2G Performance - Operations Manual R2-D2, Star Wars, 1977
  • 33. Véronique Gendner – e-tissage.net – Nov. 2024 33 Modelling • Label / Node / Relationship / property choice ? Elevating a property to a Label to reduce property access • Watch out for hyper connected node • Read vs Write time : Sometimes most efficient strategy is not the same when doing massive imports or update than at reading query time. => Consider using temporary representations Ex : using a temporary Labels across several existing labels, to add an index
  • 34. Véronique Gendner – e-tissage.net – Nov. 2024 34 Resources Cypher query planner • Slow Cypher Statements and How to Fix Them , Feb 2024, Adam Cowley • Execution plans and query tuning Execution plan operators • Ninja Call about Cypher, Louërat Bastien, July 2024 Performances & Query tunning • Cypher Performance Improvements in Neo4j 5.26 LTS, Christoffer Bergman, October 2024 • Neo4j Live: Lessons learned from Real-World Graph App Development, Dave Aitel, Aug 2023 • Top 10 Cypher Tuning Tips & Tricks, Michael Hunger, GraphConnect 2022 slides • Tuning Cypher, Andrew Bowman at NODES 2019 • GraphAcademy Intermediate Cypher Queries course Communities • Neo4j discord https://discord.gg/UnsvEs8u • Neo4j community forum https://community.neo4j.com • GraphGeeks https://www.graphgeeks.org/ Courses & Tutorials Videos Article Technical documentation Slides and replay of this presentation : https://www.e-tissage.net/cypher-query-takes-too-long Very insightful, but some parts now outdated With many thanks to everyone who spent time explaining things, to help me connect the dots ! Examples from the https://sandbox.neo4j.com Movie db