0% found this document useful (0 votes)

13 views

Multi-Model-Identifies-Fraud-At-Scale-–-ArangoDB-White-Paper

Uploaded by

enrique.repulles

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

Multi-Model-Identifies-Fraud-At-Scale-–-ArangoDB-White-Paper

Uploaded by

enrique.repulles

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

White Paper:

Multi-Model Identiﬁes Fraud

At Scale

By Arthur Keen (Senior Solution Architect, ArangoDB)

May 2020
Table of Contents
The Significance of Fraud and Graphs 2

Why Multi-Model for Fraud Detection? 3

Converting from Relational Source to Multi-model Graph 4

Fraud Questions 5

Detect Fraud Rings From a Suspicious Account 5

Detect All Fraud Rings 6

Find Orphan Accounts 7

Find Most Influential Customers and Accounts 8

What are the top 3 most influential accounts? 9

Finding Money Laundering Patterns 10

Detecting Fraud At Scale 11

Conclusion 12

Hands-on with Fraud Detection & Anti Money Laundering 13

Appendix A: Queries 15

1
The Significance of Fraud and Graphs
Fraud is an enormous and ever growing problem impacting all industries and
government services. Global fraud results in over $3.7 trillion losses annually.
Businesses lose on average 5% of their income to fraud every year. In 2018
businesses incurred $3.13 remediation costs for each dollar of fraud [1], dealing
with chargebacks, fees, interest and labor.

Traditional fraud detection views data through a straw, focusing on discrete data
points including specific accounts, individuals, devices or IP addresses. However,
today’s sophisticated fraudsters escape detection by forming fraud rings
composed of stolen and synthetic identities and circuitous back channels.

To uncover fraud rings, it is essential to look beyond individual data points in

individual data sources to a broader view of the connection patterns that exist
across multiple data modalities1. Multiple disparate data sources storing
individual activities and relationships that need to be analysed in concert to
detect complex fraudulent behavior.

ArangoDB’s native multi-model is ideal for tackling this challenge, because it

supports graphs, documents, key-stores, and relational models. This provides
streamlined, flexible, and agile harmonization of the relevant multi-modal2 user
activity data, provides the performance and scale to detect the complex fraud
patterns, and serves results in the different data models needed by
stakeholders.

1
Multimodal data. Our experience of the world is multimodal — we see objects, hear sounds, feel
the texture, smell odors, and taste flavors. Modality refers to the way in which something
happens or is experienced and a research problem is characterized as multimodal when it
includes multiple such modalities.
2
This juxtaposition of multi-model and multi-modal is deliberate, they are orthogonal terms.

2
Figure 1: Identify fraud patterns in the network of transactions and relationships.

Why Multi-Model for Fraud Detection?

ArangoDB’s multi-model graph allows you to easily fuse together disparate data
and identify complex fraudulent patterns of connections, such as fraud rings,
using the ArangoDB Query Language (AQL).

The identification of fraud ring patterns requires very deep (multi-hop) traversals
across the graph. The query for detecting a fraud ring can be accomplished in
six lines of (easy to write and maintain) AQL code and ArangoDB can execute
these queries with sub-second response times

Multi-model do not have to convert the entire dataset to graph to do this.

Use graphs where needed for analytics. Multi-model graphs allow you to
combine documents, joins, and graphs to solve this problem.

3
Converting from Relational Source to Multi-model Graph

The source of data for fraud detection would likely be a relational database,
for example, the schema depicted in Figure 3, which describes the foreign
key relationships among the Bank, Branch, Customer, Account, and
Transaction Tables.

Figure 3: Relational Source Schema

How do we convert this to a graph in ArangoDB? Because ArangoDB is a

multi-model database, the tables can be ingested as-is, directly into
ArangoDB as collections, so the Bank table becomes the Bank collection and
so on. Then you can choose whether to convert all or part of it to a graph
model based on the requirements for fraud analytics.

Since we need to do deep link/traversal analytics on the account transactions

and the customers, it makes sense to add graph edges in this area of the
graph. In this transformation it makes sense to use the Transaction
collection as an edge and to materialize the CustomerAccount foreign key as
the AccountHolder edge.

Figure 4: Multi-Model Schema: Documents, Joins, Graph

We use the convention of converting foreign key relationships to edges that

are directed from the dependent to the independent entity. Resolution

4
entities AKA join tables in the relational model can be used as edges in a
graph model as we have done with Transaction.

Fraud Questions
We will describe how to use ArangoDB to answer various questions:

● Are there potential fraud rings connected to a suspicious account?

● Are there any potential fraud rings in my data?
● Are there orphan accounts (those not transacting)?
● Who are the most influential customers/accounts in transactions?
● Are there any money laundering patterns?

The following section shows how these questions can be answered in

ArangoDB on synthetically generated transaction data. The queries are
examples for detecting the patterns on this synthetic data set, meant to
inspire practitioners to develop real-world fraud detection capabilities on
ArangoDB with real data.

Detect Fraud Rings From a Suspicious Account

Fraud rings consist of very long loops of transactions and relationships

among individuals that are used by fraudsters to evade detection. These
long loops are also used in sophisticated cyber crime, where the perpetrators
create long paths of logins across multiple systems to avoid detection. The
reason these long paths are difficult to detect and understand is that they
require deep multi-hop traversals into the graph of transactions and
relationships among the individuals collaborating in the fraud.

In conventional systems, these multi-hop queries require a high number of

joins, which can take a substantial amount of time and consume a large
amount of computing resources. ArangoDB’s graph model supports high
performance multi-hop queries, where for example 10-hop queries on large
datasets can take less than 10 milliseconds depending on the topology of the

5
graph. For this example, the query finds long loops of transactions starting
from a suspicious account and looping back to the suspicious account over 5
to 10 transaction hops.

Figure 5 depicts the fraud ring detection query written in the ArangoDB
Query Language (AQL) being developed and executed in the ArangoDB
administrative panel. Note that this sophisticated query is expressed in 6
lines of AQL code and that the compact representation is easily
understandable and maintainable. The query results are displayed as a
circuit in the graph visualization and are also available in json, so they can be
processed by applications calling this query. Note also that the query is
parameterized by ‘suspicious account’ and number of loops to detect.

Figure 5: Finding fraud Ring(s) from a suspicious account

Detect All Fraud Rings

In the previous example, we detected fraud rings connected to a suspicious

account. What if we did not have a list of suspicious accounts to analyze yet
and wanted to analyze our graph to detect all of the fraud ring patterns in it?

6
This is easily accomplished in AQL by adding an outer loop to the fraud ring
detector for suspicious accounts. This sophisticated query is written in only 6
lines of AQL!
The query for finding all fraud loops is depicted in Figure 6.

Figure 6: Find all fraud rings

Find Orphan Accounts

There are many patterns for finding suspicious accounts that may require
further investigation. Most of these patterns are essentially finding
anomalous behavior to flag accounts.

One pattern is the orphan account, where an account is set up to participate

in very specific fraud transaction patterns, but otherwise does not interact in
a ‘normal’ way with other accounts and may be used very infrequently.

Figure 7 depicts a query for finding orphan accounts and reports on the
accounts and account owner.

7
Figure 7: Find Suspicious “Orphan” Accounts

Find Most Influential Customers and Accounts

We can also use standard graph algorithms like pagerank to find deeply
coordinated activity, by looking for the most influential customers and
accounts.

The pagerank algorithm scores how important or influential a vertex is

relative to the rest of the network. This is accomplished in ArangoDB by
executing ArangoDB’s pagerank algorithm on the graph via the Pregel
interface and then visualizing the results.

Figure 8 depicts a visualization of several clusters of customer/

account/transaction activity, where the size of the vertices is scaled
proportional to the pagerank computed for that vertex. This visualization
provides visual cues to the relative dominance of customers and accounts in
the network.

8
Figure 8: Find most influential accounts and customers

What are the top 3 most influential accounts?

Top 3 or top 10 queries are often used to focus attention. In this example,
we use an AQL query to find the top 3 most influential customers. This query
is essentially reading the pagerank value inserted by ArangoDB’s pagerank
algorithm and ordering the results in descending order and returning a limit
of three. The query and the results of execution are depicted in Figure 9.

Figure 9: Query for listing top 3 most influential accounts

9
Finding Money Laundering Patterns

ArangoDB can also be used to find more specific patterns, for example, in
money laundering. In money laundering there is a funds
disaggregation/aggregation pattern, where many small transactions (below
some known triggering threshold) are used to split up a large sum of money,
followed by multiple transaction hops across accounts to further avoid
detection, ultimately followed by a number of transactions that aggregate
the funds back to an account.

This fan-out/fan-in pattern can easily be detected using AQL. The query and
results are depicted in Figure 10.

Figure 10: Finding Money Laundering Patterns

10
Detecting Fraud At Scale
Real-world financial transactions generate billions of data points and
relationships, which will rapidly overrun the capabilities of a single server.
Providing fraud-detection performance at scale requires the underlying data
systems to be able to scale out data across multiple nodes in a distributed
cluster and to be able to efficiently distribute computation in parallel across
the cluster.

On a distributed database cluster, the limiting factor is network performance,

because network performance is two orders of magnitude slower than
memory and in a distributed cluster there will be data and communication
traffic between nodes in the cluster. For example, the performance on
detecting a fraud ring would be negatively impacted if many of the edges
being traversed caused computation to hop back and forth between servers.
Obviously better network performance improves overall performance,
however there are also data distribution and query optimizations that can
greatly reduce the amount of inter-node communication needed to execute
queries, and therefore improve distributed performance.

Optimizing the layout of data on the cluster can reduce the inter-node
communication needed to perform queries. ArangoDB uses Smartgraph
algorithm to optimize graph distribution across a cluster, SmartJoins to
ensure that joins do not cross servers, and satellite collections to replicate
metadata across servers so that lookups occur local to servers.

Figure 11: Bad distribution of graph data causes network hops during query execution

11
The Smartgraph feature of ArangoDB allows us to handle this problem in a
smarter way. In Fraud Detection we might know from the past that
fraudsters use banks in certain countries or regions to launder their money.
We can use this domain knowledge as a sharing key for our graph data and
allocate all financial transactions performed in this region on DB server 1,
and distribute other transactions on other DB servers. By using this
approach we can allocate all data needed to be grouped together on each
machine, and use the query engines on each DB Server to execute our
queries in parallel.

Figure 12: Optimized data distribution with ArangoDB SmartGraphs

Conclusion
This paper points the way to using ArangoDB as part of a fraud detection
solution. We encourage users to experiment with our sample data and
sample queries, learn how to apply ArangoDB to fraud visa experimentation
by adding/modifying the data and queries, and be inspired and empowered
to apply your knowledge of fraud to use ArangoDB on your own data to

12
detect fraudulent activity. To get started easily, you can follow the interactive
demo provider on our cloud service ArangoDb Oasis and described below.

Hands-on with Fraud Detection & Anti Money Laundering

Testing ArangoDB and its capabilities for detecting fraud and money
laundering is very simple. Many of the use cases shown in this white paper
are part of an interactive demo available for free on ArangoDB’s cloud
service Oasis. No credit card is needed for a 14 day free trial deployment and
the examples can be installed with just one click. A detailed guide is provided
so really everyone can follow along easily.

Just s ign-up for ArangoDB Oasis and follow the few steps below

1. Create a Deployment (Here is a 2min video Tutorial)

2. Install the Fraud Detection Example in Oasis (Project -> Deployment
Tab -> View your deployment -> Examples Tab or just click “view
Deployment” directly after initiating the deployment creation
3. After the example is ready (~1minute) follow the Fraud Detection
guide provided to run real queries against the demo data you just
installed

13
This White Paper was written by Arthur Keen. For any questions about solving
Fraud Detection cases with ArangoDB, feel free to reach out to
[email protected]

14
Appendix A: Queries
/*

Find all suspicious long loops of transactions

Show the graph and json results
Scroll to bottom of graph results and click "GraphViewer" to see results in Graph Viewer

WITH transaction, account

FOR suspicous_account IN account
FOR acct, tx, path IN 5..10 OUTBOUND suspicous_account._id GRAPH 'fraud-detection'
PRUNE tx._to == suspicous_account._id
FILTER tx._to == suspicous_account._id
RETURN path

/*
Find number of Curious loops from a suspicious Account
Hints:
Try suspiciousAccountID = account/10000032
Rerun the query for different number of loops detected
Show the graph and json results
Scroll to bottom of graph results and click "GraphViewer" to see results in Graph Viewer
*/

WITH account, transaction

LET suspicious_account = DOCUMENT(@suspiciousAccountID)
FOR acct, tx, path IN 5..10 OUTBOUND suspicious_account._id GRAPH 'fraud-detection'
PRUNE tx._to == suspicious_account._id
FILTER tx._to == suspicious_account._id
LIMIT @numberOfLoopsReturned
RETURN path

/*
Find Orphan Account
An orphan account is an account with little or no transactions.
These may be set up in advance of money laundering operations.
This query finds accounts with no transactions

LET usedResources = UNION_DISTINCT(

FOR relationship IN transaction RETURN relationship._from,
FOR relationship IN transaction RETURN relationship._to)
FOR resource IN account
FILTER resource._id NOT IN usedResources
SORT resource.account_type, resource.customer_id
RETURN {"customerName" : DOCUMENT(CONCAT("customer/",
resource.customer_id)).Name, "customerID": resource.customer_id, "accountID":
resource._id, "type": resource.account_type }

15
/*
Anti Money Laundering Pattern Detection
Find transaction patterns that contain a disaggregation and re-aggregation of funds
pattern
This pattern is characterized by transactions that dis-aggregate funds from a source
account to
multiple accounts in amounts that are below a reporting threshold, i.e., below $10,000
followed by a series of small transactions into 1 or more accounts, followed by
re-aggregation
of the small transactions into a destination account.
Show the graph and json results
Scroll to bottom of graph results and click "GraphViewer" to see results in Graph Viewer
*/

WITH account, transaction

LET accountOutDegree = (FOR transaction IN transaction
COLLECT accountOut = transaction._from WITH COUNT INTO outDegree
RETURN {account : accountOut, outDegree : outDegree})
LET accountInDegree = (FOR transaction IN transaction
COLLECT accountIn = transaction._to WITH COUNT INTO inDegree
RETURN {account : accountIn, inDegree : inDegree} )
LET accountDegree = (FOR inRecord IN accountInDegree
FOR outRecord IN accountOutDegree
FILTER inRecord.account == outRecord.account
RETURN MERGE(inRecord, outRecord))
LET maxAccount = (FOR maxDegree IN accountOutDegree
FILTER maxDegree.outDegree == MAX(accountOutDegree[*].outDegree)
RETURN maxDegree)[0]
FOR account, transaction IN 1..4 OUTBOUND maxAccount.account transaction
RETURN transaction

Final Project Report SIP Aseem
25% (4)
Final Project Report SIP Aseem
42 pages
Learning Log: Organize Your Data in A Table: Instructions
0% (1)
Learning Log: Organize Your Data in A Table: Instructions
2 pages
BASH Guide - Joseph DeVeau
100% (2)
BASH Guide - Joseph DeVeau
227 pages
1
No ratings yet
1
12 pages
Analysis On Credit Card Fraud Detection Methods
No ratings yet
Analysis On Credit Card Fraud Detection Methods
19 pages
Technical Solution Document: Version Number: 0.0 Version Date: May 9, 2016
No ratings yet
Technical Solution Document: Version Number: 0.0 Version Date: May 9, 2016
20 pages
Multi Model Identifies Fraud at Scale - ArangoDB White Paper
No ratings yet
Multi Model Identifies Fraud at Scale - ArangoDB White Paper
17 pages
Mini Project
No ratings yet
Mini Project
12 pages
A Multi Perspective Fraud Detection Method For Multi Participant E Commerce Transactions
No ratings yet
A Multi Perspective Fraud Detection Method For Multi Participant E Commerce Transactions
6 pages
mlproject
No ratings yet
mlproject
8 pages
Analysis & Summary Report for the Vice President of Fraud
No ratings yet
Analysis & Summary Report for the Vice President of Fraud
16 pages
Fraud UseCase White Paper 2024
No ratings yet
Fraud UseCase White Paper 2024
20 pages
Distributed Datamining IN Credit Card Fraud Detection
No ratings yet
Distributed Datamining IN Credit Card Fraud Detection
98 pages
Presentation Slides
No ratings yet
Presentation Slides
18 pages
FINANCIAL FRAUD DETECTION
No ratings yet
FINANCIAL FRAUD DETECTION
11 pages
synopsis ml projectpdf
No ratings yet
synopsis ml projectpdf
13 pages
Transaction Fraud Detection Using GRU-centered Sandwich-Structured Model
No ratings yet
Transaction Fraud Detection Using GRU-centered Sandwich-Structured Model
6 pages
Using CHAID For Classification Problems: Ray@hrs - Co.nz
No ratings yet
Using CHAID For Classification Problems: Ray@hrs - Co.nz
5 pages
Credit Card Fraud Detection
No ratings yet
Credit Card Fraud Detection
72 pages
Srinivasulu Journal
No ratings yet
Srinivasulu Journal
5 pages
Credit Card Fraud Analysis
No ratings yet
Credit Card Fraud Analysis
3 pages
122208
No ratings yet
122208
17 pages
Fraud Detection Project Report
No ratings yet
Fraud Detection Project Report
4 pages
Distributed Datamining IN Credit Card Fraud Detection
No ratings yet
Distributed Datamining IN Credit Card Fraud Detection
3 pages
Fraud Analytics 2022
No ratings yet
Fraud Analytics 2022
11 pages
Fraud Detection in Financial Transactions
No ratings yet
Fraud Detection in Financial Transactions
2 pages
ads
No ratings yet
ads
8 pages
[email protected]
No ratings yet
[email protected]
6 pages
Anomaly Detection in Graphs of Bank Transactions For Anti Money Laundering Applications
No ratings yet
Anomaly Detection in Graphs of Bank Transactions For Anti Money Laundering Applications
16 pages
Credit Card Fraud Detection and Analysis
No ratings yet
Credit Card Fraud Detection and Analysis
4 pages
Data Mining
No ratings yet
Data Mining
9 pages
Poster
No ratings yet
Poster
2 pages
10 Case Study
No ratings yet
10 Case Study
6 pages
Internship project
No ratings yet
Internship project
8 pages
Paper-7 - Supervised Machine Learning Model For Credit Card Fraud Detection
No ratings yet
Paper-7 - Supervised Machine Learning Model For Credit Card Fraud Detection
7 pages
The Top 5 Use Cases of Graph Databases: Unlocking New Possibilities With Connected Data
No ratings yet
The Top 5 Use Cases of Graph Databases: Unlocking New Possibilities With Connected Data
13 pages
JETIR2404299
No ratings yet
JETIR2404299
9 pages
Managing Fraud With Link Analysis and Timeline Visualization
No ratings yet
Managing Fraud With Link Analysis and Timeline Visualization
14 pages
B17 Discrete Report
No ratings yet
B17 Discrete Report
16 pages
ccfdpdf
No ratings yet
ccfdpdf
25 pages
Exp8_Minor_d8e14902d9222c9afbb3e7fdb3e207cb
No ratings yet
Exp8_Minor_d8e14902d9222c9afbb3e7fdb3e207cb
3 pages
Credit Card Fraud Detection Based On Ontology Graph
No ratings yet
Credit Card Fraud Detection Based On Ontology Graph
12 pages
Managing Risk and Threat in Financial Services With Link Analysis 2
No ratings yet
Managing Risk and Threat in Financial Services With Link Analysis 2
17 pages
Credit Card Fraud Detection: Title
No ratings yet
Credit Card Fraud Detection: Title
5 pages
Project Zero
No ratings yet
Project Zero
15 pages
Human Behavior Scoring in Credit Card Fraud Detection
No ratings yet
Human Behavior Scoring in Credit Card Fraud Detection
9 pages
Presentation Slides
No ratings yet
Presentation Slides
16 pages
Credit Card Research Paper
No ratings yet
Credit Card Research Paper
12 pages
Credit Card Fraud Analysis Ashutosh
No ratings yet
Credit Card Fraud Analysis Ashutosh
3 pages
Credit Card Fraud Detection With Artificial Immune Systems (Has Features)
No ratings yet
Credit Card Fraud Detection With Artificial Immune Systems (Has Features)
12 pages
Fraud Detection System Micro-Project
No ratings yet
Fraud Detection System Micro-Project
27 pages
Group 19 Literature Review
No ratings yet
Group 19 Literature Review
11 pages
Credit Card Fraud Detection Using Historical Transaction Data
No ratings yet
Credit Card Fraud Detection Using Historical Transaction Data
5 pages
Credit Card Fraud Detection - Machine Learning Methods
No ratings yet
Credit Card Fraud Detection - Machine Learning Methods
5 pages
Script KHDL
No ratings yet
Script KHDL
4 pages
Link For Google Colab Note Book: Pa Ge
No ratings yet
Link For Google Colab Note Book: Pa Ge
17 pages
金融违约笔记
No ratings yet
金融违约笔记
10 pages
Capstone Project - Credit Card Fraud Prediction - Alexandre Daltro
No ratings yet
Capstone Project - Credit Card Fraud Prediction - Alexandre Daltro
15 pages
Credit Card Fraud Detection Techniques
No ratings yet
Credit Card Fraud Detection Techniques
8 pages
Final Doc of Fraud Detection in Banking Data by Machine Learning Techniques
No ratings yet
Final Doc of Fraud Detection in Banking Data by Machine Learning Techniques
63 pages
Fraud Detection in Financial Transaction
No ratings yet
Fraud Detection in Financial Transaction
5 pages
Data Science Project Ideas for Thesis, Term Paper, and Portfolio
From Everand
Data Science Project Ideas for Thesis, Term Paper, and Portfolio
Zemelak Goraga
No ratings yet
Anti-Money Laundering Transaction Monitoring Systems Implementation: Finding Anomalies
From Everand
Anti-Money Laundering Transaction Monitoring Systems Implementation: Finding Anomalies
Derek Chau
5/5 (2)
Governor Model Test
No ratings yet
Governor Model Test
2 pages
Rothi Bhattacharyya: Work Experience
No ratings yet
Rothi Bhattacharyya: Work Experience
4 pages
5 (T) InsurTech Examining The Role of Technology in Insurance Sector
No ratings yet
5 (T) InsurTech Examining The Role of Technology in Insurance Sector
17 pages
1ZBG000827_en TXpert Hub Distribution Technical Brochure
No ratings yet
1ZBG000827_en TXpert Hub Distribution Technical Brochure
11 pages
Trading Chart
No ratings yet
Trading Chart
3 pages
Seven Ideas For Using Mobile Phones in The Classroom
No ratings yet
Seven Ideas For Using Mobile Phones in The Classroom
4 pages
Teodulo Perez Vasquez UX IU Designer
No ratings yet
Teodulo Perez Vasquez UX IU Designer
3 pages
Guidelines To Fill Student Data For Skill Certificates v1
No ratings yet
Guidelines To Fill Student Data For Skill Certificates v1
2 pages
Agent-Based Simulation With Netlogo To Evaluate Ambient Intelligence Scenarios
No ratings yet
Agent-Based Simulation With Netlogo To Evaluate Ambient Intelligence Scenarios
10 pages
Accessing Crystal Reports
No ratings yet
Accessing Crystal Reports
2 pages
Excel Guidelines Chapter2
No ratings yet
Excel Guidelines Chapter2
15 pages
Backend Developer SDE III ( Updated ) (3)
No ratings yet
Backend Developer SDE III ( Updated ) (3)
4 pages
ICT-Grade 7
No ratings yet
ICT-Grade 7
14 pages
Minimal Spanning Tree Problem
No ratings yet
Minimal Spanning Tree Problem
25 pages
Pro e Installation
No ratings yet
Pro e Installation
1 page
IMaster NCE Smart LCT V100R021C00 Software Installation and Commissioning Guide (SUSE) 01-C
No ratings yet
IMaster NCE Smart LCT V100R021C00 Software Installation and Commissioning Guide (SUSE) 01-C
53 pages
CH - 11 ALGEBRA
No ratings yet
CH - 11 ALGEBRA
5 pages
Osinstall - MPKG Patched For MBR Lion: Click Here To Download
No ratings yet
Osinstall - MPKG Patched For MBR Lion: Click Here To Download
3 pages
21.2.1 Packet Tracer - Troubleshoot WLAN Issues
No ratings yet
21.2.1 Packet Tracer - Troubleshoot WLAN Issues
3 pages
Howto Code - Blocks and AVR1 - 3 PDF
No ratings yet
Howto Code - Blocks and AVR1 - 3 PDF
12 pages
Lista Precios202306
No ratings yet
Lista Precios202306
6 pages
Table: Element Forces - Frames Frame Station Outputcase Casetype P V2 V3 T
No ratings yet
Table: Element Forces - Frames Frame Station Outputcase Casetype P V2 V3 T
4 pages
End-To-End Learning of Driving Models From Large-Scale Video Datasets
No ratings yet
End-To-End Learning of Driving Models From Large-Scale Video Datasets
9 pages
VITA CR System Version 3.2 Software: Installation Guide
No ratings yet
VITA CR System Version 3.2 Software: Installation Guide
47 pages
Nptel: NOC:Introduction To Data Analytics - Video Course
No ratings yet
Nptel: NOC:Introduction To Data Analytics - Video Course
2 pages
Rubygems: Manager For The Ruby
No ratings yet
Rubygems: Manager For The Ruby
9 pages
20 Days in Java
No ratings yet
20 Days in Java
44 pages

Multi-Model-Identifies-Fraud-At-Scale-–-ArangoDB-White-Paper

Uploaded by

Multi-Model-Identifies-Fraud-At-Scale-–-ArangoDB-White-Paper

Uploaded by

White Paper:

Multi-Model Identiﬁes Fraud

By Arthur Keen (Senior Solution Architect, ArangoDB)

Why Multi-Model for Fraud Detection? 3

Converting from Relational Source to Multi-model Graph 4

Detect Fraud Rings From a Suspicious Account 5

Detect All Fraud Rings 6

Find Orphan Accounts 7

Find Most Influential Customers and Accounts 8

What are the top 3 most influential accounts? 9

Finding Money Laundering Patterns 10

Detecting Fraud At Scale 11

Hands-on with Fraud Detection & Anti Money Laundering 13

To uncover fraud rings, it is essential to look beyond individual data points in

ArangoDB’s native multi-model is ideal for tackling this challenge, because it

Why Multi-Model for Fraud Detection?

Multi-model do not have to convert the entire dataset to graph to do this.

Figure 3: Relational Source Schema

How do we convert this to a graph in ArangoDB? Because ArangoDB is a

Since we need to do deep link/traversal analytics on the account transactions

Figure 4: Multi-Model Schema: Documents, Joins, Graph

We use the convention of converting foreign key relationships to edges that

● Are there potential fraud rings connected to a suspicious account?

The following section shows how these questions can be answered in

Detect Fraud Rings From a Suspicious Account

Fraud rings consist of very long loops of transactions and relationships

In conventional systems, these multi-hop queries require a high number of

Figure 5: Finding fraud Ring(s) from a suspicious account

Detect All Fraud Rings

In the previous example, we detected fraud rings connected to a suspicious

Figure 6: Find all fraud rings

Find Orphan Accounts

One pattern is the orphan account, where an account is set up to participate

Find Most Influential Customers and Accounts

The pagerank algorithm scores how important or influential a vertex is

Figure 8 depicts a visualization of several clusters of customer/

What are the top 3 most influential accounts?

Figure 9: Query for listing top 3 most influential accounts

Figure 10: Finding Money Laundering Patterns

On a distributed database cluster, the limiting factor is network performance,

Figure 12: Optimized data distribution with ArangoDB SmartGraphs

Hands-on with Fraud Detection & Anti Money Laundering

1. Create a Deployment (​Here is a 2min video Tutorial​)

Find all suspicious long loops of transactions

WITH​ transaction, account

WITH​ account, transaction

LET​ usedResources = ​UNION_DISTINCT​(

WITH​ account, transaction

You might also like

1. Create a Deployment (Here is a 2min video Tutorial)

WITH transaction, account

WITH account, transaction

LET usedResources = UNION_DISTINCT(

WITH account, transaction