SlideShare a Scribd company logo
Presented by Patrick Di Loreto
R&D Engineering Lead
14th June 2015
Site: https://developer.williamhill.com/
BLOG: http://patricknoir.blogspot.com
Twitter: https://twitter.com/patricknoir
Using Spark, Kafka, Cassandra and Akka on
Mesos for Real-Time Personalization
•  WH Labs
•  Omnia – Data Management Platform
–  Omnia Chronos – A distributed Integration Middleware with Akka and Kafka
–  Omnia Fates – The long term memory with Apache Cassandra
–  Omnia NeoCortex – Real time and Machine Learning using Apache Spark
–  Omnia Hermes – Serving layer with Akka CQRS
–  Omnia Infrastructure - Mesos, Marathon and Docker
Introduction
We're	
  Hiring	
  
h+ps://careers.williamhill.com	
  
WH	
  Apple	
  Watch	
  App	
   Interac:ve	
  Scoreboard	
   Virtual	
  Reality	
  Horse	
  Race	
  
Oculus	
  RiD	
  
Omnia Platform
Reactive Distributed Data Platform
Based on a Lambda Architecture
Respecting Reactive Principles
•  Chronos – Data Source
•  Fates – Batch Layer
•  NeoCortex – Speed Layer
•  Hermes – Serving Layer
Omnia – Data Management Platform
Omnia	
  
Chronos	
  
Fates	
  
Hermes	
  
NeoCortex	
  
Omnia & Lambda Architecture
Chronos	
  
(Data	
  Source)	
  
NeoCortex	
  
(Speed	
  Layer)	
  
Fates	
  
(Batch	
  Layer)	
  
Hermes	
  
(Serving	
  Layer)	
  	
  	
  	
  	
  	
  
	
  	
  	
  	
  	
  
Omnia Principles
h+p://www.reac:vemanifesto.org/	
  
•  Scalable
•  Fault Tolerant
•  Highly Available
Omnia Chronos – Data Source
Omnia Chronos
Is in charge to collect the data from
different sources and organise them
into a stream of observable events.
Observable [ ]
• Social	
  media	
  
• Facebook	
  
• Twi+er	
  
• Affiliates	
  
• Page	
  viewing	
  
• Ar:cles	
  read,	
  
following	
  and	
  
followers,	
  bets	
  etc…	
  
• Sports	
  related	
  
• Tweets	
  
• News	
  
• Gaming	
  
• Web	
  Analy:cs	
  
• Ac:vi:es	
  with	
  in	
  
our	
  applica:ons	
  
Internal	
  
Product	
  
Centric	
  
External	
  
Customer	
  
Centric	
  
{	
  
	
  	
  “type”	
  :	
  “bet”,	
  
	
  	
  “version”	
  :	
  “1.0”	
  
	
  	
  “Ame”	
  :	
  “2015-­‐06-­‐03	
  
08:00:31”,	
  
	
  	
  “acquisiAonTime:	
  “	
  .	
  .	
  .”,	
  
	
  	
  “source”	
  :	
  “WHBetSystem”	
  
	
  	
  “payload”	
  :	
  {	
  …	
  any	
  valid	
  json	
  }	
  
}	
  
Omnia Chronos
In Chronos you define streams that collect data and convert/
persist into a stream of Observable[Incident].
Chronos	
  
Stream	
  
3	
  
Stream	
  
2	
  
Stream	
  
1	
  
Stream	
  
Omnia Chronos - Clustering
Chronos	
  1	
   Chronos	
  2	
   Chronos	
  3	
  
Twi+er	
  	
  
Omnia Chronos
•  Each stream is an actor which supervises its children:
–  Adapter Actor
–  Converter Actor
–  Persistence Manager Actor
•  Streams Actor are referential transparent with the usage of
Akka Cluster: We have extended Akka Cluster to migrate the
Stream Actors based on resource KPIs
•  Data are persisted in Kafka for durability
•  Chronos is built on top of Akka, ScalaRx and Play framework:
planning migration to Akka Streaming
Omnia Fates
Fates represents the long term memory of Omnia. Is in charge to organise all the incidents recorded by Chronos into
timelines and create new information as views by using machine learning, logical reasoning and time series analysis.
•  A timeline represents the history, the sequence of incidents performed by a specific entity over the time. Timelines
are organised per categories. An example of timeline can be the customer timeline, which might contain all the bets
placed, deposit and withdraw activities, tweets etc... performed by the specific customer.
A timeline category is not limited just to customers, it can be anything, for example: Sport Event: football match,
competition
•  Views are the result of job task that elaborates data from:
–  Timelines
–  Other Views
Omnia Fates
Timelines are created from timeline streams, each timeline stream read data from a Chronos stream and
fed the right timeline.
Omnia FatesChronos	
  
	
  
	
  
	
  
	
  
	
  
	
  
Fates	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
•  Fates persist timelines of incidents.
•  Column Family Name: <TimelineCategory>_tl
•  Key Definition: ( (entityId, date), timestamp )
•  The partition key is a strong hash key : well balanced Cassandra Cluster
•  Composite key: incidents are ordered by timestamp under a specific entity within a day
(date = yyyy-MM-dd )
Omnia Fates - Cassandra
Omnia Fates
•  We build views with job able to do:
Jobs are performed on top of NeoCortex
Logical	
  
Reasoning	
  
• Deduc:on	
  
• Induc:on	
  
• Abduc:on	
  
Time	
  line	
  analysis	
  
• Trends	
  
• Cycles	
  
• Seasonality	
  
Other	
  ML	
  
• Classifica:on	
  
• Clustering	
  
• Predic:ons	
  
Omnia Neo Cortex
Omnia Neo Cortex
•  Neo Cortex is a library developed on top of Apache Spark in order to provide to the
developers an easy way to write micro services on top of Omnia.
•  In NeoCortex we use the distribute nature of Spark to perform fast, real time data
processing and we hide to the developer the problematic relative to the connection to
the source system (Chronos) and the publishing layer
•  Typeclass definition for: Timeline, View, ChronosStream etc…
•  Typeclass definition for Algebrical structures:
–  Monoids, Rings, Groups, providing advanced functions for: moving averages,
ARX, ARMA etc
Omnia Neo Cortex
Omnia Neo Cortex - Parallelism
chronos	
  
stream	
  
Driver	
  
Executor	
  1	
  
Executor	
  2	
  
Executor	
  3	
  
Executor	
  4	
  
Executor	
  3	
  
Executor	
  4	
  
Hermes	
  
(Serving	
  Layer)	
  	
  	
  	
  	
  	
  
Stage	
  1	
  
(map)	
  
Stage	
  2	
  
(reduceByKey)	
  
Fates	
  
:melines	
  
views	
  
Omnia Hermes
Hermes
Is the layer on which data get represented for consumption: B2B and B2C. At its
foundation micro-services, notifications and data as API are key aspects of the design
Scalable and simple full duplex communication for the web
Express the correlation between the entities of the model
Inspired by Falcor (Netflix) and GraphQL (Facebook)
Hermes
Hermes	
  
Distributed	
  Cache	
  
Hermes	
  Node	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  Local	
  Cache	
  
Subscrip:on	
  Manager	
  
Client	
  Manager	
  
Authen:ca:on	
  Handler	
  
Dispatcher	
  
HTTP	
  
WS	
  
TCP	
  
Browser	
  
Hermes	
  JS	
  
WH	
  Apps	
  
Omnia Infrastructure – Mesos/Marathon/Docker
Omnia Infrastructure
Omnia	
  
Docker	
  
Marathon	
  	
  
Mesos	
  
Node	
   Node	
   Node	
   Node	
   Node	
  
Use Omnia on Omnia
Mesos	
  
Marathon	
  
Docker	
  
(Applica:on	
  Repository)	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
Docker	
  
Omnia	
  
App	
  
	
  
	
  
	
  
	
  
Docker	
  
Omnia	
  
App	
  
	
  
	
  
	
  
	
  
Docker	
  
Omnia	
  
App	
  
Chronos	
  
	
  
NeoCortex	
  
(Speed	
  Layer)	
  
Fates	
  
(Batch	
  Layer)	
  
	
  	
  	
  	
  	
  
JMX	
   JMX	
  
JMX	
  
Health	
  Stream	
  
Thank you
Q&A	
  

More Related Content

What's hot (18)

PDF
How to deploy Apache Spark 
to Mesos/DCOS
Legacy Typesafe (now Lightbend)
 
PDF
Using the SDACK Architecture to Build a Big Data Product
Evans Ye
 
PDF
Streaming Big Data & Analytics For Scale
Helena Edelson
 
PDF
Kafka spark cassandra webinar feb 16 2016
Hiromitsu Komatsu
 
PDF
Exploring Reactive Integrations With Akka Streams, Alpakka And Apache Kafka
Lightbend
 
PDF
Reactive dashboard’s using apache spark
Rahul Kumar
 
PDF
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
Helena Edelson
 
PDF
Sa introduction to big data pipelining with cassandra &amp; spark west mins...
Simon Ambridge
 
PDF
A Tale of Two APIs: Using Spark Streaming In Production
Lightbend
 
PDF
Lambda architecture
Szilveszter Molnár
 
PDF
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
Natalino Busa
 
PDF
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...
DataStax Academy
 
PDF
Introduction to Spark Streaming
datamantra
 
PDF
Apache cassandra & apache spark for time series data
Patrick McFadin
 
PDF
Akka at Enterprise Scale: Performance Tuning Distributed Applications
Lightbend
 
PDF
Kafka Streams: the easiest way to start with stream processing
Yaroslav Tkachenko
 
PDF
Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020
confluent
 
PDF
Lightbend Fast Data Platform
Lightbend
 
How to deploy Apache Spark 
to Mesos/DCOS
Legacy Typesafe (now Lightbend)
 
Using the SDACK Architecture to Build a Big Data Product
Evans Ye
 
Streaming Big Data & Analytics For Scale
Helena Edelson
 
Kafka spark cassandra webinar feb 16 2016
Hiromitsu Komatsu
 
Exploring Reactive Integrations With Akka Streams, Alpakka And Apache Kafka
Lightbend
 
Reactive dashboard’s using apache spark
Rahul Kumar
 
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
Helena Edelson
 
Sa introduction to big data pipelining with cassandra &amp; spark west mins...
Simon Ambridge
 
A Tale of Two APIs: Using Spark Streaming In Production
Lightbend
 
Lambda architecture
Szilveszter Molnár
 
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
Natalino Busa
 
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...
DataStax Academy
 
Introduction to Spark Streaming
datamantra
 
Apache cassandra & apache spark for time series data
Patrick McFadin
 
Akka at Enterprise Scale: Performance Tuning Distributed Applications
Lightbend
 
Kafka Streams: the easiest way to start with stream processing
Yaroslav Tkachenko
 
Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020
confluent
 
Lightbend Fast Data Platform
Lightbend
 

Viewers also liked (16)

PDF
Reactive app using actor model & apache spark
Rahul Kumar
 
PPTX
Alpine academy apache spark series #1 introduction to cluster computing wit...
Holden Karau
 
PPTX
Spark Kernel Talk - Apache Spark Meetup San Francisco (July 2015)
Robert "Chip" Senkbeil
 
PDF
Rethinking Streaming Analytics For Scale
Helena Edelson
 
PDF
Streaming Analytics with Spark, Kafka, Cassandra and Akka
Helena Edelson
 
PPTX
Intro to Apache Spark
Mammoth Data
 
PPTX
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Spark Summit
 
PDF
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Legacy Typesafe (now Lightbend)
 
PDF
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Helena Edelson
 
PDF
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
Anton Kirillov
 
PPTX
10 Sets of Best Practices for Java 8
Garth Gilmour
 
PDF
Reactive Streams, j.u.concurrent & Beyond!
Konrad Malawski
 
PDF
H2O - the optimized HTTP server
Kazuho Oku
 
PDF
Container Orchestration Wars
Karl Isenberg
 
PDF
Linux 4.x Tracing Tools: Using BPF Superpowers
Brendan Gregg
 
PPTX
Real World Java 9 - JetBrains Webinar
Trisha Gee
 
Reactive app using actor model & apache spark
Rahul Kumar
 
Alpine academy apache spark series #1 introduction to cluster computing wit...
Holden Karau
 
Spark Kernel Talk - Apache Spark Meetup San Francisco (July 2015)
Robert "Chip" Senkbeil
 
Rethinking Streaming Analytics For Scale
Helena Edelson
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka
Helena Edelson
 
Intro to Apache Spark
Mammoth Data
 
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Spark Summit
 
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Legacy Typesafe (now Lightbend)
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Helena Edelson
 
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
Anton Kirillov
 
10 Sets of Best Practices for Java 8
Garth Gilmour
 
Reactive Streams, j.u.concurrent & Beyond!
Konrad Malawski
 
H2O - the optimized HTTP server
Kazuho Oku
 
Container Orchestration Wars
Karl Isenberg
 
Linux 4.x Tracing Tools: Using BPF Superpowers
Brendan Gregg
 
Real World Java 9 - JetBrains Webinar
Trisha Gee
 
Ad

Similar to Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization (20)

PPTX
Modernizing with microservices and fast data
Patrick Di Loreto
 
PDF
Monitoring Akka with Kamon 1.0
Steffen Gebert
 
PDF
Introduction to apache kafka, confluent and why they matter
Paolo Castagna
 
PDF
BBL KAPPA Lesfurets.com
Cedric Vidal
 
PDF
Bigdata meetup dwarak_realtime_score_app
Dwarakanath Ramachandran
 
PPTX
Enabling Microservices Frameworks to Solve Business Problems
Ken Owens
 
PPTX
End-to-end Data Governance with Apache Avro and Atlas
DataWorks Summit
 
PPTX
Aws re invent 2018 recap
CloudHesive
 
PDF
Apache Kafka + Apache Mesos + Kafka Streams - Highly Scalable Streaming Micro...
Kai Wähner
 
PPTX
Webinar: Unlock the Power of Streaming Data with Kinetica and Confluent
Kinetica
 
PPTX
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Chris Fregly
 
PDF
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...
confluent
 
PDF
Streaming meetup
karthik_krk
 
PDF
Chti jug - 2018-06-26
Florent Ramiere
 
PDF
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Kai Wähner
 
PDF
What is OpenStack and the added value of IBM solutions
Sasha Lazarevic
 
PPTX
Kafka On YARN (KOYA): An Open Source Initiative to integrate Kafka & YARN
DataWorks Summit
 
PPTX
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
confluent
 
PDF
Cloud orchestration major tools comparision
Ravi Kiran
 
PPTX
Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Rick Bilodeau
 
Modernizing with microservices and fast data
Patrick Di Loreto
 
Monitoring Akka with Kamon 1.0
Steffen Gebert
 
Introduction to apache kafka, confluent and why they matter
Paolo Castagna
 
BBL KAPPA Lesfurets.com
Cedric Vidal
 
Bigdata meetup dwarak_realtime_score_app
Dwarakanath Ramachandran
 
Enabling Microservices Frameworks to Solve Business Problems
Ken Owens
 
End-to-end Data Governance with Apache Avro and Atlas
DataWorks Summit
 
Aws re invent 2018 recap
CloudHesive
 
Apache Kafka + Apache Mesos + Kafka Streams - Highly Scalable Streaming Micro...
Kai Wähner
 
Webinar: Unlock the Power of Streaming Data with Kinetica and Confluent
Kinetica
 
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Chris Fregly
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...
confluent
 
Streaming meetup
karthik_krk
 
Chti jug - 2018-06-26
Florent Ramiere
 
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Kai Wähner
 
What is OpenStack and the added value of IBM solutions
Sasha Lazarevic
 
Kafka On YARN (KOYA): An Open Source Initiative to integrate Kafka & YARN
DataWorks Summit
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
confluent
 
Cloud orchestration major tools comparision
Ravi Kiran
 
Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Rick Bilodeau
 
Ad

Recently uploaded (20)

PDF
Mobile CMMS Solutions Empowering the Frontline Workforce
CryotosCMMSSoftware
 
PPTX
Fundamentals_of_Microservices_Architecture.pptx
MuhammadUzair504018
 
PPTX
The Role of a PHP Development Company in Modern Web Development
SEO Company for School in Delhi NCR
 
PPTX
MiniTool Power Data Recovery Full Crack Latest 2025
muhammadgurbazkhan
 
PPTX
3uTools Full Crack Free Version Download [Latest] 2025
muhammadgurbazkhan
 
PDF
Revenue streams of the Wazirx clone script.pdf
aaronjeffray
 
PDF
Efficient, Automated Claims Processing Software for Insurers
Insurance Tech Services
 
PPTX
A Complete Guide to Salesforce SMS Integrations Build Scalable Messaging With...
360 SMS APP
 
PPTX
Revolutionizing Code Modernization with AI
KrzysztofKkol1
 
DOCX
Import Data Form Excel to Tally Services
Tally xperts
 
PPTX
Tally_Basic_Operations_Presentation.pptx
AditiBansal54083
 
PDF
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pdf
Varsha Nayak
 
PPT
MergeSortfbsjbjsfk sdfik k
RafishaikIT02044
 
PDF
Build It, Buy It, or Already Got It? Make Smarter Martech Decisions
bbedford2
 
PPTX
Platform for Enterprise Solution - Java EE5
abhishekoza1981
 
PPTX
Writing Better Code - Helping Developers make Decisions.pptx
Lorraine Steyn
 
PPTX
How Apagen Empowered an EPC Company with Engineering ERP Software
SatishKumar2651
 
PPTX
Equipment Management Software BIS Safety UK.pptx
BIS Safety Software
 
PDF
GetOnCRM Speeds Up Agentforce 3 Deployment for Enterprise AI Wins.pdf
GetOnCRM Solutions
 
PDF
Beyond Binaries: Understanding Diversity and Allyship in a Global Workplace -...
Imma Valls Bernaus
 
Mobile CMMS Solutions Empowering the Frontline Workforce
CryotosCMMSSoftware
 
Fundamentals_of_Microservices_Architecture.pptx
MuhammadUzair504018
 
The Role of a PHP Development Company in Modern Web Development
SEO Company for School in Delhi NCR
 
MiniTool Power Data Recovery Full Crack Latest 2025
muhammadgurbazkhan
 
3uTools Full Crack Free Version Download [Latest] 2025
muhammadgurbazkhan
 
Revenue streams of the Wazirx clone script.pdf
aaronjeffray
 
Efficient, Automated Claims Processing Software for Insurers
Insurance Tech Services
 
A Complete Guide to Salesforce SMS Integrations Build Scalable Messaging With...
360 SMS APP
 
Revolutionizing Code Modernization with AI
KrzysztofKkol1
 
Import Data Form Excel to Tally Services
Tally xperts
 
Tally_Basic_Operations_Presentation.pptx
AditiBansal54083
 
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pdf
Varsha Nayak
 
MergeSortfbsjbjsfk sdfik k
RafishaikIT02044
 
Build It, Buy It, or Already Got It? Make Smarter Martech Decisions
bbedford2
 
Platform for Enterprise Solution - Java EE5
abhishekoza1981
 
Writing Better Code - Helping Developers make Decisions.pptx
Lorraine Steyn
 
How Apagen Empowered an EPC Company with Engineering ERP Software
SatishKumar2651
 
Equipment Management Software BIS Safety UK.pptx
BIS Safety Software
 
GetOnCRM Speeds Up Agentforce 3 Deployment for Enterprise AI Wins.pdf
GetOnCRM Solutions
 
Beyond Binaries: Understanding Diversity and Allyship in a Global Workplace -...
Imma Valls Bernaus
 

Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization

  • 1. Presented by Patrick Di Loreto R&D Engineering Lead 14th June 2015 Site: https://developer.williamhill.com/ BLOG: http://patricknoir.blogspot.com Twitter: https://twitter.com/patricknoir Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
  • 2. •  WH Labs •  Omnia – Data Management Platform –  Omnia Chronos – A distributed Integration Middleware with Akka and Kafka –  Omnia Fates – The long term memory with Apache Cassandra –  Omnia NeoCortex – Real time and Machine Learning using Apache Spark –  Omnia Hermes – Serving layer with Akka CQRS –  Omnia Infrastructure - Mesos, Marathon and Docker Introduction
  • 3. We're  Hiring   h+ps://careers.williamhill.com   WH  Apple  Watch  App   Interac:ve  Scoreboard   Virtual  Reality  Horse  Race   Oculus  RiD  
  • 5. Based on a Lambda Architecture Respecting Reactive Principles •  Chronos – Data Source •  Fates – Batch Layer •  NeoCortex – Speed Layer •  Hermes – Serving Layer Omnia – Data Management Platform Omnia   Chronos   Fates   Hermes   NeoCortex  
  • 6. Omnia & Lambda Architecture Chronos   (Data  Source)   NeoCortex   (Speed  Layer)   Fates   (Batch  Layer)   Hermes   (Serving  Layer)                      
  • 7. Omnia Principles h+p://www.reac:vemanifesto.org/   •  Scalable •  Fault Tolerant •  Highly Available
  • 8. Omnia Chronos – Data Source
  • 9. Omnia Chronos Is in charge to collect the data from different sources and organise them into a stream of observable events. Observable [ ] • Social  media   • Facebook   • Twi+er   • Affiliates   • Page  viewing   • Ar:cles  read,   following  and   followers,  bets  etc…   • Sports  related   • Tweets   • News   • Gaming   • Web  Analy:cs   • Ac:vi:es  with  in   our  applica:ons   Internal   Product   Centric   External   Customer   Centric   {      “type”  :  “bet”,      “version”  :  “1.0”      “Ame”  :  “2015-­‐06-­‐03   08:00:31”,      “acquisiAonTime:  “  .  .  .”,      “source”  :  “WHBetSystem”      “payload”  :  {  …  any  valid  json  }   }  
  • 10. Omnia Chronos In Chronos you define streams that collect data and convert/ persist into a stream of Observable[Incident]. Chronos   Stream   3   Stream   2   Stream   1   Stream  
  • 11. Omnia Chronos - Clustering Chronos  1   Chronos  2   Chronos  3   Twi+er    
  • 12. Omnia Chronos •  Each stream is an actor which supervises its children: –  Adapter Actor –  Converter Actor –  Persistence Manager Actor •  Streams Actor are referential transparent with the usage of Akka Cluster: We have extended Akka Cluster to migrate the Stream Actors based on resource KPIs •  Data are persisted in Kafka for durability •  Chronos is built on top of Akka, ScalaRx and Play framework: planning migration to Akka Streaming
  • 14. Fates represents the long term memory of Omnia. Is in charge to organise all the incidents recorded by Chronos into timelines and create new information as views by using machine learning, logical reasoning and time series analysis. •  A timeline represents the history, the sequence of incidents performed by a specific entity over the time. Timelines are organised per categories. An example of timeline can be the customer timeline, which might contain all the bets placed, deposit and withdraw activities, tweets etc... performed by the specific customer. A timeline category is not limited just to customers, it can be anything, for example: Sport Event: football match, competition •  Views are the result of job task that elaborates data from: –  Timelines –  Other Views Omnia Fates
  • 15. Timelines are created from timeline streams, each timeline stream read data from a Chronos stream and fed the right timeline. Omnia FatesChronos               Fates                  
  • 16. •  Fates persist timelines of incidents. •  Column Family Name: <TimelineCategory>_tl •  Key Definition: ( (entityId, date), timestamp ) •  The partition key is a strong hash key : well balanced Cassandra Cluster •  Composite key: incidents are ordered by timestamp under a specific entity within a day (date = yyyy-MM-dd ) Omnia Fates - Cassandra
  • 17. Omnia Fates •  We build views with job able to do: Jobs are performed on top of NeoCortex Logical   Reasoning   • Deduc:on   • Induc:on   • Abduc:on   Time  line  analysis   • Trends   • Cycles   • Seasonality   Other  ML   • Classifica:on   • Clustering   • Predic:ons  
  • 19. Omnia Neo Cortex •  Neo Cortex is a library developed on top of Apache Spark in order to provide to the developers an easy way to write micro services on top of Omnia. •  In NeoCortex we use the distribute nature of Spark to perform fast, real time data processing and we hide to the developer the problematic relative to the connection to the source system (Chronos) and the publishing layer •  Typeclass definition for: Timeline, View, ChronosStream etc… •  Typeclass definition for Algebrical structures: –  Monoids, Rings, Groups, providing advanced functions for: moving averages, ARX, ARMA etc
  • 21. Omnia Neo Cortex - Parallelism chronos   stream   Driver   Executor  1   Executor  2   Executor  3   Executor  4   Executor  3   Executor  4   Hermes   (Serving  Layer)             Stage  1   (map)   Stage  2   (reduceByKey)   Fates   :melines   views  
  • 23. Hermes Is the layer on which data get represented for consumption: B2B and B2C. At its foundation micro-services, notifications and data as API are key aspects of the design Scalable and simple full duplex communication for the web Express the correlation between the entities of the model Inspired by Falcor (Netflix) and GraphQL (Facebook)
  • 24. Hermes Hermes   Distributed  Cache   Hermes  Node                    Local  Cache   Subscrip:on  Manager   Client  Manager   Authen:ca:on  Handler   Dispatcher   HTTP   WS   TCP   Browser   Hermes  JS   WH  Apps  
  • 25. Omnia Infrastructure – Mesos/Marathon/Docker
  • 26. Omnia Infrastructure Omnia   Docker   Marathon     Mesos   Node   Node   Node   Node   Node  
  • 27. Use Omnia on Omnia Mesos   Marathon   Docker   (Applica:on  Repository)                                 Docker   Omnia   App           Docker   Omnia   App           Docker   Omnia   App   Chronos     NeoCortex   (Speed  Layer)   Fates   (Batch  Layer)             JMX   JMX   JMX   Health  Stream