SlideShare a Scribd company logo
SCALABLE EVENT TRACKING
by Ø
istein Sø
rensen - Schibsted Payment
WHAT IS AN EVENT?
EVENTS

UDP
UDP
Logger
Logger

SPiD Core
SPiD Core
Events
Events

file_logger
file_logger
aws_sqs
aws_sqs

Amazon SQS
EVENTS
Mixpanel
Mixpanel

EC2
DataPiper

Amazon SQS

Amazon SQS

Auto Scaling
Redshift
UDP LOGGER
DATAPIPER
AMAZON EC2 DEPLOYMENT
EC2 DEPLOYMENT

Auto Scaling

EC2 instances
EC2 DEPLOYMENT
EC2
Ubuntu 12.04
LTS
m1.medium

Auto Scaling
$$bash < User Data
bash < User Data
Auto Scaling Group
Auto Scaling Group
Launch Config
EC2 DEPLOYMENT

Public S3 Bucket
•

EC2
Ubuntu 12.04
LTS
m1.medium

Puppet
S3cmd

S3cmd config

$$bash < User Data
bash < User Data
EC2 DEPLOYMENT

Private S3 Bucket
•
•
•

EC2
Ubuntu 12.04
LTS
m1.medium

Node.js
npm modules

Puppet config
DataPiper
Upstart and logrotate configs

$$bash < User Data
bash < User Data
EC2 DEPLOYMENT
EC2
Ubuntu 12.04
LTS
m1.medium
DataPiper
mixpanel
mixpanel
redshift
redshift
SQS
SQS
SOFTWARE DEPLOYMENT
SOFTWARE DEPLOYMENT

Upload

Private S3 Bucket
SOFTWARE DEPLOYMENT

2
Auto Scaling
SOFTWARE DEPLOYMENT

Auto Scaling

EC2
Ubuntu 12.04
LTS
m1.medium
DataPiper
mixpanel
mixpanel
redshift
redshift
SOFTWARE DEPLOYMENT

1
Auto Scaling
SOFTWARE DEPLOYMENT
EC2
Ubuntu 12.04
LTS
m1.medium
DataPiper
mixpanel
mixpanel
redshift
redshift
QUESTIONS?

More Related Content

What's hot (18)

PDF
Nika it consulting weekly update
Rod Delwar
 
PDF
Nils Mohr & Jake Pearce - 100 years of flight data at British Airways. Past, ...
AWSCOMSUM
 
PPTX
Getting to Know Airflow
Rosanne Hoyem
 
PPTX
Big data Lambda Architecture - Batch Layer Hands On
hkbhadraa
 
PDF
Atlantisで実現するTerraformのGitOps
理弘 山崎
 
PDF
Scale search powered apps with Elastisearch, k8s and go - Maxime Boisvert
Web à Québec
 
PDF
Puppet at Spotify (stockholm)
Puppet
 
PPT
awstalk.ppt
webhostingguy
 
PDF
Ansible party in the [Google] clouds
Esther Lozano
 
PDF
Orchestrating workflows Apache Airflow on GCP & AWS
Derrick Qin
 
PDF
Elasticsearch @ Keboola
Martin Halamíček
 
PPTX
Google Cloud Platform - Eric Johnson, Joe Selman - ManageIQ Design Summit 2016
ManageIQ
 
PDF
Go With The Flow
PhilWinstanley
 
PDF
React meets o OCalm
Michał Załęcki
 
PDF
Google Cloud Computing compares GCE, GAE and GKE
Simon Su
 
PDF
Plaλ!
sihil
 
PPTX
Intro to the Google Cloud for Developers
Lynn Langit
 
PDF
Lunch_and_Learn_20150603
Hung Lin
 
Nika it consulting weekly update
Rod Delwar
 
Nils Mohr & Jake Pearce - 100 years of flight data at British Airways. Past, ...
AWSCOMSUM
 
Getting to Know Airflow
Rosanne Hoyem
 
Big data Lambda Architecture - Batch Layer Hands On
hkbhadraa
 
Atlantisで実現するTerraformのGitOps
理弘 山崎
 
Scale search powered apps with Elastisearch, k8s and go - Maxime Boisvert
Web à Québec
 
Puppet at Spotify (stockholm)
Puppet
 
awstalk.ppt
webhostingguy
 
Ansible party in the [Google] clouds
Esther Lozano
 
Orchestrating workflows Apache Airflow on GCP & AWS
Derrick Qin
 
Elasticsearch @ Keboola
Martin Halamíček
 
Google Cloud Platform - Eric Johnson, Joe Selman - ManageIQ Design Summit 2016
ManageIQ
 
Go With The Flow
PhilWinstanley
 
React meets o OCalm
Michał Załęcki
 
Google Cloud Computing compares GCE, GAE and GKE
Simon Su
 
Plaλ!
sihil
 
Intro to the Google Cloud for Developers
Lynn Langit
 
Lunch_and_Learn_20150603
Hung Lin
 

Viewers also liked (10)

PDF
Getting Predictable - Pragmatic Approach for Mobile Development - Devday.lk ...
Anjana Somathilake
 
PDF
Mixpanel Basics (Introducing analytics to teammates at Prompt.ly)
Jordan Feldstein
 
PDF
Intro to Mixpanel
Gilman Tolle
 
PPTX
Implementing improved and consistent arbitrary event tracking company-wide us...
yalisassoon
 
PDF
Best Practices: What to Track with Your Analytics
Kissmetrics on SlideShare
 
PDF
Mixpanel in 10 minutes
Paweł Nowak
 
PDF
Comment utiliser Mixpanel - Julien Le Coupanec, Growth Hacker chez TheFamily
TheFamily
 
PDF
5 Common Startup Growth F-ups - Aliisa Hodges, Mixpanel
Traction Conf
 
PDF
Aliisa Hodges, Mixpanel - 5 Tips on Acquiring Loyal Customers
Traction Conf
 
PDF
Mixpanel - Our pitch deck that we used to raise $65M
Suhail Doshi
 
Getting Predictable - Pragmatic Approach for Mobile Development - Devday.lk ...
Anjana Somathilake
 
Mixpanel Basics (Introducing analytics to teammates at Prompt.ly)
Jordan Feldstein
 
Intro to Mixpanel
Gilman Tolle
 
Implementing improved and consistent arbitrary event tracking company-wide us...
yalisassoon
 
Best Practices: What to Track with Your Analytics
Kissmetrics on SlideShare
 
Mixpanel in 10 minutes
Paweł Nowak
 
Comment utiliser Mixpanel - Julien Le Coupanec, Growth Hacker chez TheFamily
TheFamily
 
5 Common Startup Growth F-ups - Aliisa Hodges, Mixpanel
Traction Conf
 
Aliisa Hodges, Mixpanel - 5 Tips on Acquiring Loyal Customers
Traction Conf
 
Mixpanel - Our pitch deck that we used to raise $65M
Suhail Doshi
 
Ad

Recently uploaded (20)

PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
PDF
Persuasive AI: risks and opportunities in the age of digital debate
Speck&Tech
 
PDF
Predicting the unpredictable: re-engineering recommendation algorithms for fr...
Speck&Tech
 
PDF
Complete Network Protection with Real-Time Security
L4RGINDIA
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
Human-centred design in online workplace learning and relationship to engagem...
Tracy Tang
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
PPTX
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PDF
Building Resilience with Digital Twins : Lessons from Korea
SANGHEE SHIN
 
PDF
Wojciech Ciemski for Top Cyber News MAGAZINE. June 2025
Dr. Ludmila Morozova-Buss
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
Persuasive AI: risks and opportunities in the age of digital debate
Speck&Tech
 
Predicting the unpredictable: re-engineering recommendation algorithms for fr...
Speck&Tech
 
Complete Network Protection with Real-Time Security
L4RGINDIA
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
Human-centred design in online workplace learning and relationship to engagem...
Tracy Tang
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
Building Resilience with Digital Twins : Lessons from Korea
SANGHEE SHIN
 
Wojciech Ciemski for Top Cyber News MAGAZINE. June 2025
Dr. Ludmila Morozova-Buss
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Ad

Scalable Event Tracking

Editor's Notes

  • #3: An event is a happening inside the SPiD core. Ie: Signup, Login, Logout, Verify email, Purchase, etc. Events are triggered to be able to Get insight into the user behavior. Measure conversion of our processes. We need events to be able to improve our software.
  • #4: Events are sent from the SPiD Core to our UDP logger. The UDP logger inserts all events into a SQS queue.
  • #5: DataPiper retrieves events from the main SQS queue. Events are filtered and inserted into Redshift, Mixpanel and other SQS queues.
  • #6: The UDP logger is written in Node.js for performance. We have an admin interface that monitors the server to be able to detect issues in an early stage. Why UDP? We use UDP to prevent latency inside the SPiD Core. This way we can send as many events as we like without having to think about latency. Package loss is no issue as long as the UDP server is located inside the same network as the core application.
  • #7: The DataPiper is also written in Node.js. We use Amazon CloudWatch to monitor the DataPiper performance. Incoming messages, messages in queue, messages in flight, etc. Based on these number we can fine tune the DataPiper. DataPiper flow: Retrieve data from the main SQS queue Filter data. Insert data into Mixpanel, Redshift or another SQS queue.
  • #8: Deployment of our servers on the Amazon cloud platform. How do we do it? What do we need to think about? Scalability We need to make sure our queues never pile up. Uptime Our queues needs to be up at all times. Luckily Amazon provides that with SQS. Our DataPiper needs to up all the time to keep our data flowing.
  • #9: Auto Scaling provides the key to this solution. Auto Scaling: Makes sure we always have the desired amount of servers running. Makes it possible to scale when traffic increases and then down scale afterwards.
  • #10: An Auto Scaling is actually an Auto Scaling Group: It provides the desired amount of EC2 instances based on your scaling policy. An Auto Scaling Group is tied to a Launch Configuration: AMI type : Type of predefined server image. Instance type : Hardware type. Storage : Storage size and type. Security group : Firewalls around this group of servers. User data : bash script run at launch, used to automate installation.
  • #11: When the Auto Scaling Group fires up a new server it’s done like this: AMI is booted with the desired instance type, storage and security group. User data script installs: S3cmd config from public S3 bucket. S3cmd tools. Puppet via npm.
  • #12: When the first step is done and you are able to connect to the private S3 bucket: User data script then downloads: The standalone puppet config (node less) from private S3 bucket. Then it executes the Puppet client (node less, no master server needed): Installing required packages (node.js, ppm, etc) Preparing software install Creating dirs and setting ownership Installing DataPiper Software, config, upstart, logrotate Starting DataPiper service.
  • #13: No ssh login. No manual labor. All is automated - Look, no hands :)
  • #14: How do we deploy new versions of our software? Software deployment can be a tedious process. We’re working hard to simplify this and minimize the risk of down time due to deployment.
  • #15: This is how it’s done: The deployment master prepares A new release of our software. A new config file. All is uploaded to our private S3 bucket. Before proceeding please wait a few minutes and enjoy a good cup of coffee. It can be a replication delay inside the S3 platform.
  • #16: Start the deployment of new instances: Number of desired instances are increased by the number of new instances you want to deploy with the new software version. One and one is good to be sure everything works smoothly.
  • #17: Auto Scaling fires up new instances with our new software and config files. This usually takes a couple of minutes.
  • #18: When these new instances are up, then you decrease the numbers of desired instances back to the original number. Auto scaling will destroy the old instances and you’re good to go with your new version.
  • #20: - How do we deploy new versions of out software?