SlideShare a Scribd company logo
Implementing and Visualizing Click-
  Stream Data with MongoDB	

                      	

Jan 22, 2013 - New York MongoDB User Group	

                        	

            Cameron Sim - LearnVest.com
Agenda	

•  About LearnVest	

•  HL Application Architecture	

•  Data Capture	

•  Event Packaging	

•  MongoDB Data Warehousing	

•  Loading & Visualization	

•  Finishing up
LearnVest Inc.
                            www.learnvest.com	

                             Mission Statement	

    Aiming to making Financial Planning as accessible as having a gym membership	

                                          	

                                          	

           Company	

                                          Key Products	

nded in 2008 by Alexa Von Tobel, CEO	

            Account Aggregation and Managem
                	

                              (Bank, Credit, Loan, Investment, Mort
 50+ People and Growing rapidly	

                                     	

          Based in NYC	

                       Original and Syndicated Newsletter Co
                	

                                                    	

           Platforms	

                                       Financial Planning	

         Web  iPhone	

                                  (tiered product offering)	

                	

                                                    	


                                        Stack	

                                                             Analytics	

        Operational	

                             MongoDB 2.2.0 (3-node replica-set
Wordpress, Backbone.js, Node.js	

                         Java 6, Spring 3	

ava Spring 3, Redis, Memcached,
LearnVest.com	

      Web
LearnVest.com	

     IPhone
High Level Architecture	

      Production	

                            Analytics	

               	

                                  	

elivery               Services	

   Services              Loaders  Dashbo




  HTTPS	

  pyMongo
ure Everything	

                            Collection	

-Driven events over web and mobile	

 m-level exceptions	

ything else	


porary Data	

ok’ with approximate data	

rational Databases are the system of record	


egate events as they come in	

ove the overhead of basic metrics (counts, sums) on core events	

p by user unique id and increment counts per event, over time-dimensions
eek-ending, month, year)
Data Capture	

OS	


 (void) sendAnalyticEventType:(NSString*)eventType
                       object:(NSString*)object
                         name:(NSString*)name
                         page:(NSString*)page
                       source:(NSString*)source;

    NSMutableDictionary *eventData = [NSMutableDictionary dictionary];

    if   (eventType!=nil) [params setObject:eventType forKey:@eventType];
    if   (object!=nil) [eventData setObject:object forKey:@object];
    if   (name!=nil) [eventData setObject:name forKey:@name];
    if   (page!=nil) [eventData setObject:page forKey:@page];
    if   (source!=nil) [eventData setObject:source forKey:@source];
    if   (eventData!=nil) [params setObject:eventData forKey:@eventData];

    [[LVNetworkEngine sharedManager] analytics_send:params];
Data Capture	

WEB (JavaScript)	


unction internalTrackPageView() {
  var cookie = {
            userContext: jQuery.cookie('UserContextCookie'),
      };
  var trackEvent = {
            eventType: pageView,
            eventData: {
                   page: window.location.pathname + window.location.search
            }
      };
      // AJAX
      jQuery.ajax({
             url: /api/track,
             type: POST,
             dataType: json,
             data: JSON.stringify(trackEvent),
             // Set Request Headers
             beforeSend: function (xhr, settings) {
                    xhr.setRequestHeader('Accept', 'application/json');
                    xhr.setRequestHeader('User-Context', cookie.userContext)
                    if(settings.type === 'PUT' || settings.type === 'POST')
                           xhr.setRequestHeader('Content-Type', 'application/js
                    }
             }
      });
Bus Event Packaging	

ng 3 RESTful service layer, controller methods define the eventCode via @tracki
otation	


tom Intercepter class extends HandlerInterceptorAdapter and implements
 Handle() (for each event) to invoke calls via Spring @async to an EventPublisher	


ntPublisher publishes to common event bus queue with multiple subscribers, one o
kages the eventPayload MapString, Object object and forwards to Analytics Rest
Bus Event Packaging	

ing RestController Methods	

ace	


estMapping(value = /user/login, method = RequestMethod.POST,
rs=Accept=application/json)
c MapString, Object userLogin(@RequestBody MapString, Object event,
ervletRequest request);

ete/Impl Class	

ride
king(user.login)
c MapString, Object userLogin(@RequestBody MapString, Object event,
ervletRequest request){

/Implementation

eturn event;
Bus Event Packaging	

stom Intercepter class extends HandlerInterceptorAdapter 	


cted void handleTracking(String trackingCode, MapString, Object modelMap
ervletRequest request) {


MapString, Object responseModel = new HashMapString, Object();

 // remove non-serializables  copy over data from modelMap

 try {
        this.eventPublisher.publish(trackingCode, responseModel, request);
 } catch (Exception e) {
        log.error(Error tracking event ' + trackingCode + ' : 
                     + ExceptionUtils.getStackTrace(e));
 }
Bus Event Packaging	

stom Intercepter class extends HandlerInterceptorAdapter 	

c void publish (String eventCode, MapString,Object eventData,
                                                HttpServletRequest request

MapString,Object payload = new HashMapString,Object();
String eventId=UUID.randomUUID().toString();
MapString, String requestMap = HttpRequestUtils.getRequestHeaders(reques

//Normalize message
payload.put(eventType, eventData.get(eventType));
payload.put(eventData, eventData.get(eventType));
payload.put(version, eventData.get(eventType));
payload.put(eventId, eventId);
payload.put(eventTime, new Date());
payload.put(request, requestMap);
.
.
.
//Send to the Analytics Service for MongoDB persistence




c void sendPost(EventPayload payload){
   HttpEntity request = new HttpEntity(payload.getEventPayload(), headers)
Map m = restTemplate.postForObject(endpoint, request, java.util.Map.class)
Bus Event Packaging	

erialized Json (User Action)	


tCode”   :   “user.login”,
tType”   :   “login”,
ion”     :   “1.0”,
tTime”   :   “1358603157746”,
tData”   :   {
                  “” : “”,
                  “” : “”,
                  “” : “”
             },
est” : {
             “call-source” : “WEB”,
             “user-context” : “00002b4f1150249206ac2b692e48ddb3”,
             “user.agent”   : “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2)
                                AppleWebKit/537.11 (KHTML, like Gecko) Chrome/
                                23.0.1271.101 Safari/537.11”,
             “cookie”       : “size=4; CP.mode=B; PHPSESSID=c087908516
                                ee2fae50cef6500101dc89; resolution=1920;
                                JSESSIONID=56EB165266A2C4AFF9
                                46F139669D746F; csrftoken=73bdcd
                                ddf151dc56b8020855b2cb10c8, content-length :
                                204, accept-encoding : gzip,deflate,sdch”,

         }
Bus Event Packaging	

erialized Json (Generic Event)	


tCode”   :   “generic.ui”,
tType”   :   “pageView”,
ion”     :   “1.0”,
tTime”   :   “1358603157746”,
tData”   :   {
                  “page”    : “/learnvest/moneycenter/inbox”,
                  “section” : “transactions”,
                  “name”    : “view transactions”
                  “object” : “page”
             },
est” : {
             “call-source” : “WEB”,
             “user-context” : “00002b4f1150249206ac2b692e48ddb3”,
             “user.agent”   : “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2)
                                AppleWebKit/537.11 (KHTML, like Gecko) Chrome/
                                23.0.1271.101 Safari/537.11”,
             “cookie”       : “size=4; CP.mode=B; PHPSESSID=c087908516
                                ee2fae50cef6500101dc89; resolution=1920;
                                JSESSIONID=56EB165266A2C4AFF9
                                46F139669D746F; csrftoken=73bdcd
                                ddf151dc56b8020855b2cb10c8, content-length :
                                204, accept-encoding : gzip,deflate,sdch”,

         }
MongoDB Data Warehousing	

goDB Information	

 0	

 de replica-set	

rge (primary), 2x Medium (secondary) AWS Amazon-Linux machines	

  with single 500GB EBS volumes mounted to /opt/data	


goDB Config File	

  = /opt/data/mongodb/datarest = truereplSet = voyager	

mes	

vents daily on web, ~600K on mobile	

B per day at start, slowed to ~1GB per day	

ntly at 78GB (collecting since August 2012)	


re Scaling Strategy	

p 2nd Replica-Set	

d replica-sets to n at 60% / 250GB per EBS volume	

d key probably based on sequential mix of email_address  additional string
MongoDB Data Warehousing	

OBILE	


 ist all events, bucketed by source, event code and time:-	

EB/MOBILE	

er.login	

 e (day, week-ending, month, year)	


ert into collection e_web / e_mobile	


sert into:- 	

web_user_login_day	

web_user_login_week	

web_user_login_month	

web_user_login_year	


 dictable model for scaling and measuring business growth
MongoDB Data Warehousing	

DBObject newDocument = new BasicDBObject().append($inc
                     new BasicDBObject().append(count, 1));

ate day dimension
ction_day.update(new BasicDBObject().append(user-context, userContext)
               .append(eventType, eventType)
               .append(date, sdf_day.format(d)),newDocument, true, false

ate week dimension
ction_week.update(new BasicDBObject().append(user-context, userContext)
               .append(eventType, eventType)
               .append(date, sdf_day.format(w)), newDocument, true, fals

ate month dimension
ction_month.update(new BasicDBObject().append(user-context, userContext)
               .append(eventType, eventType)
               .append(date, sdf_month.format(d)), newDocument, true, fa

ate month dimension
ction_year.update(new BasicDBObject().append(user-context, userContext)
               .append(eventType, eventType)
               .append(date, sdf_year.format(d)), newDocument, true, fal
MongoDB Data Warehousing	

ount_addManual_weeke_web_account_addManual_year
_user_login_day
_user_login_week
_user_login_month
_user_login_yeare_mobile_generic_ui_daye_mobile_generic_ui_monthe_mobile_g
weeke_mobile_generic_ui_year

e_web_user_login_day.find()
d : ObjectId(50e4b9871b36921910222c42), count   : 5, date : 01/02,
-context : c4ca4238a0b923820dcc509a6f75849b }
d : ObjectId(50cd6cfcb9a80a2b4ee21422), count   : 7, date : 01/02,
-context : c4ca4238a0b923820dcc509a6f75849b }
d : ObjectId(50cd6e51b9a80a2b4ee21427), count   : 2, date : 01/02,
-context : c4ca4238a0b923820dcc509a6f75849b }
d : ObjectId(50e4b9871b36921910222c42), count   : 3, date : 01/03,
-context : 50e49a561b36921910222c33 }
MongoDB Data Warehousing	

1, accept-charset : ISO-8859-1,utf-8;q=0.7,*;q=0.3, cookie : size=
de=B; PHPSESSID=c087908516ee2fae50cef6500101dc89; resolution=1920;
IONID=56EB165266A2C4AFF946F139669D746F;
oken=73bdcdddf151dc56b8020855b2cb10c8, content-length : 255, accept-
ing : gzip,deflate,sdch }, eventType : flick, eventData : { obje
on, name : split transaction button, page : #inbox/79876/, secti
saction_river_details } }
MongoDB Data Warehousing	

xing Strategy	


xes on core collections (e_web and e_mobile) come in under 3GB on 7.5GB Large
ce and 3.75GB on Medium instances	


 datetime in two fields and compound index on date with other fields like eventTyp
unique id (user-context)	


vy insertion rates, much lower read rates....so less indexes the better
MongoDB Data Warehousing	

ing Strategy
e_web.getIndexes()[
        v : 1,            key : {                  request.user-contex
               created_date : 1        },            ns :
ycenter.e_web,             name : request.user-context_1_created_date_

        v : 1,            key : {                  eventData.name : 1
     created_date : 1            },           ns : moneycenter.e_web
 name : eventData.name_1_created_date_1     }]
jective	

Loading  Visualization	

 how historic and intraday stats on core use cases (logins, conversions)	

 how user funnel rates on conversion pages	

 how general usability - how do users really use the Web and IOS platforms?	


on-Functionals	

 traday doesn’t need to be “real-time”, polling is good enough for now	

Overnight batch job for historic must scale horizontally	


 neral Implementation Strategy	

 o all heavy lifting  object manipulation, UI should just display graph or table	

Modularize the service to be able to regenerate any graphs/tables without a full load
Loading  Visualization	

va Batch Service	


a Mongo library to query key collections and return user counts and sum of events

ursor webUserLogins = c.find(
   new BasicDBObject(date, sdf.format(new Date())));

vate HashMapString, Object getSumAndCount(DBCursor cursor){
          HashMapString, Object m = new HashMapString, Object();

           int sum=0;
           int count=0;
           DBObject obj;
           while(cursor.hasNext()){
                  obj=(DBObject)cursor.next();
                  count++;
                  sum=sum+(Integer)obj.get(count);
           }

           m.put(sum, sum);
           m.put(count, count);
           m.put(average, sdf.format(new Float(sum)/count));

           return m;
Loading  Visualization	

va Batch Service	


e Aggregation Framework where required on core collections (e_web) and externa
reate aggregation objects
bject project = new BasicDBObject($project,
 new BasicDBObject(day_value, fields) );
bject day_value = new BasicDBObject( day_value, $day_value);
bject groupFields = new BasicDBObject( _id, day_value);

reate the fields to group by, in this case “number”
upFields.put(number, new BasicDBObject( $sum, 1));

reate the group
bject group = new BasicDBObject($group, groupFields);

xecute
regationOutput output = mycollection.aggregate( project, group );

(DBObject obj : output.results()){
Loading  Visualization	


va Batch Service	


ngoDB Command Line example on aggregation over a time period, e.g. month
b.e_web.aggregate( [      { $match : { created_date : { $gt :
Date(2012-10-25T00:00:00)}}},     { $project : {        day_value : {day
dayOfMonth : $created_date },                          month:{ $month :
reated_date }} }},     { $group : {         _id : {day_value:$day_value}
    number : { $sum : 1 }      } },   { $sort : { day_value : -1 } } ])
Loading  Visualization	

va Batch Service	


sisting events into graph and table collections	


.homeGraphs.find()

_id : ObjectId(50f57b5c1d4e714b581674e2), accounts_natural : 54,
counts_total : 54, date : ISODate(2011-02-06T05:00:00Z), linked_rate
.96, premium_rate : 0, str_date : 2011,01,06, upgrade_rate : 0
ers_avg_linked : 3.43, users_linked : 7 }
_id : ObjectId(50f57b5c1d4e714b581674e3), accounts_natural : 144,
counts_total : 144, date : ISODate(2011-02-07T05:00:00Z), linked_rat
.11, premium_rate : 0, str_date : 2011,01,07, upgrade_rate : 0
ers_avg_linked : 4, users_linked : 16 }
_id : ObjectId(50f57b5c1d4e714b581674e4), accounts_natural : 119,
counts_total : 119, date : ISODate(2011-02-08T05:00:00Z), linked_rat
.13, premium_rate : 0, str_date : 2011,01,08, upgrade_rate : 0
ers_avg_linked : 4.5, users_linked : 18 }
17)
           Loading  Visualization	

day numbers    try:        conn = pymongo.Connection('localhost',
           db = conn['lvanalytics']
accountmetrics.find(
                                           cursor =

           {date : {$gte : dt_from, $lte : dt_to}}).sort(date)
urn buildMetricsDict(cursor)    except Exception as e:
ger.error(e.message)


urn the graph object (as a list or a dict of lists) to the view that called the
thod	

edata={}
edata['accountsGraph']=mongodb_home.getHomeChart()

urn render_to_response('home.html',{'pagedata': pagedata},
text_instance=RequestContext(request))




.homeGraphs.find()

_id : ObjectId(50f57b5c1d4e714b581674e2), accounts_natural : 54,
Loading  Visualization	


ango and HighCharts

pulate the series.. (JavaScript with Django templating)	

iesOptions[0] = {
id: 'naturalAccounts',    name: Natural Accounts,    data: [     {% for
n pagedata.metrics.accounts_natural %}          {% if not forloop.first
 {% endif %}               [Date.UTC({{a.0}}),{{a.1}}]         {% endfor
  ],   tooltip: {      valueDecimals: 2   }   };
Loading  Visualization	

ango and HighCharts

d Create the Charts and Tables...
Loading  Visualization	

ango and HighCharts

d Create the Charts and Tables...
Lessons Learned	

• Date Time managed as two fields, Datetime and Date	

• Aggregating and upserting documents as events are received works for us	

•  Real-time Map-Reduce in pyMongo - too slow, don’t do this.	

	

• Django-noRel - Unstable, use Django and configure MongoDB as a
      datastore only	


• Memcached on Django is good enough (at the moment) - use django-celery
      with rabbitmq to pre-cache all data after data loading	


•  HighCharts is buggy - considering D3  other libraries	

• Don’t need to retrieve data directly from MongoDB to Django, perhaps
      provide all data via a service layer (at the expense of ever-additional
      features in pyMongo)
Next Steps	

• A/B testing framework, experiments and variances	

•  Unauthenticated / Authenticated user tracking	

•  Provide data async over service layer	

• Segmentation with graphical libraries like D3  Cross-Filter (
http://square.github.com/crossfilter/)	


• Saving Query Criteria, expanding out BI tools for internal users	

• MongoDB Connector, Hadoop and Hive (maybe Tableau and other tools)	

• Storm / Kafka for real-time analytics processing	

• Shard the Replica-Set, looking into Gizzard as the middleware
Hrishi Dixit	

  Chief Technology Officer	

                                                       
                                             Kevin Connelly	

                                         Director of Engineering	

                 Will Larche	

                                          kevin@learnvest.com	

   hrishi@learnvest.com	

                                  	

                                  	

                                                                     	

                                                                                Lead IOS Developer	

                                                                                will@learnvest.com	


                                  	

                                  	

                                  	

                                                  	

                   	

                                                                        	

                                                                        	

                                  	

                                   	

                                  	

                                   	

                                                    	

                 	

                                                    	

                 	

              	

                                             Cameron Sim	

                             	

       Jeremy Brennan	

                                        Director of Analytics Tech	

           your name here	

Director of UI/UX Technology	

                                        cameron@learnvest.com	

              New Awesome Develope
   jeremy@learnvest.com	

                                  	

                                           you@learnvest.com	

              	

                                  	

             	

                                             	

                        	

                                                                               HIR

More Related Content

What's hot (20)

PDF
Data engineering design patterns
Valdas Maksimavičius
 
PDF
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Databricks
 
PDF
Microservice API Gateways with NGINX
Geoffrey Filippi
 
PDF
MLflow with Databricks
Liangjun Jiang
 
PDF
Large Language Models Bootcamp
Data Science Dojo
 
PDF
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
DianaGray10
 
PDF
Moving to Databricks & Delta
Databricks
 
PDF
Building an analytics workflow using Apache Airflow
Yohei Onishi
 
PDF
Natural Language Processing with Graph Databases and Neo4j
William Lyon
 
PPTX
Introduction to Azure Databricks
James Serra
 
PDF
چطور Chat GPT به آموزش کمک میکند؟
Viraclick.com
 
PDF
Productizing Structured Streaming Jobs
Databricks
 
PDF
ETL to ML: Use Apache Spark as an end to end tool for Advanced Analytics
Miklos Christine
 
PPTX
Airflow presentation
Anant Corporation
 
PPTX
Chatbot ppt
Manish Mishra
 
PDF
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
Numenta
 
PDF
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Databricks
 
PDF
And then there were ... Large Language Models
Leon Dohmen
 
PDF
What’s New with Databricks Machine Learning
Databricks
 
PDF
Spark SQL
Joud Khattab
 
Data engineering design patterns
Valdas Maksimavičius
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Databricks
 
Microservice API Gateways with NGINX
Geoffrey Filippi
 
MLflow with Databricks
Liangjun Jiang
 
Large Language Models Bootcamp
Data Science Dojo
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
DianaGray10
 
Moving to Databricks & Delta
Databricks
 
Building an analytics workflow using Apache Airflow
Yohei Onishi
 
Natural Language Processing with Graph Databases and Neo4j
William Lyon
 
Introduction to Azure Databricks
James Serra
 
چطور Chat GPT به آموزش کمک میکند؟
Viraclick.com
 
Productizing Structured Streaming Jobs
Databricks
 
ETL to ML: Use Apache Spark as an end to end tool for Advanced Analytics
Miklos Christine
 
Airflow presentation
Anant Corporation
 
Chatbot ppt
Manish Mishra
 
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
Numenta
 
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Databricks
 
And then there were ... Large Language Models
Leon Dohmen
 
What’s New with Databricks Machine Learning
Databricks
 
Spark SQL
Joud Khattab
 

Viewers also liked (6)

PDF
MongoDB ClickStream and Visualization
Cameron Sim
 
PDF
Clickstream Data Warehouse - Turning clicks into customers
Albert Hui
 
PDF
Clickstream & Social Media Analysis using Apache Spark
TUMRA | Big Data Science - Gain a competitive advantage through Big Data & Data Science
 
PDF
Insights into Customer Behavior from Clickstream Data by Ronald Nowling
Spark Summit
 
PPTX
Web log & clickstream
Michel Bruley
 
PPTX
Using Big Data to Drive Customer 360
Cloudera, Inc.
 
MongoDB ClickStream and Visualization
Cameron Sim
 
Clickstream Data Warehouse - Turning clicks into customers
Albert Hui
 
Insights into Customer Behavior from Clickstream Data by Ronald Nowling
Spark Summit
 
Web log & clickstream
Michel Bruley
 
Using Big Data to Drive Customer 360
Cloudera, Inc.
 
Ad

Similar to Implementing and Visualizing Clickstream data with MongoDB (20)

PDF
Open analytics | Cameron Sim
Open Analytics
 
PPTX
Developing your first application using FIWARE
FIWARE
 
PDF
Siddhi - cloud-native stream processor
Sriskandarajah Suhothayan
 
PDF
Firefox OS: HTML5 sur les stéroïdes - HTML5mtl - 2014-04-22
Frédéric Harper
 
PPTX
Introduction to WSO2 Data Analytics Platform
Srinath Perera
 
PDF
Taking Web Apps Offline
Pedro Morais
 
PPTX
Developing your first application using FI-WARE
Fermin Galan
 
PDF
Engage 2013 - Multi Channel Data Collection
Webtrends
 
PDF
HTML for the Mobile Web, Firefox OS - All Things Open - 2014-10-22
Frédéric Harper
 
PDF
NoSQL meets Microservices - Michael Hackstein
distributed matters
 
PDF
Firefox OS, une plateforme à découvrir - IO Saglac - 2014-09-09
Frédéric Harper
 
PDF
Firefox OS, HTML5 to the next level - Python Montreal - 2014-05-12
Frédéric Harper
 
PDF
Evolving your Data Access with MongoDB Stitch
MongoDB
 
PDF
WSO2Con EU 2016: An Introduction to the WSO2 Analytics Platform
WSO2
 
PDF
[Serverless Meetup Tokyo #3] Serverless in Azure (Azure Functionsのアップデート、事例、デ...
Naoki (Neo) SATO
 
PPTX
HTML5 on Mobile
Adam Lu
 
PPTX
Practical AngularJS
Wei Ru
 
PDF
Webinar: Building Your First App with MongoDB and Java
MongoDB
 
PDF
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
GeeksLab Odessa
 
PDF
NoSQL meets Microservices
ArangoDB Database
 
Open analytics | Cameron Sim
Open Analytics
 
Developing your first application using FIWARE
FIWARE
 
Siddhi - cloud-native stream processor
Sriskandarajah Suhothayan
 
Firefox OS: HTML5 sur les stéroïdes - HTML5mtl - 2014-04-22
Frédéric Harper
 
Introduction to WSO2 Data Analytics Platform
Srinath Perera
 
Taking Web Apps Offline
Pedro Morais
 
Developing your first application using FI-WARE
Fermin Galan
 
Engage 2013 - Multi Channel Data Collection
Webtrends
 
HTML for the Mobile Web, Firefox OS - All Things Open - 2014-10-22
Frédéric Harper
 
NoSQL meets Microservices - Michael Hackstein
distributed matters
 
Firefox OS, une plateforme à découvrir - IO Saglac - 2014-09-09
Frédéric Harper
 
Firefox OS, HTML5 to the next level - Python Montreal - 2014-05-12
Frédéric Harper
 
Evolving your Data Access with MongoDB Stitch
MongoDB
 
WSO2Con EU 2016: An Introduction to the WSO2 Analytics Platform
WSO2
 
[Serverless Meetup Tokyo #3] Serverless in Azure (Azure Functionsのアップデート、事例、デ...
Naoki (Neo) SATO
 
HTML5 on Mobile
Adam Lu
 
Practical AngularJS
Wei Ru
 
Webinar: Building Your First App with MongoDB and Java
MongoDB
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
GeeksLab Odessa
 
NoSQL meets Microservices
ArangoDB Database
 
Ad

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 

Recently uploaded (20)

PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
PDF
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
PDF
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PPTX
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
PDF
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
PDF
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
PDF
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
PDF
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PDF
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
PDF
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
PPTX
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 

Implementing and Visualizing Clickstream data with MongoDB

  • 1. Implementing and Visualizing Click- Stream Data with MongoDB Jan 22, 2013 - New York MongoDB User Group Cameron Sim - LearnVest.com
  • 2. Agenda •  About LearnVest •  HL Application Architecture •  Data Capture •  Event Packaging •  MongoDB Data Warehousing •  Loading & Visualization •  Finishing up
  • 3. LearnVest Inc. www.learnvest.com Mission Statement Aiming to making Financial Planning as accessible as having a gym membership Company Key Products nded in 2008 by Alexa Von Tobel, CEO Account Aggregation and Managem (Bank, Credit, Loan, Investment, Mort 50+ People and Growing rapidly Based in NYC Original and Syndicated Newsletter Co Platforms Financial Planning Web iPhone (tiered product offering) Stack Analytics Operational MongoDB 2.2.0 (3-node replica-set Wordpress, Backbone.js, Node.js Java 6, Spring 3 ava Spring 3, Redis, Memcached,
  • 5. LearnVest.com IPhone
  • 6. High Level Architecture Production Analytics elivery Services Services Loaders Dashbo HTTPS pyMongo
  • 7. ure Everything Collection -Driven events over web and mobile m-level exceptions ything else porary Data ok’ with approximate data rational Databases are the system of record egate events as they come in ove the overhead of basic metrics (counts, sums) on core events p by user unique id and increment counts per event, over time-dimensions eek-ending, month, year)
  • 8. Data Capture OS (void) sendAnalyticEventType:(NSString*)eventType object:(NSString*)object name:(NSString*)name page:(NSString*)page source:(NSString*)source; NSMutableDictionary *eventData = [NSMutableDictionary dictionary]; if (eventType!=nil) [params setObject:eventType forKey:@eventType]; if (object!=nil) [eventData setObject:object forKey:@object]; if (name!=nil) [eventData setObject:name forKey:@name]; if (page!=nil) [eventData setObject:page forKey:@page]; if (source!=nil) [eventData setObject:source forKey:@source]; if (eventData!=nil) [params setObject:eventData forKey:@eventData]; [[LVNetworkEngine sharedManager] analytics_send:params];
  • 9. Data Capture WEB (JavaScript) unction internalTrackPageView() { var cookie = { userContext: jQuery.cookie('UserContextCookie'), }; var trackEvent = { eventType: pageView, eventData: { page: window.location.pathname + window.location.search } }; // AJAX jQuery.ajax({ url: /api/track, type: POST, dataType: json, data: JSON.stringify(trackEvent), // Set Request Headers beforeSend: function (xhr, settings) { xhr.setRequestHeader('Accept', 'application/json'); xhr.setRequestHeader('User-Context', cookie.userContext) if(settings.type === 'PUT' || settings.type === 'POST') xhr.setRequestHeader('Content-Type', 'application/js } } });
  • 10. Bus Event Packaging ng 3 RESTful service layer, controller methods define the eventCode via @tracki otation tom Intercepter class extends HandlerInterceptorAdapter and implements Handle() (for each event) to invoke calls via Spring @async to an EventPublisher ntPublisher publishes to common event bus queue with multiple subscribers, one o kages the eventPayload MapString, Object object and forwards to Analytics Rest
  • 11. Bus Event Packaging ing RestController Methods ace estMapping(value = /user/login, method = RequestMethod.POST, rs=Accept=application/json) c MapString, Object userLogin(@RequestBody MapString, Object event, ervletRequest request); ete/Impl Class ride king(user.login) c MapString, Object userLogin(@RequestBody MapString, Object event, ervletRequest request){ /Implementation eturn event;
  • 12. Bus Event Packaging stom Intercepter class extends HandlerInterceptorAdapter cted void handleTracking(String trackingCode, MapString, Object modelMap ervletRequest request) { MapString, Object responseModel = new HashMapString, Object(); // remove non-serializables copy over data from modelMap try { this.eventPublisher.publish(trackingCode, responseModel, request); } catch (Exception e) { log.error(Error tracking event ' + trackingCode + ' : + ExceptionUtils.getStackTrace(e)); }
  • 13. Bus Event Packaging stom Intercepter class extends HandlerInterceptorAdapter c void publish (String eventCode, MapString,Object eventData, HttpServletRequest request MapString,Object payload = new HashMapString,Object(); String eventId=UUID.randomUUID().toString(); MapString, String requestMap = HttpRequestUtils.getRequestHeaders(reques //Normalize message payload.put(eventType, eventData.get(eventType)); payload.put(eventData, eventData.get(eventType)); payload.put(version, eventData.get(eventType)); payload.put(eventId, eventId); payload.put(eventTime, new Date()); payload.put(request, requestMap); . . . //Send to the Analytics Service for MongoDB persistence c void sendPost(EventPayload payload){ HttpEntity request = new HttpEntity(payload.getEventPayload(), headers) Map m = restTemplate.postForObject(endpoint, request, java.util.Map.class)
  • 14. Bus Event Packaging erialized Json (User Action) tCode” : “user.login”, tType” : “login”, ion” : “1.0”, tTime” : “1358603157746”, tData” : { “” : “”, “” : “”, “” : “” }, est” : { “call-source” : “WEB”, “user-context” : “00002b4f1150249206ac2b692e48ddb3”, “user.agent” : “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/ 23.0.1271.101 Safari/537.11”, “cookie” : “size=4; CP.mode=B; PHPSESSID=c087908516 ee2fae50cef6500101dc89; resolution=1920; JSESSIONID=56EB165266A2C4AFF9 46F139669D746F; csrftoken=73bdcd ddf151dc56b8020855b2cb10c8, content-length : 204, accept-encoding : gzip,deflate,sdch”, }
  • 15. Bus Event Packaging erialized Json (Generic Event) tCode” : “generic.ui”, tType” : “pageView”, ion” : “1.0”, tTime” : “1358603157746”, tData” : { “page” : “/learnvest/moneycenter/inbox”, “section” : “transactions”, “name” : “view transactions” “object” : “page” }, est” : { “call-source” : “WEB”, “user-context” : “00002b4f1150249206ac2b692e48ddb3”, “user.agent” : “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/ 23.0.1271.101 Safari/537.11”, “cookie” : “size=4; CP.mode=B; PHPSESSID=c087908516 ee2fae50cef6500101dc89; resolution=1920; JSESSIONID=56EB165266A2C4AFF9 46F139669D746F; csrftoken=73bdcd ddf151dc56b8020855b2cb10c8, content-length : 204, accept-encoding : gzip,deflate,sdch”, }
  • 16. MongoDB Data Warehousing goDB Information 0 de replica-set rge (primary), 2x Medium (secondary) AWS Amazon-Linux machines with single 500GB EBS volumes mounted to /opt/data goDB Config File = /opt/data/mongodb/datarest = truereplSet = voyager mes vents daily on web, ~600K on mobile B per day at start, slowed to ~1GB per day ntly at 78GB (collecting since August 2012) re Scaling Strategy p 2nd Replica-Set d replica-sets to n at 60% / 250GB per EBS volume d key probably based on sequential mix of email_address additional string
  • 17. MongoDB Data Warehousing OBILE ist all events, bucketed by source, event code and time:- EB/MOBILE er.login e (day, week-ending, month, year) ert into collection e_web / e_mobile sert into:- web_user_login_day web_user_login_week web_user_login_month web_user_login_year dictable model for scaling and measuring business growth
  • 18. MongoDB Data Warehousing DBObject newDocument = new BasicDBObject().append($inc new BasicDBObject().append(count, 1)); ate day dimension ction_day.update(new BasicDBObject().append(user-context, userContext) .append(eventType, eventType) .append(date, sdf_day.format(d)),newDocument, true, false ate week dimension ction_week.update(new BasicDBObject().append(user-context, userContext) .append(eventType, eventType) .append(date, sdf_day.format(w)), newDocument, true, fals ate month dimension ction_month.update(new BasicDBObject().append(user-context, userContext) .append(eventType, eventType) .append(date, sdf_month.format(d)), newDocument, true, fa ate month dimension ction_year.update(new BasicDBObject().append(user-context, userContext) .append(eventType, eventType) .append(date, sdf_year.format(d)), newDocument, true, fal
  • 19. MongoDB Data Warehousing ount_addManual_weeke_web_account_addManual_year _user_login_day _user_login_week _user_login_month _user_login_yeare_mobile_generic_ui_daye_mobile_generic_ui_monthe_mobile_g weeke_mobile_generic_ui_year e_web_user_login_day.find() d : ObjectId(50e4b9871b36921910222c42), count : 5, date : 01/02, -context : c4ca4238a0b923820dcc509a6f75849b } d : ObjectId(50cd6cfcb9a80a2b4ee21422), count : 7, date : 01/02, -context : c4ca4238a0b923820dcc509a6f75849b } d : ObjectId(50cd6e51b9a80a2b4ee21427), count : 2, date : 01/02, -context : c4ca4238a0b923820dcc509a6f75849b } d : ObjectId(50e4b9871b36921910222c42), count : 3, date : 01/03, -context : 50e49a561b36921910222c33 }
  • 20. MongoDB Data Warehousing 1, accept-charset : ISO-8859-1,utf-8;q=0.7,*;q=0.3, cookie : size= de=B; PHPSESSID=c087908516ee2fae50cef6500101dc89; resolution=1920; IONID=56EB165266A2C4AFF946F139669D746F; oken=73bdcdddf151dc56b8020855b2cb10c8, content-length : 255, accept- ing : gzip,deflate,sdch }, eventType : flick, eventData : { obje on, name : split transaction button, page : #inbox/79876/, secti saction_river_details } }
  • 21. MongoDB Data Warehousing xing Strategy xes on core collections (e_web and e_mobile) come in under 3GB on 7.5GB Large ce and 3.75GB on Medium instances datetime in two fields and compound index on date with other fields like eventTyp unique id (user-context) vy insertion rates, much lower read rates....so less indexes the better
  • 22. MongoDB Data Warehousing ing Strategy e_web.getIndexes()[ v : 1, key : { request.user-contex created_date : 1 }, ns : ycenter.e_web, name : request.user-context_1_created_date_ v : 1, key : { eventData.name : 1 created_date : 1 }, ns : moneycenter.e_web name : eventData.name_1_created_date_1 }]
  • 23. jective Loading Visualization how historic and intraday stats on core use cases (logins, conversions) how user funnel rates on conversion pages how general usability - how do users really use the Web and IOS platforms? on-Functionals traday doesn’t need to be “real-time”, polling is good enough for now Overnight batch job for historic must scale horizontally neral Implementation Strategy o all heavy lifting object manipulation, UI should just display graph or table Modularize the service to be able to regenerate any graphs/tables without a full load
  • 24. Loading Visualization va Batch Service a Mongo library to query key collections and return user counts and sum of events ursor webUserLogins = c.find( new BasicDBObject(date, sdf.format(new Date()))); vate HashMapString, Object getSumAndCount(DBCursor cursor){ HashMapString, Object m = new HashMapString, Object(); int sum=0; int count=0; DBObject obj; while(cursor.hasNext()){ obj=(DBObject)cursor.next(); count++; sum=sum+(Integer)obj.get(count); } m.put(sum, sum); m.put(count, count); m.put(average, sdf.format(new Float(sum)/count)); return m;
  • 25. Loading Visualization va Batch Service e Aggregation Framework where required on core collections (e_web) and externa reate aggregation objects bject project = new BasicDBObject($project, new BasicDBObject(day_value, fields) ); bject day_value = new BasicDBObject( day_value, $day_value); bject groupFields = new BasicDBObject( _id, day_value); reate the fields to group by, in this case “number” upFields.put(number, new BasicDBObject( $sum, 1)); reate the group bject group = new BasicDBObject($group, groupFields); xecute regationOutput output = mycollection.aggregate( project, group ); (DBObject obj : output.results()){
  • 26. Loading Visualization va Batch Service ngoDB Command Line example on aggregation over a time period, e.g. month b.e_web.aggregate( [ { $match : { created_date : { $gt : Date(2012-10-25T00:00:00)}}}, { $project : { day_value : {day dayOfMonth : $created_date }, month:{ $month : reated_date }} }}, { $group : { _id : {day_value:$day_value} number : { $sum : 1 } } }, { $sort : { day_value : -1 } } ])
  • 27. Loading Visualization va Batch Service sisting events into graph and table collections .homeGraphs.find() _id : ObjectId(50f57b5c1d4e714b581674e2), accounts_natural : 54, counts_total : 54, date : ISODate(2011-02-06T05:00:00Z), linked_rate .96, premium_rate : 0, str_date : 2011,01,06, upgrade_rate : 0 ers_avg_linked : 3.43, users_linked : 7 } _id : ObjectId(50f57b5c1d4e714b581674e3), accounts_natural : 144, counts_total : 144, date : ISODate(2011-02-07T05:00:00Z), linked_rat .11, premium_rate : 0, str_date : 2011,01,07, upgrade_rate : 0 ers_avg_linked : 4, users_linked : 16 } _id : ObjectId(50f57b5c1d4e714b581674e4), accounts_natural : 119, counts_total : 119, date : ISODate(2011-02-08T05:00:00Z), linked_rat .13, premium_rate : 0, str_date : 2011,01,08, upgrade_rate : 0 ers_avg_linked : 4.5, users_linked : 18 }
  • 28. 17) Loading Visualization day numbers try: conn = pymongo.Connection('localhost', db = conn['lvanalytics'] accountmetrics.find( cursor = {date : {$gte : dt_from, $lte : dt_to}}).sort(date) urn buildMetricsDict(cursor) except Exception as e: ger.error(e.message) urn the graph object (as a list or a dict of lists) to the view that called the thod edata={} edata['accountsGraph']=mongodb_home.getHomeChart() urn render_to_response('home.html',{'pagedata': pagedata}, text_instance=RequestContext(request)) .homeGraphs.find() _id : ObjectId(50f57b5c1d4e714b581674e2), accounts_natural : 54,
  • 29. Loading Visualization ango and HighCharts pulate the series.. (JavaScript with Django templating) iesOptions[0] = { id: 'naturalAccounts', name: Natural Accounts, data: [ {% for n pagedata.metrics.accounts_natural %} {% if not forloop.first {% endif %} [Date.UTC({{a.0}}),{{a.1}}] {% endfor ], tooltip: { valueDecimals: 2 } };
  • 30. Loading Visualization ango and HighCharts d Create the Charts and Tables...
  • 31. Loading Visualization ango and HighCharts d Create the Charts and Tables...
  • 32. Lessons Learned • Date Time managed as two fields, Datetime and Date • Aggregating and upserting documents as events are received works for us •  Real-time Map-Reduce in pyMongo - too slow, don’t do this. • Django-noRel - Unstable, use Django and configure MongoDB as a datastore only • Memcached on Django is good enough (at the moment) - use django-celery with rabbitmq to pre-cache all data after data loading •  HighCharts is buggy - considering D3 other libraries • Don’t need to retrieve data directly from MongoDB to Django, perhaps provide all data via a service layer (at the expense of ever-additional features in pyMongo)
  • 33. Next Steps • A/B testing framework, experiments and variances •  Unauthenticated / Authenticated user tracking •  Provide data async over service layer • Segmentation with graphical libraries like D3 Cross-Filter ( http://square.github.com/crossfilter/) • Saving Query Criteria, expanding out BI tools for internal users • MongoDB Connector, Hadoop and Hive (maybe Tableau and other tools) • Storm / Kafka for real-time analytics processing • Shard the Replica-Set, looking into Gizzard as the middleware
  • 34. Hrishi Dixit Chief Technology Officer Kevin Connelly Director of Engineering Will Larche [email protected] [email protected] Lead IOS Developer [email protected] Cameron Sim Jeremy Brennan Director of Analytics Tech your name here Director of UI/UX Technology [email protected] New Awesome Develope [email protected] [email protected] HIR