These slides are the ones I presented at the 2011 Game Developer's Conference.
Social game and entertainment company IMVU built a real-time lightweight networked messaging back-end suitable for chat and social gaming. Here's how we did it!
How to tune IBMs Garbage Collector (GC), particularly for Generational GC.
This was presented at the WebShphere User Group, UK in February 2011.
You can read the article here:
http://www.ibm.com/developerworks/websphere/techjournal/1106_bailey/1106_bailey.html
This document summarizes a keynote speech given by John Adams, an early Twitter engineer, about scaling Twitter operations from 2008-2009. Some key points:
1) Twitter saw exponential growth rates from 2008-2009, processing over 55 million tweets per day and 600 million searches per day.
2) Operations focused on improving performance, reducing errors and outages, and using metrics to identify weaknesses and bottlenecks like network latency and database delays.
3) Technologies like Unicorn, memcached, Flock, Cassandra, and daemons were implemented to improve scalability beyond a traditional RDBMS and handle Twitter's data volumes and real-time needs.
4) Caching,
This document summarizes Mike Krieger's talk on scaling Instagram from its early days with 2 engineers to supporting over 30 million users. Some key points include: starting simply with Django and PostgreSQL; adopting Redis for caching and queuing; implementing database sharding in PostgreSQL as user growth increased database size; focusing on simplicity, monitoring, and nimble iteration; and scaling components individually while maintaining a minimal overall architecture. Krieger emphasizes optimizing for operational simplicity and solving problems with existing tools before building custom solutions.
The experiences of migrating a large scale, high performance healthcare networkgeorge.james
Partners Healthcare migrated their large-scale Caché database from a mixed Windows and UNIX environment to a new highly available UNIX architecture using HP servers. They took a phased approach, first benchmarking performance on test systems, then migrating the database tier and finally the application tier. Benchmarking revealed optimizations that improved performance in production. The new environment provided improved availability, scalability and reduced maintenance needs to support continued rapid growth of the healthcare network.
This document discusses SignalR, a Microsoft technology for building real-time web applications. SignalR provides a simple abstraction over various transport mechanisms like websockets, server-sent events, and long polling. It allows for easy implementation of real-time features like chat, live dashboards, and games. The document covers key aspects of SignalR like hubs for defining server-side endpoints, group management, authentication, and deployment options on web farms using Redis or SQL Server.
The document discusses key concepts in distributed systems including networking, remote procedure calls (RPC), and transaction processing systems (TPS). It covers networking fundamentals like sockets and ports. It describes how RPC works by allowing functions to be called remotely. It explains the ACID properties that TPS must support for atomicity, consistency, isolation, and durability of transactions processed across distributed systems.
The presentation given at MSBTE sponsored content updating program on 'Advanced Java Programming' for Diploma Engineering teachers of Maharashtra. Venue: Guru Gobind Singh Polytechnic, Nashik
Date: 22/12/2010
Session: Java Network Programming
This document discusses Apache Traffic Server, an open source HTTP proxy server. It provides an overview of Traffic Server's history and capabilities. Key points include:
- Traffic Server can handle a high volume of requests (350,000/sec) and throughput (30Gbps) for content delivery networks (CDNs).
- It uses an event-driven, multithreaded model to solve concurrency problems faced by other proxy servers.
- Traffic Server makes operations easy through automatic restart on crash, configuration reload without restart, and command line utilities for stats and configs.
- It can be used for forward and reverse proxying, load balancing, caching, and building CDNs through remapping of URLs to
Interoperable Web Services with JAX-WS and WSITCarol McDonald
The document provides an overview of Carol McDonald's presentation on Sun's web services stack. The key points are:
- Metro is Sun's implementation of JAX-WS for developing web services. WSIT provides reliability, security, and transactions using WS-* specifications.
- JAX-WS allows developing web services by annotating POJOs. The WSDL is generated automatically.
- WSIT adds features like reliable messaging, security, and transactions to web services using standards like WS-ReliableMessaging and WS-Security.
- The presentation demonstrates creating and consuming a web service using JAX-WS and configuring reliable messaging and security using WSIT.
The document outlines the verification strategy for a PCI-Express presenter device. It discusses the PCI-Express protocol overview including terminology, hierarchy and functions at various layers. It emphasizes the importance of design-for-verification using techniques like modular architectures, standardized interfaces and reference models to aid in functional verification closure and compliance testing. Performance verification is also highlighted as critical given the real-time requirements of the standard.
The Presentation given at Guru Gobind Singh Polytechnic, Nashik for Third Year Information Technology and Computer Engineering Students on 08/02/2011.
Topic: Java Network Programming
Network visibility and control using industry standard sFlow telemetrypphaal
• Find out about the sFlow instrumentation built into commodity data center network and server infrastructure.
• Understand how sFlow fits into the broader ecosystem of NetFlow, IPFIX, SNMP and DevOps monitoring technologies.
• Case studies demonstrate how sFlow telemetry combined with automation can lower costs, increase performance, and improve security of cloud infrastructure and applications.
21. Application Development and Administration in DBMSkoolkampus
The document provides an overview of web interfaces to databases and techniques for improving web application performance. It discusses how databases can be interfaced with the web to allow users to access data from anywhere. It then covers topics like dynamic page generation, sessions, cookies, servlets, server-side scripting, and techniques for improving web server performance like caching. The document also discusses performance tuning at the hardware, database, and transaction levels to identify and address bottlenecks.
Проксирование HTTP-запросов web-акселератором / Александр Крижановский (Tempe...Ontico
РИТ++ 2017, HighLoad Junior
Зал Сингапур, 6 июня, 11:00
Тезисы:
http://junior.highload.ru/2017/abstracts/2545.html
Вы поставили HTTP-акселератор перед вашим web-сервером для ускорения отдачи контента, но запросы пользователей по-прежнему отдаются с большой задержкой, а ресурсы сервера кажутся незагруженными. А, может, после того, как поставили
web-акселератор, web-приложение сломалось, да еще и так, что проблема воспроизводится редко, хуже того, о ней могут знать ваши пользователи, но не вы.
...
This document provides an overview of Microsoft's Azure cloud services platform. It discusses key Azure capabilities and services including compute, storage, SQL Azure database, service bus, and access control. Azure provides scalable infrastructure and platform services that allow developers to build and host applications in the cloud using familiar .NET tools. The document also demonstrates a sample grid computing application built on Azure and highlights reasons to consider cloud computing such as reducing costs, improving scalability, and reducing IT overhead.
The document discusses some of the challenges of developing and deploying web services at scale, including:
- Meeting service level agreements for high availability and performance.
- Choosing appropriate technologies and architectures that can scale to support large volumes of traffic and data.
- Ensuring services are robust, reliable and secure through practices like rigorous testing, monitoring, and automated deployment.
- Fostering collaboration between development and operations teams to address deployment issues as they arise.
Taking Spark Streaming to the Next Level with Datasets and DataFramesDatabricks
Structured Streaming provides a simple way to perform streaming analytics by treating unbounded, continuous data streams similarly to static DataFrames and Datasets. It allows for event-time processing, windowing, joins, and other SQL operations on streaming data. Under the hood, it uses micro-batch processing to incrementally and continuously execute queries on streaming data using Spark's SQL engine and Catalyst optimizer. This allows for high-level APIs as well as end-to-end guarantees like exactly-once processing and fault tolerance through mechanisms like offset tracking and a fault-tolerant state store.
Building Continuous Application with Structured Streaming and Real-Time Data ...Databricks
This document summarizes a presentation about building a structured streaming connector for continuous applications using Azure Event Hubs as the streaming data source. It discusses key design considerations like representing offsets, implementing the getOffset and getBatch methods required by structured streaming sources, and challenges with testing asynchronous behavior. It also outlines issues contributed back to the Apache Spark community around streaming checkpoints and recovery.
Pushing information is a decoupled and performance effective way to ensure interested parties have the most recent information ASAP.
This session looks at reasons and technology for pushing information at various points in an enterprise architecture. Databases can push to the middle tier, the middle tier pushes to the browser and mobile app - triggered by email, chat, JMS message or CEP event and one client can push to another. The link with Event Driven Architecture is explored.
HTTP Channels and Web Sockets are demonstrated as well as AJAX based background push, database query result change notification and HTTP calls from the database. We'll look at what to send in an event and how to present the push signal in the end user interface.
Attendees will learn how to effectively leverage concepts (such as Bayeux) and technologies to implement push-across-the-tiers in a scalable fashion- thus creating a modern application that satisfies the modern end user.
* Introduce push in the real world: don't call us and other examples
* Explain how push is good for performance (no polling), for decoupling (consumer does not need to know where the push comes from) and most up-to-date information available (as opposed to polling)
* Discuss architecture and all the gaps between and within tier where push may be required and how the trigger can originate
* Demonstrate how push can be implemented from a database to the middle tier (for example to refresh cache or send signal that ends up in client)
* Demonstrate how push can be implemented from middle tier to client - and what it can be used for
* Discussion of presentation/visualization of asynchronous, push-based refresh of client
* Leveraging the server-client push, demonstrate how client-client push can be implemented (through client-server AJAX and server-client push)
* Demonstrate end-to-end push: database undergoing some DML finally resulting in a browser being refreshed
* Linking Push with Event Driven Architecture and Complex Event Processing
* Brief future outlook
Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)Spark Summit
This document discusses StreamingBench, a benchmarking tool for streaming systems. It aims to help users understand and select streaming platforms, identify factors that impact performance, and provide guidance on optimizing resources. The document outlines StreamingBench workloads and scoring metrics, compares the performance of Spark Streaming, Storm, Trident and Samza, and analyzes how configuration choices like serialization, partitions, and acknowledgements affect throughput and latency.
Spark is a powerhouse for large datasets, but when it comes to smaller data workloads, its overhead can sometimes slow things down. What if you could achieve high performance and efficiency without the need for Spark?
At S&P Global Commodity Insights, having a complete view of global energy and commodities markets enables customers to make data-driven decisions with confidence and create long-term, sustainable value. 🌍
Explore delta-rs + CDC and how these open-source innovations power lightweight, high-performance data applications beyond Spark! 🚀
This document discusses SignalR, a Microsoft technology for building real-time web applications. SignalR provides a simple abstraction over various transport mechanisms like websockets, server-sent events, and long polling. It allows for easy implementation of real-time features like chat, live dashboards, and games. The document covers key aspects of SignalR like hubs for defining server-side endpoints, group management, authentication, and deployment options on web farms using Redis or SQL Server.
The document discusses key concepts in distributed systems including networking, remote procedure calls (RPC), and transaction processing systems (TPS). It covers networking fundamentals like sockets and ports. It describes how RPC works by allowing functions to be called remotely. It explains the ACID properties that TPS must support for atomicity, consistency, isolation, and durability of transactions processed across distributed systems.
The presentation given at MSBTE sponsored content updating program on 'Advanced Java Programming' for Diploma Engineering teachers of Maharashtra. Venue: Guru Gobind Singh Polytechnic, Nashik
Date: 22/12/2010
Session: Java Network Programming
This document discusses Apache Traffic Server, an open source HTTP proxy server. It provides an overview of Traffic Server's history and capabilities. Key points include:
- Traffic Server can handle a high volume of requests (350,000/sec) and throughput (30Gbps) for content delivery networks (CDNs).
- It uses an event-driven, multithreaded model to solve concurrency problems faced by other proxy servers.
- Traffic Server makes operations easy through automatic restart on crash, configuration reload without restart, and command line utilities for stats and configs.
- It can be used for forward and reverse proxying, load balancing, caching, and building CDNs through remapping of URLs to
Interoperable Web Services with JAX-WS and WSITCarol McDonald
The document provides an overview of Carol McDonald's presentation on Sun's web services stack. The key points are:
- Metro is Sun's implementation of JAX-WS for developing web services. WSIT provides reliability, security, and transactions using WS-* specifications.
- JAX-WS allows developing web services by annotating POJOs. The WSDL is generated automatically.
- WSIT adds features like reliable messaging, security, and transactions to web services using standards like WS-ReliableMessaging and WS-Security.
- The presentation demonstrates creating and consuming a web service using JAX-WS and configuring reliable messaging and security using WSIT.
The document outlines the verification strategy for a PCI-Express presenter device. It discusses the PCI-Express protocol overview including terminology, hierarchy and functions at various layers. It emphasizes the importance of design-for-verification using techniques like modular architectures, standardized interfaces and reference models to aid in functional verification closure and compliance testing. Performance verification is also highlighted as critical given the real-time requirements of the standard.
The Presentation given at Guru Gobind Singh Polytechnic, Nashik for Third Year Information Technology and Computer Engineering Students on 08/02/2011.
Topic: Java Network Programming
Network visibility and control using industry standard sFlow telemetrypphaal
• Find out about the sFlow instrumentation built into commodity data center network and server infrastructure.
• Understand how sFlow fits into the broader ecosystem of NetFlow, IPFIX, SNMP and DevOps monitoring technologies.
• Case studies demonstrate how sFlow telemetry combined with automation can lower costs, increase performance, and improve security of cloud infrastructure and applications.
21. Application Development and Administration in DBMSkoolkampus
The document provides an overview of web interfaces to databases and techniques for improving web application performance. It discusses how databases can be interfaced with the web to allow users to access data from anywhere. It then covers topics like dynamic page generation, sessions, cookies, servlets, server-side scripting, and techniques for improving web server performance like caching. The document also discusses performance tuning at the hardware, database, and transaction levels to identify and address bottlenecks.
Проксирование HTTP-запросов web-акселератором / Александр Крижановский (Tempe...Ontico
РИТ++ 2017, HighLoad Junior
Зал Сингапур, 6 июня, 11:00
Тезисы:
http://junior.highload.ru/2017/abstracts/2545.html
Вы поставили HTTP-акселератор перед вашим web-сервером для ускорения отдачи контента, но запросы пользователей по-прежнему отдаются с большой задержкой, а ресурсы сервера кажутся незагруженными. А, может, после того, как поставили
web-акселератор, web-приложение сломалось, да еще и так, что проблема воспроизводится редко, хуже того, о ней могут знать ваши пользователи, но не вы.
...
This document provides an overview of Microsoft's Azure cloud services platform. It discusses key Azure capabilities and services including compute, storage, SQL Azure database, service bus, and access control. Azure provides scalable infrastructure and platform services that allow developers to build and host applications in the cloud using familiar .NET tools. The document also demonstrates a sample grid computing application built on Azure and highlights reasons to consider cloud computing such as reducing costs, improving scalability, and reducing IT overhead.
The document discusses some of the challenges of developing and deploying web services at scale, including:
- Meeting service level agreements for high availability and performance.
- Choosing appropriate technologies and architectures that can scale to support large volumes of traffic and data.
- Ensuring services are robust, reliable and secure through practices like rigorous testing, monitoring, and automated deployment.
- Fostering collaboration between development and operations teams to address deployment issues as they arise.
Taking Spark Streaming to the Next Level with Datasets and DataFramesDatabricks
Structured Streaming provides a simple way to perform streaming analytics by treating unbounded, continuous data streams similarly to static DataFrames and Datasets. It allows for event-time processing, windowing, joins, and other SQL operations on streaming data. Under the hood, it uses micro-batch processing to incrementally and continuously execute queries on streaming data using Spark's SQL engine and Catalyst optimizer. This allows for high-level APIs as well as end-to-end guarantees like exactly-once processing and fault tolerance through mechanisms like offset tracking and a fault-tolerant state store.
Building Continuous Application with Structured Streaming and Real-Time Data ...Databricks
This document summarizes a presentation about building a structured streaming connector for continuous applications using Azure Event Hubs as the streaming data source. It discusses key design considerations like representing offsets, implementing the getOffset and getBatch methods required by structured streaming sources, and challenges with testing asynchronous behavior. It also outlines issues contributed back to the Apache Spark community around streaming checkpoints and recovery.
Pushing information is a decoupled and performance effective way to ensure interested parties have the most recent information ASAP.
This session looks at reasons and technology for pushing information at various points in an enterprise architecture. Databases can push to the middle tier, the middle tier pushes to the browser and mobile app - triggered by email, chat, JMS message or CEP event and one client can push to another. The link with Event Driven Architecture is explored.
HTTP Channels and Web Sockets are demonstrated as well as AJAX based background push, database query result change notification and HTTP calls from the database. We'll look at what to send in an event and how to present the push signal in the end user interface.
Attendees will learn how to effectively leverage concepts (such as Bayeux) and technologies to implement push-across-the-tiers in a scalable fashion- thus creating a modern application that satisfies the modern end user.
* Introduce push in the real world: don't call us and other examples
* Explain how push is good for performance (no polling), for decoupling (consumer does not need to know where the push comes from) and most up-to-date information available (as opposed to polling)
* Discuss architecture and all the gaps between and within tier where push may be required and how the trigger can originate
* Demonstrate how push can be implemented from a database to the middle tier (for example to refresh cache or send signal that ends up in client)
* Demonstrate how push can be implemented from middle tier to client - and what it can be used for
* Discussion of presentation/visualization of asynchronous, push-based refresh of client
* Leveraging the server-client push, demonstrate how client-client push can be implemented (through client-server AJAX and server-client push)
* Demonstrate end-to-end push: database undergoing some DML finally resulting in a browser being refreshed
* Linking Push with Event Driven Architecture and Complex Event Processing
* Brief future outlook
Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)Spark Summit
This document discusses StreamingBench, a benchmarking tool for streaming systems. It aims to help users understand and select streaming platforms, identify factors that impact performance, and provide guidance on optimizing resources. The document outlines StreamingBench workloads and scoring metrics, compares the performance of Spark Streaming, Storm, Trident and Samza, and analyzes how configuration choices like serialization, partitions, and acknowledgements affect throughput and latency.
Spark is a powerhouse for large datasets, but when it comes to smaller data workloads, its overhead can sometimes slow things down. What if you could achieve high performance and efficiency without the need for Spark?
At S&P Global Commodity Insights, having a complete view of global energy and commodities markets enables customers to make data-driven decisions with confidence and create long-term, sustainable value. 🌍
Explore delta-rs + CDC and how these open-source innovations power lightweight, high-performance data applications beyond Spark! 🚀
"Rebranding for Growth", Anna VelykoivanenkoFwdays
Since there is no single formula for rebranding, this presentation will explore best practices for aligning business strategy and communication to achieve business goals.
Technology Trends in 2025: AI and Big Data AnalyticsInData Labs
At InData Labs, we have been keeping an ear to the ground, looking out for AI-enabled digital transformation trends coming our way in 2025. Our report will provide a look into the technology landscape of the future, including:
-Artificial Intelligence Market Overview
-Strategies for AI Adoption in 2025
-Anticipated drivers of AI adoption and transformative technologies
-Benefits of AI and Big data for your business
-Tips on how to prepare your business for innovation
-AI and data privacy: Strategies for securing data privacy in AI models, etc.
Download your free copy nowand implement the key findings to improve your business.
Rock, Paper, Scissors: An Apex Map Learning JourneyLynda Kane
Slide Deck from Presentations to WITDevs (April 2021) and Cleveland Developer Group (6/28/2023) on using Rock, Paper, Scissors to learn the Map construct in Salesforce Apex development.
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPathCommunity
Join this UiPath Community Berlin meetup to explore the Orchestrator API, Swagger interface, and the Test Manager API. Learn how to leverage these tools to streamline automation, enhance testing, and integrate more efficiently with UiPath. Perfect for developers, testers, and automation enthusiasts!
📕 Agenda
Welcome & Introductions
Orchestrator API Overview
Exploring the Swagger Interface
Test Manager API Highlights
Streamlining Automation & Testing with APIs (Demo)
Q&A and Open Discussion
Perfect for developers, testers, and automation enthusiasts!
👉 Join our UiPath Community Berlin chapter: https://community.uipath.com/berlin/
This session streamed live on April 29, 2025, 18:00 CET.
Check out all our upcoming UiPath Community sessions at https://community.uipath.com/events/.
"Client Partnership — the Path to Exponential Growth for Companies Sized 50-5...Fwdays
Why the "more leads, more sales" approach is not a silver bullet for a company.
Common symptoms of an ineffective Client Partnership (CP).
Key reasons why CP fails.
Step-by-step roadmap for building this function (processes, roles, metrics).
Business outcomes of CP implementation based on examples of companies sized 50-500.
Role of Data Annotation Services in AI-Powered ManufacturingAndrew Leo
From predictive maintenance to robotic automation, AI is driving the future of manufacturing. But without high-quality annotated data, even the smartest models fall short.
Discover how data annotation services are powering accuracy, safety, and efficiency in AI-driven manufacturing systems.
Precision in data labeling = Precision on the production floor.
Procurement Insights Cost To Value Guide.pptxJon Hansen
Procurement Insights integrated Historic Procurement Industry Archives, serves as a powerful complement — not a competitor — to other procurement industry firms. It fills critical gaps in depth, agility, and contextual insight that most traditional analyst and association models overlook.
Learn more about this value- driven proprietary service offering here.
How Can I use the AI Hype in my Business Context?Daniel Lehner
𝙄𝙨 𝘼𝙄 𝙟𝙪𝙨𝙩 𝙝𝙮𝙥𝙚? 𝙊𝙧 𝙞𝙨 𝙞𝙩 𝙩𝙝𝙚 𝙜𝙖𝙢𝙚 𝙘𝙝𝙖𝙣𝙜𝙚𝙧 𝙮𝙤𝙪𝙧 𝙗𝙪𝙨𝙞𝙣𝙚𝙨𝙨 𝙣𝙚𝙚𝙙𝙨?
Everyone’s talking about AI but is anyone really using it to create real value?
Most companies want to leverage AI. Few know 𝗵𝗼𝘄.
✅ What exactly should you ask to find real AI opportunities?
✅ Which AI techniques actually fit your business?
✅ Is your data even ready for AI?
If you’re not sure, you’re not alone. This is a condensed version of the slides I presented at a Linkedin webinar for Tecnovy on 28.04.2025.
2. Presentation OverviewDescribe the problemLow-latency game messaging and state distributionSurvey available solutionsQuick mention of also-ransDive into implementationErlang!Discuss gotchasSpeculate about the future
5. What Do We Want?Any-to-any messaging with ad-hoc structureChat; Events; Input/ControlLightweight (in-RAM) state maintenanceScores; Dice; Equipment
6. New Building BlocksQueues provide a sane view of distributed state for developers building gamesTwo kinds of messaging:Events (edge triggered, “messages”)State (level triggered, “updates”)Integrated into a bigger system
7. From Long-poll to Real-timeCachingWeb ServersLoad BalancersDatabasesCachingLong PollLoad BalancersGame ServersConnection GatewaysMessage QueuesToday’s Talk
9. Performance RequirementsSimultaneous user count:80,000 when we started150,000 today1,000,000 design goalReal-time performance (the main driving requirement)Lower than 100ms end-to-end through the systemQueue creates and join/leaves (kill a lot of contenders)>500,000 creates/day when started>20,000,000 creates/day design goal
10. Also-rans: Existing WheelsAMQP, JMS: Qpid, Rabbit, ZeroMQ, BEA, IBM etcPoor user and authentication modelExpensive queuesIRCSpanning Tree; Netsplits; no stateXMPP / JabberProtocol doesn’t scale in federationGtalk, AIM, MSN Msgr, Yahoo MsgrIf only we could buy one of these!
11. Our Wheel is Rounder!Inspired by the 1,000,000-user mochiweb apphttp://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-1A purpose-built general systemWritten in Erlang
14. Queue NodeGatewayThe Journey of a MessageGateway for UserQueue NodeQueue ProcessMessage in Queue: /room/123Mount: chatData: Hello, World!Find node for /room/123Find queue /room/123List of subscribersGatewayValidationGatewayGateway for UserForward message
15. Anatomy of a QueueQueue Name: /room/123MountType: messageName: chatUser A: I win.User B: OMG Pwnies!User A: Take that!…Subscriber ListUser A @ Gateway CUser B @ Gateway BMountType: stateName: scoresUser A: 3220 User B: 1200
16. A Single Machine Isn’t Enough1,000,000 users, 1 machine?25 GB/s memory bus40 GB memory (40 kB/user)Touched twice per messageone message per is 3,400 ms
18. Consistent HashingThe Gateway maps queue name -> nodeThis is done using a fixed hash functionA prefix of the output bits of the hash function is used as a look-up into a table, with a minimum of 8 buckets per nodeLoad differential is 8:9 or better (down to 15:16)Updating the map of buckets -> nodes is managed centrallyHash(“/room/123”) = 0xaf5…Node ANode BNode CNode DNode ENode F
19. Consistent Hash Table UpdateMinimizes amount of traffic movedIf nodes have more than 8 buckets, steal 1/N of all buckets from those with the most and assign to new targetIf not, split each bucket, then steal 1/N of all buckets and assign to new target
20. ErlangDeveloped in ‘80s by Ericsson for phone switchesReliability, scalability, and communicationsProlog-based functional syntax (no braces!)25% the code of equivalent C++Parallel Communicating ProcessesErlang processes much cheaper than C++ threads(Almost) No Mutable DataNo data race conditionsEach process separately garbage collected
28. MonitoringExample counters:Number of connected usersNumber of queuesMessages routed per secondRound trip time for routed messagesDistributed clock work-around!Disconnects and other error events
30. Section: Problem CasesUser goes silentSecond user connectionNode crashesGateway crashesReliable messagesFirewallsBuild and test
31. User Goes SilentSome TCP connections will stop(bad WiFi, firewalls, etc)We use a ping messageBoth ends separately detect ping failureThis means one end detects it before the other
32. Second User ConnectionCurrently connected usermakes a new connectionTo another gateway because of load balancingAuser-specific queue arbitratesQueues are serializedthere is always a winner
33. State is ephemeralit’s lost when machine is lostA user “management queue”contains all subscription stateIf the home queue node dies, the user is logged outIf a queue the user is subscribed to dies, the user is auto-unsubscribed (client has to deal)Node Crashes
34. Gateway CrashesWhen a gateway crashesclient will reconnectHistoryallow us to avoid re-sending for quick reconnectsThe application above the queue API doesn’t noticeErlang message send does not report errorMonitor nodes to remove stale listeners
35. Reliable Messages“If the user isn’t logged in, deliver the next log-in.”Hidden at application server API level, stored in databaseReturn “not logged in”Signal to store message in databaseHook logged-in call-outRe-check the logged in state after storing to database (avoids a race)
36. FirewallsHTTP long-poll has one main strength:It works if your browser worksMessage Queue uses a different protocolWe still use ports 80 (“HTTP”) and 443 (“HTTPS”)This makes us horriblepeopleWe try a configured proxy with CONNECTWe reach >99% of existing customersFuture improvement: HTTP Upgrade/101
37. Build and TestContinuous Integration and Continuous DeploymentHad to build our own systemsErlangIn-place Code UpgradesToo heavy, designed for “6 month” upgrade cyclesUse fail-over instead (similar to Apache graceful)Load testing at scale“Dark launch” to existing users
38. Section: FutureReplicationSimilar to fail-overLimits of Scalability (?)M x N (Gateways x Queues) stops at some pointOpen SourceWe would like to open-source what we canProtobuf for PHP and Erlang?IMQ core? (not surrounding application server)
39. Q&ASurveyIf you found this helpful, please circle “Excellent”If this sucked, don’t circle “Excellent”Questions?@[email protected] is a great place to work, and we’re hiring!