Message Queuing on a Large Scale: IMVUs stateful real-time message queue for chat and games

Large-scale Messaging at IMVUJon WatteTechnical Director, IMVU Inc@jwatte

Presentation OverviewDescribe the problemLow-latency game messaging and state distributionSurvey available solutionsQuick mention of also-ransDive into implementationErlang!Discuss gotchasSpeculate about the future

ContextCachingWeb ServersHTTPLoad BalancersDatabasesCachingLong PollLoad BalancersGame ServersHTTP

What Do We Want?Any-to-any messaging with ad-hoc structureChat; Events; Input/ControlLightweight (in-RAM) state maintenanceScores; Dice; Equipment

New Building BlocksQueues provide a sane view of distributed state for developers building gamesTwo kinds of messaging:Events (edge triggered, “messages”)State (level triggered, “updates”)Integrated into a bigger system

From Long-poll to Real-timeCachingWeb ServersLoad BalancersDatabasesCachingLong PollLoad BalancersGame ServersConnection GatewaysMessage QueuesToday’s Talk

FunctionsGame ServerHTTPValidation users/requestsNotificationClientConnectListen message/state/userSend message/stateCreate/delete queue/mountJoin/remove userSend message/stateQueue

Performance RequirementsSimultaneous user count:80,000 when we started150,000 today1,000,000 design goalReal-time performance (the main driving requirement)Lower than 100ms end-to-end through the systemQueue creates and join/leaves (kill a lot of contenders)>500,000 creates/day when started>20,000,000 creates/day design goal

Also-rans: Existing WheelsAMQP, JMS: Qpid, Rabbit, ZeroMQ, BEA, IBM etcPoor user and authentication modelExpensive queuesIRCSpanning Tree; Netsplits; no stateXMPP / JabberProtocol doesn’t scale in federationGtalk, AIM, MSN Msgr, Yahoo MsgrIf only we could buy one of these!

Our Wheel is Rounder!Inspired by the 1,000,000-user mochiweb apphttp://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-1A purpose-built general systemWritten in Erlang

Section: ImplementationJourney of a messageAnatomy of a queueScaling across machinesErlang

Queue NodeGatewayThe Journey of a MessageGateway for UserQueue NodeQueue ProcessMessage in Queue: /room/123Mount: chatData: Hello, World!Find node for /room/123Find queue /room/123List of subscribersGatewayValidationGatewayGateway for UserForward message

Anatomy of a QueueQueue Name: /room/123MountType: messageName: chatUser A: I win.User B: OMG Pwnies!User A: Take that!…Subscriber ListUser A @ Gateway CUser B @ Gateway BMountType: stateName: scoresUser A: 3220 User B: 1200

A Single Machine Isn’t Enough1,000,000 users, 1 machine?25 GB/s memory bus40 GB memory (40 kB/user)Touched twice per messageone message per is 3,400 ms

Scale Across MachinesGatewayQueuesGatewayQueuesInternetGatewayQueuesConsistent HashingGatewayQueues

Consistent HashingThe Gateway maps queue name -> nodeThis is done using a fixed hash functionA prefix of the output bits of the hash function is used as a look-up into a table, with a minimum of 8 buckets per nodeLoad differential is 8:9 or better (down to 15:16)Updating the map of buckets -> nodes is managed centrallyHash(“/room/123”) = 0xaf5…Node ANode BNode CNode DNode ENode F

Consistent Hash Table UpdateMinimizes amount of traffic movedIf nodes have more than 8 buckets, steal 1/N of all buckets from those with the most and assign to new targetIf not, split each bucket, then steal 1/N of all buckets and assign to new target

ErlangDeveloped in ‘80s by Ericsson for phone switchesReliability, scalability, and communicationsProlog-based functional syntax (no braces!)25% the code of equivalent C++Parallel Communicating ProcessesErlang processes much cheaper than C++ threads(Almost) No Mutable DataNo data race conditionsEach process separately garbage collected

Example Erlang Process% spawn processMyCounter = spawn(my_module, counter, [0]).% increment counterMyCounter! {add, 1}.% get valueMyCounter! {get, self()};receive {value, MyCounter, Value} -> Valueend.% stop processMyCounter! stop.counter(stop) -> stopped;counter(Value) ->NextValue = receive {get, Pid} ->Pid! {value, self(), Value}, Value; {add, Delta} -> Value + Delta; stop -> stop; _ -> Valueend, counter(NextValue). % tail recursion

Section: DetailsLoad ManagementMarshallingRPC / Call-outsHot Adds and Fail-overThe Boss!Monitoring

Load ManagementGatewayQueuesInternetGatewayQueuesHAProxyHAProxyGatewayQueuesConsistent HashingGatewayQueues

Marshallingmessage MsgG2cResult { required uint32 op_id = 1; required uint32 status = 2; optional string error_message = 3;}

RPCWeb ServerPHPHTTP + JSONadminGatewayMessage QueueErlang

Call-outsWeb ServerPHPHTTP + JSONMessage QueueGatewayErlangMountCredentialsRules

ManagementGatewayQueuesThe BossGatewayQueuesGatewayQueuesConsistent HashingGatewayQueues

MonitoringExample counters:Number of connected usersNumber of queuesMessages routed per secondRound trip time for routed messagesDistributed clock work-around!Disconnects and other error events

Section: Problem CasesUser goes silentSecond user connectionNode crashesGateway crashesReliable messagesFirewallsBuild and test

User Goes SilentSome TCP connections will stop(bad WiFi, firewalls, etc)We use a ping messageBoth ends separately detect ping failureThis means one end detects it before the other

Second User ConnectionCurrently connected usermakes a new connectionTo another gateway because of load balancingAuser-specific queue arbitratesQueues are serializedthere is always a winner

State is ephemeralit’s lost when machine is lostA user “management queue”contains all subscription stateIf the home queue node dies, the user is logged outIf a queue the user is subscribed to dies, the user is auto-unsubscribed (client has to deal)Node Crashes

Gateway CrashesWhen a gateway crashesclient will reconnectHistoryallow us to avoid re-sending for quick reconnectsThe application above the queue API doesn’t noticeErlang message send does not report errorMonitor nodes to remove stale listeners

Reliable Messages“If the user isn’t logged in, deliver the next log-in.”Hidden at application server API level, stored in databaseReturn “not logged in”Signal to store message in databaseHook logged-in call-outRe-check the logged in state after storing to database (avoids a race)

FirewallsHTTP long-poll has one main strength:It works if your browser worksMessage Queue uses a different protocolWe still use ports 80 (“HTTP”) and 443 (“HTTPS”)This makes us horriblepeopleWe try a configured proxy with CONNECTWe reach >99% of existing customersFuture improvement: HTTP Upgrade/101

Build and TestContinuous Integration and Continuous DeploymentHad to build our own systemsErlangIn-place Code UpgradesToo heavy, designed for “6 month” upgrade cyclesUse fail-over instead (similar to Apache graceful)Load testing at scale“Dark launch” to existing users

Section: FutureReplicationSimilar to fail-overLimits of Scalability (?)M x N (Gateways x Queues) stops at some pointOpen SourceWe would like to open-source what we canProtobuf for PHP and Erlang?IMQ core? (not surrounding application server)

Q&ASurveyIf you found this helpful, please circle “Excellent”If this sucked, don’t circle “Excellent”Questions?@jwattejwatte@imvu.comIMVU is a great place to work, and we’re hiring!

Message Queuing on a Large Scale: IMVUs stateful real-time message queue for chat and games

Recommended

More Related Content

Similar to Message Queuing on a Large Scale: IMVUs stateful real-time message queue for chat and games (20)

Recently uploaded (20)

Message Queuing on a Large Scale: IMVUs stateful real-time message queue for chat and games