SlideShare a Scribd company logo
World Of Tanks Experience Of Using Kafka
Levon Avakyan / WoT Server
Reliability /
l_avakyan@wargaming.net
1
Table Of Contents 2
• World Of Tanks Server
• BigWorld Technology
• Cluster of clusters
• World of Tanks and Big Data
• Apache Kafka
• How it works?
• Best practices
• World of Tanks Server and Kafka
• Implementation
• Difficulties
3
World Of Tanks
Server
BigWorld Technology 4
BigWorld Technology is BigWorld's middleware for implementing Massively Multiplayer Online Games.
• Scalability, reliability, efficiency
• Occam's Razor
• Improve the worst case
• Client/server bandwidth is valuable
• Keep information together; Avoid two-way calls
• Avoid bottlenecks; Make the system distributed
• Where possible, do communication in batches
5
CLUSTER COMPONENTS
Switch fabric
LoginApp LoginApp BaseApp BaseApp BaseApp
Switch fabric
DBApp DBApp
Switch fabric
ServiceAppCellAppCellAppCellApp
BaseAppMgrDBAppMgr CellAppMgr
DATABASE
INTERNET
6
WORLD OF TANKS: RU
~70 servers
Amsterdam
Moscow
Novosibirsk
Krasnoyarsk
Frankfurt
40 servers
250+ servers
80+ servers
~70 servers
7
• Heat maps (player positions, damage, detection, etc)
• Battle results ( by arena, by player, by vehicle)
• Matchmaker data
Approximately. 20K RPS
World of Tanks and Big Data
8
• Data Warehouse
• Player Relationship Managment Platfrom
• Wargaming Rating Managment System
• Ranked Battles Leaderboads
• Strongholds Service
World of Tanks Data Consumers
9
Apache Kafka
Distributed streaming platform
Apache Kafka 10
Concept:
• Kafka is run as a cluster on one or more servers.
• The Kafka cluster stores streams of records in categories
called topics.
• Each record consists of a key, a value, and a timestamp.
Capabilities:
• It lets you publish and subscribe to streams of records.
• It lets you store streams of records in a fault-tolerant way.
• It lets you process streams of records as they occur.
Guarantees
• Messages sent by a producer to a particular topic partition will be
appended in the order they are sent
• A consumer instance sees records in the order they are stored in
the log.
• For a topic with replication factor N, we will tolerate up to N-1
server failures without losing any records committed to the log.
Apache Kafka® is a distributed streaming platform.
Apache Kafka 11
• Topics
• Producers
• Consumers
• Broker
Apache Kafka 12
A topic is a category or feed name to which records are published
Topics and Logs
Apache Kafka 13
Consumers label themselves with a consumer group name, and each record published to a topic is
delivered to one consumer instance within each subscribing consumer group.
Consumer Groups
Apache Kafka 14
• Electing a controller
• Cluster membership
Zookeeper
Apache Kafka 15
Replication
Apache Kafka 16
• Replication-factor >=3
• Min.insync.replicas = 2
• Unclean.leader. Election = false
• Auto.offset.commit= false
• Acks = all (if you want all data safe)
• Retries = MAX_INT
• Monitor
17
World of Tanks Server
and Apache Kafka
Marrige
Producer Implementation 18
• python-kafka (https://github.com/dpkp/kafka-python)
• librdkafka (https://github.com/edenhill/librdkafka)
Technology
python-kafka librdkafka
python C++
a lot of bugs more stable
memory leaks with disabled GC thread safe
Producer Implementation 19
Zero iteration
Producer Implementation 20
• Hardcoded schema
• Schema in message
• All traffic through Center
Difficulties
Message Schema 21
Schema Registry provides a serving layer for your metadata. It provides a RESTful interface for storing
and retrieving Avro schemas. It stores a versioned history of all schemas, provides multiple
compatibility settings and allows evolution of schemas according to the configured compatibility
setting. It provides serializers that plug into Kafka clients that handle schema storage and retrieval for
Kafka messages that are sent in the Avro format.
Schema Registry
Producer Implementation 22
First iteration
Producer Implementation 23
• Rapidly growing queues (memory drain) – slow producer
• Cannot send messages to Kafka
• Losing messages
Difficulties
24
WORLD OF TANKS: RU
~70 servers
Amsterdam
Moscow
Novosibirsk
Krasnoyarsk
Frankfurt
40 servers
250+ servers
80+ servers
~70 servers
Infrastructure improvements 25
• Local Kafka cluster in every Data Center
• One Schema Registry
• Replication to the central Kafka cluster using Mirror
Maker
• Configure topics in according with best practice
• Improve monitoring
Apache Kafka
Producer Implementation 26
Second iteration
Plans 27
• Backup data in message queues
• Migrate to librdkafka
• Schema versioning
Conclusion 28
• Apache Kafka is powerful tool for transferring, storing and processing
data
• No solution is working out of the box
• Analyze, do experiments, improve your solution continuously
Thank you
Levon Avakyan (l_avakyan@wargaming.net)
29
Ad

More Related Content

What's hot (20)

Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
Martin Podval
 
Stream Me Up, Scotty: Transitioning to the Cloud Using a Streaming Data Platform
Stream Me Up, Scotty: Transitioning to the Cloud Using a Streaming Data PlatformStream Me Up, Scotty: Transitioning to the Cloud Using a Streaming Data Platform
Stream Me Up, Scotty: Transitioning to the Cloud Using a Streaming Data Platform
confluent
 
Ingesting Healthcare Data, Micah Whitacre
Ingesting Healthcare Data, Micah WhitacreIngesting Healthcare Data, Micah Whitacre
Ingesting Healthcare Data, Micah Whitacre
confluent
 
Streaming ETL - from RDBMS to Dashboard with KSQL
Streaming ETL - from RDBMS to Dashboard with KSQLStreaming ETL - from RDBMS to Dashboard with KSQL
Streaming ETL - from RDBMS to Dashboard with KSQL
Bjoern Rost
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache Kafka
Joe Stein
 
Kafka
KafkaKafka
Kafka
shrenikp
 
Kafka ops-new
Kafka ops-newKafka ops-new
Kafka ops-new
Ariel Moskovich
 
Apache Pulsar First Overview
Apache PulsarFirst OverviewApache PulsarFirst Overview
Apache Pulsar First Overview
Ricardo Paiva
 
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache KafkaStrata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
confluent
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
Discover Pinterest
 
Troubleshooting Kafka's socket server: from incident to resolution
Troubleshooting Kafka's socket server: from incident to resolutionTroubleshooting Kafka's socket server: from incident to resolution
Troubleshooting Kafka's socket server: from incident to resolution
Joel Koshy
 
Introduction Apache Kafka
Introduction Apache KafkaIntroduction Apache Kafka
Introduction Apache Kafka
Joe Stein
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
Clement Demonchy
 
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...
HostedbyConfluent
 
Kafka meetup JP #3 - Engineering Apache Kafka at LINE
Kafka meetup JP #3 - Engineering Apache Kafka at LINEKafka meetup JP #3 - Engineering Apache Kafka at LINE
Kafka meetup JP #3 - Engineering Apache Kafka at LINE
kawamuray
 
Disaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache KafkaDisaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache Kafka
confluent
 
A la rencontre de Kafka, le log distribué par Florian GARCIA
A la rencontre de Kafka, le log distribué par Florian GARCIAA la rencontre de Kafka, le log distribué par Florian GARCIA
A la rencontre de Kafka, le log distribué par Florian GARCIA
La Cuisine du Web
 
The Foundations of Multi-DC Kafka (Jakub Korab, Solutions Architect, Confluen...
The Foundations of Multi-DC Kafka (Jakub Korab, Solutions Architect, Confluen...The Foundations of Multi-DC Kafka (Jakub Korab, Solutions Architect, Confluen...
The Foundations of Multi-DC Kafka (Jakub Korab, Solutions Architect, Confluen...
confluent
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
Mohammed Fazuluddin
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
Srikrishna k
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
Martin Podval
 
Stream Me Up, Scotty: Transitioning to the Cloud Using a Streaming Data Platform
Stream Me Up, Scotty: Transitioning to the Cloud Using a Streaming Data PlatformStream Me Up, Scotty: Transitioning to the Cloud Using a Streaming Data Platform
Stream Me Up, Scotty: Transitioning to the Cloud Using a Streaming Data Platform
confluent
 
Ingesting Healthcare Data, Micah Whitacre
Ingesting Healthcare Data, Micah WhitacreIngesting Healthcare Data, Micah Whitacre
Ingesting Healthcare Data, Micah Whitacre
confluent
 
Streaming ETL - from RDBMS to Dashboard with KSQL
Streaming ETL - from RDBMS to Dashboard with KSQLStreaming ETL - from RDBMS to Dashboard with KSQL
Streaming ETL - from RDBMS to Dashboard with KSQL
Bjoern Rost
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache Kafka
Joe Stein
 
Apache Pulsar First Overview
Apache PulsarFirst OverviewApache PulsarFirst Overview
Apache Pulsar First Overview
Ricardo Paiva
 
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache KafkaStrata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
confluent
 
Troubleshooting Kafka's socket server: from incident to resolution
Troubleshooting Kafka's socket server: from incident to resolutionTroubleshooting Kafka's socket server: from incident to resolution
Troubleshooting Kafka's socket server: from incident to resolution
Joel Koshy
 
Introduction Apache Kafka
Introduction Apache KafkaIntroduction Apache Kafka
Introduction Apache Kafka
Joe Stein
 
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...
HostedbyConfluent
 
Kafka meetup JP #3 - Engineering Apache Kafka at LINE
Kafka meetup JP #3 - Engineering Apache Kafka at LINEKafka meetup JP #3 - Engineering Apache Kafka at LINE
Kafka meetup JP #3 - Engineering Apache Kafka at LINE
kawamuray
 
Disaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache KafkaDisaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache Kafka
confluent
 
A la rencontre de Kafka, le log distribué par Florian GARCIA
A la rencontre de Kafka, le log distribué par Florian GARCIAA la rencontre de Kafka, le log distribué par Florian GARCIA
A la rencontre de Kafka, le log distribué par Florian GARCIA
La Cuisine du Web
 
The Foundations of Multi-DC Kafka (Jakub Korab, Solutions Architect, Confluen...
The Foundations of Multi-DC Kafka (Jakub Korab, Solutions Architect, Confluen...The Foundations of Multi-DC Kafka (Jakub Korab, Solutions Architect, Confluen...
The Foundations of Multi-DC Kafka (Jakub Korab, Solutions Architect, Confluen...
confluent
 

Similar to World of Tanks Experience of Using Kafka (20)

Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
Chhavi Parasher
 
Kafka Explainaton
Kafka ExplainatonKafka Explainaton
Kafka Explainaton
NguyenChiHoangMinh
 
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Data Con LA
 
Building Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaBuilding Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache Kafka
Slim Baltagi
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
Guido Schmutz
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache Kafka
Angelo Cesaro
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
Amita Mirajkar
 
Envoy and Kafka
Envoy and KafkaEnvoy and Kafka
Envoy and Kafka
Adam Kotwasinski
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
Kumar Shivam
 
Kafka for begginer
Kafka for begginerKafka for begginer
Kafka for begginer
Yousun Jeong
 
Distributed messaging with Apache Kafka
Distributed messaging with Apache KafkaDistributed messaging with Apache Kafka
Distributed messaging with Apache Kafka
Saumitra Srivastav
 
Unleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptxUnleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptx
Knoldus Inc.
 
From a kafkaesque story to The Promised Land
From a kafkaesque story to The Promised LandFrom a kafkaesque story to The Promised Land
From a kafkaesque story to The Promised Land
Ran Silberman
 
Streaming and Messaging
Streaming and MessagingStreaming and Messaging
Streaming and Messaging
Xin Wang
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processing
confluent
 
Introduction to Kafka Streams Presentation
Introduction to Kafka Streams PresentationIntroduction to Kafka Streams Presentation
Introduction to Kafka Streams Presentation
Knoldus Inc.
 
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
HostedbyConfluent
 
Making Apache Kafka Even Faster And More Scalable
Making Apache Kafka Even Faster And More ScalableMaking Apache Kafka Even Faster And More Scalable
Making Apache Kafka Even Faster And More Scalable
PaulBrebner2
 
Consensus in Apache Kafka: From Theory to Production.pdf
Consensus in Apache Kafka: From Theory to Production.pdfConsensus in Apache Kafka: From Theory to Production.pdf
Consensus in Apache Kafka: From Theory to Production.pdf
Guozhang Wang
 
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Erik Onnen
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
Chhavi Parasher
 
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Data Con LA
 
Building Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaBuilding Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache Kafka
Slim Baltagi
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
Guido Schmutz
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache Kafka
Angelo Cesaro
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
Amita Mirajkar
 
Kafka for begginer
Kafka for begginerKafka for begginer
Kafka for begginer
Yousun Jeong
 
Distributed messaging with Apache Kafka
Distributed messaging with Apache KafkaDistributed messaging with Apache Kafka
Distributed messaging with Apache Kafka
Saumitra Srivastav
 
Unleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptxUnleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptx
Knoldus Inc.
 
From a kafkaesque story to The Promised Land
From a kafkaesque story to The Promised LandFrom a kafkaesque story to The Promised Land
From a kafkaesque story to The Promised Land
Ran Silberman
 
Streaming and Messaging
Streaming and MessagingStreaming and Messaging
Streaming and Messaging
Xin Wang
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processing
confluent
 
Introduction to Kafka Streams Presentation
Introduction to Kafka Streams PresentationIntroduction to Kafka Streams Presentation
Introduction to Kafka Streams Presentation
Knoldus Inc.
 
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
HostedbyConfluent
 
Making Apache Kafka Even Faster And More Scalable
Making Apache Kafka Even Faster And More ScalableMaking Apache Kafka Even Faster And More Scalable
Making Apache Kafka Even Faster And More Scalable
PaulBrebner2
 
Consensus in Apache Kafka: From Theory to Production.pdf
Consensus in Apache Kafka: From Theory to Production.pdfConsensus in Apache Kafka: From Theory to Production.pdf
Consensus in Apache Kafka: From Theory to Production.pdf
Guozhang Wang
 
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Erik Onnen
 
Ad

More from Levon Avakyan (10)

Надежность World of Tanks Server
Надежность World of Tanks ServerНадежность World of Tanks Server
Надежность World of Tanks Server
Levon Avakyan
 
Grpahite&grafana
Grpahite&grafanaGrpahite&grafana
Grpahite&grafana
Levon Avakyan
 
Реляционные базы данных
Реляционные базы данныхРеляционные базы данных
Реляционные базы данных
Levon Avakyan
 
Программирование как способ выражения мыслей.
Программирование как способ выражения мыслей. Программирование как способ выражения мыслей.
Программирование как способ выражения мыслей.
Levon Avakyan
 
SRE vs DevOps
SRE vs DevOpsSRE vs DevOps
SRE vs DevOps
Levon Avakyan
 
Wargaming Clan Platform
Wargaming Clan PlatformWargaming Clan Platform
Wargaming Clan Platform
Levon Avakyan
 
Архитектура мета игры Wargaming. Глобальная карта 2.0.
Архитектура мета игры Wargaming. Глобальная карта 2.0.Архитектура мета игры Wargaming. Глобальная карта 2.0.
Архитектура мета игры Wargaming. Глобальная карта 2.0.
Levon Avakyan
 
Осознанный выбор. Python 3 для реализации сервисного шлюза клиента World of T...
Осознанный выбор. Python 3 для реализации сервисного шлюза клиента World of T...Осознанный выбор. Python 3 для реализации сервисного шлюза клиента World of T...
Осознанный выбор. Python 3 для реализации сервисного шлюза клиента World of T...
Levon Avakyan
 
Кланы в Wargaming. От странички на танковом портале до мультиплатфермнного с...
Кланы в Wargaming. От странички на танковом портале до  мультиплатфермнного с...Кланы в Wargaming. От странички на танковом портале до  мультиплатфермнного с...
Кланы в Wargaming. От странички на танковом портале до мультиплатфермнного с...
Levon Avakyan
 
Оперирование высоко нагруженными проектами. Или "Клановые войны" каждый день
Оперирование высоко нагруженными проектами. Или "Клановые войны" каждый деньОперирование высоко нагруженными проектами. Или "Клановые войны" каждый день
Оперирование высоко нагруженными проектами. Или "Клановые войны" каждый день
Levon Avakyan
 
Надежность World of Tanks Server
Надежность World of Tanks ServerНадежность World of Tanks Server
Надежность World of Tanks Server
Levon Avakyan
 
Grpahite&grafana
Grpahite&grafanaGrpahite&grafana
Grpahite&grafana
Levon Avakyan
 
Реляционные базы данных
Реляционные базы данныхРеляционные базы данных
Реляционные базы данных
Levon Avakyan
 
Программирование как способ выражения мыслей.
Программирование как способ выражения мыслей. Программирование как способ выражения мыслей.
Программирование как способ выражения мыслей.
Levon Avakyan
 
Wargaming Clan Platform
Wargaming Clan PlatformWargaming Clan Platform
Wargaming Clan Platform
Levon Avakyan
 
Архитектура мета игры Wargaming. Глобальная карта 2.0.
Архитектура мета игры Wargaming. Глобальная карта 2.0.Архитектура мета игры Wargaming. Глобальная карта 2.0.
Архитектура мета игры Wargaming. Глобальная карта 2.0.
Levon Avakyan
 
Осознанный выбор. Python 3 для реализации сервисного шлюза клиента World of T...
Осознанный выбор. Python 3 для реализации сервисного шлюза клиента World of T...Осознанный выбор. Python 3 для реализации сервисного шлюза клиента World of T...
Осознанный выбор. Python 3 для реализации сервисного шлюза клиента World of T...
Levon Avakyan
 
Кланы в Wargaming. От странички на танковом портале до мультиплатфермнного с...
Кланы в Wargaming. От странички на танковом портале до  мультиплатфермнного с...Кланы в Wargaming. От странички на танковом портале до  мультиплатфермнного с...
Кланы в Wargaming. От странички на танковом портале до мультиплатфермнного с...
Levon Avakyan
 
Оперирование высоко нагруженными проектами. Или "Клановые войны" каждый день
Оперирование высоко нагруженными проектами. Или "Клановые войны" каждый деньОперирование высоко нагруженными проектами. Или "Клановые войны" каждый день
Оперирование высоко нагруженными проектами. Или "Клановые войны" каждый день
Levon Avakyan
 
Ad

Recently uploaded (20)

HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 

World of Tanks Experience of Using Kafka

  • 1. World Of Tanks Experience Of Using Kafka Levon Avakyan / WoT Server Reliability / [email protected] 1
  • 2. Table Of Contents 2 • World Of Tanks Server • BigWorld Technology • Cluster of clusters • World of Tanks and Big Data • Apache Kafka • How it works? • Best practices • World of Tanks Server and Kafka • Implementation • Difficulties
  • 4. BigWorld Technology 4 BigWorld Technology is BigWorld's middleware for implementing Massively Multiplayer Online Games. • Scalability, reliability, efficiency • Occam's Razor • Improve the worst case • Client/server bandwidth is valuable • Keep information together; Avoid two-way calls • Avoid bottlenecks; Make the system distributed • Where possible, do communication in batches
  • 5. 5 CLUSTER COMPONENTS Switch fabric LoginApp LoginApp BaseApp BaseApp BaseApp Switch fabric DBApp DBApp Switch fabric ServiceAppCellAppCellAppCellApp BaseAppMgrDBAppMgr CellAppMgr DATABASE INTERNET
  • 6. 6 WORLD OF TANKS: RU ~70 servers Amsterdam Moscow Novosibirsk Krasnoyarsk Frankfurt 40 servers 250+ servers 80+ servers ~70 servers
  • 7. 7 • Heat maps (player positions, damage, detection, etc) • Battle results ( by arena, by player, by vehicle) • Matchmaker data Approximately. 20K RPS World of Tanks and Big Data
  • 8. 8 • Data Warehouse • Player Relationship Managment Platfrom • Wargaming Rating Managment System • Ranked Battles Leaderboads • Strongholds Service World of Tanks Data Consumers
  • 10. Apache Kafka 10 Concept: • Kafka is run as a cluster on one or more servers. • The Kafka cluster stores streams of records in categories called topics. • Each record consists of a key, a value, and a timestamp. Capabilities: • It lets you publish and subscribe to streams of records. • It lets you store streams of records in a fault-tolerant way. • It lets you process streams of records as they occur. Guarantees • Messages sent by a producer to a particular topic partition will be appended in the order they are sent • A consumer instance sees records in the order they are stored in the log. • For a topic with replication factor N, we will tolerate up to N-1 server failures without losing any records committed to the log. Apache Kafka® is a distributed streaming platform.
  • 11. Apache Kafka 11 • Topics • Producers • Consumers • Broker
  • 12. Apache Kafka 12 A topic is a category or feed name to which records are published Topics and Logs
  • 13. Apache Kafka 13 Consumers label themselves with a consumer group name, and each record published to a topic is delivered to one consumer instance within each subscribing consumer group. Consumer Groups
  • 14. Apache Kafka 14 • Electing a controller • Cluster membership Zookeeper
  • 16. Apache Kafka 16 • Replication-factor >=3 • Min.insync.replicas = 2 • Unclean.leader. Election = false • Auto.offset.commit= false • Acks = all (if you want all data safe) • Retries = MAX_INT • Monitor
  • 17. 17 World of Tanks Server and Apache Kafka Marrige
  • 18. Producer Implementation 18 • python-kafka (https://github.com/dpkp/kafka-python) • librdkafka (https://github.com/edenhill/librdkafka) Technology python-kafka librdkafka python C++ a lot of bugs more stable memory leaks with disabled GC thread safe
  • 20. Producer Implementation 20 • Hardcoded schema • Schema in message • All traffic through Center Difficulties
  • 21. Message Schema 21 Schema Registry provides a serving layer for your metadata. It provides a RESTful interface for storing and retrieving Avro schemas. It stores a versioned history of all schemas, provides multiple compatibility settings and allows evolution of schemas according to the configured compatibility setting. It provides serializers that plug into Kafka clients that handle schema storage and retrieval for Kafka messages that are sent in the Avro format. Schema Registry
  • 23. Producer Implementation 23 • Rapidly growing queues (memory drain) – slow producer • Cannot send messages to Kafka • Losing messages Difficulties
  • 24. 24 WORLD OF TANKS: RU ~70 servers Amsterdam Moscow Novosibirsk Krasnoyarsk Frankfurt 40 servers 250+ servers 80+ servers ~70 servers
  • 25. Infrastructure improvements 25 • Local Kafka cluster in every Data Center • One Schema Registry • Replication to the central Kafka cluster using Mirror Maker • Configure topics in according with best practice • Improve monitoring Apache Kafka
  • 27. Plans 27 • Backup data in message queues • Migrate to librdkafka • Schema versioning
  • 28. Conclusion 28 • Apache Kafka is powerful tool for transferring, storing and processing data • No solution is working out of the box • Analyze, do experiments, improve your solution continuously

Editor's Notes

  • #5: Scalability, reliability, efficiency Главная цель - сделать расширяемую, надежную, эффективную систему •Occam's Razor The simplest design that satisfies all requirements should be considered the best, or in the words of Ein- stein, "Make things as simple as possible, but no simpler". •Improve the worst case In general (mainly when it comes to the client experience), the worst case should be improved over the average case. •Client/server bandwidth is valuable The most important resource is the bandwidth between the client and server. After this, it is probably CPU, and then intra-server bandwidth. •Keep information together; Avoid two-way calls Information (or data) that is often used together should be easily accessed together. • Avoid bottlenecks; Make the system distributed •Where possible, do communication in batches There is a fairly high overhead in sending a single packet. Лучше 1 большой, чем 10 маленьких
  • #6: 5
  • #7: 6
  • #9: PRMP таргетированное предложение для пользователей.
  • #11: 4 API Коннектор это встравить в уже готовые приложения Стриминг – чтобы строить стрим процессор
  • #13: Это распределенный неизменяемый лог. Офсет. Ретеншен период Гарантии Порядок записи Порядок чтения
  • #14: Лоад балансинг и фолт толеранс для разных консюмер групп
  • #15: Выборы Подписка офсеты
  • #16: Репликация. Лидер Выбывание реплики АКИ
  • #21: Scalability, reliability, efficiency The general goal is to produce a scalable, reliable, and efficient system. This should be done while keeping as much simplicity and flexibility as possible. •Occam's Razor The simplest design that satisfies all requirements should be considered the best, or in the words of Ein- stein, "Make things as simple as possible, but no simpler". •Improve the worst case In general (mainly when it comes to the client experience), the worst case should be improved over the average case. For example, it is not beneficial to have a blinding fast and accurate situation when a client is not near a cell boundary if the experience is poor when he is near one. •Client/server bandwidth is valuable The most important resource is the bandwidth between the client and server. After this, it is probably CPU, and then intra-server bandwidth. •Keep information together; Avoid two-way calls Information (or data) that is often used together should be easily accessed together. For example, a large amount of the data processed together in the game is related to objects that are geometrically close. It makes sense then to use data partitioning based on locality. It is also expensive to have to request information from a separate server machine when it is necessary. This is true for a number of reasons including the extra hops and coordination required and (maybe even more importantly) the reduced likelihood of being able to batch requests together. •Avoid bottlenecks; Make the system distributed The design should try to avoid single central points where things occur. This approach can cause perfor- mance bottlenecks and make the design non-scalable. It can also introduce a single point of failure, there- fore raising fault tolerance issues. •Where possible, do communication in batches There is a fairly high overhead in sending a single packet. That is, it is a lot more expensive to send ten individual packets than it is to send one packet that is ten times bigger.
  • #23: Scalability, reliability, efficiency The general goal is to produce a scalable, reliable, and efficient system. This should be done while keeping as much simplicity and flexibility as possible. •Occam's Razor The simplest design that satisfies all requirements should be considered the best, or in the words of Ein- stein, "Make things as simple as possible, but no simpler". •Improve the worst case In general (mainly when it comes to the client experience), the worst case should be improved over the average case. For example, it is not beneficial to have a blinding fast and accurate situation when a client is not near a cell boundary if the experience is poor when he is near one. •Client/server bandwidth is valuable The most important resource is the bandwidth between the client and server. After this, it is probably CPU, and then intra-server bandwidth. •Keep information together; Avoid two-way calls Information (or data) that is often used together should be easily accessed together. For example, a large amount of the data processed together in the game is related to objects that are geometrically close. It makes sense then to use data partitioning based on locality. It is also expensive to have to request information from a separate server machine when it is necessary. This is true for a number of reasons including the extra hops and coordination required and (maybe even more importantly) the reduced likelihood of being able to batch requests together. •Avoid bottlenecks; Make the system distributed The design should try to avoid single central points where things occur. This approach can cause perfor- mance bottlenecks and make the design non-scalable. It can also introduce a single point of failure, there- fore raising fault tolerance issues. •Where possible, do communication in batches There is a fairly high overhead in sending a single packet. That is, it is a lot more expensive to send ten individual packets than it is to send one packet that is ten times bigger.
  • #24: Scalability, reliability, efficiency The general goal is to produce a scalable, reliable, and efficient system. This should be done while keeping as much simplicity and flexibility as possible. •Occam's Razor The simplest design that satisfies all requirements should be considered the best, or in the words of Ein- stein, "Make things as simple as possible, but no simpler". •Improve the worst case In general (mainly when it comes to the client experience), the worst case should be improved over the average case. For example, it is not beneficial to have a blinding fast and accurate situation when a client is not near a cell boundary if the experience is poor when he is near one. •Client/server bandwidth is valuable The most important resource is the bandwidth between the client and server. After this, it is probably CPU, and then intra-server bandwidth. •Keep information together; Avoid two-way calls Information (or data) that is often used together should be easily accessed together. For example, a large amount of the data processed together in the game is related to objects that are geometrically close. It makes sense then to use data partitioning based on locality. It is also expensive to have to request information from a separate server machine when it is necessary. This is true for a number of reasons including the extra hops and coordination required and (maybe even more importantly) the reduced likelihood of being able to batch requests together. •Avoid bottlenecks; Make the system distributed The design should try to avoid single central points where things occur. This approach can cause perfor- mance bottlenecks and make the design non-scalable. It can also introduce a single point of failure, there- fore raising fault tolerance issues. •Where possible, do communication in batches There is a fairly high overhead in sending a single packet. That is, it is a lot more expensive to send ten individual packets than it is to send one packet that is ten times bigger.
  • #25: 24
  • #26: Scalability, reliability, efficiency The general goal is to produce a scalable, reliable, and efficient system. This should be done while keeping as much simplicity and flexibility as possible. •Occam's Razor The simplest design that satisfies all requirements should be considered the best, or in the words of Ein- stein, "Make things as simple as possible, but no simpler". •Improve the worst case In general (mainly when it comes to the client experience), the worst case should be improved over the average case. For example, it is not beneficial to have a blinding fast and accurate situation when a client is not near a cell boundary if the experience is poor when he is near one. •Client/server bandwidth is valuable The most important resource is the bandwidth between the client and server. After this, it is probably CPU, and then intra-server bandwidth. •Keep information together; Avoid two-way calls Information (or data) that is often used together should be easily accessed together. For example, a large amount of the data processed together in the game is related to objects that are geometrically close. It makes sense then to use data partitioning based on locality. It is also expensive to have to request information from a separate server machine when it is necessary. This is true for a number of reasons including the extra hops and coordination required and (maybe even more importantly) the reduced likelihood of being able to batch requests together. •Avoid bottlenecks; Make the system distributed The design should try to avoid single central points where things occur. This approach can cause perfor- mance bottlenecks and make the design non-scalable. It can also introduce a single point of failure, there- fore raising fault tolerance issues. •Where possible, do communication in batches There is a fairly high overhead in sending a single packet. That is, it is a lot more expensive to send ten individual packets than it is to send one packet that is ten times bigger.
  • #28: Scalability, reliability, efficiency The general goal is to produce a scalable, reliable, and efficient system. This should be done while keeping as much simplicity and flexibility as possible. •Occam's Razor The simplest design that satisfies all requirements should be considered the best, or in the words of Ein- stein, "Make things as simple as possible, but no simpler". •Improve the worst case In general (mainly when it comes to the client experience), the worst case should be improved over the average case. For example, it is not beneficial to have a blinding fast and accurate situation when a client is not near a cell boundary if the experience is poor when he is near one. •Client/server bandwidth is valuable The most important resource is the bandwidth between the client and server. After this, it is probably CPU, and then intra-server bandwidth. •Keep information together; Avoid two-way calls Information (or data) that is often used together should be easily accessed together. For example, a large amount of the data processed together in the game is related to objects that are geometrically close. It makes sense then to use data partitioning based on locality. It is also expensive to have to request information from a separate server machine when it is necessary. This is true for a number of reasons including the extra hops and coordination required and (maybe even more importantly) the reduced likelihood of being able to batch requests together. •Avoid bottlenecks; Make the system distributed The design should try to avoid single central points where things occur. This approach can cause perfor- mance bottlenecks and make the design non-scalable. It can also introduce a single point of failure, there- fore raising fault tolerance issues. •Where possible, do communication in batches There is a fairly high overhead in sending a single packet. That is, it is a lot more expensive to send ten individual packets than it is to send one packet that is ten times bigger.
  • #29: Scalability, reliability, efficiency The general goal is to produce a scalable, reliable, and efficient system. This should be done while keeping as much simplicity and flexibility as possible. •Occam's Razor The simplest design that satisfies all requirements should be considered the best, or in the words of Ein- stein, "Make things as simple as possible, but no simpler". •Improve the worst case In general (mainly when it comes to the client experience), the worst case should be improved over the average case. For example, it is not beneficial to have a blinding fast and accurate situation when a client is not near a cell boundary if the experience is poor when he is near one. •Client/server bandwidth is valuable The most important resource is the bandwidth between the client and server. After this, it is probably CPU, and then intra-server bandwidth. •Keep information together; Avoid two-way calls Information (or data) that is often used together should be easily accessed together. For example, a large amount of the data processed together in the game is related to objects that are geometrically close. It makes sense then to use data partitioning based on locality. It is also expensive to have to request information from a separate server machine when it is necessary. This is true for a number of reasons including the extra hops and coordination required and (maybe even more importantly) the reduced likelihood of being able to batch requests together. •Avoid bottlenecks; Make the system distributed The design should try to avoid single central points where things occur. This approach can cause perfor- mance bottlenecks and make the design non-scalable. It can also introduce a single point of failure, there- fore raising fault tolerance issues. •Where possible, do communication in batches There is a fairly high overhead in sending a single packet. That is, it is a lot more expensive to send ten individual packets than it is to send one packet that is ten times bigger.