SlideShare a Scribd company logo
Pfm – PaaS for MMO Takehiro Iyatomi
Introduction (of me) - game developper from kanagawa
Introduction (of me) - game developper from kanagawa -  food fighter in Tsukuba vs Ran Ran, Yume-ya, Claret, etc...
problem - difficulty to develop/start up online game (infrastructure cost, development difficulty) - it loses variety of online games ecosystem
Project purpose Provide online game developping environment which is:  Coding -  easy and fun Cost -  cheap Robustness –  available Scalability –  available make total cost for starting online game so cheap and provide easy and fun developping
Related works : project darkstar (a.k.a sun grid server) Open source online game server framework written in Java by sun micro systems Coding -  developper need to learn middle level Java programming knowledge  Cost –  high. developper need to prepare their own infrastructure Robustness ,  Scalability  – it suppose to be available but it never checked by actual services.
Related works : BigWorld Proprietary server framework by BigWorld technology. Coding –  developper only need to learn python to develop. easy coding but easily makes severe problem (because easily generate frequent access to database unconsiously) Cost –  high. user need to prepare their own infrastructure Robustness ,  Scalability  – it suppose to be available but actually NOT. (it has hanged up many times with actual environment.)
Problem of previous solution -  Coding  : still programming difficulty exists (middle level programming knowledge or big side effect) -  Cost  : lack of the view of decreasing cost for infrastructure (no framework can handle multiple application, so none of these cannot be PaaS)
lua - lightweight, reflective, imperative and (possibly) functional programming languages mainly designed for embedding to C program - powerful reflection feature called metatable (looks like C++ operator overload) - flexible feature called environment table to provide namespace - all data structure is described by one data type called table (a kind of associative array) (see wikipedia for more feature ;-))
terms(1) -  servant (servant node)  : the node which do actuall job in distribute computing system. Usually the word 'worker' may more familiar, but in this slide, the word 'worker' used for the meaning of worker thread running in each node, so use 'servant' as same meaning.  -  VM(luaVM)  : lua byte code interpreter -  fiber  : thread (in lua, called coroutine) which cooperatively executes 1 lua function call. It suspends itself by explicit yielding -  KVS  : distributed key value store. Especially focus on master-servant type key value store like kumo-fs (made in Tsukuba :P)
terms(2) Object  : KVS record which is given lua script binding so that developpers can access its data from their program. Object behave as lua table in script. Object ID  : pfm generate KVS record by itself, so need to generate unique key by its own. Such a self-assigned unique key.  Object method  : lua function object(function is first class object in lua) which relate with object as its table element.
pfm - Programming framework for script language lua on distributed key-value store(KVS) (written in C/C++).  - Developper can describe interaction between records on KVS by lua scripting with automatic inter-luaVM asynchronous RPC. - Can handle multiple application on it even only 1  node available
fiber fiber fiber fiber fiber fiber (Call object method) object fiber (cooperative thread) Fiber suspended Fiber executed Each Fiber relate with 1 object (method owner) Asynchronous RPC
v = object:func(arg1, arg2, ..., argN) v = object.func(object,arg1, arg2, ..., argN) v = object:func(arg1, arg2, ..., argN) v = object:func(arg1, arg2, ..., argN) v =  object[“func“] (object,arg1, arg2, ..., argN) v =  object[“func“] (object,arg1, arg2, ..., argN) func, object,arg1, arg2, ..., argN  is sent to remote fiber yields execution to another fiber v =  object[“func“](object,arg1, arg2, ..., argN) Modify  [...]  behavior if object is stored in remote node object[“func”] is not actual value but rpc context in this case Modify  (...)  behavior Syntax sugar Syntax sugar RPC reply back  and fiber resumed
function player:func(target)  local n = target:foo() local r = target:bar(n) return r end function player:foo()  return self.data end function player:bar(n)  return self.item.baz(n) end function item:baz(n)  return self.data + n end Host 1 Host 2 Host 3 Fiber1 (player: func) Fiber3 (item: baz) Fiber2 (target: foo,bar) target:foo() Fiber1 resume fiber2(foo) start return self.data target:bar(n) fiber2(bar) start (fiber2(foo), fiber2(bar) may different) self.item.baz(n) fiber2(foo) start fiber3(baz) start return self.data+n return self.item.baz(n) local r = target:bar(n) return r Fiber2 resume
function func() local v =  global_variable return v end Real global table Environment table Each function can have own table variable which is called environment table that can replace  Global table (global namespace) Switch namespace so that same name but different body Environment table All application which is hosted by pfm luaVM has its own environment table. And when each fiber created, pfm attach environment table whose owner application is  dispatch this rpc call. Thus pfm luaVM can support multiple application with 1 node.
master node servant node client node get/put replication node membership,failure detection Master-servant(worker) style  Distribute Key-value store
master node servant node client node rpc rpc, object creation , generate object ID ,  replication login/logout , node membership, failure detection Pfm (from all  servant)
Node role difference KVS Client node : dispatch put/get
Servant node : process put/get, replication
Master node : node membership, failure detection Pfm Client node :  process/dispatch rpc
Servant node :  process/dispatch rpc, object creation, generate object ID , replication
Master node :  login/logout , node membership, failure detection
pfm Coding –  easy and fun. lua is familiar to game developper and inter-VM RPC hides every difficulty of multi thread / distribute computing system programming from them. Cost -  cheap. (pfm can run as PaaS). User dont need to prepare their infrastructure. Robustness , Scalability  – because it based on KVS. It supposes to failover, and scale to some level.
Login/logout on pfm - usually KVS is used as backend of network service, so all node trusted. - but pfm. Servant node of KVS is also frontend of service, so many untrust node connects to servant node. - so need to authenticate each connected node.
1. Send login w/account, world ID
1. Send login w/account, world ID 2. Forward to master node  (duplicate login check)
1. Send login w/account, world ID 3. If no error, master node returns object ID. If first login, newly assigned 2. Forward to master node  (duplicate login check)
1. Send login w/account, world ID 3. If no error, master node returns object ID. If first login, newly assigned 2. Forward to master node  (duplicate login check) 4. Actual authentication Performed each servant node For scaling
1. Send login w/account, world ID 2. Forward to master node  (duplicate login check) 5. Each servant node knows where object  Exists from object ID and consistent hash, Then request load/create query to the node.  3. If no error, master node Returns object ID. If first login, newly assigned 4. Actual authentication Performed each servant node For scaling
1. Send login w/account, world ID 2. Forward to master node  (duplicate login check) 5. Each servant node knows where object  Exists from object ID and consistent hash, Then request load/create query to the node.  3. If no error, master node Returns object ID. If first login, newly assigned 4. Actual authentication Performed each servant node For scaling 6. Return load/create result
1. Send login w/account, world ID 2. Forward to master node  (duplicate login check) 5. Each servant node knows where object  Exists from object ID and consistent hash, Then request load/create query to the node.  3. If no error, master node Returns object ID. If first login, newly assigned 4. Actual authentication Performed each servant node For scaling 6. Return load/create result Loaded player object 7. if load/create success. The node  Which client access at first retains copy of the object. And client access to pfm Through rpc request to this object only. Such a object that relate with  client node, Called  'player object'
Generate object ID - it based on MAC address (6byte) + auto increment value of each node (6byte)  - during initialization, generator load current auto increment value from file, and write 'fault flag' - when finalized normally, generator remove 'fault flag' from file. - if during initialization, fault flag is exist, generator  thinks abnormal shutdown may happen, so add some big value (1M) to auto increment value.
Name convention - in Pfm, user can change behavior of rpc by specifying function name to call with obeying some convention. - like a RoR(Ruby on Rails) activerecord guess the function/variable name according to record relationship.
Convention #1:  _{function name}   e.g.  object:_protected_routine() Host 1 Host 2 Host 3 Fiber1 (not trusted) e.g. client node Fiber3 (trusted) e.g. servant node Fiber2 Servant node which rpc target object is exist object:_procected_routine() NG : because Host 1 is  Not  trusted. (client node may cheater) OK : because  Host 3 is trusted Node (servant node  is prepared by  service provider, so  trusted) Return error Rpc call which procedure name starts with  '_'  only can call from trusted node. (Currently only client node is untrust.) Return result
Convention #2:  notify_{function name}   e.g.  object:notify_chat(msg) Host 1 Host 2 Fiber1 Fiber2 object: notify_ chat (msg) Rpc call which procedure name start with  'notify_' , autometically understood by  System as trying to call rpc which name is  removal of  'notify_'  from original procedure name and does not wait reply. Call object: chat (msg) does not wait reply  (execution continues) Reply is back, but it will ignore
Convention #3:  client_{function name}   e.g.  object:client_open_ui(url) Host 1 Host 2 Fiber1 Fiber2 object: client_ open_ui (msg) Rpc call which procedure name start with  'client_' , autometically understood by  System as trying to call client node rpc which name is  removal of  'client_'  from original procedure name. Host 3 Fiber3( client node ) If target object is  Player object (attach with Session),  Forward  open_ui (msg) To client Forward  Reply of  open_ui (msg)
Convention #4:  combination   Conventions can be used with combination. eg)  notify_ client_ open_ui Try to call client procedure   open_ui   and   does not wait reply
Convention #5:  user-defined convention   Host 1 Host 2 Fiber1 Fiber2 Call  object : funcname (...) User define convention such as convention #1 - #3 by defining lua function Which receive target object and funcname and return new function which executed with fiber. Returns new function  (if some convention rule enabled) or,  Return original function
Convention #5:  user-defined convention   example of convention checker. ( if function start with 'broadcast_'  , then  call rpc which procedure name is removal of 'broadcast_' of original procedure name for all member variable of target object ) function hook_check_convention(procname,obj)  local s.e = string.find(procname,“broadcast_”) If s == 0 then local funcbody = string.sub(procname,e)  return local function _(object,...) local k,v = pfm.next(object) while k do  If pfm.typeof(v) then v[funcbody](...) end k,v = pfm.next(object,k) end end  end return obj[procname] end
Future of pfm -  robustness  (data replication, failover) -  support rpc through http  (for working cooperatively with web services) -  support cooperation with Unity3D (game developping IDE which scripting is provided by lua)
Implementation plan: replication fiber(related with object) Update cache/ Refer data Fiber execution  finish Implement fiber local cache.  Once read/write to object data,  Actual object data will not change but Fiber local cache store current value of  Object data. After fiber execution finished, Fiber local cache update actual object value At once. It may cause data update conflict.  But for online game, some of data changes  can ignore its order, so basically update data  As cache is, and if some data change need to  update master data immediately, want to  Prepare such a programming convention  like a volatile keyword in C/C++
terms(3) Message ID  : for asynchronous RPC, to distinguish which reply is for which RPC, pfm send RPC command with round-robin increment ID. It called Message ID.
Implementation plan: RPC failover RPC RPC replication object Player object Active connection for this rpc Stand-by connection for this rpc Each RPC packet which send  ServantA,B -> Servant 1 is stored Sender node's memory until Servant1 reply back.  Once rpc is processed, servant1  Update 'last processed  message ID from servant A or B'  with remote Address.  It also send to servant2 (replicate host) with replication  packet.
Implementation plan: RPC failover RPC RPC replication object Player object Active connection for this rpc Stand-by connection for this rpc After servant1 node fault,  Servant A,B try to use servant2 As primary servant node. Then servant A,B re-send  Unreplied packet from servant1  to servant2 Resend RPC Resend RPC receiving rpc packet which Resent from servant A,B,  Servant2 compare message ID In resent packet with last processed message ID recorded in itself. If resent packet's message ID Is greater, it processed, otherwise discarded.

More Related Content

What's hot (20)

ZIP
Above the clouds: introducing Akka
nartamonov
 
PPTX
Running Ruby on Solaris (RubyKaigi 2015, 12/Dec/2015)
ngotogenome
 
PPTX
これからのPerlプロダクトのかたち(YAPC::Asia 2013)
goccy
 
PDF
TorqueBox - Ultrapassando a fronteira entre Java e Ruby
Bruno Oliveira
 
PDF
At Scale With Style
Martin Rehfeld
 
PDF
Combining the strength of erlang and Ruby
Martin Rehfeld
 
PDF
When Ruby Meets Java - The Power of Torquebox
rockyjaiswal
 
PDF
DataMapper on Infinispan
Lance Ball
 
PDF
DEF CON 27 - workshop - HUGO TROVAO and RUSHIKESH NADEDKAR - scapy dojo v1
Felipe Prado
 
PDF
Новый InterSystems: open-source, митапы, хакатоны
Timur Safin
 
PDF
Automatic Reference Counting @ Pragma Night
Giuseppe Arici
 
PDF
Fiber in the 10th year
Koichi Sasada
 
KEY
Ruby Concurrency and EventMachine
Christopher Spring
 
PDF
Java Keeps Throttling Up!
José Paumard
 
PDF
Treasure Data Summer Internship 2016
Yuta Iwama
 
KEY
TorqueBox - Ruby Hoedown 2011
Lance Ball
 
PDF
Java Full Throttle
José Paumard
 
PDF
服务框架: Thrift & PasteScript
Qiangning Hong
 
PDF
TorqueBox - When Java meets Ruby
Bruno Oliveira
 
PDF
2008 07-24 kwpm-threads_and_synchronization
fangjiafu
 
Above the clouds: introducing Akka
nartamonov
 
Running Ruby on Solaris (RubyKaigi 2015, 12/Dec/2015)
ngotogenome
 
これからのPerlプロダクトのかたち(YAPC::Asia 2013)
goccy
 
TorqueBox - Ultrapassando a fronteira entre Java e Ruby
Bruno Oliveira
 
At Scale With Style
Martin Rehfeld
 
Combining the strength of erlang and Ruby
Martin Rehfeld
 
When Ruby Meets Java - The Power of Torquebox
rockyjaiswal
 
DataMapper on Infinispan
Lance Ball
 
DEF CON 27 - workshop - HUGO TROVAO and RUSHIKESH NADEDKAR - scapy dojo v1
Felipe Prado
 
Новый InterSystems: open-source, митапы, хакатоны
Timur Safin
 
Automatic Reference Counting @ Pragma Night
Giuseppe Arici
 
Fiber in the 10th year
Koichi Sasada
 
Ruby Concurrency and EventMachine
Christopher Spring
 
Java Keeps Throttling Up!
José Paumard
 
Treasure Data Summer Internship 2016
Yuta Iwama
 
TorqueBox - Ruby Hoedown 2011
Lance Ball
 
Java Full Throttle
José Paumard
 
服务框架: Thrift & PasteScript
Qiangning Hong
 
TorqueBox - When Java meets Ruby
Bruno Oliveira
 
2008 07-24 kwpm-threads_and_synchronization
fangjiafu
 

Similar to Pfm technical-inside (20)

PPTX
Cs 704 d rpc
Debasis Das
 
KEY
Polyglot parallelism
Phillip Toland
 
PDF
NoSQL afternoon in Japan Kumofs & MessagePack
Sadayuki Furuhashi
 
PDF
NoSQL afternoon in Japan kumofs & MessagePack
Sadayuki Furuhashi
 
PDF
Distributed Ruby and Rails
Wen-Tien Chang
 
PDF
Building Distributed Systems
Pivorak MeetUp
 
PDF
Why Erlang? - Bar Camp Atlanta 2008
boorad
 
PDF
Dragoncraft Architectural Overview
jessesanford
 
PPT
Suman's PhD Candidacy Talk
Suman Srinivasan
 
PDF
RPC in Smalltalk
ESUG
 
PDF
"Elixir of Life" - Dev In Santos
Fabio Akita
 
PPTX
UNIT V DIS.pptx
Premkumar R
 
PDF
Ruby and Distributed Storage Systems
SATOSHI TAGOMORI
 
PPT
Dce rpc
pratosh123
 
PDF
Practical Byzantine Fault Tolerance
Suman Karumuri
 
PDF
Redis: REmote DIctionary Server
Ezra Zygmuntowicz
 
PDF
RubyKaigi 2014: ServerEngine
Treasure Data, Inc.
 
PDF
3 f6 9_distributed_systems
op205
 
PDF
Introduction To Distributed Erlang
David Dossot
 
PDF
Architectures of Distributed Systems.pdf
cAnhTrn53
 
Cs 704 d rpc
Debasis Das
 
Polyglot parallelism
Phillip Toland
 
NoSQL afternoon in Japan Kumofs & MessagePack
Sadayuki Furuhashi
 
NoSQL afternoon in Japan kumofs & MessagePack
Sadayuki Furuhashi
 
Distributed Ruby and Rails
Wen-Tien Chang
 
Building Distributed Systems
Pivorak MeetUp
 
Why Erlang? - Bar Camp Atlanta 2008
boorad
 
Dragoncraft Architectural Overview
jessesanford
 
Suman's PhD Candidacy Talk
Suman Srinivasan
 
RPC in Smalltalk
ESUG
 
"Elixir of Life" - Dev In Santos
Fabio Akita
 
UNIT V DIS.pptx
Premkumar R
 
Ruby and Distributed Storage Systems
SATOSHI TAGOMORI
 
Dce rpc
pratosh123
 
Practical Byzantine Fault Tolerance
Suman Karumuri
 
Redis: REmote DIctionary Server
Ezra Zygmuntowicz
 
RubyKaigi 2014: ServerEngine
Treasure Data, Inc.
 
3 f6 9_distributed_systems
op205
 
Introduction To Distributed Erlang
David Dossot
 
Architectures of Distributed Systems.pdf
cAnhTrn53
 
Ad

Recently uploaded (20)

PDF
Researching The Best Chat SDK Providers in 2025
Ray Fields
 
PDF
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PPTX
The Future of AI & Machine Learning.pptx
pritsen4700
 
PDF
How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdf
Stryv Solutions Pvt. Ltd.
 
PDF
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
PDF
RAT Builders - How to Catch Them All [DeepSec 2024]
malmoeb
 
PPTX
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
PDF
Build with AI and GDG Cloud Bydgoszcz- ADK .pdf
jaroslawgajewski1
 
PPTX
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
PPTX
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PDF
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PDF
TrustArc Webinar - Navigating Data Privacy in LATAM: Laws, Trends, and Compli...
TrustArc
 
PDF
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
PPTX
AVL ( audio, visuals or led ), technology.
Rajeshwri Panchal
 
PDF
The Future of Artificial Intelligence (AI)
Mukul
 
PPTX
Agile Chennai 18-19 July 2025 | Workshop - Enhancing Agile Collaboration with...
AgileNetwork
 
PDF
Generative AI vs Predictive AI-The Ultimate Comparison Guide
Lily Clark
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
Researching The Best Chat SDK Providers in 2025
Ray Fields
 
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
The Future of AI & Machine Learning.pptx
pritsen4700
 
How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdf
Stryv Solutions Pvt. Ltd.
 
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
RAT Builders - How to Catch Them All [DeepSec 2024]
malmoeb
 
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
Build with AI and GDG Cloud Bydgoszcz- ADK .pdf
jaroslawgajewski1
 
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
TrustArc Webinar - Navigating Data Privacy in LATAM: Laws, Trends, and Compli...
TrustArc
 
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
AVL ( audio, visuals or led ), technology.
Rajeshwri Panchal
 
The Future of Artificial Intelligence (AI)
Mukul
 
Agile Chennai 18-19 July 2025 | Workshop - Enhancing Agile Collaboration with...
AgileNetwork
 
Generative AI vs Predictive AI-The Ultimate Comparison Guide
Lily Clark
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
Ad

Pfm technical-inside

  • 1. Pfm – PaaS for MMO Takehiro Iyatomi
  • 2. Introduction (of me) - game developper from kanagawa
  • 3. Introduction (of me) - game developper from kanagawa - food fighter in Tsukuba vs Ran Ran, Yume-ya, Claret, etc...
  • 4. problem - difficulty to develop/start up online game (infrastructure cost, development difficulty) - it loses variety of online games ecosystem
  • 5. Project purpose Provide online game developping environment which is: Coding - easy and fun Cost - cheap Robustness – available Scalability – available make total cost for starting online game so cheap and provide easy and fun developping
  • 6. Related works : project darkstar (a.k.a sun grid server) Open source online game server framework written in Java by sun micro systems Coding - developper need to learn middle level Java programming knowledge Cost – high. developper need to prepare their own infrastructure Robustness , Scalability – it suppose to be available but it never checked by actual services.
  • 7. Related works : BigWorld Proprietary server framework by BigWorld technology. Coding – developper only need to learn python to develop. easy coding but easily makes severe problem (because easily generate frequent access to database unconsiously) Cost – high. user need to prepare their own infrastructure Robustness , Scalability – it suppose to be available but actually NOT. (it has hanged up many times with actual environment.)
  • 8. Problem of previous solution - Coding : still programming difficulty exists (middle level programming knowledge or big side effect) - Cost : lack of the view of decreasing cost for infrastructure (no framework can handle multiple application, so none of these cannot be PaaS)
  • 9. lua - lightweight, reflective, imperative and (possibly) functional programming languages mainly designed for embedding to C program - powerful reflection feature called metatable (looks like C++ operator overload) - flexible feature called environment table to provide namespace - all data structure is described by one data type called table (a kind of associative array) (see wikipedia for more feature ;-))
  • 10. terms(1) - servant (servant node) : the node which do actuall job in distribute computing system. Usually the word 'worker' may more familiar, but in this slide, the word 'worker' used for the meaning of worker thread running in each node, so use 'servant' as same meaning. - VM(luaVM) : lua byte code interpreter - fiber : thread (in lua, called coroutine) which cooperatively executes 1 lua function call. It suspends itself by explicit yielding - KVS : distributed key value store. Especially focus on master-servant type key value store like kumo-fs (made in Tsukuba :P)
  • 11. terms(2) Object : KVS record which is given lua script binding so that developpers can access its data from their program. Object behave as lua table in script. Object ID : pfm generate KVS record by itself, so need to generate unique key by its own. Such a self-assigned unique key. Object method : lua function object(function is first class object in lua) which relate with object as its table element.
  • 12. pfm - Programming framework for script language lua on distributed key-value store(KVS) (written in C/C++). - Developper can describe interaction between records on KVS by lua scripting with automatic inter-luaVM asynchronous RPC. - Can handle multiple application on it even only 1 node available
  • 13. fiber fiber fiber fiber fiber fiber (Call object method) object fiber (cooperative thread) Fiber suspended Fiber executed Each Fiber relate with 1 object (method owner) Asynchronous RPC
  • 14. v = object:func(arg1, arg2, ..., argN) v = object.func(object,arg1, arg2, ..., argN) v = object:func(arg1, arg2, ..., argN) v = object:func(arg1, arg2, ..., argN) v = object[“func“] (object,arg1, arg2, ..., argN) v = object[“func“] (object,arg1, arg2, ..., argN) func, object,arg1, arg2, ..., argN is sent to remote fiber yields execution to another fiber v = object[“func“](object,arg1, arg2, ..., argN) Modify [...] behavior if object is stored in remote node object[“func”] is not actual value but rpc context in this case Modify (...) behavior Syntax sugar Syntax sugar RPC reply back and fiber resumed
  • 15. function player:func(target) local n = target:foo() local r = target:bar(n) return r end function player:foo() return self.data end function player:bar(n) return self.item.baz(n) end function item:baz(n) return self.data + n end Host 1 Host 2 Host 3 Fiber1 (player: func) Fiber3 (item: baz) Fiber2 (target: foo,bar) target:foo() Fiber1 resume fiber2(foo) start return self.data target:bar(n) fiber2(bar) start (fiber2(foo), fiber2(bar) may different) self.item.baz(n) fiber2(foo) start fiber3(baz) start return self.data+n return self.item.baz(n) local r = target:bar(n) return r Fiber2 resume
  • 16. function func() local v = global_variable return v end Real global table Environment table Each function can have own table variable which is called environment table that can replace Global table (global namespace) Switch namespace so that same name but different body Environment table All application which is hosted by pfm luaVM has its own environment table. And when each fiber created, pfm attach environment table whose owner application is dispatch this rpc call. Thus pfm luaVM can support multiple application with 1 node.
  • 17. master node servant node client node get/put replication node membership,failure detection Master-servant(worker) style Distribute Key-value store
  • 18. master node servant node client node rpc rpc, object creation , generate object ID , replication login/logout , node membership, failure detection Pfm (from all servant)
  • 19. Node role difference KVS Client node : dispatch put/get
  • 20. Servant node : process put/get, replication
  • 21. Master node : node membership, failure detection Pfm Client node : process/dispatch rpc
  • 22. Servant node : process/dispatch rpc, object creation, generate object ID , replication
  • 23. Master node : login/logout , node membership, failure detection
  • 24. pfm Coding – easy and fun. lua is familiar to game developper and inter-VM RPC hides every difficulty of multi thread / distribute computing system programming from them. Cost - cheap. (pfm can run as PaaS). User dont need to prepare their infrastructure. Robustness , Scalability – because it based on KVS. It supposes to failover, and scale to some level.
  • 25. Login/logout on pfm - usually KVS is used as backend of network service, so all node trusted. - but pfm. Servant node of KVS is also frontend of service, so many untrust node connects to servant node. - so need to authenticate each connected node.
  • 26. 1. Send login w/account, world ID
  • 27. 1. Send login w/account, world ID 2. Forward to master node (duplicate login check)
  • 28. 1. Send login w/account, world ID 3. If no error, master node returns object ID. If first login, newly assigned 2. Forward to master node (duplicate login check)
  • 29. 1. Send login w/account, world ID 3. If no error, master node returns object ID. If first login, newly assigned 2. Forward to master node (duplicate login check) 4. Actual authentication Performed each servant node For scaling
  • 30. 1. Send login w/account, world ID 2. Forward to master node (duplicate login check) 5. Each servant node knows where object Exists from object ID and consistent hash, Then request load/create query to the node. 3. If no error, master node Returns object ID. If first login, newly assigned 4. Actual authentication Performed each servant node For scaling
  • 31. 1. Send login w/account, world ID 2. Forward to master node (duplicate login check) 5. Each servant node knows where object Exists from object ID and consistent hash, Then request load/create query to the node. 3. If no error, master node Returns object ID. If first login, newly assigned 4. Actual authentication Performed each servant node For scaling 6. Return load/create result
  • 32. 1. Send login w/account, world ID 2. Forward to master node (duplicate login check) 5. Each servant node knows where object Exists from object ID and consistent hash, Then request load/create query to the node. 3. If no error, master node Returns object ID. If first login, newly assigned 4. Actual authentication Performed each servant node For scaling 6. Return load/create result Loaded player object 7. if load/create success. The node Which client access at first retains copy of the object. And client access to pfm Through rpc request to this object only. Such a object that relate with client node, Called 'player object'
  • 33. Generate object ID - it based on MAC address (6byte) + auto increment value of each node (6byte) - during initialization, generator load current auto increment value from file, and write 'fault flag' - when finalized normally, generator remove 'fault flag' from file. - if during initialization, fault flag is exist, generator thinks abnormal shutdown may happen, so add some big value (1M) to auto increment value.
  • 34. Name convention - in Pfm, user can change behavior of rpc by specifying function name to call with obeying some convention. - like a RoR(Ruby on Rails) activerecord guess the function/variable name according to record relationship.
  • 35. Convention #1: _{function name} e.g. object:_protected_routine() Host 1 Host 2 Host 3 Fiber1 (not trusted) e.g. client node Fiber3 (trusted) e.g. servant node Fiber2 Servant node which rpc target object is exist object:_procected_routine() NG : because Host 1 is Not trusted. (client node may cheater) OK : because Host 3 is trusted Node (servant node is prepared by service provider, so trusted) Return error Rpc call which procedure name starts with '_' only can call from trusted node. (Currently only client node is untrust.) Return result
  • 36. Convention #2: notify_{function name} e.g. object:notify_chat(msg) Host 1 Host 2 Fiber1 Fiber2 object: notify_ chat (msg) Rpc call which procedure name start with 'notify_' , autometically understood by System as trying to call rpc which name is removal of 'notify_' from original procedure name and does not wait reply. Call object: chat (msg) does not wait reply (execution continues) Reply is back, but it will ignore
  • 37. Convention #3: client_{function name} e.g. object:client_open_ui(url) Host 1 Host 2 Fiber1 Fiber2 object: client_ open_ui (msg) Rpc call which procedure name start with 'client_' , autometically understood by System as trying to call client node rpc which name is removal of 'client_' from original procedure name. Host 3 Fiber3( client node ) If target object is Player object (attach with Session), Forward open_ui (msg) To client Forward Reply of open_ui (msg)
  • 38. Convention #4: combination Conventions can be used with combination. eg) notify_ client_ open_ui Try to call client procedure open_ui and does not wait reply
  • 39. Convention #5: user-defined convention Host 1 Host 2 Fiber1 Fiber2 Call object : funcname (...) User define convention such as convention #1 - #3 by defining lua function Which receive target object and funcname and return new function which executed with fiber. Returns new function (if some convention rule enabled) or, Return original function
  • 40. Convention #5: user-defined convention example of convention checker. ( if function start with 'broadcast_' , then call rpc which procedure name is removal of 'broadcast_' of original procedure name for all member variable of target object ) function hook_check_convention(procname,obj) local s.e = string.find(procname,“broadcast_”) If s == 0 then local funcbody = string.sub(procname,e) return local function _(object,...) local k,v = pfm.next(object) while k do If pfm.typeof(v) then v[funcbody](...) end k,v = pfm.next(object,k) end end end return obj[procname] end
  • 41. Future of pfm - robustness (data replication, failover) - support rpc through http (for working cooperatively with web services) - support cooperation with Unity3D (game developping IDE which scripting is provided by lua)
  • 42. Implementation plan: replication fiber(related with object) Update cache/ Refer data Fiber execution finish Implement fiber local cache. Once read/write to object data, Actual object data will not change but Fiber local cache store current value of Object data. After fiber execution finished, Fiber local cache update actual object value At once. It may cause data update conflict. But for online game, some of data changes can ignore its order, so basically update data As cache is, and if some data change need to update master data immediately, want to Prepare such a programming convention like a volatile keyword in C/C++
  • 43. terms(3) Message ID : for asynchronous RPC, to distinguish which reply is for which RPC, pfm send RPC command with round-robin increment ID. It called Message ID.
  • 44. Implementation plan: RPC failover RPC RPC replication object Player object Active connection for this rpc Stand-by connection for this rpc Each RPC packet which send ServantA,B -> Servant 1 is stored Sender node's memory until Servant1 reply back. Once rpc is processed, servant1 Update 'last processed message ID from servant A or B' with remote Address. It also send to servant2 (replicate host) with replication packet.
  • 45. Implementation plan: RPC failover RPC RPC replication object Player object Active connection for this rpc Stand-by connection for this rpc After servant1 node fault, Servant A,B try to use servant2 As primary servant node. Then servant A,B re-send Unreplied packet from servant1 to servant2 Resend RPC Resend RPC receiving rpc packet which Resent from servant A,B, Servant2 compare message ID In resent packet with last processed message ID recorded in itself. If resent packet's message ID Is greater, it processed, otherwise discarded.
  • 46. Implementation plan: RPC failover RPC RPC replication object Player object Active connection for this rpc Stand-by connection for this rpc After finish resend packet, Servant A,B send rpc packet To servant2 and servant2 start Replication to next node.
  • 47. currently Pfm powered by - lua 5.1.4 & coco patch (for providing 'true' yield) & byte code portability patch by me - tokyocabinet 1.4.33 (fast DBM) - msgpack 0.4.1 (binary serialize format : implementation is specialized for streaming (record boundary unknown) serialize) - libconhash (consistent hash library which provide fast node search by red-black tree) - SFMT (SIMD oriented Fast Mersenne Twister) with multi-thread patch by me
  • 48. Want to solve that - for 1 application, seems only about 1000 node at most can assign - lua fiber uses huge size memory (36K+/fiber) can reduce? - better msgpack implementation for pfm purpose - change name pfm to something else LOL (how about yue? It means lua in chinese and sounds like moe-character)

Editor's Notes

  • #13: Developper : game developper who develop online game by using pfm.
  • #18: Servant node : for computer science region, worker node may more familar term. But in this slide, I use 'worker' as OS thread on each servant node. So I want to use 'servant' as the meaning 'worker'. Thus, servant node means the kind of node in distribute computing system which provide actual service to client.