Back to Erlang-fr French area.
Worldforge is a project aiming at developping an online role playing game servers, with massively multi-user constraints. It means that thousands of players may come and play in the very same world.
The needs for such a development are importants. We must be able to manage a great numbers of network connections and to be able to react very quickly to incoming events. We also have to compute interaction between the characters and send back the results to all client programs.
So, the main technical goals for such a platform are :
Those two needs show by themselves that we must be think about developping a distributed server yet.
The Erlang language has intrinsics features which make it possible to develop this distributed server much more easily than in any other language. It offers precious possibilities for such a server, such as running code hot swapping, without service interruption. Lastly, Erlang was written by Ericsson for its routers, with similar requirements than Worldforge server: high numbers of simultaneous connections ; ability to bring information to the right destination. At this point, the analogy with Worldforge STAGE is obvious.
The idea of using Erlang for such development is then appealing. So I decided to experiment it.
Throught a collection of small articles like this one, I propose to describe how such development is designed and written.
I will describe this development according to STAGE server base code, one of the servers proposed by Worldforge.
STAGE was formerly designed by Bryce Harrington, and supposed to be implemented in C++, although this isn't an absolute need. We are going to write an Erlang implementation, judiciously called Erly-Stage. ;-) With this in mind, we will also want to :
Clearly, our intention here is to allow client programs to connect to either implementation transparently.
This article will start to browse the architecture of Erly-Stage.
STAGE is designed with with several agents and is based on several toolboxes.
This agent federates a set of applicative modules. He is responsible for launching and supervising each modules. In Erly-Stage, Thor's code is something like this :
start( ThorNode, MercuryNode ) ->
?INFO("[Thor] starting on node ~p ~n", [ThorNode]),
muse:start(),
PegasusProcess = pegasus:start(ThorNode),
MercuryProcess = mercury:start(MercuryNode, ?PORT,
PegasusProcess),
shepherd:start(),
echo:start(),
io:format("~n[Thor] starting script finished ~n").
This is a classical Erlang construct, so I won't say much on it. The only point to see is that every agents are launched on a possibly different Erlang node. This is the first step for applicative distribution in Erly-Stage.
Muse is the media server that we are going to study in a future article. So will be Shepherd and Echo, that are respectivly responsible for entities movement, and gameworld persistency (ie mass storage).
In this first article, we'll see how Mercury works.
This agent accepts and manages TCP connections. Mercury can also delegate its tasks to another agent, as we will see later.
In Erlang, here is the main code of the TCP manager :
{ ok, SSocket } = gen_tcp:listen(
ServerPort,
[ list,
{ packet, 0 },
{ active, false },
{ nodelay, true },
{ keepalive, true},
{ reuseaddr, true} ] ),
io:format("[Mercury] Listening socket is ~p on node ~p~n",
[SSocket, node()]),
server_accept(SSocket, PegasusAgent ).
Mercury states here to be accepting TCP connection on ServerPort. When such connection occurs, Mercury launches a dedicated process which will have to take care of the just-created socket, as we can see here :
server_accept( SSocket, PegasusAgent ) ->
{ ok, CSocket } = gen_tcp:accept( SSocket ),
io:format("[Mercury] new connection : ~p~n", [ CSocket ]),
Pid = spawn(node(), ?MODULE, connection_loop,
[CSocket, PegasusAgent]),
gen_tcp:controlling_process( CSocket, Pid ),
%% And we loop to accept other connections...
?MODULE:server_accept(SSocket, PegasusAgent ).
Please note that gen_tcp:accept is the blocking point. At this point, the process is really waiting for the connection to occur. The other point to see is that the loop is written with the specification of the module (?MODULE:server_accept), which allows hot code-swapping. This will be interesting here for debugging purpose.
At this point, we have a TCP server which accepts connections and launches managing processes. Such process will receive data and process them :
case gen_tcp:recv( CSocket, 0 ) of
{ ok, Bs0 } ->
... processing of incoming data ...
... we loop...
{ error, closed } ->
... end of connection
end.
Incoming data are get by blocks of undefined size. We then have to write a dispatcher, which role will be to format those data in string lists. Then, we are going to study Atlas, the protocol which is in charge of exchanging data between clients and servers. All this will be the subject of the following article...
Thanks to Mickaël Rémond (a.k.a Stormy) for this second reading :-)