NIMBY Rails devblog 2022-11

2 December, 2022

Multiplayer: the problem

The challenge of implementing multiplayer in NIMBY Rails is the fact the state (this would be data involved in the player built objects and the simulation run on top of them) is very large by video game standards, easily going into hundreds of MB while at the same time having the expectation of running a simulation at perfect 60fps. It is not possible for the server to send the full state on every frame of the simulation. Thus game clients must also run the simulation, and hopefully their local simulation will match that of the server.

This is accomplished first by having a correct and reliable way to relay player input to all clients, and second by having the simulation be fully deterministic. Up until 1.5 NIMBY Rails had partial implementations of both aspects, but neither were in a good shape. For 1.6 I decided to tackle both sides of the problem (input relay and deterministic simulation) to try to “fix” multiplayer once for all. “Fixing” multiplayer for NIMBY Rails means to be exclusively limited by the bandwidth consumed by player edits and new player joins, and nothing else. It is reasonable to have an MP game slow down when a player does a very large edit or when it needs to send the current save snapshot on a new player, given the fact servers are self hosted by the players themselves with potentially limited PCs and home internet. But beyond that, in an ideal world, an MP game session would have a performance ceiling (not being able to sim a given speed) similar to a SP game session.

Multiplayer in 1.6: the good

As I mentioned, the state of the game is too large to send on every frame, so ideally you send nothing by making sure the simulation run on every client is identical. But something must be sent, since players aren’t just sitting watching the trains go round. Every edit by a client done needs to be relayed first to the server, which then, if it likes it, will relay it to all the other clients.

Accepting changes from clients, merging them together in to a coherent whole, and relaying them back was the core idea which formed the concept of the database, the internal storage of player owned objects used by the game logic, even in SP. This in-memory database has features like layering, versioning, diffing and commits, and it is also used to power other features, like the track editor undo. Only player objects are stored in the database (think tracks, buildings, the editable aspects of trains and lines, etc.) Simulation state like a train position or paxs are never stored in the database.

The flow of this database system for MP is clients having a base layer which is meant to be a 1:1 read-only copy of the server database. When a client makes an edit, it does so in a layer which is on top of that read-only copy. Even a deletion is expressed as an annotation in the layer, never touching the base database. Then the client sends the changes to the server, with a private version number.

The server constantly receives changes from the clients, and tries to merge them together into its own database. When the merge succeeds, it produces another set of changes of its own, which are then relayed back to clients. Clients receive these changes and apply them directly into the base database, rather than their own private layer, along with the corresponding version number, and discards the layers older than the version.

One problem with this system is that, prior to 1.6, clients ran their simulation on top of the layers, rather than on top of the read only database. This means that in essence every client simulation was liable to diverge from all others. They didn’t even had the same input! As long as this was the case, any effort to make the simulation deterministic was doomed.

So in 1.6 the client simulation is run on top of the read-only replica of the server database. This will stop clients from destroying their local simulation determinism on any and all edits, like has always been the case since v1.1. In exchange this means the client UI can and will show synchronization glitches when, for example, a track is moved with trains running on top: the track editor and map is showing the player their local layer data, but the simulation is running on a copy of the server data. But these kinds of glitches are preferable to having a multiplayer game collapse due client desynchronization.

Another problem remained: there was no coordination between the simulation and the server relaying edits to clients. This means clients merged server changes ASAP, rather than in a coordinated way. Starting in 1.6 all server edits are now sent along with both a simulation frame number, and an edit sequence number. These numbers match the server frame and database version numbers at the moment it sent the changes. Clients now store and wait to apply server changes at the exact simulation frame and in the correct edit sequence ordering.

With both of these changes, the multiplayer code is now much more correct than before. Clients keep a 1:1 copy of the server data, unmodifiable by any local player edit, and every piece of data sent by the server is annotated with the exact instructions to merge it at the right moment so both the database and the simulation produce the same results as they did in the server.

Multiplayer in 1.6: the bad

The previous improvements were just the prerequisites for the ultimate goal: fully deterministic simulation. This is the holy grail of any networked game whose state is larger than a few KB. A fully deterministic simulation is a piece of code which, given a set of inputs, it always produces the exact same set of outputs, no matter how many times you run it. For NIMBY Rails the set of inputs is the database, and as explained in the previous section, its own networked replication was now working well enough.

But there is a second set of state, also huge on its own, which is not part of the database: the simulation state. This is data like the pax lists inside trains and stations, and all the motion and destination data in a train. Basically all the data which is automatically produced by the game as it runs and which cannot be edited by the player.

This set of data is both part of the input, and the output, of the simulation. In a fully deterministic simulation this data only needs to be relayed once (on game load), and then it never needs to leave the client, because the generated outputs will always be the same for every client on every frame. But, of course, this wasn’t being the case. There are bugs which cause the outputs to differ even when the inputs are identical, and the next task was to find and fix all of them.

Detecting simulation divergences is relatively easy (1.5 introduced a system for this), identifying which exact piece of data diverged is harder, and then discovering which line of code is producing the divergence is very hard. I fixed a good dozen of instances, and it reduced the amount of divergence quite a bit, with small and medium saves showing near perfect determinism when not edited or lightly edited, but in sufficiently large saves issues keep creeping up.

It was going to take me a long time to fix them all, and on top of it, it’s not like game is finished and out of EA. New game logic will be introduced, and although I am always super careful to keep the logic deterministic, without specialized testing it could go wrong again. I gave up in the end.

Multiplayer in 1.6: the ugly

Giving up on determinism means the simulation state also needs to be relayed in some way, to live patch the client simulations that go wrong. Basically the server streams its simulation state. It is impossible to do this for all the simulation data at the normal simulation rate, so it is done at a slower rate. This is how multiplayer simulation worked between v1.1 and v1.4. v1.5 additionally introduced the concept of divergence detection, to avoid sending data which was already correct, but in practice it work very well to reduce the required bandwidth, since very often data was not correct at all.

For 1.6 I am going back to fixed rate state streaming. The server will aim to replace the entire client state at rates which go from 2 seconds for basic train motion data to around 10 seconds for pax data. Basically the client runs a 60 fps simulation which receives a constant but slow (0.5 to 0.1 fps) stream of corrected data from the server.

This streaming gives a hard limit on how big a game can grow in MP, since the player hosting the session can run out of upload bandwidth to stream changes and simulation state. For this reason I researched ways to compress the streamed data as much as possible.

I was already using the zstd compressor for many things, including compressing everything going down the network. Investigating ways to improve the compression rate I discovered that zstd dictionary mode, which I knew about but wasn’t using, can be tricked into being a delta encoder. Basically, both the server and the client keep a copy of the last sent data by the server to use as a compression dictionary. When the server wants to update that data, it compresses the new data with zstd, but tells zstd to use the old data as the dictionary. This produces much smaller results, similar to delta compression. Since the client also has a copy of the old data, it is only sent the delta, and it can recreate the new data using the old data and the newly received delta. After this is done both the server and the client discard the old data and the new data becomes the dictionary for the next delivery.

This system works very well with the kind of updates the server sends. The simulation state vector of a train motion is around 1KB uncompressed, but it is often less than 100 bytes when delta encoded. It makes sense: a moving train state is mostly unmodified frame to frame with the exception of its speed and position, for example. This could also be done by with custom code, but doing it automatically with zstd saves me of a lot of effort. And it also works very well much pax listings, which are quite a bit larger than 1KB and very dynamically structured.

In conclusion, I haven’t been able to reach the goals I set for myself for multiplayer improvements in 1.6. Reaching them would require an investment of time and effort which I believe is wrong at the current state of the game. But this does not mean multiplayer is not improved in 1.6. I think it is better it has ever been, and that it should support larger builds than before, after the expected early 1.6 beta issues are ironed out.

Improving tape tools to support multi track segments

The tape tools presented in October have been iterated and improved. Back then, they only worked within a single track segment. This limitation has been lifted and it is now possible to cover over multiple segments, as long as they are connected to the first clicked segment:

They are also capable of handling parallel tracks, automatically editing them. In fact they must edit them without player input. If you split a parent track, the parented tracks must also be split to keep the geometry. For this reason the tape tools which involve splitting, like the track tape tool, will always automatically apply to parented tracks.

Building tape tool

A new tape tool to create attached buildings has been implemented:

This tool is non-splitting, since it just creates buildings, not touching the tracks.

Parallel track tape tool

Another new tape tool. I think this one will be quite popular:

This tool is also non-splitting. It creates new (single) tracks parallel to existing tracks, automatically setting up the parent-child relationship as extruded controls. It is smart enough to switch sides depending on the mouse position too.

Evolution of the station tape tool

I’ve decided I will try to replace the default station tool with a tape tool instead. To be able to do so it must have full feature parity with the existing station tool, at least for single and double track:

It’s mostly working, but getting the switches and signals right is very hard, since it must be capable handling parallel tracks laid out in any direction. The video shows the easiest case, with a default double track, and it’s still not fully working.