NIMBY Rails devblog 2021-11
New multiplayer simulation state synchronization idea
NIMBY Rails multiplayer is based on state streaming, with two major data streams: the database and the simulation. The database is everything a player manually creates, edits and removes from the game. The server collects all these changes from the clients, merges them into a consistent state, which becomes the absolute truth, and then retransmits the changes to the clients. This system is working well and it is optimized.
Simulation is everything that happens without player input. This is mainly the pax simulation and train simulation. Streaming the simulation state is similar to streaming video. Both the server and the clients run a full simulation of the game, but the server, being the sole source of truth, progressively streams its entire simulation state to all the clients. The clients then merge the simulation stream into their own local simulation, replacing the state objects as they arrive. This is rated down to a much lower speed than the actual full data rate produced by the simulation. For example, train state is streamed at a 1/180th rate compared to the frame rate. This is because the amount of information to stream at full frame rate is too much even for a simple game. But even with this massive downrate it’s easy to reach the limits of the underlying Steam CDN used for multiplayer communication.
I’ve started testing a new idea: hash-based retransmission. The gist is that both the server and the clients run the full simulation as before, but the server only streams one hash per state object, rather than the full object. This massively cuts down on the bandwidth. The clients then receive the hashes, run the simulation frame, and calculate their own hashes for the state objects. When a client object hash does not match the server hash it requests the server to re-send it fully in the next simulation stream update.
For this idea to be successful it requires the simulation to be as deterministic as possible. The basic hash streaming has already been tested (without the retransmission part), with the goal of testing how deterministic the simulation was. Plenty of sources of indeterminism were identified in some state objects and simulation logic, and those have been fixed. More work remains to check all the object classes streamed as part of the simulation, and then to implement the actual retransmission system. An interesting phenomena of “rippling” has been observed: when one object is desync, it tends to desync other objects (think a station pax list affecting the pax list of the trains waiting on the station, since the trains board pax from the station). With the current full state streaming this is a non-problem, but with retransmission it could bring down the whole idea. More work also remains to identify these situations and fix them (by making sure related objects are always hashed and streamed together, for example).
Alert system for trains
One of the top requested features is now in the game: train alerts. These alerts trigger for all the existing train error states, and are carefully timed so they don’t trigger redundantly. The alerts and their timing are configurable by the player.
There’s an extra alert condition which is not an error: signal waits. It has been very often requested to trigger an alert when a train waits for too long at a signal, since that often means some network or design error in the player build. Players can adjust the timing of these alerts on a per-signal basis.
Ongoing investigation on vertex buffer leak
Since the first release of the game, but increasingly so in the latest versions, a vertex buffer leak appears randomly for some users. It has been very hard to track down, since there are no fixed steps to reproduce other than “play the game for hours and do something indeterminate other than staring at it”. I first suspected the culprit was my map tile cache system, which keeps around vertex buffers to quickly draw recent tiles when scrolling around the map. But after a very intense review and a partial rewrite of this cache system, the problem persists. So it was time to look beyond my own code.
Part of the vector graphics drawing in the game is handled by a third party library. It’s mostly used for the extra track decorations and icons shown in the track editor mode, like the curve circles, node icons, envelopes, etc. This graphics library allocates vertex buffers in a “high water mark” fashion: when drawing a frame, it will try to use a previously allocated buffer from the previous frame. And if that’s not enough, it will allocate additional buffers. Repeat for the next frame. In this design the buffers are only deallocated when the game terminates. Thus, if the game has at one time required thousands of buffers, just for one frame, these will never be deallocated. It’s some kind of “soft leak”, since the buffers remain usable for the library.
This design choice for allocation can be fine if the overall drawing complexity of the application has a clear boundary, but this is not the case for NIMBY Rails. The design of NIMBY Rails not setting any limits to how much the user can build, combined with the way this library handles buffer allocation, mean that it’s possible to reach a situation where the library reaches the limit for buffer handles. In a simple test with less than 5K tracks I was able to grow buffer handle allocation to 20% of that limit just by selecting multiple tracks in the editor, most of them by the vector library, while the map tile cache behaved properly. I will try some ideas to improve the memory handling of this library, like switching to transient buffers, since it never persists buffers between frames.
No way signals
A new kind of signal has been added to the game: no way signals. This new signal does not allow trains to pass by. By itself it’s not super useful, but it’s also tag filtering enabled. This means it’s now possible to designate sections of tracks which are only usable by trains with a given set of tags (or missing a given set of tags).
This is the first filtered signal which is able to modify train path finding. It’s still a fully static signal, so it’s not a huge change, but it also meant a review to how paths are cached and handled. Thanks to v1.3 private train pathfinding this was very easy to implement.
Accounting charts
Management games need charts, so NIMBY Rails now has charts. Well, one chart, but you can put everything in there in you want. When deciding how to implement this feature I looked at how other games usually separate their charts by concept or even by individual game object, but I didn’t like that. I believe charts are more useful when used to compare data rather than just to observe trends, so I wanted an implementation which allowed to do both. And it poses no limit on what items you can compare. If the existing accounting feature is keeping track of it, you can chart it. All of them at the same time.
This posed some UI and UX problems. For the item selection, since it’s essentially the same as the existing accounting data, I made the accounting interface itself the data picker to build up the graph. Since it was going to be very common for the plotted data to share colors, I sneaked in a little homage to a classic solution of the 80s, when the displays in business computers were monochrome, by using pattern fills when a color is duplicated.
Accounting data export
Implementing charts made it clear it’s impossible to give support to every player idea about charts and data crunching, so rather than devoting an entire major version to twisty statistics features, I’m providing a full data export of the game accounting in TSV format (in v1.3.12). These files can be directly loaded in Excel or other spreadsheet applications, or into databases or statistical software if you are really serious about your numbers. May your stonks always go up.