NIMBY Rails devblog 2022-05

Pax flow rework

As part of the multithreaded station processing introduced at the end of April, I changed the way station pax are processed. In particular the meaning of the “flow”, the system used to fairly divide the frame processing time between stations and trains. It is now based on equating individual pax clusters moved to/from trains to one flow unit, rather than equating every pax cluster with the same destination to one flow unit. This means flow is now more regular and predictable, and is closer to the original idea (1 flow unit = 1 pax). Unfortunately the first iteration of this change had a major bug (pax processing “waits” until certain amount of flow has accumulated, and it was set to a huge number) and it resulted in long train waits, but it was quickly fixed in the next build.

Controlled flow is important for 1.5, since pax processing time will be exponentially slower, so having finer and well defined control will help tuning the sim.

Extra building modding features

Some often requested building modding features were added in May: default color, default sizes, default attach values and a second painting layer for decal support. None of this required a new schema, and mods can opt into using these features just by declaring them in the mod.txt.

Discarded development: inline station mode

It’s not often that I disclose a discarded feature, but this one took me some time and it was abandoned after implementing most of it. I implemented an inline mode for station creation, allowing to select any contiguous track pair and have it converted into a station platform, as long as said track pair was compatible with all the platform rules. And this last condition was what made me discard this feature. Since platforms make use of the entire track, the track segments were not meant to be longer than 800m. In practice this gave very poor UX for this editor mode. The player would need to careful lay down the tracks to make sure that limit was not exceeded without any feedback from the track editor, since track creation was still a separate mode from this new (inline) new station mode. In practice it mean often rebuilding some area with shorter tracks, which was just the same amount of work compared to using the current new station mode.

Clip multicollections and workshop collections

It is now possible to create multiple clipboard blueprint collections, and then share them in the Steam Workshop. Players can then browse the Workshop and subscribe to your collections, and they will appears as read only collections in their track editor. Shared collections support modded tracks and buildings at the game level, with subscription prompts, and I see modders are also making use of the mod dependency system in the Workshop itself, which is cool.

It was a bit slow to bring up this feature since clip collections are in essence mini-saves and as such they need to participate in the game model version system, so the game can evolve but still be able to load old collections. In the end I also had to introduce a retro-version check so old builds (only from 1.4.22) don’t crash when presented with future collections.

More multithreaded optimizations

Another month, another turbo geek rant on multithreading. Tracks validation was made multithreaded, to go along with stations, and the speedup was even larger than for stations. The reason is that half of track validation is versus other tracks, so this avoids loading any map data tile, making for a very speedy validation. These validation changes are not just a QoL for game loading time, since they also apply while in game when pasting large clips. And for MP it’s always a very good thing to load as fast as possible so the initial fast forward sync is finished earlier.

Now for a couple very technical items. These new low level features will be enabled in 1.5, but the code was developed and tested in 1.4, and disabled in 1.4 builds. Since 1.4 is basically done I don’t want to introduce instability so late in its lifetime.

The basic way to distribute work in the game between threads is to use a task job system. Simplifying, each task represents one game object, and they all want to run at the same time. A thread pool takes the tasks and runs them in some random order. Mapping these tasks to game objects is not exactly easy since the base game objects with AI (stations, trains and lines) are stored in a versioned database system, and iterating over a “live”, correct snapshot of said database produces a complex recursive chain of calls while the iteration dives down in the earlier, unmodified versions of the objects. So a first step before these AIs could be made into tasks was to walk this iteration, accumulate all the objects (just the ID or pointer, of course) into some flat container, and run the tasks based on that. Once this flat container is ready it is possible to run the station processing in any way desired: sequentially, one task per station, or as it is done now, one task per core and then each task consuming from the flat container using an atomic counter. This allows to have just a few tasks with heavy private data, rather than many smaller but dumber tasks.

Unfortunately building the flat container takes time. So the AI code now has another tool to iterate over the object database: parallel iterators. These iterators first wrap the complexities of iterating the multiversion database into a serial iterator type, then this type is wrapped in a parallel iterator system which uses an atomic counter to skip the iterator position into the one assigned by the counter. It is now possible to directly iterate over the database from more than one thread, with each thread consuming objects at its own peace (without any static assignment of objects to threads!), in a fully atomic and safe way. This has all the core work distribution advantages without doing any setup before the iteration is run.

While I enjoyed geeking out this hard on multithreading, I realized that I was missing one of the most basic multithreading optimizations made in games: run the logic asynchronously from the game UI/render. I never implemented this since the “correct” way of doing so involves double buffering the game state. Since the game state in NIMBY Rails can easily run into multiple GBs, I never considered doing this. I have a far future idea of using the versioned database system I mentioned earlier to create immutable snapshots of the database and then draw the UI from them, but right now this is not possible since all the simulation state is kept of out of the database for a (big) performance boost.

Instead, I realized I could implement a lite version of asynchronous game logic. Make the game logic asynchronous, running alone in its own thread, but when the main thread decided it was time to render the game, stop the game logic thread (at the end of its current simulation frame, waiting in the main thread for it), render the game with the assurance nothing else is touching the game state, and when the render is done (and thus any changes the player made to the game state), resume the game logic thread. I tested this and it turned out it worked just fine, giving some sim speed boost when vsync was enabled, as expected. It will be more important in 1.5 since parts of the logic itself will become asynchronous over the rest of the logic, with an evolved system for precalculating pax pathfinding over multiple sim frames, for example.

Fixed timetables and 1.5 expectations

During June I will try a few more multithreading ideas (1.5 needs every possible optimization I can think of), but my intention is to start private development of 1.5 features ASAP, with fixed timetables as the one and only user facing feature planned for 1.5 at the moment.

I’ve read a lot of player suggestions on fixed timetables for trains during the past year and most of them miss the fact the game has a pax simulation too :) The most important feature of fixed timetables is not how much control they give you over train timing. It is how pax are going to be able to plan a trip given all the timetables in the network. And even for the simpler ideas it is an order of magnitude more difficult (both in programming and in processing time) compared to the current line system. Whatever shape the timetable system takes it will be first and foremost in support of keeping the pax simulation capable of at least running at the current speed, if at all possible. Finding the correct point between the current line system and “empty trains but line timing so advanced the player can run custom code as the train AI” will be the goal for June.