Adventures in map processing
As I explained last month, this time I wanted to rebuild the map on modest hardware, just an i3-12100 with 32GB RAM. The strategy was mainly about avoiding large memory allocations so it fits in the limited RAM, with every OSM shape already pre-sorted into buckets corresponding to the game highest level of detail tile. It was looking doable, but in the end the limiting factor was not the RAM, it was the CPU. My OSM processing code is linearly scalable, it can take any number of cores and saturate them 100% of the time, and the i3 only has 4. Additionally I was starting to remember what a huge PITA processing some of the raster layers was even with an oversized server, so I went shopping again. I still wanted to avoid renting a bare metal server for tasks that take at most a few hours to a few days, so I looked into cloud instances, with little hope of finding something affordable in the size range I wanted. But I was wrong, and Hetzner again delivered, with 32 core / 128GB AMD Epyc instances in the range of around 0.5 euro/hour. On top of that they call them “dedicated” and promise top performance even you hammer them down hard, and I can say it’s true. For one of the builds I completely saturated 32 cores for 20h and it worked perfectly the entire time.
The agony that is reprojecting raster layers
Processing the various map layers went well in general, within the parameters and timings I predicted. The OSM layer in particular was very fast to process, with a combined time (initial index + actual processing) of less than 8h for the whole planet OSM file. But this is because I control 100% of code involved and spent a lot of time optimizing it for exactly my purposes, so it’s understandable general purpose tools would be slower.
For the processing the raster layers I use a thin layer of my code over GDAL. GDAL is lifesaver, with an incredibly huge amount of image drivers, and it’s capable of reprojecting and rescaling over anything you can imagine.
The second most important layer, the buildup layer, requires reprojection from Mollweide to the plain WGS84 coordinates used by the game (officially the game uses Web Mercator projection over these WGS84 coordginates but it’s actually more complex than that, since the projection is dynamically adjusted; but that’s not relevant for the dataset). My guess is that the researchers who made the buildup layer use Mollweide because it preserves areas, and one of the most important variables of their data is building density.
Unfortunately gdal_wrap is not parallel. It does have some multithreading capability, but the computing path does not appear to use multiple threads. I reviewed my notes from version 1.3 and yep, it took me 40h to reproject after cutting down the resolution by 2. So I had to chunk it down into ~20000 independent files (~2h), and then run gdal_wrap over them with GNU parallel (~8h). And I finally had (huge) raster dataset ready for my own processing, which this time went down all the way to the max detail layer of the game, and it took 12h to process. I wasn’t expecting population processing to be the slowest this time, but it was worth in the end, as you can see later.
More OSM cuts
I started this project wanting to include as much OSM data as possible, with only a couple specific blacklists. But it turned out there’s just too much of it. The first thing I noticed is that the tag data is just to large to include whole, so I came up with a tag whitelist of just 53 tag names, and then with the top 250 values for each whitelisted tag. Despite this, out of the 395M non-building ways in OSM, the game features 376M, with just 5% not making the cut (never counting buildings here, of course). For nodes the blacklist had to be made much more severe. For there’s millions of shop= nodes, and I don’t think having the location of every Burger King™ in the world is adding much to the game. Or the location of 30M power utility poles. Still, the game now features 95M nodes, and all of them have a name, so independently of how much or little this data is useful for future POI gameplay, at least it makes the map more complete looking.
Despite this, the game will be slightly larger compared to 1.9, around 3GB more. I think it’s worth it given the extra detail in the OSM and the population layers.
Separate files per layer
1.10 splits the map data in multiple files, one per 1-2 layers.
This will make it possible to update some layers independent of others. Of course the huge one remains the OSM layer, but the smaller ones could be updated more often, or be the subject of experiments, like increasing the resolution of the population layer.
Megatexture for raster layers
Rasters layers are now rendered using a “megatexture” -like system, with shader-defined texture filtering. All map textures are now organized in a few, large texture arrays, rather than keeping a single (dynamic) texture per tile. The tile cache system directly handles the upload and discard of array layers. This is very similar to what is already done in other dynamic texture systems in the game, like train or track mods. But the new feature is the shader-defined filtering. When a tile draw is issued, the fragment shader has available the layer index of neighboring textures, and thus it is capable of filtering beyond the borders of the tile it’s rendering. This makes raster rendering seamless in 1.10:
In theory, if the data is correct, this shader sophistication is not required. But when sampling the source raster layer at close, but not quite, its native resolution, like in this case for the DEM layer, it’s easy to introduce subpixel errors in the boundaries. Proper sampling would require storing sub-pixel values for the borders and a coverage %. But with this shader-based system I can seamlessly tile anything, no matter how cursed the source data is. My main worry is taxing the GPU, specially since this is not the only new shader-heavy feature in 1.10.
Map layer: OpenStreetMap
I’ve already introduced some of the data-related decisions around the OSM layer, so let’s take a look at how it renders. Keep in mind I still haven’t implemented the new map style system or new styles, so colors and textures are just like 1.9.
A sometimes requested feature is now in 1.10:
The 1.10 OSM renderer now can interpret the layer= OSM tag, and properly sort OSM objects based on it, for the correct layered drawing. It also works for tracks:
Non zero layer ways do not collide with ground tracks, and they display correctly under or above them. All non-ground tracks are always under or above all OSM ways, independent of their layer, so there’s no new collision rules introduced.
Signed distance field line drawing
OSM line features are now rendered using signed distance fields, with rounded corners:
This should cut down on a lot of artifacts related to acute angles in OSM ways, and it just looks much better. It also gives analytical antialiasing for free (in the screenshot the inner part is aliased because it’s still being rendered using alpha textures to simulate borders, this will be fixed).
Map layer: land cover
The land cover layer is what gives the map color at low levels of detail. I just updated this layer to a newer version and didn’t do much else. I would like to revisit this layer at some point, but I preferred to focus on other more gameplay relevant layers for now.
Map layer: DEM
The elevation layer is the same one implemented in 1.3, at the same resolution. Like the land cover layer, this is a cosmetic layer, so I preferred to not seek more detail for this layer (and it’s already kind of large for being just cosmetic). That being said, it looks better in 1.10, thanks to the tiling fix, and because the hillshading is now done in the shader:
The hillshading shader uses the same lighting formulas a 3D game would use, deriving normals from the elevation data and lighting it with a imaginary sun.
Map layer: population and buildup
And the best for last, the population and buildup layers. These are gameplay relevant, and I wanted to make them more detailed. For the population layer I updated to the newest version, GHS-POP 2023A, and sampled it almost at 1:1 resolution. The combination of the newer dataset and 1:1 sampling means the station coverage values should be more correct. Which means, more pax, not less, fear not. The reason is that 1.3 undersampled this texture by around 50%, and GDAL does not offer a sum sampler, so I instead went with average, resulting in around half the population values compared to the dataset. This is now corrected because 1.10 just doesn’t undersample this layer.
The big upgrade is the buildup layer. This is now an extract of GHS-BUILT-C 2023. This is an extraordinary dataset, with a native resolution of 10m, which gives, for every pixel, a categorization of residential vs non-residential usage, plus a building height estimation. It is basically a building layer in raster format. I really like it and I wanted to put as much as possible of it in the game, but figuring out the epic compression required for the full detail would take me too long (see the earlier agony on processing this layer) and 1.10 has been baking for awhile, so I decided to cut a bit shorter than I would have liked. In the end the game samples this layer at around ~15m and it stores just one bit (built or not), without any of the finer categories.
On this medium zoom image you can see a lot of little dots. These correspond to almost one building, since it’s using nearest neighbor sampling. 1.3 also offered something similar, but it was much more sparse and low resolution:
When zooming in more and on a more dense city you can make out the streets between the buildings:
Specially in cities with larger blocks:
I would like to experiment more with this layer, using it as some kind of “seed” for the procedural buildings project I started two summers ago, but for now it will just be this funky pink lo-fi building layer in the game.
Map style system
The last remaining piece of 1.10 is the map style system. Only a skeleton of it exists right now, the minimally required to copy the 1.9 map styles with very simple OSM tag matching. The idea of this system is to offer a matching + style block system, similar in concept to CSS. A modder will define style blocks (line width, color, textures) and then add one or more matching rule to each block, like “all highway=primary and highway=primary_link objects”. Multiple blocks can affect a single object, and a single block can affect multiple objects. Blocks can override each other too, if they appear later in the mod.txt file, or if they are loaded after some other block (so for example, modder style blocks can always override the built-in style blocks). Overrides make it possible to develop minimal mods that for example only change the display of one kind of highway, without having to include the whole base styles in the mod. For this reason it will be possible to enable more than one map mod at the same time.