EA Update #24 - Map Generator 1/3
This update replaces the map generator with a new version.
This update replaces the game's island map generator with a new more robust and expandable version.
I had two primary goals for the new level generator. Firstly, to bring it up to at least the same level as the old generator - but do this with a modular, extendable and configurable approach. And secondly, to fix the biggest problems with the old one - excessive unusable land, poor scaling with world sizes, bad coastal shapes and many minor issues.
There is a lot to talk about designing a complex system like this. I won't go into a huge essay about my software architecture here. There are many reasons why I'm doing the things I'm doing even if they oftentimes feel like over-engineering convoluted solutions to simple problems. But I will go over the most important parts.
Due to Steam post limit, I had to split this up into three parts. This post will focus on the overview and general approach, and the goal of the generator. The other two parts will go through specific generation steps:
Post #2 and
Post #3.
Iteration
The hardest part of making a complex parametrized system is testing and iterating its results. The faster I can preview my changes and the easier I can understand the impact of these changes, the faster I can work on it and by extension produce better results.
This is my best friend:
To accomplish fast iteration, the system has to be designed with this in mind from the start.
A map generator makes hundreds of decisions when building a map from high-level choices (for example, the number of biomes) to specific low-level choices (for example, number of trees per tile). Usually, these are encoded as direct and indirect parameters that I can quickly tweak to achieve the desired results.
The difficulty comes from having lots of parameters and wanting to understand their impact at the same time, which means previewing the difference they make to the map generation. And this is a difficult problem because any changes that happen after the relevant ones can obscure the results. Most parameters have a huge range of possible values and only a narrow band of value produce desirable (or even usable) results.
So I want to preview not just the final result, but every step of the way. To do that, I am splitting the generator's logic into discrete steps or "passes" so that I can look at the independent output(s) of any pass in the sequence.
It is also critically important that the result preview is visual. This may sound like an obvious criterion, but implementing clear and useful previews takes significant additional time and effort. It is thus a longterm strategy, which means it has all the pitfalls of additional code to implement and maintain.
Another important consideration is coding the generator, not just running it. The last thing I want is to break everything by making changes. Implementing separate passes will take more time but not nearly as much time as having to rewrite large parts of the code if I interconnect everything. If I mess up some pass, then I will likely only break its inputs and outputs to the previous and next passes. And realistically I am almost guaranteed to rewrite, remove, add or majorly change many of the passes because I simply don't know how exactly they will work.
And one final and often overlooked benefit is personal motivation. Actually seeing the system work after every change is a huge boost to my motivation. It's the difference between just making something work and investing that extra polishing time. In a large system like this, small details quickly compound.
Passes
I would like to describe every step the generator takes, because I personally find it all very interesting and I could talk about it for hours. Unfortunately, this will end up with a hundred pages long post (and about 20 times over the Steam post limit). So I will try to summarize the most important steps and parts. And even with me being "brief", it is still about 3 times over the post limit. So as I'm splitting the update into three posts - this one (broad overview) and two step-by-step overviews of the generation process:
Post #2 and
Post #3.
Scaling
One thing I barely considered for the old generator is scaling parameters with the generated world size. This is where the huge sand areas of the old generator in the large maps come from. And this is actually a much more difficult problem than it seems, which needs to be "baked into" logic from the start. Every variable that controls something size-dependent for the generator must decide how it behaves with different map sizes and I need to be able to quickly control it. In fact, there must actually be all the variables to control size-dependent features (for example, I cannot scale parts of a noise map without scaling everything). For example, I would want to specify that a beach is 2 tiles wide or 5% of map size, or 3% of map area, etc. Choosing an incorrect value or approach might work fine for the reference size, but may produce terrible result for different sizes.
So when specifying parameters, I make sure to specify (and preview) how they scale:
Generally, the options are "static", "linear" and "exponential" (with inverse variants). Static mean the same value for all sizes - e.g. number of spawn locations. Linear means it scales with the map size (width) - e.g. allowed water edge distance. And exponential means it scales with the map area - e.g. number of stone deposits.
And when testing the generator, I can run any combination to verify that they all work correctly:
Major changes
As mentioned, this update is not meant to majorly change the gameplay beyond (A) fixing and adjusting most problems with the old generator and (B) those changes that are best done now to avoid too much disruption later. So there are still a few bigger changes that do impact gameplay:
Rocks no longer spawn all over the place, but rather in discrete patches focused towards one area of the map (this preview is from the tiny map, because I cannot get a better screenshot from a large map to demonstrate the idea clearly):
Furthermore, rocks now come in 4 variants - Rocks, Coal Vein, Iron Ore Vein and Limestone Vein, with their respective deposits only spawning there:
Consequently, the Mines now need to be placed on the appropriate type of rock:
(That is one chonker of a tooltip... I haven't really come up with a way to shorten it either other than having 8 separate lines.)
Sand and Beach are now different tiles. Minable and usable Sand now spawn in discrete clumpy patches. Beaches spawn along the coast (sometimes in clumps) and they cannot be mined.
They look a bit empty (again, because I did not want to expand the scope of the update), but there is potential to add something here.
Overall, the island shape has somewhat changed. The generator attempts to "fit a hex shape" into the game area, with a roughly-expected distance to edges, but still some larger coastal variation (I explain the various constraints in the other posts). A more "extreme" example of an island that is basically a hex:
The coastline of the island is also much cleaner. There are no extended peninsulas or closed bays - the vast majority of the map should be accessible. But there can still be significant variation of the coastline. Again, a more "extreme" example with lots of "shapeness":
Most islands end up somewhere in between the two, leaning towards more varied coast.
I also added a new grass type for some visual variety and as a proof of concept that I can vary vegetation:
Which kind of lead to discussing the possibilities...
Potential
One of the main goals of the new level generator was to be able to extend it in linear time. In other words, add new stuff quickly. This means there are a lot of things I can now add to the map that I previously couldn't. Some are very easy, some are more difficult, some are impractical. For example, I won't be making multiple islands (impractical), but I can easily add more biomes (easy) or cool features like a river or two (difficult). The great thing is that I am not limited to any particular single algorithm that has to somehow produce all the results in bulk - the generator consists of many individual steps and I have a lot of freedom to add and modify those.
A particular direction I want is to specialize and utilize different parts of the map, so that the terrain shapes the village more than it currently does (i.e. not at all). For example, you might have a swamp in one end that is needed for medicinal plants or flood plains in another part, where crops grow best, and so on. Following this, I might be able to force some production chains to actually need transportation from their production location to their processing location rather than having everything clumped together.
Another big potential feature is allowing player customization for the generated map. For example, forest density, deposit amount, climate bias, etc. I haven't implemented any of this, but I have kept this in the back of my mind when designing systems. I basically comes down to having user values selecting or adjusting one or more generation parameters.
Finally, I could potentially add--if not full replayability with different climate zones and resources--then at least enough differences between maps and their layout to not all look the same.
So this is all something that I'm looking forward to (admittedly, I am also exhausted having spent so much time on the generator.) I won't even speculate on what I want to add next, because there are just so many possibilities. In fact, deciding what to add and--more importantly--what not to add is in itself a big design decision/direction.
Unity engine
I am currently using Unity engine for the game. You my have heard about the recent Unity trust debacle. I won't go into details here, but the main point is that I do not wish to continue using the engine. Unfortunately, MicroTown is many years into development and strongly tied with the engine's features. This makes porting to a different engine very difficult and time consuming. I don't know if I can realistically invest that much time into this. I spent a couple weeks investigating engines and experimenting to see what such a transition would require. For now, I am still undecided. But it's likely I'll have to stay with the current engine.
Future plans
I had to really stop myself from starting to add more features to the level generator. It has already turned out to be the largest "feature" that I have ever worked on. And the final touches and fixes again took way longer than I anticipated.
I wish I could have split this update into multiple parts over past year or so. But unfortunately this is one of those "all or nothing" features. I cannot implement half the generator or even 90% of the generator - it has to be all of it. So updates have been, shall we say, a bit slow (in addition to some other real-life stuff). So I tried to work on it 50/50 mixed with other features/updates, but that hasn't really been a good approach. I will definitely avoid splitting major workloads like that in the future.
So, hopefully, I will be making several "terrain updates" adding something to the generator and either adding or modifying something in related gameplay.
Full changelog
Changes
• Implemented new level generator algorithm
• Overall terrain shape is now properly based on a hexagonal shape, but there is more coastal variation and most landmass is slightly further from the edge of the map, but with more variation
• Terrain is now constructed in a more cell-like pattern, where "cells" correspond to biomes/regions, such as meadows or forests. Trees and other props will have way fewer unnatural "stretches" and "clumps". These "regions" will tend to be more uniform and connect more naturally.
• Islands overall shape will avoid generating with no coastline variation or excessive water or land
• Island now has several larger "climate" or "biome" sections that somewhat alter its layout and features - a part of the island has more forests, a part has more plains and a part has more rocks along with other slight vegetation biases.
• Deep Water to Shallow Water transitions are now smoother
• Water will not have excessive "patches" anymore
• Coastline is less jagged now and fewer extended unusable peninsulas or bays will spawn
• Add Switchgrass, a type of grass that spawns in Grassland
• Rename "Flower" to Marigold
• Add generic Flower concept
• Vegetation (Tree layout, types, tile variation, and Shrubbery/Marigold) layout are all now based on a separate noise patterns to avoid unnatural patterning
• Add new tile Beach that replaces Sand along the coastline, not suitable for quarrying
• Coastal tiles have much less Sand (that is, fewer Beach tiles) now
• Sand, Clay and Rocks now spawn in dedicated patches and cover much less of the terrain
• Patches of Rocks, Clay and Sand won't spawn next to similar patches anymore
• Rocks patches will spawn more in "rocky biome" and will generally include all the ores in that region (although they will still spawn randomly elsewhere)
• Clay patches now only spawn further inland
• Sand patches only spawn closer to coastline
• Several larger Beach locations now spawn randomly along the coast
• Rocks now have resource vein variants - Coal Vein, Iron Ore Vein and Limestone Vein (and their excavated variants - Excavated Coal Vein, Excavated Iron Ore Vein and Excavated Limestone Vein)
• Coal Mine, Iron Ore Mine, Lime Mine can now only be built on their resource tile variant, while Stone Mine can be built on plain Rocks
• Spawn point is now randomly offset from the middle of the map
• Spawn region is now clear of obstructing terrain
• Spawn region will always have some free-standing rocks around it
• Spawn region will only be set in "regular" climate avoiding inconvenient terrain for starting
• Patches with resources will avoid spawning in poorly-accessible unbuildable locations, like surrounded by water or unbuildable terrain
• Animal Habitats now spawn based on distance to coast rather than distance to map edge/center making for more uniform spawning and avoiding unreachable locations
• Animal Habitats now pre-spawn properly instead of appearing one at a time after the world starts
• All the internal properties of level generation now scale with the world size appropriate to their purpose. For example, larger worlds do not get larger beaches or excessive rock patches while smaller worlds still contain sufficient rock patches. Terrain features, like biomes stay proportionally-sized.
• World generation will no longer stall UI (some other operations will also have fewer UI stalling portions)
• Older saves with Mines will have their tiles converted to respective resource tile
• Older saves will have empty stone patches randomly converted to resource patches
• Add worker self-delivery max distance indicator "circle" when building or selecting buildings
• Add the worker self-delivery max distance value to several relevant tooltips
• Item deliveries during carrier shortage will allow self-deliveries, ignoring regular priorities and proportions, but not not stalling/failing to deliver
• HUD delivery info tooltip will also show self-delivery count
Fixes
• Selecting Observer would not show the empty icons above observable building until they had done at least some work
• Fix worker self-deliveries not working above dedicated carrier max delivery distance
• Fix workers delivering 2 or 3 items of their building's output to storage removing/replacing them from the building as worker
• HUD delivery info tooltip showing the wrong number of available (idle) carriers and some other slight inconsistencies due to self-deliveries
• Fix props attempting to transform while a worker is using them, for example a sapling rotting while being harvested by an arborist (and this invalid combo causing games saved at this moment to fail to load)
Balancing
• New terrain generation alters various gameplay elements
Rudy
Snowy Ash Games
EA Update #24 - Map Generator 2/3
This the second of three update posts about the new level generator. The introduction is at
Post #1 and the third/last part is at
Post #3.
I will now go into step-by-step process of the generation passes. For clarity, I will try to explain things in the way that makes most sense rather than sticking to exact technical pass layout. I'll try to group sections by generation "topic", although there is a lot of overlap.
Clusters
Previously, the generated terrain was based purely on a noise map (Perlin noise). This created decent-looking gradual layouts, but this did not let me actually change anything in absolute values. That is, I could adjust things relatively - for example, have sand at 5% threshold, but then it would be random amount of sand up to 5% on any size map, which might be terrible on huge maps. There was no way for me to say - I want
only
3 "patches" of sand, because there was no concept of a "patch" or "scale".
So the first and main terrain step is to produce discrete "clusters" (based on the map size). First, I generate a bunch of random (shapeless) points:
Then I tessellate these (basic Voronoi diagram) into individual (shaped) clusters:
Due to randomness, some of these often end up being too large or too small, so I run an extra smoothing/averaging pass (Lloyd's relaxation):
This produces a nicely distributed and organically-looking pattern (kind of like cells). Jumping way ahead, this is what will give the terrain the "clustered" look I'm going for (warning: now you will never unsee it):
Water
The next big thing that I generate is the separation between water and not-water by heightmap, which is a convoluted way of saying that I'm building a coastline. This is where a noise map like Perlin noise works really well, so that's what I'm using:
Now, I cannot directly apply this to clusters, because the edges need to be water and the middle needs to be the island mass - by design. Previously, I was using a squashed circle on top of the height map to "cut out" the rough shape, but this created a lot of water in some directions and too much land in others. So now I implemented a much more accurate hex shape with proper "distance to edge" calculations:
I can then overlay this shape over the actual heightmap, reducing or increasing the random value to force it below or above certain thresholds:
This makes sure that anything outside the "big" hex becomes water and anything inside the "small" hex becomes land. The in-between area is then where the noise actually affects the shape. With some visual experimentation in later passes, I can tweak the noise parameters to produce satisfactory results.
I can then apply the "clamped" height map values to the clusters themselves and decide which clusters become coastal shallow water and deep water:
Of course, due to the very nature of applying thresholds to randomness over an area, I sometimes end up with little inland pond and offshore islets. For gameplay purposes (and all the issues these cause), I am removing these when I find them:
And this is just the start of the "battle" against randomness "ruining" my islands. I'll talk about this more a little later.
Island shape
There is no perfect way to generate an island shape. Perhaps there are more suitable algorithms just for the island shape generation, but I also have all the other generator's parts to consider and they have to all be compatible. So instead of seeking some perfect algorithm, I'm just brute-forcing the generation a bunch of times until it doesn't get rejected. Some passes are there purely to reject the result and tell the generator to restart.
One such pass is for, broadly, "island shape". I generate a few "control points" in a rough approximate ellipse around the island and check if they are land or water:
If there's too much water, too much land or too much repetition, I reject the shape. Otherwise, it's probably interesting enough to keep. There is no science behind this - I was just looking at hundreds of islands until I came up with heuristics and parameters that struck a good balance between random and still somewhat hex-like shaped.
This is a good time to reflect on just how many maps I generate throughout the process:
I have set up my generator to support output export at every step, so that I can leave it running for many iterations and then check them for any systematic problems. This is not something I could do before, and this is an invaluable methodology for designing and implementing a system like this.
Climate
Among many experiments to make different regions of an island feel more varied, I settled on a "climate" concept, which flags clusters as belonging to a certain climate (to a certain amount) and other passes can use this data to alter their generation. So this doesn't do anything by itself, but it provides additional customization for later passes. It's kind of like biomes, but currently there isn't any biome-specific terrain, so I'm calling it "climate".
For start, I select a few "control points" in a circle around the map (I chose this shape for simplicity and because the maps aren't large enough to come up with complicated layouts). Each climate has a area it can influence:
Then several climates are assigned randomly, specifically 2 forest, 2 plains and 1 rocks (and the rest stay default):
These are just thematic names that describe my intentions. The climate regions then flag clusters within their influence with an appropriate climate value/weight for later use:
Again, jumping way ahead, slight variations is later passes would create noticeable differences:
Spawn
Previously, I placed the starting area or spawn in the middle of the island and manually cleared it from obstructions and non-buildable tiles. And while this works, it's implemented backwards. Instead, what I really want is to decide on the spawn location and then make sure I don't obstruct it with anything and place the relevant things around it.
So now I pick a random location on the island within a spawn selection area torus:
From there, I pick a valid (for example, not too close to water) closest cluster and construct a spawn area (this uses the same logic as regions, which I will describe later):
This creates one "main" point for the spawn and "grabs" nearby clusters to expand the area to a specified desired size. In addition, I place several "meadow rocks" areas, so I can spawn starting stone deposits around but not on top of the spawn area.
Finally, I create a small spawn exclusion zone, where many things can generate normally, but it will mostly remain free of obstruction:
While this may sound simple, it's extremely easy to get wrong and took me enough experimentation to make a working version. There was a good reason why I previously kept the spawn centered and it was to avoid the many issues of random placement of potentially-overlapping terrain features.
Regions
Most of the map is generated in two big parts - discrete features like rocks and clay patches and continuously semi-random ones like forests or flowers. Regions are for preset discrete locations, so they have to be generated first before everything else is "filled in".
It is very difficult to break down region generation into multiple passes, because they are basically all identical except their rulesets. And separating these rulesets is only useful for visualization (which would take huge amounts of work, so I didn't really do it). So what I end up is producing the region map in a single pass:
Generally speaking, regions are groups of neighboring clusters that form an area for some purpose. For example, there could be a 4 cluster region for coal rocks. Each type of region has a long list of hand-crafted rules and requirements, mostly derived from experimenting. Plus, most of these have some sort of rules about distances in relation to each other. So it makes more sense to just see the results for all at once.
For example, here is a ruleset for placing sand regions:
What this says is that it will place 2-3 region with 2 clusters semi-randomly; it won't place two regions close to each other and will place regions close to but not next to the coastline.
There are many more rules and nuances I could add, but it's best to keep it simple - more parameters just means more work adjusting them and debugging issues. And there are already 12 different region groups just to match the basic level generation goals.
There are some notable rules though worth mentioning. For instance, some regions are much more likely to generate in certain climates. Notably, all rocks-based minable regions are more likely to appear in the rocks climate:
Jumping ahead to final generation, this would allow the terrain to shape village direction and create a part of the map that would naturally focus on mining and industry:
This is not something I could have influenced like this before and is only possible because I can "direct" such discretely-selected regions towards certain results I'm looking for.
Another example is a beach region (in addition to regular thin noise-based beaches discussed later), which randomly generates along the coast in a few locations:
This adds variety to the map without covering half the map in sand, like it used to do before.
I should note that while this is how I primarily use the regions for now, they do not specifically just control tiles (grass, sand, rocks, whatever), rather they flag the clusters as belonging to that region. In fact, the number of tiles that later get converted to their "region designation" is more strict than "cover the whole region". So regions in that sense more of an abstract layer informing later generation what to select but not necessarily exactly how. More like "this is a good place for X".
Filler regions
The second type of regions are the "filler" regions that, as the name implies, fill out the rest of the clusters that aren't already designated for something. Because this doesn't require any discrete counts, this can be done with a noise map. I'm using several slightly different noise maps for different purposes, specifically tree cover, tree types, plant type and tile variation. It's also fairly simple to add new variants and then use them in later passes.
For example, here is (part of) the tree cover noise:
This assigns a "tree cover value" to each cluster (point):
Then a bias to increase or decrease the noise value is applied based on the climate it's in:
These are small adjustments that each climate provides, but they can create a significant variation for each "filler":
Now with the noise values decided, empty clusters can decide what sort of content they want. For example, tree cover decides where the forests will be located:
These noise maps will be further used in later passes for relevant selections. The above selection was more or less binary, but there are many more things that can be decided based on the full range of the random value.
And more!
I'll continue describing the remaining steps in the next, but this seems like a good time to reflect upon all the things that I am not doing. As I mentioned, it's important to set myself some limits on what I'm implementing, otherwise I will never finish this. I am certainly thinking about a lot of cool things that I could do.
As an example, I used the cluster layout and the Voronoi diagram edge connectivity to build a cluster "height map", spawn a "lake" region and then build a river running from the lake to the nearest coast:
Of course, this is far from something that I can translate into tiles and gameplay, so it's more of a proof of concept. But I thought to mention this just to give an idea of the possibilities.
Next post:
Post #3
EA Update #24 - Map Generator 3/3
This the third/last of three update posts about the new level generator. The introduction is at
Post #1 and the second partis at
Post #2.
At this point we have something resembling a map:
It's probably fair to say that the generation logic so far is already complex. But this is only perhaps third of the way to the final map implementation-wise. As more passes work with the data, they make smaller and more specialized changes. There aren't too many steps left, but they are generally an order of magnitude more complex in implementation than most preceding steps. This is also why I attempt to get as much top-level decision making done as soon as possible, so that the later passes work with smaller and more specific parts of the map without making any significant changes to the overall direction.
Tiles
The next big transition between the two generation parts is to actually create a tile layout on top of the clusters. This is quite literally how it works:
The main change is that the tiles actually correspond to the in-game tile hex layout (and all the math from here is in hex grid). It's also important to note that they are much denser than the underlying clusters, so all operation are way more fine-grained:
In other words, a single cluster is represented by a collection of tiles. This is how the cell-like structure of the clusters gets preserved to the final version.
The tiles that get selected (at least, the initial version) pretty much just match the cluster they are on:
While it was not really productive to design the main shape of the island using tiles are the basic "unit", it's important to start seeing individual tiles once the main shapes are determined, so going forward all changes will be done to tiles (directly and indirectly):
The downside of using purely random points to determine the clusters is that sometimes they are just too small or too large when they get converted to tiles. And there's no good way to handle this generally. These are basically statistical outliers, so after "placing" regions into tiles, I also either trim or expand them to a maximum or minimum number of allowed tiles:
The perfectionist in me detests this kind of bodging. So this may be a good time to reflect on this.
Randomness
Randomness is always battling with order. As a concrete example, to create interesting rocky patch shapes I have to allow the random values to be random enough that these shapes get generated. I cannot restrict it too much or I will end up generating the same blob over and over. The problem of course is that random generation has no concept of subjective values like "interesting" or "balanced" or "doesn't break pathfinding horribly". To put it another way, the more cool stuff I let in, the more bad stuff sneaks in. Consequently, the more freedom I allow, the more rules I have to write.
There are several ways to ensure that the generated result is compatible with my goals. Barring just hard-coding all the values, I broadly label these "threshold", "fix", "retry" and "accept". This might sound like nitpicking (and I certainly didn't foresee myself writing an essay on randomness philosophy). But given that I have to make dozens if not hundreds of decisions as to how I will handle each such value, there emerge several patterns.
"Threshold" means setting a strict limit to the possible output of random generation. It may be strong bias, it may be a preset range, it may be a hard limit that cannot statistically be outside the designed range. For example, the middle of the island is always solid land - the threshold for land simply exceeds the noise values that are clamped to the hex shape there. The downside is that such thresholds are often fairly restrictive and I may miss out on interesting outputs simply because I deny too many.
"Fix" is the consequence of letting a lot of randomness through. Subpar outputs and awesome outputs are different ends of the same spectrum. If I aim to preserve the good bits, I have to be prepared to handle the bad bits. Besides reading like a fortune cookie, this just means applying certain fixes to expected problems, like the above example of under/over-sized regions. There is a downside though - I have to actually implement it, test it and forever maintain and adjust it with the rest of the generator, which is all the more complex because of it. This is also an excellent reason to keep such fixes self-contained in separate passes and easily reviewable.
"Retry" means literally that - just run it again. There are two general cases when I would do this. Firstly, if something breaks or I cannot generate something. For example, I might want to place a sand patch 5 tiles away from all the clay patches, but I just happened to generate clay so uniformly that there's no place on the map to place sand now. Secondly, when I don't like the result. Like the earlier example with the island shape - I might hard-code some subjective rules, filters or conditions. As easy as this solution is, it's also dangerous to start rejecting too often rather than have better rules. It's also too easy to filter too much and not give randomness a chance to produce neat results. And with a complex generator like this, one must consider that it becomes computationally costly - I just cannot retry things hundreds of times, especially during later passes.
And finally "accept". Sometimes I just have to give up. The result will never be without fault. That's statistically impossible. The good news is that this is actually mitigated by the layered complexity of the generator as well as high error tolerance of the game itself. It's actually surprisingly difficult to generate a map that's unplayable. Even a tiny map that can barely fit anything and fails half the checks is perfectly playable (and I didn't even cherry-pick the example, I just ran the game once):
Shape cleanup
Similarly to clusters, tiles may end up with small ponds and islets because cluster connections do not always translate to tiles perfectly and it's occasionally possible to create "gaps" between clusters when represented as tiles:
These are really rare and they also don't really matter for other non-water clusters. For example, a gap between a forest in the final layout just makes it more varied, but doesn't cause any gameplay issues:
A bunch of logic depends on coast distance, so I calculate each tile's distance to water and land:
Previously, I often encountered islands with annoying unusable coastal extrusions (long peninsulas) that mostly ended up as sand anyway. Similarly, there often appeared long water "tunnels" into land. I still occasionally get these:
However, now I attempt to find "landmass" and "water mass" areas, which are basically tiles that are deep inland or far from coast. This identifies any regions that are only connected by narrow strips of land/water:
As these are undesirable, I reject these results and retry the whole generation. It's fine for there to be extrusions like this, but they have to be "reasonable" and ensuring "wide connectivity" is the approach I found best to preserve enough interesting but not broken shapes:
Another common issue is an undesirable proportion of land-to-water ratio. Some shapes end up being too large (basically a hex), others too small (boring blob in center). So I calculate how much water there is that's not close to the coast and reject shapes with bad proportions:
A similar common issue is the whole shape being "offset" from the center so that some coastlines are really far from the "world edge", making for huge undesirable water stretches. I reject these shapes as well:
Coastal cleanup
The height map random noise does a decent job of creating shallow and deep water around the island, but it is also prone to creating excessive shallow water. Since I cannot "fix" it during the initial noise pass (I would also affect how much land generates), I manually trim away shallow water too far from land (convert to deep water):
This basically makes deep water transitions more "aggressive". This is also mostly a result of visually confirming what sort of shallow water amounts and variations look goods.
Similarly, sometime the noise transitions are so "steep" that deep water ends up touching the coast, so I offset any deep water too close to the coastline (although I fine with there being only a little gap):
Finally, all this fiddling with water level may end up shallow offshore patches. I did consider leaving them in as small cool variations in the ocean, but they almost never actually look good because they are just extreme outliers of the height map noise. So I basically remove them:
Previous height map also controlled the beach elevation. That is, I spawned beaches based on the height map, which was based on the map size. This created really huge sand areas around the coastline, especially where the coastline wasn't regular. So far, I have completely removed any sand at the coastline except the specially designated beach regions. But I do actually want some narrow beach tiles to randomly appear. So I'm using some noise values to randomly change coastal tiles to sand but only with specific thresholds to avoid any excess:
Despeckle
A common issue with the resulting tile layout is that they sometimes have "sharp" edges - tiles that are almost completely surrounded by another tile type. This just doesn't look good in the final version, so I added a pass to trim these to match their surroundings:
With several iterations, this cleans up any unnatural looking extrusions.
A special case of this are coasts that may end up with micro-extrusions of single-tile peninsulas. I really don't like these in the final resulting tile layout, so I trim these:
Technically, this trimming happens before all the water coastline logic because I cannot modify water/land tiles once I have established the distances. (It was a "fun" bug to chase that caused the generator to crash every 1000th run or so.)
Tile variation
At this point I assign tile variation to the map based on a noise map:
Currently, this only affects grassland tiles (and technically forest tiles that may revert back to grassland). Compared to previous generation, these patches and transitions are a less messy shape and don't overlap contrasting variants as much. Similar to other noise maps, climate affects these for some extra variation between "biomes":
Previously, I used the same height noise map for these as all the other features and while it looked fine, it was also very obvious purely random and not following any sort of natural "land" layout.
This might seem like a fairly trivial decorative feature to discuss in such detail, but the map looks really ugly with all the grass tiles being the same variant, especially after seeing the varied background for so long. Seeing such a huge difference made by such a small variation makes me consider all the other places I can vary in minor ways but to great effect.
Fillers
Finally, I get to filling the contents of regions. This is also what I call the process and the objects - fillers. Specifically, I will spawn trees, shrubbery and deposits all with the same general methodology and much shared logic but different details. Similarly to region creation, this will involve fewer passes that instead process numerous entries but in different ways.
First of all, I need to determine the "depth" of the region tiles, i.e. how far from the region's edge they are:
This may sound like a small thing, but it's a very important step that has huge impact on the resulting visual layout. Without this information, I wouldn't be able to make any transitions between region sufficiently gradual (or sharp) to look decent. It's also important for spawning the right amount of fillers, like deposits. This is not something I could not have done previously, because the map had no concept of "region" - all locations where only as gradual as the noise map at that location.
Currently, I do 3 runs for the different groups. Starting with trees, which get spawned in the forests (regions):
The placement rules, which are basically "how many trees per tile in which tiles" comes from the ruleset I specify:
An earlier generated noise map for tree types (which was affected by the climate) is then used to decide what type of tree to place - spruce, maple to birch. This all ends up creating a nice mix of trees in semi-contained regions:
Very similarly to trees, I place minor vegetation - shrubs, grass and flowers. They also have their own ruleset and noise map, but things are just tweaked to accomplish what I want visually (which really is 90% of how I come up with the parameters):
And finally, I place deposits in the earlier established deposit/minable regions:
Most importantly, these are not just randomly spewed across the map with too many or too few tiles, but rather deliberately chosen to occupy certain areas and be of certain size (as per their region). This is one of the main problems I had with the old generator. Now, deposits follow specific rulesets, including actual deposit counts:
It is also important to note that deposits come with the "amount" value. While count means "how many per tiles" - 1 to 3, amount means how many "logical states" they get in gameplay, which translates to how many times they can be mined.
Even a simple algorithm for "filling" the deposits into a region to look nice and natural ends up being a fairly complex process. For example, I prefer to fill in the deeper region tiles before filling the outer tiles. I also want deeper tiles to have more minable amount than outer ones. But all of this is still randomized.
Output
At this point, we reach the end of the generation. The way I have structured the generator, I technically started implementing it with an input and an output pass, so I have always had "the output". This really is an arbitrary point, which is decided by what the game needs and when I stop. As I mentioned at the start, the primary goal was to get it back to working to the same level as the old generator plus fix the primary problem. At this time, that is roughly where it's at. Of course, it's easily extendable in all sort of ways now, so when I'm adding new features, I can now also consider making changes to the map generator.
I skipped describing much of minor logic and a lot of details. I also skipped all the things that didn't work - a lot of the above is the result of experimenting and making multiple versions for passes. I might mention some if and when I add some feature that is tied specifically to some part of the generation.
Combining all the outputs and data from the passes, I get my final "image":
Importantly, this is all not presented in the actual game, but rather only in the separate generator "project" visualized with a quick debug texture creation. And this also contains all the important data from the preceding passes, that does not appear in the final game in any form. For example, none of the cluster and region information, none of the noise maps or stuff like climates is part of the gameplay. Probably 90% of the data gets discarded. In fact, the only new data that I store compared to the previous version is the spawn location. But this data can be fed into the game now, which really "only" has to spawn the things that the output asks for - tiles and props plus whatever hard-coded spawn features. Notably, the game doesn't have to "fix" anything that the generator messed up - it's basically expected to be correct.
Final thoughts
I've been writing the main points to these posts along with implementing the generator. By the time I'm publishing the update, I do a full write-up from these key points. And, oh boy, were there a lot of those! By the end of it, I could hardly remember what I meant by half of the stuff I wrote down at the beginning. These posts also serve not just to discuss my progress publicly, but also to help me think through many design decisions by "talking them out". And there is still a lot to talk about, but I have to keep it relatively short (I say while having to split this up into 3 parts due to length).
The main conclusions and future plans can be found in the first part:
Post #1.