Performance
Now, I shall turn to performance: myself and a few others weren’t really happy with the late game performance in Stellaris. So we tried, with various methods, to make it better. And, to be honest, the results are quite promising.
In general, there are three ways of making performance better: fix cumbersome logic, use caching, and multithread. It’s not so easy, though, because the latter two solutions have some unpleasant friends: caches like to hang out with Out Of Syncs and miscalculations, while multithreading likes both CTDs and OOSes. But we managed to make headway on various issues.
First and foremost, modifier updates were quite expensive. For instance, if you enabled an edict in your country, it would trigger a recalculation of the modifiers in every planet, fleet, ship and pop owned by this country. This ends up being a lot of recalculations, which could take a while, since updating modifiers was not exactly trivial - basically, the nice tooltips in pop details which show where all the effects they have come from take a fair bit of processing power to achieve.
There are two improvements on this front in 3.5. Firstly, we multithreaded the modifier updates. It was all done in serial before, and could lead to lengthy freezes and some daily ticks taking far longer than others. For good reason - there were a lot of interdependencies to untangle, and crashes to prevent. Still, this had a pretty big effect.
Additionally, I noticed that pops were getting modifiers added to them that were irrelevant, like megastructure cost modifiers. So an additional parameter was added to economic categories to specify where the modifiers in them were expected to go, which ended up saving a further 20% of time while recalculating modifiers.
Aside from this, a fairly big improvement came when we noticed that our multithreading system would occasionally bug out and have all threads waiting for each other for several milliseconds. This made multithreading relatively fast operations actually end up being slower on average, and even where it was still worth it, it’d be slower than it needed to be. Luckily, our Tech Director was able to save the day and stop this madness, improving our player experience immeasurably.
This is just a taste of the things we did. Basically, a lot of time was spent analysing a deliberately absurd save game - it was set to maximum pop growth, tech progress, galaxy size and habitable planets, and then run without crises or fallen empires until 2730. The game chugged a fair bit (with which I mean, it took 11 minutes for a year), but it turned out the main culprit was strike craft movement. I was unable to replicate this on any normal save - it probably only occurs during battle, and there were presumably some extremely large battles going on - but anyway by multithreading that it actually became somewhat more reasonable to play.
From there, there were a lot of incremental fixes that were possible. For instance, the game spent quite a bit of time working out whether a system had colonies in it, several times a day - something which could be cached quite easily. (Well, so I thought. Two OOS fixes later, I was regretting this). Another big offender was wars: a country didn’t have a cached set of wars it was fighting in, but would instead recalculate all the wars it was fighting in whenever it needed to know information about any of the wars it was fighting in. Needless to say, this was quite inefficient, and fortunately quite cachable. Then, whenever the game was working out which type of starbase a starbase was (e.g. “Bastion”) - including every few frames on the outliner - it would look through every different type of starbase and decide which one was most appropriate here - the answer? Cache, cache, cache.
We also sped up the time it took for the AI to decide where to establish branch offices. Then we noticed that most of the time spent updating countries in serial was spent assessing the validity of their policies - something which we could thread. (Ok, it’s not so easy: we could thread the calculation, cache the result, and then implement the result in the next serial update, which would be basically straight afterwards). Finally, monthly ticks should be a bit quicker, since we used a similar logic on the calculations for which pops should assemble, grow and decline, and also removed about 50% of the job cache recalculations during auto migrations.
It wasn’t all plain sailing - aside from at least three OOSes and two crashes introduced and fixed with this work, we noticed after a while that the game wasn’t quite as fast as we were expecting. In fact, it was temporarily freezing every few days. This, it turned out, was because - thanks to some logic fixes with unintended consequences - the AI was thinking extremely hard about where to place its megastructures. So some changes to put the more expensive checks there were needed.
How much impact will all this have? Well, I can’t promise any particular number, for various reasons. For one thing, most of these improvements were made in July, and it’s always possible we added new inefficiencies in the meantime (e.g. the megastructures thing). Also, in particular due to AI games, a 3.4 save is not very comparable with a 3.5 save, since the AI will have behaved rather differently in creating that save (but on the other hand, you cannot load a 3.5 save in 3.4, because it will almost certainly crash). Besides that, a particular save may have a particular thing going on causing it to lag (e.g. the strike craft issue alluded to before, which had otherwise seemed a non-issue). And sometimes computers are just temperamental and allocate their resources to doing things other than playing the game. Nevertheless, last time I did tests based on 3.4 saves, it was more than 30% faster. In my latest late game campaign, it’s not precisely fast (the late game will never be as fast as the early game - there’s so little going on in the early game that performance is primarily determined by how fast it takes to render frames), but it definitely feels less slow. But I look forward to seeing what people think, and whether people feel that more effort is needed.