After testing a couple of games in desmume for audio issues there really seems to be some weird cackling sound bug in most games.
But it has nothing to do with your computer specs but with the dual spu synch mode in the audio options,it produces weird static on most games.
When i switched to synchronous method(either of the three options are fine) the sound sounded perfect and exactly like the video posted above.
edit:Tested some roms on both no$gba and the japanese desmume port and the they both have the same issues unless you use synchronous method for audio.
Yeah, method P gave me the best results, and that is what I used for the recording, which is why it at least sounds normal at points.
With the dual sync method it was completely bad.
You are doing deep, critical, scientific research on an issue that happens because your PC is slow shit and you're getting audio stuttering because it can't keep up with the sync. You would also witness this exact same issue if you were to try running another emulator your rig can't handle which has forced audio sync. Apparently my post was invisible as I showed clear proof that the sound is fine on my end. There's nothing wrong with Desmume's audio but apparently your brain is suffering from buffer underruns.
If you're done being retarded, the word of the day for you to google is Race Condition.
If they upgraded at least to SDL2, they could call SDL_QueueAudio to write the audio data to the buffer when they're ready, rather than this approach where they take no effort to make sure that their internal buffer and the SDL buffer is synchronised.
How most other projects treat the audio buffer varies, but generally they have this concept of Audio Latency (typically between 50 to 200 ms) where the idea is that you fill up a buffer ahead of time to ensure that there's plenty still in the buffer so you don't end up with buffer underruns like this.
And for the most part, projects generally fill the buffer to capacity whenever they see an opening, it doesn't matter if they just simulated a 5ms step, if there's a 30ms gap in the buffer, by jove you better believe it's going to fill all 30ms with data.
This is where the need for mixing comes into play, if you start playing sound effects, you generally want it to play soon-ish, and not tack it onto the end of the buffer so it plays much later, so you'll just mix it into the middle of the buffer.
For these the read/writes happen at an offset that's about 50-200ms into the buffer, which has the obvious advantage that you'll be done mixing the audio by the time it's copied to the output stream.
What desmume does instead, is it fully generates the audio internally, then passes a fixed amount of data through to the sound module.
With a 44100 sample rate, it does this at a rate of 3 samples per horizontal line drawn.
This has absolutely nothing to do with my PC's specs, and everything to do with the person who wrote the code not doing a good job.
Go open up sndsdl.cpp and look for yourself at what it does - it's just a bit over 200 lines.