After spending hours trying to figure out why the copy synthesis was so glitchy when there weren't similar problems with the synthetic vowels, I noticed that I had written two options for the routine - one using FFT to perform analysis, and the other using DFT.

I had been using the FFT version of the code, which I had forgotten was less stable than the DFT. So the problem was with the analysis code, not the synthesis.

With that out of the way, I copied the core rendering code into the DFT copy synthesis and was promptly underwhelmed by the result.

Clearly, I'm doing something wrong . The output lacks detail and sounds muffled.

My alternate synthesis method - which directly rendered sine waves at the harmonics - sounded far superior.

So I took that code, and used the IFFT to render it instead.

Unfortunately, the results were still lacking.

That's because small pitch wobbles would sometime cause pitches to shift from one bin to the next, and the result was audible. Bumping up size of the FFT array took care of that, but increased the render time. Keep in mind, I'm currently using pure Lua, so in compiled C I'd expect different results.

My listening tests haven't been comprehensive, and there are still some audio artifacts that I need to address. But by and large, rendering the sine waves directly seems to give the best results.

I'm not ready to totally abandon the IFFT, though. I still want to experiment with it for generating aspiration, and see if I can get better results than plain filters.


This free site is ad-supported. Learn more