I was wondering what the most computationally efficient way of implementing
a biquad cascade would be? I've implemented an alsa application to
generate the filter coefficients and process some audio. This works great
on my linux machine, but when I put the same code on a less powerful
embedded system, it sounds horrible and takes almost all the cpu.
The linux machine can handle around 20-30 biquads with no problem, but the
embedded system craps out using only 2.
Is there something I am doing wrong with the cascade? I basically have a
for loop which processes the audio buffer a certain number of times,
depending on how many biquads are used. Is there some way to process the
audio in parallel to save some processing time?
Thanks in advance