Opus and MP3 for speech recognition
Recently got discussion what is worse for telephony audio compression - opus or mp3. I was under impression that opus is unconditionally better than mp3, but it doesn’t seem the case actually. At least at 32kbps MP3 is even better in terms of SISDR and OPUS and I believe in accuracy rate. I believe this due to non-streaming nature.
The problem of MP3 is spectral masking but it doesn’t happen at 32 kbps.
At the same time 24kbps is harmful.
Publications seem to confirm that actually, relevant ones:
Looks like major speech corruption reasons are not really related to mp3. Most likely those would be background noise, frame drop and bad denoises which people apply one sound to “improve” the recognition.