Wav2Vec2.0 Test Results
We continue testing of the most advanced ASR models, here we try famous Wav2Vec2.0, an impressive work by Facebook. Here are previous posts: Nvidia Nemo Wav2Letter RASR The ideas behind...
NVIDIA Nemo Test Results
In Russian community Nvidia NEMO got some popularity recently. Big companies like VK.COM and Yandex announce they use Nemo for their production systems. Quartznet and Jasper architectures mentioned here and...
Wav2Letter RASR Model Test Results
There are so many toolkits and model releases that some interesting things left unnoticed. Some time ago Facebook published a small paper Rethinking Evaluation in ASR: Are Our Models Robust...
N-gram language model toolkits in 2020
N-gram language models are well understood and widely used. These days n-grams are not the best models for common machine learning tasks like translation or speech recognition, they have been...
On latency of speech recognition
There are many factors that affect the quality of the speech recognition system. One is word accuracy (word error rate), others are intent accuracy (something that big companies like Amazon...
Interspeech 2020 Thursday
Horay, this year I made an effort to review all Interspech paper, it never happened for me for quite some years. Speaker recognition, emotion recognition, ASR for language learning, transformers...
Interspeech 2020 Wednesday
Wednesday is very promising with many interesting papers, challenges and enligthments. Multimodal learning is gaining more and more attention. Semi-supervised learning is everywhere. Important DNS supression challenge and wonderful Asteroid...