In Russian community Nvidia NEMO got some popularity recently. Big companies like VK.COM and Yandex announce they use Nemo for their production systems. Quartznet and Jasper architectures mentioned here and...
There are so many toolkits and model releases that some interesting things left unnoticed. Some time ago Facebook published a small paper Rethinking Evaluation in ASR: Are Our Models Robust...
Espnet toolkit has got some great recent developments as described in Recent Developments on ESPnet Toolkit Boosted by Conformer Results include great accuracy numbers, so I tried Espnet recently on...
N-gram language models are well understood and widely used. These days n-grams are not the best models for common machine learning tasks like translation or speech recognition, they have been...
There are many factors that affect the quality of the speech recognition system. One is word accuracy (word error rate), others are intent accuracy (something that big companies like Amazon...
Horay, this year I made an effort to review all Interspech paper, it never happened for me for quite some years. Speaker recognition, emotion recognition, ASR for language learning, transformers...
Wednesday is very promising with many interesting papers, challenges and enligthments. Multimodal learning is gaining more and more attention. Semi-supervised learning is everywhere. Important DNS supression challenge and wonderful Asteroid...
Returning to Monday I’d like to mention a wonderful keynote of Prof. Janet B. Pierrehumbert The cognitive status of simple and complex models which covered some very interesting details of...
Interspeech is overwhelming as usual. Thosands of papers and ideas, lives and thoughts. On one hand I kind of like online format when you can participate in discussions sitting at...
Recently some good news happened in Kaldi word, essentially, LINTO project released their French model 2.0. This model is trained on 7100 hours according to documentation and looks a bit...