Things about Espnet

Espnet toolkit has got some great recent developments as described in Recent Developments on ESPnet Toolkit Boosted by Conformer Results include great accuracy numbers, so I tried Espnet recently on...

N-gram language model toolkits in 2020

N-gram language models are well understood and widely used. These days n-grams are not the best models for common machine learning tasks like translation or speech recognition, they have been...

On latency of speech recognition

There are many factors that affect the quality of the speech recognition system. One is word accuracy (word error rate), others are intent accuracy (something that big companies like Amazon...

Interspeech 2020 Thursday

Horay, this year I made an effort to review all Interspech paper, it never happened for me for quite some years. Speaker recognition, emotion recognition, ASR for language learning, transformers...

Interspeech 2020 Wednesday

Wednesday is very promising with many interesting papers, challenges and enligthments. Multimodal learning is gaining more and more attention. Semi-supervised learning is everywhere. Important DNS supression challenge and wonderful Asteroid...

Interspeech 2020 Tuesday

Returning to Monday I’d like to mention a wonderful keynote of Prof. Janet B. Pierrehumbert The cognitive status of simple and complex models which covered some very interesting details of...

Interspeech 2020 Monday

Interspeech is overwhelming as usual. Thosands of papers and ideas, lives and thoughts. On one hand I kind of like online format when you can participate in discussions sitting at...

Vosk/Kaldi French model

Recently some good news happened in Kaldi word, essentially, LINTO project released their French model 2.0. This model is trained on 7100 hours according to documentation and looks a bit...

Status of Vosk in October 2020

When you work on things day to day you lose the overall picture very quickly. We’ve been actively training models and fixing things here and there and adding new platforms....

ML datasets are not relevant anymore

We have started promoting data collection for open source speech recognition at Voxforge project in 2007. It has been a great time before the speech recognition revolution but even then...