Semi-supervised Learning and Frame Rate

April 19, 2021

With the development of neural network toolkits it seems that the technology reached the point where huge network can remember and recognize almost everything as long as it was properly...

Wav2Vec2.0 Test Results

March 02, 2021

We continue testing of the most advanced ASR models, here we try famous Wav2Vec2.0, an impressive work by Facebook. Here are previous posts: Nvidia Nemo Wav2Letter RASR The ideas behind...

NVIDIA Nemo Test Results

February 13, 2021

In Russian community Nvidia NEMO got some popularity recently. Big companies like VK.COM and Yandex announce they use Nemo for their production systems. Quartznet and Jasper architectures mentioned here and...

Wav2Letter RASR Model Test Results

January 30, 2021

There are so many toolkits and model releases that some interesting things left unnoticed. Some time ago Facebook published a small paper Rethinking Evaluation in ASR: Are Our Models Robust...

Things about Espnet

January 24, 2021

Espnet toolkit has got some great recent developments as described in Recent Developments on ESPnet Toolkit Boosted by Conformer Results include great accuracy numbers, so I tried Espnet recently on...

N-gram language model toolkits in 2020

December 13, 2020

N-gram language models are well understood and widely used. These days n-grams are not the best models for common machine learning tasks like translation or speech recognition, they have been...

On latency of speech recognition

November 27, 2020

There are many factors that affect the quality of the speech recognition system. One is word accuracy (word error rate), others are intent accuracy (something that big companies like Amazon...

Interspeech 2020 Thursday

October 28, 2020

Horay, this year I made an effort to review all Interspech paper, it never happened for me for quite some years. Speaker recognition, emotion recognition, ASR for language learning, transformers...

Interspeech 2020 Wednesday

October 27, 2020

Wednesday is very promising with many interesting papers, challenges and enligthments. Multimodal learning is gaining more and more attention. Semi-supervised learning is everywhere. Important DNS supression challenge and wonderful Asteroid...

Interspeech 2020 Tuesday

October 26, 2020

Returning to Monday I’d like to mention a wonderful keynote of Prof. Janet B. Pierrehumbert The cognitive status of simple and complex models which covered some very interesting details of...