Обновлено 10.04.2023: * добавлены 3 набора данных - телевещание, медицина (спасибо Александре Антоновой), русский librispeech * добавлены 2 модели - vosk 0.42, обновлённый bond005, funasr Мы протестировали доступные модели для...
Whisper is very popular these days, so here are some more observations on it. Whisper has many cool properties like very good generic transcription accuracy or accurate punctuation, but Whisper...
_"The wind blows to the south and turns to the north; round and round it goes, ever returning on its course." - Ecclesiastes 1:6_ _"Hundred years day and night spins...
Whisper popularity wave continues. Many projects appear for whisper-based web services, whisper on mobile and so on. Some projects modify Whisper models and algorithms to improve speed and it raises...
Recently Kaldi project released a pack of models trained on Gigaspeech. You can find them [here](http://kaldi-asr.org/models/m14) Models are good, not significantly better than our previous model, but not significantly worse...
Everyone is crazy about OpenAI Whisper. Trained on 680 thousand hours of multilingual data it indeed sets a new stage in speech recognition. We tested it and some other recent...
While people recently argue if Google's model [is sentient](https://news.ycombinator.com/item?id=31721584) we must admit that another important property of the living creatures emerged in recent AI models - they started to have...
Almost a year we haven't updated the news here. Time goes fast and new things keep us busy. There are some news to discuss, but they are mostly worth a...
What I really like in speech recognition and what keeps me excited about it is an active on-going development of speech recognition technology which boosts both speech recognition results and,...
While dataset sizes grow beyond 10 thousand hours (Gigaspeech) the compute requirements for speech recognition research also grow. Any research even a simple architecture testing gets harder and harder because...