We recently evaluated Russian open source and proprietary TTS models. Here are the results: Engine Voice CER xRT GPU xRT CPU UTMOS Similarity Avg/Min Encodec FAD Silero v3_1 Aidar 0.7...
There are two extremes these days - one party claims that LLMs has magical emergent abilities, another claims that AI is overhyped and will end soon. The real situation is...
Whisper ASR is a great technology with many innovative things. For example, multiobjective transcription/translation training, a huge 600k hours training dataset or long-context decoding were really revolutionary at the time...
Обновлено 01.06.2024: * добавлена GigaAM, Whisper V3, GigaAM RNNT Предыдущая версия [здесь](https://alphacephei.com/nsh/2023/01/22/russian-models.html) Мы протестировали доступные модели для распознавания русской речи на различных наборах данных. Интересных моделей довольно много, каждая со...
Recently published NaturalSpeech paper attracted some attention. While ideas discussed there are somewhat straight, it is nice to see a solid implementation from a reputable institution and great results. It...
There are many TTS engines around, here are some notes about them ## Speed According to https://arxiv.org/pdf/2210.15975.pdf decoder takes most of the time in TTS, so decoder speed optimization is...
Speech technology is continuously disrupted by neural network things and generative AI approaches. A good example is the TTS area. In the last years a hundred methods and models have...
Чем дольше мы изучаем реальность, тем необычней она нам кажется. Например, по текущим представлениям, мозг обладает следующими свойствами: 1. мозг -- высокопараллельная система, 1. информация в мозгу передаётся с помощью...
It is interesting that the longer we study the reality the more unusual it appears to us. For example, if we think about brain, there are two important ideas we...
По аналогии с [тестом открытых русских моделей](https://alphacephei.com/nsh/2023/01/22/russian-models.html) мы протестировали популярные сервисы для распознавания речи на записях телефонии. Результаты на сентябь 2023: {:class="table table-bordered"} | Dataset | Vosk 0.52 | Яндекс...