When you work on things day to day you lose the overall picture very quickly. We’ve been actively training models and fixing things here and there and adding new platforms....
We have started promoting data collection for open source speech recognition at Voxforge project in 2007. It has been a great time before the speech recognition revolution but even then...
There many open source German models already around, unfortunately, most of them are not perfectly trained. Here is a review of the current state and some information about new German...
Reading recent Facebook paper on audio embeddings wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations by Alexei Baevski, Henry Zhou, Abdelrahman Mohamed, Michael Auli I wonder how accurate...
Recently got discussion what is worse for telephony audio compression - opus or mp3. I was under impression that opus is unconditionally better than mp3, but it doesn’t seem the...
Many models and datasets become available recently, testing models against datasets becomes more complicated and in the same time more fun. Recenly Kaldi Active Grammar Project released some new models...
In 2019 AlphaCephei has made quite some good progress. We have introduced a project called Vosk which is meant to be a portable API for speech recognition for variety of...
I noticed a big slowdown in RELU layer performance recently, essentially the RELU operation can now take up to 10% in the total CPU count. This is with kernel 4.15....
A Highly Efficient Distributed Deep Learning System for Automatic Speech Recognition Wei Zhang, Xiaodong Cui, Ulrich Finkler, George Saon, Abdullah Kayi, Alper Buyuktosunoglu, Brian Kingsbury, David Kung, Michael Picheny https://www.isca-speech.org/archive/Interspeech_2019/pdfs/2700.pdf...
Spatial and Spectral Fingerprint in The Brain: Speaker Identification from Single Trial MEG Signals Oral; 1000–1020 Debadatta Dash (The University of Texas at Dallas), Paul Ferrari (University of Texas at...