Vosk/Kaldi French model

Recently some good news happened in Kaldi word, essentially, LINTO project released their French model 2.0. This model is trained on 7100 hours according to documentation and looks a bit better than previosly widely used model from Paul Guyot

Both models are not perfect: LM needs update possibly to RNNLM, then LINTO model uses some text postprocessing to split out articles (not very nice for interoperability with commonly used LMs). Then the graph LINTO creates is too huge (6Gb) but in general these models are pretty useful.

We fixed the langauge model issues in the model with better graph and proper CARPA rescoring and made the model for Vosk available for download. Get it here:


You can also use this model with Vosk server via docker:

docker run -p 2700:2700 alphacep/kaldi-fr

Here are some results of the models. Please try and comment on the issues you encounter.

Model CV Test WER Podcast WER Speed Memory
Pguyot 27.98 30.25 0.30xRT 500 Mb
Linto original 14.24 27.04 0.22xRT 7 Gb
Linto VOSK LM 16.25 24.36 0.23xRT 1.7 Gb
Deepspeech FR Polyglot 24.74 43.75 1.0xRT 700 Mb