Vosk is a speech recognition toolkit. The best things in Vosk are:
- Supports 20+ languages and dialects - English, Indian English, German, French, Spanish, Portuguese, Chinese, Russian, Turkish, Vietnamese, Italian, Dutch, Catalan, Arabic, Greek, Farsi, Filipino, Ukrainian, Kazakh, Swedish, Japanese, Esperanto, Hindi, Czech, Polish, Uzbek, Korean. More to come.
- Works offline, even on lightweight devices - Raspberry Pi, Android, iOS
- Installs with simple
pip3 install vosk
- Portable per-language models are only 50Mb each, but there are much bigger server models available.
- Provides streaming API for the best user experience (unlike popular speech-recognition python packages)
- Allows quick reconfiguration of vocabulary for best accuracy.
- Supports speaker identification beside simple speech recognition.
See the following sections for more information:
If you have any questions, feel free to
- Post an issue on github
- Send us an e-mail at firstname.lastname@example.org
- Join our group dedicated to speech recognition on Telegram @speech_recognition
- We have a Wechat group which is pretty big, so it is invitation-only. Mail us to join the group and provide some information about yourself.