Written by
Nickolay Shmyrev
on
Release of msu_ru_nsh_clunits
Don't want to be boring and make this feed a clone of sourceforge, but I've recently released the new Russian voice for Festival.
https://developer.berlios.de/projects/festlang/The new release contains a lot of previous updates that were distributed unofficially. And the most important feature is that labels were updated automatically with SphinxTrain / MLLT models, that improved the performance a bit. I haven't debug the current problems though, so it's not clear what are the next steps to improve the quality. Though it's rather clear that better join algorithms and HMM-based cost functions will improve accuracy. Also I wanted to look on pending transcription algorithm for Russian used in academic Russian synthesizers.
I finally followed the industry mainstream and denied the hand-made labels. It's much easier to keep them automatic for sure, because it brings more flexibility. I don't believe in hand work that appeared to be error-prone as well. Hope that better algorithms like Minimum Segmentation Error training will do their work on creation of the perfect segmentation for TTS database. Also I wanted to think about for processing algorithms that are robust to segmentation errors. They are more reasonable to apply in situation when errors are present by design.
I shifted my day schedule again to US time, which is not that perfect. It used to return back last week when I had to stay awake whole night and day. It finally appeared to be productive, but now it shifted back again. I hope, I'll be able to return it back soon.