Written by Nickolay Shmyrev
on September 16, 2009

Speech Recognition As Experimental Science

It's well known there are two types of physics - theoretical one and experimental. During the school I always liked doing the last, measuring the speed of a ball or voltage, plotting the graphics and so on. Unforunately in later days I was mostly doing math or programming. Only recently when I started to spend a lot of time on speech recognition I found why do I like it so much - it's also an experimental science.

When you build a speech recogniton system your time is mostly spent on all these beautiful things. Setting up the database training, running the learning process, tracking the results. You are trying understand the nature and find it's laws, you want to find the best feature set, phoneset, find the beams and more and more. You have an experimental material and sometimes it appeared there are things you forget to take in account. The activity that's really encouraging.

Of course there are important drawbacks, issues like proper design of the experiments arise. Unfortunately it's not widely described in the literature but speech recognition experiments are just an examples of experiments so all issues are valid for them. To list a few:

Reproducability
Connection of the theory and the practice
Estimation of the results and their validity

For example the last point is very important. Currently when we are running the the database test we just get a number. We are trying to rely on it without even estimating the deviation and other very important attributes of every scientific measurement. As the result we make unreliable decisions like I did with MLLT transform. I now think that we should be more careful about that.

So that's why I started with the forementioned wikipedia page trying to find a good book on experiment design and of course it would be nice to find an appropriate software for experiment management workflow.

← Top →