Voicemail transcription with Pocketsphinx and Asterisk

This is for admins who are aware that pocketsphinx exists and want to try it. It will describe how to quickly setup voicemail transcription using pocketsphinx and Asterisk. The process is extremely simple, I promise it will not take more than 5 minutes.

We'll use external shell command invoked when voicemail arrives and this command will transcribe voicemails which aren't transcribed yet. We won't postprocess the result as well as will not clean it up. The goal is just to show how to do it quickly and show how asterisk interface can be built.


So, let's start

1. Setup asterisk. I hope it will run smoothly, it's really easy. Setup samples with make samples. Our demo will be based on them.

2. Setup pocketsphinx. You need to download pocketsphinx and sphinxbase from the download page. You need at least pocketsphinx version 0.7, previous versions will not work. Some of the required features are only avaiable in this release or later releases.

3. Check that pocketsphinx works. Just run pocketsphinx_continuous and try to say something when READY will appear. The decoding result will appear before next READY.

000000000: hello world

4. Download Communicator acoustic model for telephone speech and unpack it into some location, for example in $prefix/var/lib/asterisk/communicator. There must be files like mdef, variances, etc.

5. Edit voicemail.conf in $prefix/etc/asterisk/voicemail.conf. Configure external callback script:

externnotify=$prefix/sbin/voicemail-notify.sh

6. Now let's create the script voicemail-notify.sh in the folder $prefix/sbin where all other asterisk binaries reside. Copy-paste it from below, change permission to 755, don't forget to update the prefix to point to the asterisk installation folder

#!/bin/bash


prefix=<PUT YOUR ASTERISK PREFIX HERE>
voicemaildir=$prefix/var/spool/asterisk/voicemail/$1/$2/INBOX/


for audiofile in `ls $voicemaildir/*.wav`; do
    transcriptfile=${audiofile/wav/transcript}
    # For each message.wav we check if message.transcript exists
    if [ ! -f $transcriptfile ]; then
        # If not, we create it
        pocketsphinx_continuous -infile $audiofile \
                                -hmm $prefix/var/lib/asterisk/communicator \
                                -samprate 8000 2> /dev/null > $transcriptfile
        # Now we can do whatever we want with the new transcription
        # Send it by mail for example
        # mail $user < $transcriptfile
    fi
done

7. Start asterisk or reload configuration with "voicemail reload"

8. Dial extension 1234

*CLI> console dial 1234

and leave voicemail.

9. Check that your voicemail is transcribed automatically and the transcription is put together with wav file into voicemail folder

ls $prefix/var/spool/asterisk/voicemail/default/1234/INBOX/*.transcript

You can also send the transcript by mail or do with it whatever you want. Easy, isn't it? Well, I didn't mention you need better language model and all the tricks to improve the transcriptoin accuracy for your voicemails, thats a separate story.

Update
The second part is here: http://nsh.nexiwave.com/2011/04/voicemail-transcription-with.html