Tuesday, 1 May 2012

Sphinx4 custom acoustic model files notes.

The whole process of creating custom acoustic model is described here.
Read it thoroughly. If you are still not getting what are the required files and where to get them from this note is for you.

Given that structure is:

  1. etc
    1. your_db.dic - Phonetic dictionary
    2. your_db.phone - Phoneset file
    3. your_db.lm.DMP - Language model
    4. your_db.filler - List of fillers
    5. your_db_train.fileids - List of files for training
    6. your_db_train.transcription - Transcription for training
    7. your_db_test.fileids - List of files for testing
    8. your_db_test.transcription - Transcription for testing
  2. wav
    1. speaker_1
      1. file_1.wav - Recording of speech utterance
    2. speaker_2
      1. file_2.wav


Following files could be built by lmtool web service :
  1. your_db.dic - 
  2. Phonetic dictionary
  3. your_db.phone - Phoneset file
  4. your_db.filler - List of fillers
after you've got those files ready, you'll need .DMP file:
  1. your_db.lm.DMP - Language model
it is generated from .lm file with sphinx_lm_convert programm which is shipped with sphinxbase-7.0 archive. See this section on installation instructions of sphinxbase. You should use following commands to generate this file:

 sphinx_lm_convert -i model.lm -o model.dmp
sphinx_lm_convert -i model.dmp -ifmt dmp -o model.lm -ofmt arpa
After you've got that running, you should list all audio files that you want use for training and their matching phrases in remaining files:
  1. your_db_train.fileids - 
  2. List of files for training 
  3. your_db_train.transcription - Transcription for training

No comments:

Post a Comment