Extending Gentle Aligner
Week 7
Decoding using a customized langauge model
Here proto_dir
is kaldi/egs/gentle/data where gentle
is the package cloned from git.
1. run initialize.py
python3 scripts/initialize.py proto_dir model_dir path_to_lexicon.txt
This script ensures that you have the correct directory structure and files inside proto_dir/, model_dir/ and lexicon.txt.
2. run generateLM.py
python3 generateLM.py path_to_utterance proto_dir kaldi_path
This script makes sure that correct langauge inputs as lexicon.txt, nonsilence_phones.txt etc are created, prepares L.fst etc using
Kaldi and provides these inputs to Gentle for decoding graph generation (HCLG.fst).
3. run automateDecoding.py
python3 scripts/automateDecoding.py kaldi_path audio_file utterance_path proto_dir
This script creates speech features for the audio input and generates gmm based lattices using the customized decoding graph.
These lattices are then segregated for best results, providing us with the best path/best sequence of utterance which gets time-aligned
using the nbest-to-ctm script. Giving us the phone-level and word-level time alignments for the given utterance. This script also
produces json inputs for alignment visualization.
Sample output looks like this:
- Next: Walk-through the process: from installation to decoding Week 8
- Prev: Generating a Language Model Week 6
- main page
Tools: Kaldi, Python, C, Bash Scripting
Link to GSoC Project Repository