Extending Gentle Aligner
Week 4
Automate Decoding Process
-
Data preparation:
utt2spk, spk2utt and wav.scp are generated using the script ‘decoding.py’ Find the code here
-
Decoding steps:
compute-mfcc ./src/featbin/compute-mfcc-feats --config=voxforge_ru/conf/mfcc.conf scp,p: data/wav.scp ark,scp: data/feats.ark,data/feats.scp compute-cmvn ./src/featbin/compute-cmvn-stats --spk2utt=ark:data/spk2utt scp:data/feats.scp ark,scp:data/cmvn.ark,data/cmvn.scp gmm-latgen-faster gmm-latgen-faster$thread_string --max-active=$max_active --beam=$beam --lattice-beam=$lattice_beam --acoustic-scale=$acwt --allow-partial=true --word-symbol-table=$graphdir/words.txt $decode_extra_opts $model $graphdir/HCLG.fst "$feats" "ark:|gzip -c > $dir/lat.JOB.gz" "ark,t:$dir/words.JOB" "ark,t:$dir/alignments.JOB" || exit 1;
-
Alignment generation:
Phoneme Alignment: ./src/bin/ali-to-phones --ctm-output exp/tri2a/final.mdl ark: exp/tri2a/decode/alignments.1 1.ctm Word-level Alignment: ./steps/get_ctm.sh ./transcript ./data/lang ./exp/tri2a/decode/
-
Preparing data for visualization:
[link to code for converting to json format](https://github.com/shreya2111/gentle-labs/tree/master/v1/scripts)
- Next: Building a Russian language model - Reverse engineering HCLG.fst Week 5
- Prev: Visualization using Javascript Week 3
- main page
Tools: Kaldi, Python, C, Bash Scripting
Link to GSoC Project Repository