GSoC

Extending Gentle Aligner

Week 4

Automate Decoding Process

Data preparation:

utt2spk, spk2utt and wav.scp are generated using the script ‘decoding.py’ Find the code here

Decoding steps:

  compute-mfcc
  ./src/featbin/compute-mfcc-feats --config=voxforge_ru/conf/mfcc.conf scp,p: data/wav.scp ark,scp: data/feats.ark,data/feats.scp
    
  compute-cmvn
  ./src/featbin/compute-cmvn-stats --spk2utt=ark:data/spk2utt scp:data/feats.scp ark,scp:data/cmvn.ark,data/cmvn.scp

  gmm-latgen-faster
  gmm-latgen-faster$thread_string --max-active=$max_active --beam=$beam --lattice-beam=$lattice_beam --acoustic-scale=$acwt 
  --allow-partial=true --word-symbol-table=$graphdir/words.txt $decode_extra_opts $model $graphdir/HCLG.fst "$feats" "ark:|gzip -c > 
  $dir/lat.JOB.gz" "ark,t:$dir/words.JOB" "ark,t:$dir/alignments.JOB" || exit 1;

Alignment generation:

  Phoneme Alignment:
  ./src/bin/ali-to-phones --ctm-output exp/tri2a/final.mdl ark: exp/tri2a/decode/alignments.1 1.ctm

  Word-level Alignment:
  ./steps/get_ctm.sh ./transcript ./data/lang ./exp/tri2a/decode/

Preparing data for visualization:

  [link to code for converting to json format](https://github.com/shreya2111/gentle-labs/tree/master/v1/scripts)

Next: Building a Russian language model - Reverse engineering HCLG.fst Week 5
Prev: Visualization using Javascript Week 3
main page

Tools: Kaldi, Python, C, Bash Scripting

Link to GSoC Project Repository

GSoC 2019 | Red Hen Lab

Google Summer of Code 2019 | Red Hen Lab | Robert Ochshorn | Shreya

Extending Gentle Aligner

Week 4

Automate Decoding Process