Extending Gentle Aligner
Week 12
Summary
This is last blogpost of my GSoC 2019 journey!
Overview:
- Trained Russian automated speech recognition model, built customized Russian langauge model and used it to produce time alignments, visualizations.
- Built Singularity Container containing my work on time alignment of Russian data
- Documented work through blogpost & git repo
- Singularity image for the project can be found at gentle-singularity
-
You can find the built singularity image on google drive link to google drive
- The idea was to make Gentle produce time alignments in langauges other than English.
- My work accomplishes generation of automated time alignments in Russian, using Gentle’s langauge model generation and decoding graph compilation code. I have extensively used Kaldi’s C libraries for generating the initial langauge files, decoding an audio input and producing conversation-to-time files (ctm) link to code.
- Visualization: I also worked on a visualization that can clearly represent the time alignment produced by generating json time alignment data through my code link to viz.
What’s Done?
- Building a sentence specific langauge model for Russian
- Automated Decoding using the customized langauge model
- Extension to customized German langauge model [Ongoing]
- Built Singularity images on AWS machine
- Proper Documentation on github & jekyll blog
To Be Done Next!
- Getting automated time alignments for German langauge using the customized langauge model
- Evaluating the results of customized langauge model and whole langauge model (trained on entire dataset)
References:
- Gentle
- Russian ASR
- German ASR
- Blog for building language model
- kaldi-tutorial
- Gentle
- Little Distributed Red Hen Lab
By customized langauge model I mean a langauge model built using just one utterance of that langauge which needs to be time aligned with it’s audio data.
Tools: Kaldi, Python, C, Bash Scripting
Link to GSoC Project Repository