kaldi-grammar-compiler is a minimal tool that helps transforming Regulus Lite fixed grammars into compiled Finite State Transducers (FSTs). This thus makes them readable as language models (G.fst
) in Kaldi so that they can be used as part of an Automatic Speech Recognition (ASR) system. By doing so, we intended to provide a straight-forward tool for adding grammar-based language models into the Kaldi Speech Technology Toolkit.
The tool that we present is built from a collection of programs written in Bash, Perl and Python using standard libraries and the C++ OpenFST library.
In order to properly run this tool, it is necessary to previously install the latest version of Kaldi.
$ git clone https://github.com/kaldi-asr/kaldi.git .
You will then need to fork/clone the kaldi-grammar-compiler into your local/remote machine:
$ git clone https://github.com/lormaechea/kaldi-grammar-compiler.git .
Once you have installed the required components and cloned the kaldi-grammar-compiler repository, you will need to export Kaldi path by using the source
command and editing (if necessary) the path.sh
script made available in the main directory:
$ source path.sh
As for the grammar generation, it results from the execution of the genGrammar.sh script, which will take a source format grammar file as an input and will transform it into a G.fst binary file. Three positional paramaters must be specified:
Option | Description |
---|---|
(-g|--grammar) |
Where the input Regulus Lite grammar file needs to be defined. |
(-s|--stage) |
Where the stage process is set. |
(-d|--draw_graph) |
Where you can specify if you also want to get a visual .png representation of the resulting G.fst graph. Note: this is highly unrecommended for very comprehensive grammars as it can overload the RAM memory). |
The execution is as follows:
$ bash genGrammar.sh --grammar=<input_file> --stage=<stage> --draw_graph=<yes/no>
Two Regulus Lite grammars are made available inside workingExamples/
(medico.rl
and homeautomation.rl
).
Before proceeding to the compilation of the grammars, a preparation and normalization process is firstly required. In order to do so, prepareGrammar.pl
will be called by our main script. It will:
- First take as input a source grammar written in the Regulus Lite formalism and split the main grammar (containing a set of phrases which represent some specific discourse) from the sub-grammars (corresponding to word classes represented by non-terminal symbols).
- And then normalize the data so that it can be properly converted into FSTs in the next phase.
From which we will get the subsequent files as output (let's assume that medico.rl
is the <INPUT_FILE>
):
-
medico_main.txt
→ Main grammar. -
medico_sub.txt
→ Sub-grammars. -
medico_main_norm.txt
→ Normalized main grammar. -
medico_sub_norm.txt
→ Normalized sub-grammars. -
newWords.txt
→ This is a file that will add to a pre-existing lexicon the absent non-terminal symbols and the Out-Of-Vocabulary (OOV) forms found in the grammar given in input. It will assign an identifier to each unit.
Once the normalized files have been produced, we can move on to the compilation process. The genGrammar.sh
script will transform the input files into binary Finite State Transducers (FSTs). To do so, it will follow 3 steps:
Steps | Description |
---|---|
[--stage=1] |
First it will compile the main grammar. |
[--stage=2] |
It will subsequently compile every sub-grammar word class. |
[--stage=3] |
Finally it will replace the non terminal symbols found in the main grammar by its corresponding terminals (found on the sub-grammars). |
Once the pipeline process is achieved, a folder containing the resulting grammar, G.fst
, will be created. It can now be used as a language model in your Kaldi ASR experiments!
Licensed under the Apache 2.0 License.