Audiobook alignment for North American Indigenous languages
The concept is a web application with a series of stages of processing, which ultimately leads to a time-aligned audiobook - i.e. a package of:
- SMIL file describing time alignments
- TEI file describing text
- Audio file (WAV or MP3)
Which can be loaded using the read-along JavaScript component.
Optionally a book can be generated as a standalone HTML page or as an ePub file.
- (optional) Pre-segment inputs, consisting of:
- Single audio file
- Text with page markings (assume paragraph breaks = pages)
- Input pages: each page consists of
- Image file
- Audio file
- Text
- Run alignment
- View output and download components
- MVP app:
- Single page (image, audio, text)
- Select language (crl or atj for now)
- Run alignment and launch read-along app with output
pip install -e .
python
>>> from readalongs.app import app
app.run()
pip install -e .
readalongs_align --output-xhtml XMLFILE WAVFILE OUTPUTNAME
readalongs_create_epub OUTPUTNAME.smil OUTPUTNAME.epub