Software

I created and maintain the following online/offline softwares, most of them are related to my research.

Models

Allosaurus: Allosaurus is a pretrained universal phone recognizer for more than 2000 languages. It contains several acoustic models we published:
- Universal Model described in our ICASSP 2020 paper
- Compositional Phonetics Model described in our Interspeech 2021 paper

Online Applications

I have a website containing several applications related to low resource speech processings. The tools available are as follows:

Corpus Collection

You can create a kaldi-like corpus dataset from a single text file or a single audio file. These applications were used when we participated in the LoReHLT evaluation. Some features of the applications are summarized in our Interspeech 2020 demo paper

Recording Application: You can upload an text file which you want to create a corpus from. It will generate an interface for you to record audio for each sentence.
Transcription Application You can upload an audio file(s) which you want to create a corpus from. It will generate an interface for you to listen to each audio clip to transcribe.

Speech Recognition

Online Allosaurus: This is an old version of the Allosaurus model. You can upload a audio file to test its recognition online. A CUI interface is also available to query the online model
Inventory Customization: This is a online tool to create phone inventory to customize Allosaurus model.

Speech Synthesis

Low resource TTS: This is a demo to test parametric-based HMM models for many low resource languages. Those models were trained using the Wilderness corpus by Alan W. Black.

Datasets

UCLA Phonetics Corpus: Dataset of phone annotated 97 low resource languages, it is described in our ICASSP 2021 paper MULTILINGUAL PHONETIC DATASET FOR LOW RESOURCE SPEECH RECOGNITION

Others

kaldi-cmake: Create CMakeLists.txt automatically for kaldi project.
pytensor: A toy numpy based deep learning framework.