Skip to content

Software

I created and maintain the following online/offline softwares, most of them are related to my research.

Models

  • Allosaurus: Allosaurus is a pretrained universal phone recognizer for more than 2000 languages. It contains several acoustic models we published:

Online Applications

I have a website containing several applications related to low resource speech processings. The tools available are as follows:

Corpus Collection

You can create a kaldi-like corpus dataset from a single text file or a single audio file. These applications were used when we participated in the LoReHLT evaluation. Some features of the applications are summarized in our Interspeech 2020 demo paper

  • Recording Application: You can upload an text file which you want to create a corpus from. It will generate an interface for you to record audio for each sentence.

  • Transcription Application You can upload an audio file(s) which you want to create a corpus from. It will generate an interface for you to listen to each audio clip to transcribe.

Speech Recognition

  • Online Allosaurus: This is an old version of the Allosaurus model. You can upload a audio file to test its recognition online. A CUI interface is also available to query the online model
  • Inventory Customization: This is a online tool to create phone inventory to customize Allosaurus model.

Speech Synthesis

Datasets

Others

  • kaldi-cmake: Create CMakeLists.txt automatically for kaldi project.
  • pytensor: A toy numpy based deep learning framework.