-
Project Intoduction
A person's voice is fundamental to their identity. For individuals
who lose the ability to speak due to disabilities, like the late
Stephen Hawking, Augmentative and Alternative Communication (AAC)
technology provides a vital bridge to the world. For languages like
English or Mandarin, finding and training suitable TTS models is no issue;
however, for under-resourced languages, such models can be difficult
to come by, if they exist in the first place. This is where our project comes in.
-
Our Sponsor
Dr. Benjamin Tucker is a professor at Northern Arizona University who specializes in Natural Language Processing (NLP) and Speech Synthesis.
He has been researching techniques for creating TTS models for low-resource languages such as South African English (SAE),
Afrikaans, and isiXhosa.
-
Problem Description
We want to enable and advance Dr. Tucker's research into creating
TTS models for under-resourced languages. Prior to the start of our
project, he had a working system based on an old implementation
of Tacotron2 (a TTS system) by Nvidia. This implementation, as of 2020,
is no longer maintained and has since stopped working. Our goal
is to bring new life to this project by providing Dr. Tucker with
a refreshed and future-proof solution.
-
Solution Overview
Our solution has two main components:
a TTS model training pipeline and a web application.
The former will provide a simple framework that Dr. Tucker can
leverage to train his own TTS models without having to worry about
the intricacies of the TTS system. The latter will host the models
trained and allow users to input text and generate speech using said models.