Abstract |
Everyone needs their voice, and speech has a pivotal function in modern society. A detailed, working model of the voice
would contribute to the human atlas and would find profound applications in fields such as speech technology, medical
research, pedagogy, linguistics and the arts. But the physics are very intricate: we make the sounds of speech, song and
emotions using multiple mechanisms; and these are under exquisite control, through muscle activation patterns acquired
from years of training. Physically, voice involves complex interactions between laminar and turbulent airflow; vibrating,
deforming, colliding elastic solids; and sound waves resonating in a contorting duct. So far, these mechanisms have had
to be studied one at a time, using disparate tools and often gross approxi¬mat¬ions, for each of the subproblems. Now,
advances in computing techniques suggest the possibility of simulating the entire voice organ, including its biomechanics and
aeroacoustics, in a unified numerical domain. This major computational challenge would bring research and education much
closer to reality. In the EUNISON project, we seek to build a new voice simulator that is based on physical first principles to
an unprecedented degree. From given inputs, representing topology or muscle activations or phonemes, it will render the 3-D
physics of the voice, including of course its acoustic output. This will give important insights into how the voice works, and
how it fails. The goal is not a speech synthesis system, but rather a voice simulation engine, with many applications; given the
right controls and enough computer time, it could be made to speak in any language, or sing in any style. The model will be
operable on-line, as a reference and a platform for others to exploit in further studies. The long-term prospects include more
natural speech synthesis, improved clinical procedures, greater public awareness of voice, better voice pedagogy and new
forms of cultural expression. |