Professor of Information Engineering
Academic Division: Information Engineering
Research group: Machine Intelligence
Telephone: +44 1223 3 32733
Professor Gales’ research aims to make speech systems simple and intuitive to use; achieving high levels of accuracy and naturalness. His primary research in this area is on automatic speech recognition, converting the audio waveform into text, and speech synthesis, converting text into an audio waveform. Both these areas have underlying unsolved challenges.
Though the deployment of speech recognition systems is becoming increasingly common, the domains in which they operate are quite limited; for example spoken search terms and transcribing broadcast news. To broaden the range of applications it is necessary to develop techniques that handle the diversity of spoken communication and the broad range of environments that these systems are required to operate in.
Speech synthesis systems have been deployed for many years. Systems are now able to deliver clear, understandable speech but they lack the ability to convey the full range of expressions found in human speech. To achieve human levels of information transfer by speech, Professor Gales' is investigating expression rich, controllable synthesis.
A fundamental aspect of both of these tasks is the need to add and exploit structure in the modeling of speech. For example by explicitly factoring a synthesis model into speaker characteristics, sentence pronunciation and sentence expression it is possible to control the exact nature of how the sentence is uttered, for example happy or angry and the speaker voice.
- Fellow of Emmanuel College
- Member of the Cambridge Language Sciences Initiative