ENGINEERING TRIPOS PART IIB – 2012/2013
Module 4F11 - Speech and Language Processing
|
Leader:
|
Prof PC Woodland (pcw@eng)
|
|
Timing:
|
Lent Term
|
|
Prerequisites:
|
3F1 and 3F3 useful
|
|
Structure:
|
14 lectures + 2 examples classes
|
| Assessment: |
Material / Format / Timing / Marks
Lecture Syllabus / Written exam (1.5 hours) / Start of Easter Term / 100 % |
AIMS
The module aims to introduce major techniques for recognising and synthesising speech signals, and for statistical machine translation.
LECTURE SYLLABUS (Prof. P C Woodland and Dr. W J Byrne)
Lecture 1: Overview/Introduction
- Speech production mechanisms, types of speech sound, source-filter model, applications of speech and text processing
Lectures 2-3: Acoustic Analysis
- FFT based methods, Mel scale, cepstral analysis, all-pole filter models, calculation of LP coefficients. formant and voicing analysis. Front-end analysis for speech recognition.
Lectures 4-5: ASR Introduction and Isolated Word Recognition
- Statistical speech recognition, task complexity. Hidden Markov models. Continuous density HMM parameter estimation, Baum-Welch algorithm, Viterbi algorithm, Gaussian mixture models for HMMs.
Lecture 6: Sub-word Acoustic Models
- Large vocabulary speech recogntion, limitations of word models, context dependent phones, parameter tying.
Lecture 7: Language Models
- Perplexity, N-gram language models, discounting, interpolation.
Lecture 8: ASR Search Issues
- Continuous speech recognition. Pruning. Integrating context dependent HMMs and N-gram language models.
Lectures 9-10: Weighted Finite State Transducers for Speech and Language Processing
- Efficient realization of probabilistic models for sequence processing. Transduction, composition, determinization, minimum-cost search. WFSTs in ASR search and other language processing applications.
Lecture 11: Introduction to Statistical Machine Translation
- Statistical pattern processing approaches to translation. Automatic evaluation of translation quality.
Lecture 12: SMT - Alignment
- Parallel text as training data. Models of word and phrase alignment in translation. Model estimation procedures.
Lecture 13: SMT - Translation
- Phrase-based translation systems. Implementation via WFSTs.
Lecture 14: Text-to-Speech Synthesis
- Introduction to TTS. Data-driven synthesis and Hidden Markov Model approaches to TTS.
OBJECTIVES
On completion of the module students should:
- Understand techniques for speech analysis, recognition and synthesis;
- Understand techniques for statistical machine translation;
- Be able to apply these techniques in order to build simple speech
recognition, synthesis and machine translation systems;
- Be aware of the current state-of-the-art in speech recognition and
synthesis and in machine translation technology.
REFERENCES
Please see the Booklist for Group F Courses for references for this module.
Last updated: June 2012
teaching-office@eng.cam.ac.uk