Audio-Visual Speech Processing
Bradford Books (Publisher)
Published on 1. August 2006
Book
Hardback
328 pages
978-0-262-22078-1 (ISBN)
Description
In recent years, researchers have begun to question the unimodal paradigm of speech processing and to explore the multimodal model. When we speak, both the visible motions of the face and the audible speech acoustics are shaped by the behavior of the vocal tract. Much work in the field now examines both auditory and visual aspects of speech processing, and "speechreading" is considered a psychological process of interest beyond its direct application in hearing loss and deafness. This book assembles a broad collection of the latest work on audio-visual (AV) speech processing by human and machines. The book first treats the two main questions about human audio-visual performance: how both auditory and visual signals combine to access the mental lexicon, and where in the brain this process takes place. The contributions show that AV perception is able to recover properties that are carried by neither modality alone. The book then turns to the production and perception of multimodal speech, and the coordination of structures within and across the two modalities. Finally, the book presents some of the latest developments of speech processing by computers, particularly in AV speech recognition and synthesis. Work in computer-generated facial animation now goes beyond the traditional application areas of animation and games to address the challenge of applying the metaphor of face-to-face conversation to human-computer interfaces.
More details
Language
English
Place of publication
Massachusetts
United States
Publishing group
MIT Press Ltd
Target group
Professional and scholarly
Dimensions
Height: 229 mm
Width: 152 mm
ISBN-13
978-0-262-22078-1 (9780262220781)
Copyright in bibliographic data is held by Nielsen Book Services Limited or its licensors: all rights reserved.
Schweitzer Classification
Persons
Eric Vatikiotis-Bateson is Professor of Linguistics and Director of the Cognitive Systems Program at the University of British Columbia.
Gérard Bailly is CNRS Research Director and leads the Talking Machines team of the Institut de la Communication Parlée (Institute of Speech Communication), Grenoble.
Pascal Perrier is Professor of Signal Processing at the Institut National Polytechnique de Grenoble and leads the Speech Production team of the Institut de la Communication Parlée, Grenoble.
Gérard Bailly is CNRS Research Director and leads the Talking Machines team of the Institut de la Communication Parlée (Institute of Speech Communication), Grenoble.
Pascal Perrier is Professor of Signal Processing at the Institut National Polytechnique de Grenoble and leads the Speech Production team of the Institut de la Communication Parlée, Grenoble.