Computers Taught To Lip-Read Many Languages
Scientists have created lip-reading computers that can distinguish between different languages. Computers that can read lips are already in development - but this is the first time they have been "taught" to recognise various languages.
The discovery in the United Kingdom could have practical uses for deaf people, for law enforcement agencies, and in noisy environments.
Led by Stephen Cox and Jake Newman at the University of East Anglia's School of Computing Sciences, the groundbreaking research was presented at a major conference in Taiwan recently.
The technology was developed by statistical modelling of the lip motions made by a group of 23 bilingual and trilingual speakers.
The system was able to identify what language was being spoken by an individual speaker, with very high accuracy. These languages included English, French, German, Arabic, Mandarin, Cantonese, Italian, Polish and Russian.
Professor Cox said: "This is an exciting advance in automatic lip-reading technology and the first scientific confirmation of something we already intuitively suspected, that when people speak different languages, they use different mouth shapes in different sequences.
"For example, we found frequent ‘lip-rounding' among French speakers and more prominent tongue movements among Arabic speakers," he added.
The research is part of a wider project on automatic lip-reading. The next step will be to make the system more robust to an individual's physiology and his or her way of speaking.
Professor Cox is director of the Speech, Language and Virtual Humans Laboratory at the University of East Anglia (UEA), eastern England. There are several closely linked areas involving speech, language and music that are covered within the laboratory.
"For many years, we have done fundamental research into speech and language processing algorithms, for example speech recognition in noise, formulaic language modelling, language processing for speech synthesis, and development of applications of speech processing," Cox added.
Music processing systems developed in conjunction with US colleagues at the University of Illinois won genre classification, artist identification and classical composer identification tasks at the 2007 Mirex (Music Information Retrieval Exchange) competition. Algorithms behind these systems have been patented and are being commercialised with venture-capital funds as FindTunes.
Automatic lip reading presents a number of demanding scientific challenges. The current project addresses several key scientific questions including:
* what is the relationship between facial gesture and perceived speech?
* how is that relationship affected by the language of the speaker and the context of the discourse?
* what is the effect of language, the pose of the speaker and the context of the discourse on the recognition accuracy?
The joint research project brings together expertise from the Centre for Vision Speech & Signal Processing at the University of Surrey, the School of Computing Sciences at the UEA, and the Home Office Scientific Development Branch.
The project will build on the state-of-the-art in computer vision and speech recognition to investigate and evaluate automated lip-reading from video.
The goal is to develop tools and techniques to allow automatic, language independent lip reading of subjects from video streams. The project will also seek to quantify both human ability and automatic ability.



























