Visteon Automotive Systems has taken a big lead in a technology certain to be a headline event at this month's Society of Automotive Engineers exposition in Detroit - mobile multimedia, the field of tying together voice and function in a vehicle's cabin.
The Ford Motor Co. subsidiary has created a system for the Jaguar S-Type that allows the driver or passenger to change radio stations and/or volume, select a CD and choose a track, or make a telephone call, all by voice command.
A similar voice-activation system will be available in the new Lincoln LS, which shares a platform with the Jaguar S-Type. The two cars go on sale this spring.
'We believe this will be a must for consumers in the 21st century, and we are getting ready,' said Marcos Oliveira, general manager of Visteon electronics systems.
Five major equipment producers - Delphi, Visteon, Bosch, Mannesmann/Philips and Denso - are set to be the major players in the exploding new field of mobile multimedia.
Built for English-speaking markets, Visteon's voice-activation system had to be taught not just to speak and understand, but also to differentiate a Southern drawl from a Scottish brogue.
Thousands - yes, thousands - of people were employed to read from scripts to teach the system to recognize regional accents from English-speaking countries, as well as English spoken with a German, French, Italian or Indian accent.
For example, Jaguar specifically requested that the system recognize the voice patterns of residents of Britain who were born or brought up on the Indian subcontinent. The reason? Research shows that this group favors BMW, a target for the Jaguar S-Type.
How the system works
The key to voice activation is an audio system that accepts speech input as a sequence of phonemes - the basic sounds of human language. Most languages have 50 phonemes or fewer, despite the variations involved in regional and local accents.
Stored digitally, phonemes can be strung together according to patterns known as Hidden Markov Models, which are able to take account of accents. An individual word becomes represented by Hidden Markov Models, and a database of these is built into the system's computer.
The system is programmed to recognize a sequence of Hidden Markov Models making up an acceptable command and to pass it on to the activating chip.
Building up the Hidden Markov Models database was a lengthy process, according to Hakan Kostepen, voice activation manager at Visteon Automotive Systems.
People spent 15 to 20 minutes each reading from a script in order to span the multiple variations represented by the 13 regional accents in the United Kingdom and the six U.S. accents that Jaguar and Visteon perceived necessary to cover 95 percent of the English-speaking world.
'We asked people to speak naturally, as if they were having a phone conversation. We also now have a unique collection of phonemes of `ers' and `ums' and throats being cleared,' Kostepen said.
Two years of collaboration with Jaguar followed, with a Visteon team of nine engineers split between the carmaker's engineering center at Whitley, England, and the S-Type assembly plant at Castle Bromwich, England.
As for the car's synthesized responses, executives considered 15 different patterns, rejecting some as 'too serious' and others as 'not manly enough,' before settling on the firm but friendly solution that Jaguar customers are now waiting to enjoy.
Kostepen believes that voice activation has virtually unlimited possibilities. In addition to being made able to understand virtually every language, the system could provide translation of traffic information into the driver's native language when a vehicle crosses international borders and the recognition of individual voice patterns as a security measure.
'With complementary domestic infrastructures in place, there is no technical reason why the driver won't be able to tell the car to phone home and instruct the stove to turn itself on at a specific temperature,' he said.
But dramatic as this development may be, the conversational car is just a beginning. Voice activation is not an end in itself, but rather an enabling technology that will lead to greater things.
'The more you talk about the technology of the future, the more you find there is a growing need for the vehicle occupants to communicate with the outside world,' explained Oliveira. 'Some of the enablers, such as voice activation, are elements that will basically maintain very safe driving conditions.'
Future multimedia systems will allow the driver to send and receive e-mail and cruise the Internet while children play video games in the back seat.
Automotive voice activation has already progressed well beyond the system in the Jaguar, which is deliberately simple. The Jaguar system has a vocabulary of 105 words. Give the car an instruction, and a male voice repeats it, carries it out and confirms the action.
Although the system is capable of handling 3,000 words, it was limited to avoid overloading the customer.
'At this stage, too many words would mean it would be less intuitive,' said Kostepen.
Long before the S-Type's introduction at last year's Birmingham motor show, other carmakers were clamoring for information on voice activation, said Keith Hayton, manager of audio systems engineering at Visteon's technical center in Dunton, England. He said voice-activation systems will become increasingly common in the near future.
'All the major automakers in Europe, United States and Japan have expressed interest in the technology and its applications,' he said.
'There is an especially big demand from Germany, and expansion into the seven main European languages other than English will clearly be required.'
Visteon says its research indicates that automakers would be willing to pay as much as $300 per vehicle for the technology. It would be up to the automaker to decide whether to pass the cost directly to the consumer, integrate it into a broader telematics package, or regard it as an added-value item.
Visteon did not agree on a retail price with Jaguar when developing the system for the S-Type. Hayton said Visteon had been prepared to go out on a limb to achieve a high-profile launch.
'We don't have list price mutuality on S-Type. With Jaguar, we said we would take a hit if necessary, because we wanted the business. Jaguar likes suppliers to take a risk with them, and maybe even lose money with them, too,' he said. 'Systems and component companies will survive only if they are prepared to do this more and more.'
'Take me to your leader'
Volume brands will not be far behind in adopting the technology, Hayton predicts.
'You can already specify satellite navigation in some B- and C-class models, and theoretically you could have voice technology for not much more outlay. I see cars in the Fiesta/Corsa/Polo (subcompact) segment with voice technology within three or four years,' he said.
These high-volume models may well take things further than the S-Type.
Second-generation systems will have a 30,000-word capacity and cover an array of functions around the car, including adjusting seats and mirrors, opening the fuel door and controlling windows and sunroofs or any other action performed by a finger on a switch.
But the system's biggest breakthrough will come in parallel with the growth of satellite navigation. Hayton said it will make satellite navigation far more acceptable to the motorist.
'Typing in the destination address is one of the most painful things about (satellite navigation) today,' he said. 'Our goal is to expand the vocabulary of the system so the user can talk to it. It's what people have been waiting for.'
The eventual choice for carmakers will not be whether to have voice activation, but rather how to use it, Hayton said. Different manufacturers will require different emphases within such a package and will make their own decisions on voice presentation and content.
'It will be our job to collaborate with the OEM to develop a personality for each model,' he said. 'One day, the driver will have what amounts to an intelligent conversation with his car.'