GigNews as a good article called Speech Interfaces for Games, that gives a good basic overview of how speech recognition works and gives a hint about some of the difficulties in writing good speech apps. A good read. Here's a snippet:

Scenario #1
Computer: "Captain, do you want to open hailing frequencies, 
open fire or activate the Jump Drive?"

User: "Fire all weapons on the nearest enemy frigate."

Computer: "Fire lasers, missiles or all weapons?"

User: "All weapons."

Computer: "Fire on a frigate, a battleship or a freighter?"

User: "I said the nearest enemy frigate!"

Computer: "There are three enemy frigates within range: the Saratoga, 
the Seville and the San Diego. Which one do you want to target?"

User: "The nearest, damn you!"

Computer: "I'm sorry, I did not understand. Please try again."

User: "Malefice! You foul silicon-based denizen of the tar
 pits of Hell! A plague on you and your unnatural family!"

Ship is broad-sided by an enemy battle cruiser and destroyed.

Scenario #1 is an example of a fixed-initiative system, and a rather bad one. (Effective fixed-initiative systems are in common use in 411 directory assistance; they are still somewhat brittle and work without human intervention in a mere fraction of calls, but the phone company still saves fortunes thanks to them.)

Scenario #1's computer asks all the questions, handles very specific answers, and doesn't even listen to anything else. If the user volunteers information to try to speed up the interaction, the computer will ignore it until the "proper" time for this bit of data is reached. Only in the simplest of cases (and with the most docile of users) will a fixed-initiative dialogue achieve good results quickly and naturally.

Comments:

Post a Comment:
Comments are closed for this entry.

This blog copyright 2010 by plamere