In order for FreeTTS to figure out how a word should be pronounced it first looks it up in an internal dictionary. If the word is not in the dictionary then a set of letter-to-sound rules are applied to attempt to guess the pronunciation. There's actually quite a bit of code in FreeTTS that is involved with determining the proper pronunciation based upon spelling. Many people have wondered, if we have such a set of letter-to-sound rules, why we need a dictionary at all. Well, in fact, the FreeTTS dictionary (which consists of 60,000 or so words) contains just the exceptions to the rules. Spelling in English is just so irregular that even with 1000 lines of Java code driving a state machine with 13,000 states in the letter-to-sound state machine, there are still 60 thousand spelling exceptions. This state of affairs was highlighed by Gerard Nolst Trenite in the poem The Chaos. Here's an excerpt;

Dearest creature in creation,
Studying English pronunciation,
        I will teach you in my verse
        Sounds like corpse, corps, horse and worse.
I will keep you, Susy, busy,
Make your head with heat grow dizzy;
        Tear in eye, your dress you'll tear;
        Queer, fair seer, hear my prayer.
Pray, console your loving poet,
Make my coat look new, dear, sew it! 
Strewn with stones like rowlock, gunwale,
        Islington, and Isle of Wight,
        Housewife, verdict and indict.
Don't you think so, reader, rather,
Saying lather, bather, father?       
        Finally, which rhymes with enough,
        Though, through, bough, cough, hough,
sough, tough??
Hiccough has the sound of sup.
My advice is:  GIVE IT UP!   


In a slightly more generalized sense, having a set of rules and then storing only the exceptions is how the brain processes (apparently) lots of things, but most especially grammar. So says Steven Pinker in _Words and Rules_. That's what your post reminded me of. The book's well worth a read, by the way.

Posted by Carson on July 14, 2004 at 12:50 PM EDT #

Carson: Thanks, I'll check out the Pinker book.

Posted by Paul on July 14, 2004 at 03:44 PM EDT #

[Trackback] Paul Lamere gives a glimpse into the complexities of doing real text-to-speech in English. The 60,000 word dictionary in FreeTTS is just for the exceptions that aren't handled correctly by the 13,000 states in the letter-to-sound state machine! But th...

Posted by Tim Danner on July 17, 2004 at 12:43 PM EDT #

Post a Comment:
Comments are closed for this entry.

This blog copyright 2010 by plamere