WTS701 Datasheet, PDF(12/77 Page) Winbond – WINBOND SINGLE-CHIP TEXT-TO-SPEECH PROCESSOR

English

English German Russian Spanish Italian Polish Chinese Japanese Korean French Portuguese	Language :

WTS701 Datasheet, PDF (12/77 Pages) Winbond – WINBOND SINGLE-CHIP TEXT-TO-SPEECH PROCESSOR

◁

WTS701

7.1 TEXT-TO-SPEECH MECHANISM

The text to speech component of the system consists of three principal blocks:

â¢ Text normalization

â¢ Word to phoneme conversion

â¢ Phoneme mapping

7.1.1 Text Normalization

Text normalization involves the translation of incoming text into pronounceable words. It includes such

functions as expanding abbreviations and translating numeric strings to spoken words. It involves a

certain amount of context processing to determine correct spoken form.

In addition, the WTS701 looks into the abbreviation list stored in the deviceâs internal memory and

converts acronyms, abbreviations or special characters (such as Instant Messaging icons or

emoticons) into the appropriate text representation.

The default abbreviation list supported by the WTS701 is a general one that cannot be modified by the

user to match the domain that the text is being loaded from. But the default list can be overridden by

the user abbreviation list. This enables a flexibility of adding abbreviation specifically for the text either

by the developer or even the end user to best customize the product for its preferences. Instant

Messaging or Short Messages Service (SMS) unique characters are supported through this

functionality as well, defining the icon, ASCII/Unicode/Big5 text, and its replacement. The default

abbreviation list supported is described in the specific language release letter.

7.1.2 Words-to-Phoneme conversion

Once the data stream has been translated to pronounceable words, the system next determines how

to pronounce them. This function is obviously highly language dependent. For a language such as

English it is impossible to break this task down to a set of definitive rules. The task is achieved by a

combination of rule based processing together with exception processing.

7.1.3 Phoneme Mapping

This algorithm maps phoneme strings into the MLS phonetic inventory. This task falls into two

portions. First, the word must be split into sub-word portions. This splitting must be done at

appropriate phonetic boundaries to achieve high quality concatenation. Once a sub-word unit is

determined, the inventory is searched to determine if a match is present. A matching weight is

assigned to each match depending on how closely the phonetic context matches. Each sub-word has

a left and right side context to match as well as the phoneme string itself. If no suitable match is found

in the inventory, then the sub-word is further split in a tree like manner until a match is found. The

splitting tree is processed from left to right and each time a successful match occurs the address and

duration of the match in the corpus is placed in a queue of phonetic parts to be played out the audio

interface.

- 12 -

▷