Taiwanese Tone Sandhi and Text-to-Speech (TTS) Challenges

How the MTL Writing System Simplifies Sandhi

The MTL (Modern Taiwanese Language) writing system is designed to simplify tone sandhi. It's sandhi-aware, meaning the way a multi-syllable word is written already shows the changed tone of the preceding syllables.

Tone sandhi is when the pitch pattern (tone) of a syllable changes when it is followed by another syllable in continuous speech.

Rule Focus: only need to determine the tone of the last syllable of a word or phrase should it keep its original tone (citation tone), or should it change in a sentence?

General Tone Change Rules (Word Level)

In continuous speech, the basic rule is that all syllables in a word change tone, except under specific conditions:

Default Change: All words change tone, following the standard Taiwanese Tone Circle (a set pattern of tone shifts).

Default Exception (No Change): A word's final syllable does not change tone if it:

Is the last word of a sentence. Note: some exception may ally, see the Specific Lexical and Structural Exceptions section.

Comes immediately before a punctuation mark.

Example: In the sentence "Lie karm u pid? U, goar u cidky," the words pid, cidky, and U keep their original tones because they mark the end of a phrase or sentence. The other words (Lie, karm, goar) change tones.

Specific Lexical and Structural Exceptions

  1. Nouns (N) and Gerunds Nouns (like sikoef "watermelon") and gerunds (verb forms used as nouns) generally DO NOT change tone.

    The Big Exception: Nouns DO change tone when they are used as adjectives or as measure words (denoting a unit).

    Example: In Taioaan-laang (Taiwanese person), the noun Taioaan (Taiwan) is acting as an adjective and changes tone.

  2. Pronouns (r) Pronouns (like goar "I", lie "you") DO change tone by default.

  3. Exception: A pronoun DOES NOT change tone if the speaker pauses to emphasize it (e.g., Goar kab y, emphasizing "Goar, y").
  4. Backquote Rule (e.g., aang`ee): Words containing a backquote follow a specific exception:

  5. The word immediately preceding the backquoted word does not change tone.
  6. The syllables after the backquote change to shorter and softer tones.
  7. Words Ending in 'ar':

    When adding the suffix 'ar' (like a diminutive), the front syllable usually changes tone ("niaw" → "niau'ar").

  8. Exception: If the front syllable has a flat tone (Tone 7), it does not change tone (e.g., "te" → "te'ar").
  9. Words That Never Change Tone

    A few specific categories and words inherently resist tone change:

  10. Certain Demonstrative Adverbs/Pronouns: ciaf, hiaf, zef, hef.
    Note: In "Citniar si oseg`ee.", citniar does not change tone. But in "Cit'niar oseg`ee cyn suie.", where cit'niar is used as an adjective, it does change tone (following the Noun/Adjective Rule).

  11. Certain Interrogative Pronouns: tangsii, symmih, uixhøo.

  12. Certain Conjunctions: citmar, tvaf, pwntea.
  13. Contextual Exceptions (Based on Position)
  14. Before Specific Conjunctions: Words immediately before certain non-changing conjunctions do not change tone.
  15. Adjectives Before ee: Adjectives immediately preceding the functional word ee do not change tone.
  16. Functional Word ka: When ka is followed by a verb, the preceding word does not change tone. Note: when it is used as a verb ciaw'ar e ka laang., and Goar ka y korng., it does change tone.
  17. Parallel Phrases: In certain parallel structures (Verb-Object, Verb-Object), the first word in the pair does not change tone (e.g., phaotee khaikarng).
      More examples:
      phaotee khaikarng
      kofng'oe sngroe
      iusafn oansuie
      kekef hoxho
      hongthor binzeeng
      pexlau kviafiux
      lily laklag
      
        

    The following do change tones, since the nouns are used as adjectives: e.g: tikhaf-mixsvoax , zekpeq-hviati Checkout this verb zhwlie in the following two senenteces. In the 1st sentences, it change tones, but it does not in the 2nd sentence.

    1. MTL ti zengphvi-bwym hongbin, zhwlie kaq cyn suykhuix.

    2. Iong MTL laai zhwlie khaq ittix.

    Here is a list of words that never changed tone

  18. Advantages of the MTL System for NLP

    MTL offers significant benefits for computational tools like TTS:

  19. Fixed Word Boundaries: Unlike character-based systems (like standard Chinese harnji), MTL explicitly writes most words as multi-syllable units. This makes it easier for computers to identify where one word ends and the next begins.
  20. Reduced Ambiguity: By defining words with multiple syllables (sikoef "watermelon"), the word's meaning and Part-of-Speech (POS) category (Noun, Verb, etc.) are fixed, which is a major advantage over single-syllable systems.
  21. The Biggest TTS Challenge

  22. The Problem: The tone change of a word like Taioaan depends entirely on its function in the sentence: Noun (no change) vs. Adjective (change).
  23. The Solution: The TTS system needs a robust Part-of-Speech (POS) Tagger to label every word in the input text as a Noun, Adjective, Verb, etc., before applying the tone rules.
    Data Need: To address this, a good training dataset must include words used with different meanings and functions (e.g., the same word used as a noun and then as a verb) to train the POS tagger.
  24. Existing Tools: The lack of established, high-quality Taiwanese POS tagging models and large, tagged datasets is currently the main bottleneck for building truly accurate Taiwanese TTS.
    1. Accent and Tone Variation

      The MTL writing system is specifically based on the Southern Taiwanese accent. This choice has two main implications for its tone sandhi rules:

    2. Simplified Tone Shifts: MTL avoids some of the more complex tone changes found in other accents (e.g., curving tones to low falling tones, or short tones to long tones).

      Examples: MTL uses miphoe instead of mixphoe (from mii), and ciaqpng instead of ciaxpng (from ciah).

    3. Colloquial vs. Literary Tones: Taiwanese often has distinct colloquial (spoken) and literary (formal/written) pronunciations for the same word (e.g., zap vs. sip). Crucially, the tone change rules remain consistent regardless of whether the speaker uses the colloquial or literary pronunciation. (e.g. zap vs. sip), tone changes rules are the same.
    4. < Example: The word sikoef (watermelon) is an explicit, two-syllable noun, which means it does not undergo further tone changes when spoken (based on the Noun Rule).

      hofng teq zhoef hongzhoef hongzhoef ho hofng zhoef`khix

    For TTS, this appears to be the biggest challenge. Using POST (part of speech tagging - lexical category) would help. But this is a huge task.
    What are the existing databases out there? What models have been developed for tagging Taiwanese? Can we revise the existing POST for a jump start?
    Further, a vocabulary could be a noun or a verb.

    A good dataset must include these words with different meaning for training purpose.

    Modern Approaches