Compare engines

Precomposed characters

Check if the following two words give different output in a speech synthesis engine:

  1. "finish" (normal text)
  2. "finish" (with fi precomposed ligature)

They should be the same if the engine knows about Unicode normalisation. I did a quick check over the TTS engines/voices that I have access to (☒: different, ☑: same):

Here is a list of pre-composed Latin characters.


API


Need a way to identify abbreviations and the context it is in. For example, Ala can be either Alabama or Alanine depending on what is being talked about.

See english for more information about abbreviations.