From Message

Roger Sperberg

2007-03-26 04:55:20

Khmer Unicode

I'm able to enter and edit Khmer text in Notepad, MS Word and OpenOffice. I'm looking though for a lightweight Unicode-capable programmer's text editor.

Superedi looks very good.

But there aren't any monospaced Khmer Unicode fonts that I know of. (I'm not really sure what font is providing the Khmer characters when I tried Superedi, experimenting with every monospaced font.)

A simple word like "knyom" or "kñom," which is the word for "I," consists of four glyphs. You enter this with your keyboard by typing "x," "j," "J," and "M." The second keystroke indicates that the next letter should not appear beside the first but below it, and the last keystroke inserts a two-part vowel, one part of which appears above the first letter and the other part below the subscripted "ñ" character. In other words, the word is only one letter wide.

Some stacking and combining occurred, but the word appears as four letters wide in the required monospace font.

And pressing the spacebar also entered a space, though Khmer should not display spaces between words and the spacebar with the Khmer Unicode keyboard should generate a zero-width space (which allows for line breaks).

I hope that when you add support for proportional-width fonts that you also are able to handle the ZWSP character, displaying it with white space and collapsing it with white space made invisible.


Roger Sperberg

Wolfgang Loch

2007-03-26 23:48:44

Re: Khmer Unicode

Thanks for the note. I am aware that SuperEdi does not properly supports all scripts, especially if not every character is mapped to a single glyph.

SuperEdi works fine with Western scripts including Latin, Kyrillic and Greek characters. Even Korean, Chinese and Japanse characters work ok if you allow them to use the double width of latin characters. However, some languages can't be properly represented with monospace fonts, like Arabic and Hebrew.

And there are languages that don't represent every Unicode codepoint with a single glyph. This is true for Tamil and obviously for Khmer. I doubt that it is even possible to represent these scripts with a monospace font. For example, how do you select the first character if it represents the upper half of the glyph?

I may at some time add support for variable fonts and scripts to SuperEdi. Unfortunately I don't speak Tamil or Khmer. Can point me to some resource that (visually) explains the Khmer script and how it is represented in Unicode?

