Khmer is a Unicode block containing characters for writing the Khmer, or Cambodian, language.

The Khmer alphabet or Khmer script (IPA: ) is an abugida, which means that it's a consonant-driven script. It's used to write the Khmer language (the official language of Cambodia). Apart from that, the script is applied for Pali in the Buddhist liturgy of Cambodia and Thailand.

The origins of Khmer go back to the Pallava script, which it was adopted from. Pallava is a variant of the Grantha alphabet descended from the Brahmi script, which was used in southern India and South East Asia during the 5th and 6th centuries AD. I know, this chain seems complicated, but doesn't all linguistics? Anyway, the oldest Khmer inscription was found at Angkor Borei District in Takéo Province south of Phnom Penh and it dates back to 611.

As for the modern Khmer script, it differs a lot from its precedent forms on the inscriptions of the Angkor ruins. The Thai0E00–0E7F and Lao0E80–0EFF scripts have descended from an older form of the Khmer script.

Khmer is written from left to right. Words within one sentence or phrase usually come together with no spaces between them. Consonant clusters within a word are “stacked”, with the second (and occasionally third) consonant being written in reduced form under the main consonant. Originally there were 35 consonant characters, but modern Khmer uses only 33. Each character in fact represents a consonant sound together with an inherent vowel – either â or ô.

You might remember that Khmer is an abugida. That's why vowel sounds are more commonly represented as dependent vowels – additional marks accompanying a consonant character, and indicating what vowel sound is to be pronounced after that consonant (or consonant cluster). Most dependent vowels have two different pronunciations, depending in most cases on the inherent vowel of the consonant to which they are added. In some positions, a consonant written with no dependent vowel is taken to be followed by the sound of its inherent vowel.

Needless to say, there are also a number of diacritics used to indicate further modifications in pronunciation. The script also includes its own numerals and punctuation marks.

Properties

Range 1780–17FF
Characters 128

List of Characters

Representation in Unicode

Copied!