Space is basically the amount of free place between letters, separating words in a text.

First scripts were pictographic or ideographic. Each symbol stood for a word, and it wasn't necessary to separate them. Since alphabets got introduced, reading a monolith text became inconvenient. That's how new special symbols appeared, the purpose of which was to divide words. The symbol of Space, now considered standard, wasn't a one-day invention. As for Latin and Greek scripts, Space had already been used there for around one thousand years. However, Cyrillic script took a bit longer to adapt. There it got employed only in the XVII century. Speaking of Arabic, spaces appeared there a bit later, in the XX century.

In addition to this special symbol, word separation can be indicated in other ways. For example, using special letter forms for the end or beginning of a word. In the Arabic alphabet, several letters exist in four different forms of writing (for the beginning, end, middle, and separate forms). Although Arabs use spaces, letters still have different forms. Another alternative is a line above the letters. The words themselves are written without spaces, and the line is interrupted. In some writing systems, it may be that not words, but phrases, sentences, or syllables are separated. The true space is used in almost all modern writing systems. In Thai, only sentences are separated by spaces.

Unicode has several types of spaces. For example, there is a non-breaking space. Also, several space symbols are located in the Punctuation marks block.

More symbols for word separation:

· Interpunct. Latin. Used until the 600-800s.

𐎟 Ugaritic cuneiform.

𐏐 Persian cuneiform.

𒑰 Assyrian cuneiform.



𐤟 Phoenician.


The symbol “Space” is included in the “ASCII punctuation and symbols” subblock of the “Basic Latin” block and was approved as part of Unicode version 1.1 in 1993.

Unicode Name Space
Unicode Number
CSS Code
Unicode Block Basic Latin
Unicode Subblock ASCII punctuation and symbols
Unicode Version 1.1 (1993)
Alt Code
Version 1.1
Block Basic Latin
Type of paired mirror bracket (bidi) None
Composition Exclusion No
Case change 0020
Simple case change 0020
Grapheme_Base +
age 1.1
scripts Common
Encoding hex dec (bytes) dec binary
UTF-8 20 32 32 00100000
UTF-16BE 00 20 0 32 32 00000000 00100000
UTF-16LE 20 00 32 0 8192 00100000 00000000
UTF-32BE 00 00 00 20 0 0 0 32 32 00000000 00000000 00000000 00100000
UTF-32LE 20 00 00 00 32 0 0 0 536870912 00100000 00000000 00000000 00000000