Orality, Literacy, and the Written Arabic Language

My Question

On Oct 21, 2001, I went to a talk given by William O. Beeman that discussed why there are a lot of Muslims who don't like the U.S. In the Q&A session, he made an off-hand remark that the Arab world had been at the forefront of everything -- art, literature, science, medicine -- until around the 1600s, when suddenly Europe pulled ahead. He seemed puzzled as to why that happened.

I had finished reading The Printing Press as an Agent of Change by Elizabeth Eisenstein six weeks before, and the book blew my mind. It showed how the printing press led to an explosion of scientific advances and enabled a foment of philosophical and religious thought -- in the 1600s.

I was stunned. "Is printing in Arabic that much harder than printing Latin characters? Is the divide between Islam and the West because of the printing press?"

Since then, I have been (slowly) reading and trying to come up with an answer to that question. If you know of good books or papers, please email me with the citation.

Technological difficulties

It turns out that yes, printing in Arabic is that much harder than printing Latin characters. This is due to to multiple character shapes, diacritics, ligatures, and a sloping baseline. Tom Milo, who develops software for Arabic typography, has some fine articles on just what makes it so hard; there are additional articles by Paul Lunde and Eildert Mulder. Here is a summary of the issues:
  • Multiple glyph shapes: While Latin-derived alphabets has up to two letterforms (capital and lowercase), dependent in part on where in the word or sentence the letter is, Arabic has up to four different versions of each letter. There are special forms for the beginning, middle, and end of a word, plus a form for standing alone (as in the index of a book).
  • Slope of baseline: Tom Milo says that there is a slope to calligraphic Arabic, where all words start at one y-height and end at a different y-height. This means that the slope of the word changes depending upon how long it is. For some styles of Arabic, it's not very large, but for some (like Nasta'liq), it is more obvious.
  • Lots of ligatures: Written Arabic is a fundamentally "cursive" language, while Latin is a fundamentally "block" script. Imagine trying to typeset cursive English, and you get the beginning of a sense of how hard it would be -- but it is far worse in Arabic. In cursive Latin, the ligatures don't usually change the shape of the character a whole lot. For example, the "a" and "t" in cursive "at" look basically the same as the "a" and "t" separately. Lowercase "o" and "y" cause a few changes in shape, but they are minimal. However, in Arabic, they change a lot.
    • The kind of ligatures is also different. In Latin, you really only have left-right connections between the letters. In Arabic, you can have right-to-left and top-to-bottom connections.
    • Note: Complex ligatures in "block" writing systems might have been more common once upon a time. At the Holy Image, Hallowed Ground: Icons from Sinai exhibition, I did see quite a number of interesting combinations of Greek letters, like in this fragment from Saint John Chrysostom with Liturgical Scroll, circa 1200 CE. The chi-rho (X + P) monograph is apparently particularly common for that time period.
  • Consonant dots: Over time, some Arabic characters became ambiguous, and they introduced dots to disambiguate them. (The character for "sh" is the character for "s" with three dots above it, for example.) These dots are not optional, and they need to go in different places depending upon the overall shape of the word (e.g. if there is another character where the dots "want" to go) and the shape of the character (i.e. if there is a ligature).
    • This is a little bit similar to how Latin scripts' accents are in different positions for capital letters and lower-case letters (e.g. Ä and ä), but there are a lot more types of consonant dots in Arabic script than accents in most Latin scripts, and accent placement is far more predictable. It has only been very recently that some very talented people working very hard have figured out how to write the placement rules precisely enough for computers to be able to place the dots correctly.
  • Vowel points: Arabic was originally written without vowels. From what I understand, this isn't as much of a loss as you might think because the vowels (generally) only change what part of speech the word is, not the fundamental meaning of the word.
    • It is similar to how the last letter is different in "biked" and "bikes" for the past tense and present tense of "to bike". You're still talking about a machine with two wheels. In English, vowels are much more important than they are in Arabic: you change from talking about transportation to cooking if you change from "biked" to "baked".)
    Like consonant dots, vowel points "move around" based on the shape of the word and character.
  • Line justification: Perhaps not surprisingly, given the lack of need to rely on horizontal spaces to denote word boundaries, line justification also does not depend upon horizontal spaces. Instead, Kashida spacing is used: horizontal lines in the middle of words are stretched in well-defined and well-prescribed ways. This is difficult to implement with mechanical means.

Side Note: Advantages of Arabic Script

After that long list of what makes Arabic hard to typeset, a person who is only familiar with Latin scripts could perhaps be excused for allowing the thought to pass across their brain that Arabic is not a very "good" writing system. Chase that thought away! As I have learned more about Arabic and about writing systems in general, I have been impressed at how particularly well-suited Arabic is to the technology of the middle ages -- much better than Latin, in my opinion.
  • Easily visible word boundaries: In Arabic, it is much easier to see where a word starts and stops than it was in Latin prior to spaces between words. (It wasn't until about 800 CE that Latin started using spaces between words.) The slope and different letterforms in Arabic help readers tell where word boundaries are.

    While I haven't seen anyone else mention it, it sure looks to me like the end of one word frequently goes under the beginning of the next word. It looks to me like where modern Latin scripts use horizontal separation, calligraphic Arabic used vertical separation: the end of one word is below the beginning of the next word.

    Note that in a time when writing surfaces (papyrus, parchment, or early paper) were very expensive, vertical separation was significantly cheaper than horizontal separation. You could pack more letters onto a page with vertical spacing than with horizontal spacing -- probably almost as many as with no spacing.

  • Data compression: By not (or only rarely) writing the vowels, not only is it faster to write Arabic than Latin script, but it takes up less space. (Note that as recently as twenty years ago, computer programmers also frequently left out vowels in variable names to save space.)

  • Very distinct Bouma shape: The Bouma shape -- the overall shape of a word -- is more distinguishable in Arabic than in Latin. This is particularly true in Nasta'liq. Kamal Mansour, who worked on an OpenType version of Nasta'liq, said at a talk I went to, that people who learn to read Nasta'liq can have a very difficult time reading non-Nasta'liq Arabic script. He said that he found that words in Nasta'liq seemed almost like ideograms to him when he got good at reading Nasta'liq. This probably makes it harder to learn, but faster to read.

    • The Bouma shape is very important: in 2003, there was a rumour circulating around the Internet that claimed that researchers at a British University had found that you could still read English text reasonably easily if the first and last letter were in the right place but that every other letter was jumbled. While it appears that there was no research exactly like that mentioned, it is striking at how easy it is to read the jumbled text.

Arabic script is also pretty. Period. Check out this calligraphy, for example.

Impacts of armed conflicts on typography technologies

Tom Milo points out that Ibrahim Müteferrika did a fantastic job of typesetting Arabic in 1730 CE, but that all of the knowledge of how to typeset was lost when the Europeans overran the Ottoman Empire. All the printers worked for the government, and everybody working for the government was fired. (This is similar to the de-Baathification that the US carried out in Iraq.)

Another of the great centers of Arabic printing was Lebanon. Tom Milo points out that with the advent of mini- and micro-computers, one might have expected the Lebanese printing industry to have seized upon the new technology to come up with software solutions to the difficulty of typesetting Arabic. Alas, he notes, the Lebanese civil war (1975-1990 CE) was very effective at taking Lebanese expertise out of circulation.


The Arab culture places great value on recitation, much more so than on reading. A great honor -- the title of Hafiz -- is bestowed on people who are have recited the entire Koran from memory. Note that you don't have to read it, you don't have to understand it (though I presume it's a good thing if you do), you have to recite it. You don't have to memorize/recite it all at once, but you do have to memorize all the pieces at least once.

This makes sense given that written Arabic -- particularly the Qur'an -- was ambiguous. (Vowel points and consonant dots weren't developed until after the Qur'an. Vowel points are still frequently left out.) In order to make absolutely sure that someone understood what they read, teachers needed to hear the student repeat it back to them. This extended even to book copying -- you couldn't just borrow a book from someone, copy it, and return it. The owner would have to teach the book to you, you would memorize it, and then the owner would certify in your written copy that you recited it correctly.

Because this tradition was very firmly embedded in the culture of learning, it was also apparently common to teach "between the lines". A book might say one thing, but what it really meant was something your teacher would have to explain to you. That was part of the certification process. The whole culture of learning assumed that you could ask questions of the person you got the book from.

Printing completely upset that one-to-one learning pattern, and was thus very disruptive. If anybody could read from a book by themselves, there was no telling what interpretation they might make! They might completely misunderstand it!

If there were typos (which was likely, given how difficult it was to typeset Arabic), that would make it even more difficult to understand the book. Thus when the Ottoman Sultan finally allowed printing of secular works in Arabic in 1727, it was with the restriction that each page would need to be proofread by four learned men appointed by the Sultan.

Note that this would increase the cost significantly -- costs which were already higher due to the technical demands of printing Arabic.

Aesthetic Issues

Islam forbade most forms of figurative art, so calligraphy held one of the (if not the) highest position in the Arab artistic world. Unfortunately, the first documents printed in Arabic script (in Europe, by Europeans) were just awful. They had lots of typos and were just plain ugly.

If you are Christian, try to imagine how you would feel if the first printed Bible had been printed in Istanbul by Muslims, had zillions of typos, and looked like it had been written by third-graders in crayon. You'd probably be upset. It is no wonder that the Ottomans banned printing in Arabic for hundreds of years and banned printing the Qur'an for another hundred years after that.

Literacy in the Arab World

The literacy rate in the Arabic-speaking world has lagged other parts of the world. According to the 2007 World Factbook, Mexico has a higher literacy than Bahrain. This is probaby not all due to the technical difficulty of printing Arabic (which in turn makes books more expensive and less available).

Country Literacy Rate
Arabic World
Bahrain 86.5%
Saudia Arabia78.8%
Non-Arabic World
United States99%

Part of the low literacy in the Arab world might be due to written Arabic being hard to read. This is due in part to vowels frequently being omitted. Abu- Rabia has done studies that show that reading accuracy and comprehension are both better in Arabic if vowel marks are included.

However, part might be that written Arabic is quite different from spoken Arabic. Classical (i.e. written) Arabic has changed little since the 7th century. Note that Arabic is a very holy language, unlike Latin.

  • The Angel Gabriel spoke to the Prophet Muhammed in Arabic and commanded him to recite in Arabic. Arabic and Islam are very fundamentally tied together.
  • Jesus spoke Aramaic. The New Testament was originally written in Greek, and later translated into Latin. Latin and Christianity are thus far more weakly connected. (Even so, note that Catholic Church persisted in using Latin in the liturgy until 1963!)
To change Arabic, to modernize the language or to reform it in some other way, would probably be seen as messing with the Word of God.

Regardless, What common people speak in their homes moves on. While there is a defined standard, Modern Standard Arabic (MSA), it is a lingua franca. Like Latin was for many years, it was a working language but a mother tongue to none. Colloquial Arabic is quite different from MSA.

Colloquial Arabics are also very different from each other. Arabic is spoken from Morocco to Afghanistan, from Niger to Uzbekistan. The different descendants of the Islamic liturgical tongue are as different as the Christian liturgical tongue's descendants (i.e. Spanish, French, Italian, etc) of the are from each other. Regional variations that creep into the written language can also cause difficulties for people from a different region.

Turkish and Ottoman bans on Arabic printing

If you look at the language of those who banned printing in the Arabic script for 282 years -- the Ottoman elites -- the difference between written Arabic and the language of the street was extremely large. According to Geoffrey Lewis, in The Turkish Language Reform: A Catastropic Success, the language of the Ottoman elite -- a blend of common Turkish, Persian, and Arabic -- was difficult for all but the elite to read.

Furthermore, the Arabic alphabet was particularly poorly suited to Turkish, which depended heavily on vowels and diphthongs, and had several consonants that were poorly represented in Arabic script.

Looking backwards and forwards

A number of features of Arabic script which conserve horizontal writing space make Arabic challenging to typeset with technology originally developed for the horizontal, block-oriented European press technology. The Arab reliance on recitation, the appreciation of fine calligraphy, and a difference between written and spoken Arabic all served to delay and impede the acceptance of print in the Arab World.

Recently, there have been great strides in typesetting software. Tom Milo at DecoType and Kamal Mansour at Monotype and their teams both have worked hard to develop calligraphic Arabic fonts, ones that understand the fundamental "DNA" of Arabic scripts. These should make it much easier to overcome the technical limitations of printing in Arabic script.

Below are some of my earlier thoughts and earlier sources.

The Calligraphic State

Beeman recommended The Calligraphic State by Messick to me. It has good stuff on communications technology in Islamic societies, but it is a bit buried underneath a lot of Upper Yemen history. It is also out of print.

From this book, I learned the following.

It is very important in Islam to hear the Qur'an. Students would learn Arabic writing merely as an aid to memorizing the spoken Qur'an. Partly this is because, according to the Qur'an, the Angel Gabriel dictated to Muhammad. Gabriel didn't hand Mohommed a book and say, "Here, read this."

As the Qur'an is a complete guide to behavior, law and religion are not terribly distinct, so even the legal system was fundamentally oral until very recently. For example, signed documents did not have legal weight by themselves until relatively recently. To use a written document in court, you had to have one or two witnesses (I forget the number) in court swear that the document was genuine.

Documents weren't just copied, they were memorized. This meant that the lineage of a document was important. You learned not just who composed the document originally, but who memorized it along the way: "This was what Fred said. Fred taught it to Jackie, who taught it to Frieda, who taught it to Bill, who taught it to Mildred, who taught it to Rasheed, who taught it to Edgar, who taught it to me." The document would get passed along this way for hundreds of years!

In a completely oral society, there is no other way to preserve information. In a society that has writing but not the printing press, there is danger in depending upon written copies. There were no smoke detectors, sprinklers, or fire departments back then: your only copy could easily be lost!

Web Sites

(I surfed a bunch of Web sites which I, alas, did not record. I'm not sure where exactly the following ideas came from.)

Because figurative art is discouraged in Islamic societies, calligraphy was one of the only forms of artistic expression. The beauty of written forms thus became much more important to the Arab world than to the European world.

Turkey switched over to a modified Latin alphabet when it was under the leadership of Atatürk. Atatürk also led a major literacy push. The alphabet switch combined with the literacy push raised the literacy rate from 9% in 1923 to 33% by 1938 and 85% today. How much of that gain was due to an easier alphabet and how much due to the emphasis on literacy? I don't know the answer to that.

Arabic-Speaking Friend

A friend of mine who grew up speaking Arabic once told me that he always had found it easier to read English than Arabic. (This despite Arabic being his home language and the language of the country he grew up in.) I asked if it was because vowels usually weren't written. He said no, that the structure of the language was such that usually the words weren't ambiguous. Instead, he said that written Arabic is archaic enough that it is very different from the written language. He told me that written Arabic is basically archaic Saudi dialect.

I don't know, but suspect that the language has ossified because it is considered holy. (Think about how Latin was used in church services until 1965, even though the Bible wasn't written in Latin!) After all, Arabic is the language of Muhammad, the language that people (even non-Arabic speakers) are supposed to recite prayers in.

If I had to do all my reading and writing in the English of 570 AD, that would certainly slow me down.

Arabic Typography by Huda Smitshuijzen AbiFarès

Arabic Typography by Huda Smitshuijzen AbiFarès gave me more background on the history and technology of Arabic printing.

The history section shows that printing clearly came very late to the Arab world:

    1450: Gutenberg
    1537: first Quran printed (in Venice)
    1521: Hebrew press established in Ottoman Empire
    1694: second Quran printed (in Hamburg)
    (throughout this period, there is a tiny bit of scholarly Arabic-language printing in Europe)
    1726: prohibition against printing in Arabic type lifted in Ottoman Empire for secular books only
    1798: Napolean imported a printing press into Egypt
    1818: printing starts in Iran(?)
    1822: first press in Malta
    1826*: can't find the date, but it was around now that the prohibition against printing sacred materials in the Ottoman Empire was lifted
    1830: first press in Iraq
    1845: first press in Morocco
    1846: first presses in Palestine and Algeria
    1860: first press in Tunesia
    1877: first press in Yemen
    1881: first press in Sudan
Partly printing is difficult because of the orthography. While English has 52 letter forms (26 letters in upper-case and lower-case), Arabic has 28 letters, each of which can have four different forms depending upon whether it is at the beginning, middle, or end of the word or stands alone. (The script developed before the idea of putting spaces between words, so the different shapes help to see the word boundaries.) Then there are optional diacritics. For example, vowel marks are sometimes placed above or below the letters.

However, AbiFarès also mentions that Gutenberg used 300 letterforms to create the Gutenberg Bible. (Mostly these were ligatures, things like ae or ff.) So clearly, the complexity of typography can't account for all of why it took so long for the printing to become widespread in the Arab world.

This book also talked about how writing is important in the Islamic creation story. It went something like this:

Allah stared fixedly at the hamza (a diacritic -- a little circle). As Allah stared, the hamza started to drip ink, which ran down and became the aleph (equivalent to the Latin "A"). From the aleph, all the other letters came, and thus everything in the universe.
(I can't find the page reference or the Qur'anic reference, so I'm sure I got it slightly wrong.)

Anyway, AbiFarès says that the act of writing is considered a spiritual act, so I can imagine a real reluctance to printing sacred works. I can imagine people uncertain if it could be Scripture if it wasn't in script.

The Alphabet Effect

The Alphabet Effect by Robert K. Logan had interesting history and some aggressive conclusions.

Logan shows how the Babylonians, Hebrews, Greeks, and Romans benefitted from the introduction of the alphabet and writing. Like Eisenstein, he talks about how the printing press led to massive technological advancements and social changes.

He also says that linear, alphabetic writing made the readers' thought processes fundamentally different from preliterate or non-alphabetic peoples. In particular, the thought that classification was somehow peculiar to alphabetic peoples.

That was harder for me to swallow. It seemed to me that all of the gains the alphabetic societies could claim could be attributed simply to literacy. In some cases, it might be that alphabets are easier to learn, thus increasing literacy.

Paper Before Print

I only had my hands on Paper Before Print: The History and Impact of Paper in the Islamic world by Jonathan Bloom for a few days, so had to read quickly. In that first reading, I didn't see anything that really clarified my question. It did point out that paper was much cheaper and more robust than papyrus.

However, I was hoping to find that paper arrived in the Arab world just before Mohommed, and so could explain in part how Islam spread so rapidly. No such luck: the Arab world got paper around 700-800 AD (significantly after Mohommed). The West didn't get paper until about 1200 AD.

Jonathan Bloom also wrote a good article on paper before print. It is available on the Web and shorter than the book.

