Version of 2016-09-23

Wersja polskaBilanguage versionWersja dwujęzyczna

Contents Next part

Grzegorz Jagodziński

Indo-European and Semitic languages

1 – 23

There was time in the science when it be thought quite seriously that the first proto-language – or the language from which all the others originated – was Hebrew. A specific reminiscence of that view is the opinion that a special close genetic relation exists between Indo-European (IE) and Semitic languages. Such a view can still be found in some works. Newer investigations suggest very strongly that that view is not correct and that those previously demonstrated similarities of both language families are the result of the connections between them during over thousands of years rather than of their common origin. Nevertheless those similarities are odd, and the circumstances of their development are not clear in all respects.

Genetic relations

A number of modern and extinct languages is classified as Semitic. According to the newest classification, 4 subgroups of them are distinguished, that are North, East, West and South Semitic. The northern group comprises languages which were used in 3rd and 2nd millennia BC: Palaeosyrian (including Eblaite and Mari), Amorite, and Ugaritic. The eastern group is represented by three languages that were in use one after the other: Akkadian, Assyro-Babylonian (with Babylonian and Assyrian dialects) and Late Babylonian (used only in the written form). Among the western group, Canaanite, Aramaic and Arabic subgroups are distinguished. Old Canaanite, Hebrew (still in use in its modern form), Phoenician (with Punic), Ammonite, Moabite and Edomite belong to the former group, Early Aramaic, Imperial Aramaic, Standard Literary Aramaic, Middle Aramaic, Western Late Aramaic, Eastern Late Aramaic (with Syriac) and Neo-Aramaic to the other one, Pre-Islamic Arabian (with Lihyanite, Nabatean, Thamudic, Safaitic, Hasean), Pre-Classical Arabic, Classical Arabic, Middle Arabic and Modern Arabic (with Maltese) to the third one. Finally the southern group divides into South Arabian and Ethiopic. Within South Arabian, there are several epigraphic languages: Sabaic, Minaic, Qatabanic, Hadramitic, Awsanic, Himyaritic, as well as some modern languages: Mehri, Sheri (Śḥeri), Soqotri. Within Ethiopic there are two further subgroups. North Ethiopic group contains extinct Geʽez and extant Tigre and Tigrinia while South Ethiopic one – Amharic, Argobba, Gafat, Harari, and Gurage.

The Semitic languages are brought nearer first of all not to the Indo-European languages but to Hamitic. This name indicates a rather artificial group made of Egyptian (together with Coptic), the Berber languages (e.g. Kabylie, Tamazight, Shilha or Tamashek = Tamahaq, the language of the Tuaregs), Chadic (e.g. Hausa), Cushitic (e.g. Somali, Danakil = Afar, Bedja = Beja = Bedawye, and also Oromo = Galla) and Omotic (e.g. Kaffa). All these languages form the Afro-Asiatic family (AA), previously called also Hamito-Semitic.

The unquestionable genetic relation of the Semitic languages with Egyptian and other languages which are called Hamitic is the main reason for rejecting the thesis of close Semitic-Indo-European genetic relation. However, the lexical and the grammar features which make Indo-European and Semitic languages closer, are not present, as a rule, in other AA languages. As a consequence, it is emphasized that even if any similar features of the Semitic and IE languages exist, the Proto-Afro-Asiatic language (PAA) was not similar to Proto-Indo-European (PIE) at all.

Moreover there exists the view that all the 4 large language families of Africa, i.e. Afro-Asiatic, Nilo-Saharan (e.g. Songhai, Dinka, Masai, Shilluk, Kanuri), Niger-Kordofanian and Khoisan (languages of the Bushmen and the Hottentots), are related in the genetic sense. Special relations are said to exist between Nilo-Saharan and Niger-Kordofanian – they are united in one group called Zinj. Certain grammatical features of the Hamitic and also Semitic languages, can be found in the Zinj languages as well, e.g. in Ful (Fulani, Fulbe), which is usually numbered among the West Atlantic group of the Niger-Congo languages (a sub-group of the Niger-Kordofanian languages), and which is termed Hamitic in some works.

There also exists a probably more plausible view that the Khoisan languages can be contrasted with all the other languages of the world, and the Zinj languages have more in common with the languages of Australia and south-eastern Asia than with the AA languages (see here and here). The previously mentioned similarities are explained such that once there existed a period (ca. 20,000 BC) when the climate conditions caused people who used languages of the 4 mentioned groups (plus the Shabo language, considered to be isolate) to live in a refugium on a quite small territory in the north-eastern part of Africa. Then the common features of their languages had a chance to come into being.

A. Starostin and many other investigators have made analyses of the language relations which look very interesting. They can be found on the website called the Tower of Babel (also here). Its newer, expanded version is available from here. By the authors of the Tower of Babel, the AA languages are not only separated from IE but also removed from among the Nostratic languages – they are a sister group of not only the whole of the Nostratic languages, but also the Dene-Caucasian languages.

And so, if the Indo-European and Semitic languages had a common ancestor, it was only in the very distant past. The IE protolanguage surely existed ca. 4,000 BC. It is supposed that the Nostratic commonwealth must have existed 11,000–15,000 bp At the same time, the common ancestor of, among others, the Indo-European and Semitic languages, should have existed ca. 25,000 bp It is no strange that traces which have remained of that distant ancestor until today are very scarce, and the prevailing part of the similarities of both groups should be explained with the parallel development and mutual interactions.

Similarities between the two language families

Both language families have a number of common similarities in various language areas. It is interesting that the share of the common features seem to grow with time. In other words, today the Semitic languages show more common features with the IE family than several thousands of years ago.


Perhaps the most known common feature of Indo-European and Semitic languages is a less or more developed system of vowel alternations in inflexion and in word-formation, called apophony or ablaut. It should be emphasized that this phenomenon is rare in other language groups, and at least less universal.

Indo-European languages

We have two kinds of vowel alternations here. One of them is the qualitative alternation. In various forms of a given morpheme in various words, and even in various forms of a given word in its inflexion, the e or o vowels can be seen. In the Slavic languages the e : o alternation is present first of all in word-formation. It can be seen e.g. in the pair of Polish verbs wieźć : wozić ‘to be carrying, transporting : to transport usually’ < *vezti : voziti < IE *weǵh- : *woǵh-.

The Germanic languages have preserved the Indo-European qualitative alternation in conjugation of strong verbs. The -e- degree is present in the Present Tense while the -o- degree is present in the Past Tense, ex. in Eng. drive : drove from IE *dhreibh- : *dhroibh-. The same type of apophony can be seen in Greek in Present and Perfect: leípō ‘I leave’ : léloipa ‘I have left’. The latter form has the first consonant of the stem repeated – this is surely the result of an irregular reduction of the form with full duplication of the stem, i.e. *loiploipa, with intensive meaning. But the origin of the qualitative alternation in Indo-European has not been explained, and the phenomenon must be very old.

The second type of apophony is the quantitative alternation. There are two types of it: the disappearance (or reduction) of the vowel and the morphological lengthening of the vowel. A short vowel (e or o) could disappear, when it lost stress in inflexion or word-formation.

An example of disappearance can be seen in Polish suchy : schnąć ‘dry : to become dry’ < Proto-Slavic *suxъ : *sъxnǫti < IE *sóusos : *sus-nóm-. It is worth to notice by the way that the old alternation o : 0 (zero) was in the shape ou : u (reduction of a diphthong) but in Polish it has changes into a new quantitative alternation u : 0 (disappearance of the vowel). The quantitative alternation also happened in inflexion, e.g. in Greek páter! ‘father!’ : patrós ‘father’s (genitive)’ (e : 0).

Stop consonants, especially when word-initially, prevented the complete reduction of the vowel, because the consonant cluster would have been too hard to pronounce. As the result, reduced vowels developed. In particular IE language branches they became identical with normal short vowels or changed into -a-. In various Indo-European languages we can observe hesitations. E.g. Polish ziemia, Latin humus, and Greek khthōn < *dhǵhem-, *dhǵhom- with disappearance of the vowel can be contrasted with Hittite tekan < *dhĕǵhom- which has the vowel preserved. It is possible however that the vowel in Hittite was inserted later to simplify the pronunciation.

Another kind of a reduced vowel (marked with the letter ‘schwa’ – ə) developed when a naturally long vowel became reduced. Such naturally long vowels originated from the group of a vowel and a laryngeal consonant. In the later history of the IE languages that vowel disappeared or developed into -a- while in Indic into -i-. E.g. the IE (phase A) word *paHtérs ‘father’ developed into *pəters in IE (phase B), therefore Greek patḗr but Sanskrit pitā, pitar-. Similarly IE A *dhougoHtérs > IE B *dhug(h)əters, therefore Greek thygátēr, Sanskrit duhitar- and Slavic *dъkti, *dъkter- > *dъtji, *dъtjer-, from which are Russian doch and Polish córka. Cf. also Latin faciō ‘I make’ : fēcī ‘I made’ < *dhə- : *dhē- (cf. Eng. do and Polish dziany, dziać ‘knitted, to knit’, dzieje ‘history’, literally ‘things which was made’, dziać się ‘to happen’).

The IE vowel reduction is not only a phenomenon which was very distant in the past. For instance in the Balto-Slavic languages other, secondary vowel reductions occurred. The reduced ĕ mixed with i, and the reduced ŏ mixed with u. In Common Slavic so called yers originated from those vowels. They were ultra short i, u. In scientific works they are usually marked with the Cyrillic letters ь, ъ respectively. Similar reductions took place even in stressed syllables. It was so perhaps because the stress changed the place with time and it caused reductions of particular syllables one-by-one. E.g. together with pekǫ ‘I bake’ < *pékuōm the Imperative pьci developed (today it is secondarily piecz in Polish, after pieczesz ‘you (sg.) bake’) < *pikúis < *pekúis < *pekuŏis (old Optative).

The second type of qualitative changes was lengthening of short vowels in some morphologic formations, e.g. marking frequent actions. That way the alternations e : ē, o : ō arose. In some grammatical forms in Sanskrit, also the syllabic r, which had developed as the result of reduction of the groups er, or, was lengthened secondarily: r̥ : r̥̄.

The process of morphologic lengthening is still alive in Polish, in a specific way. And so, the short IE -o- still exists unchanged, while the lengthened -ō- is continued as -a-. Hence pomogę : pomagam ‘I will help : I (usually) help’ < *-mogh- : *-mōgh-, chodzić : chadzać ‘to walk : to walk usually’, pogorszyć : pogarszać ‘to make worse : to make worse usually’, przerobić : przerabiać ‘to remake : to remake usually’, and, in the spoken language, also wyłączyć : wyłączać ‘to switch off : to switch off usually’, irregularly pronounced [wyłanczać] instead of the literary form [wyłonczać].

The Semitic languages

It is interesting that in the Semitic languages we can find not only almost all counterparts of the IE ablaut, but also the function of particular alternations seems to be similar in some cases. Qualitative alternations (originally in the shape a : i : u) and quantitative alternations (reduction and lengthening) are so frequent in this group of languages that only consonantal skeleton of words is considered to be the root (it consists of 3 consonants as a rule). Perhaps this feature of the Semitic languages made possible to develop the alphabetic writing, but it is a quite different story…

The Semitic verb has 2 or 3 tenses and a number (even a dozen or so) of themes which make so called derived conjugations. Part of them is conjugated by numbers, persons and genders similarly like in IE languages, i.e. with adding appropriate endings. Other forms have prefixes instead – elements which have similar function as endings, but which are added at the beginning of the form (before the word-formation prefixes, if any). In some forms, both endings (usually marking gender) and prefixes (marking person) are present simultaneously. We can analyse the paradigm of the Akkadian verb kašādu ‘win’ in the Present Tense. The prefixes are marked red, the endings are marked blue:

  singular   plural  
  akašad I am winning   nikašad we are winning  
  takašad you (sg. m) are winning   takašadu you (pl. m) are winning  
  takašadi you (sg. f) are winning   takašada you (pl. f) are winning  
  ikašad he is winning   ikašadu they (m) are winning  
  takašad she is winning   ikašada they (f) are winning  

The real number of forms of the Semitic verb is even greater because endings which mark the object of the action are frequently added. For instance in the 3rd person masculine singular together with ikašad ‘he is winning’ some other forms are possible: ikašadani ‘he is winning me’, ikašadaka ‘he is winning you (sg. m)’, ikašadaki ‘he is winning you (sg. f)’, ikašadšu ‘he is winning him’, ikašadši ‘he is winning her’, ikašadannaši ‘he is winning us’, ikašadkunuši ‘he is winning you (pl. m)’, ikašadkinaši ‘he is winning you (pl. f)’, ikašadaššunu ‘he is winning them (m)’, ikašadaššina ‘he is winning them (f)’.

Vocalization (or the quality of vowels in a given word) depends on whether the verb is transitive or intransitive. In the perfect form of the Semitic verb, which is often marking the Past Tense and which uses endings rather than prefixes, transitive and some intransitive verbs had the vowel -a- at each consonant of the root (it is still so in Arabic which is relatively little changed). E.g. in the 3rd sg. m. (where is no endings) we have in Arabic kataba ‘he wrote’, galasa ‘he sat’ (in modern pronunciation ǯalasa (i.e. [jalasa]; more on Semitic pronunciation at the end of the article) but the Egyptian dialect has preserved the original pronunciation g), faˁala ‘he did’ etc. Intransitive verbs have -i- or -u- after the second consonant: ḥazina ‘he was sad’, ḥasuna ‘he was beautiful’.

Together with the endings, the Imperfect has also prefixes which denote person. There is no vowel after the first consonant of the root, there is -u after the last one, if no ending. The vowel of the prefix and the vowel after the second consonant are determined by the rules of the grammar. The respective forms of the above Arabic verbs are jaktubu ‘he is writing’ (notice that j means the same as in the International Phonetic Alphabet, i.e. the same as English y), jaglisu ‘he is sitting’, jafˁalu ‘he is making’, jaḥzanu ‘he is sad’, jaḥsunu ‘he is beautiful’. In Arabic there also exists the Passive Voice. It is made with vowel changes only: kutiba ‘it was written’, juktabu ‘it is written’.

The system of derived conjugations in Arabic is enough complex. They express various meaning which are only partially predictable even if linked to the meaning of the basic verb. And so, the second theme of the verb, intensive, expresses an act which is being done with special intensity. It is typified by duplication of the second consonant of the root: qatala ‘he killed’ : qattala ‘he murdered (many), he made massacre’, juqattilu ‘he is making massacre’. It seems that forms of that kind could have developed similarly like the IE Perfect, by full duplication of the root, and next irregular reduction of such a form. The third theme is conative which is made with lengthening of the first vowel of the root, and which denotes an act which is aimed at a specified person, e.g. qātala ‘he fought (with somebody)’, juqātilu ‘he is fighting (with somebody)’. In spite of existing morphological lengthening here, which is perfectly known in IE, it is hard to find closer parallels. The fourth theme is causative, with the prefix ˀ- < *š- (irregularly), which mixes with the prefix in the Imperfect, e.g. ˀaqtala ‘he caused sb. to kill’, juqtilu ‘he causes sb. to kill’. Other themes express derivative conjugations of the Medium Voice and they are made with the prefix n- or t- (becoming an infix in one of the forms). Other Semitic languages have similar systems of tenses and themes, in the greater part.

In the oldest known Semitic language, Akkadian, a different tense system existed. The Present Tense had prefixes like the Imperfect in other Semitic languages but the vocalization was like in Perfect (e.g. ikašad ‘he is winning’). The Preterite (Past) Tense had also prefixes and it was formally identical with the Imperfect of other Semitic languages (which functions as Present and Future there), e.g. ikšud ‘he won’. With the particle l(u)- it also expressed a wish, so the future: likšud ‘may he win’. Vowels in the Preterite were various, depending on the verb, e.g. imlik (malaku ‘to advise’), ilmad (lamadu ‘to learn’). And finally the third tense, the Permansive, expressed an instant or repeated act. It had endings and resembled the Perfect Tense, however with another vocalization, e.g. kašid ‘he wins’ (before an ending the -i- disappears: kašdaku ‘I win’). The Akkadian verb had also 4 main and 6 secondary themes. The Conative with the lengthened vowel is absent but equivalents of other Arabic themes can be found.

In word-formation of substantives and adjectives in Semitic many models of vowel changes exist, suffixes and prefixes are used quite seldom. Also forming the plural number in some languages, especially in Arabic, is highly irregular. Several examples:

In other Semitic languages the plural number is formed much more regularly but traces of irregular forms with vowel alternations remained, e.g. in Hebrew melek̲ ‘king’ – məlāk̲îm ‘kings’. This phenomenon has no parallels in IE.

Other phonetic peculiarities

In the Semitic languages we can find some features which are also present in IE but it is hard to find them in other Nostratic languages: Uralic, Altaic, Dravidian and Kartvelian. They could be evidence of their distant relation but they can also prove later mutual influence.

Labialized phonemes

In the common IE period 3 labialized phonemes are usually reconstructed: kʷ, gʷ, gʷh. We can add so called laryngeal H3 or here, probably having the value of the velar spirant . In some reconstructions labialized prevelars (or palatals) ḱʷ, ǵʷ, ǵʷh, and also dentals sʷ, tʷ, dʷ, dʷh are mentioned.

Some data suggest existing velar and uvular labialized phonemes in Proto-Semitic: kʷ, gʷ, qʷ, ḫʷ. At the same time there are rather no evidence for similar reconstructions in other Nostratic languages. For south Semitic groups of the type kʷe, gʷe, qʷe, ḫʷe we can usually find the syllables ku, gu, qu, ḫu in other languages; analogically gy in Greek dialects corresponds to common IE gʷe.

Guttural consonants

A characteristic feature of the Semitic languages is presence of guttural consonants: pharyngeal ḥ, ʕ and laryngeal h, ʔ. These consonants influence vowels in their neighbourhood. For instance in Arabic there exists a rule that if in the Perfect the vowel a is present, then in the Imperfect i or u appears, e.g. galasa ‘he sat’ – jaglisu ‘he is sitting’, qatala ‘he killed’ – jaqtulu ‘he is killing’. But if the verb contains the pharyngeal consonant, a may leave unchanged in the Imperfect: faˁala ‘he made’ – jafˁalu ‘he is making’. After all, in modern Arabic the vowel a is usually similar to [æ] or even [e], but in the vicinity of pharyngeal the pronunciation [a] is preserved. Similar (even if not identical) rules are in force also in other Semitic languages.

It is interesting that in many Semitic languages uvular ḫ, ġ (especially ġ) and pharyngeal ḥ, ʕ merge, and sometimes the uvular q is replaced with the glottal stop ʔ (so in the modern Egyptian dialect of Arabic) or with the velar k (so in some varieties of modern Hebrew). On the other hand pharyngeal consonants (similarly like laryngeal ones) showed the tendency to disappear from the system of the language many times during the history. The oldest known Semitic language, Akkadian, has not them at all except the glottal stop ʔ in some instances, and the only trace of those consonants are changes of near vowels as well as length of the preceding vowel. Also in Phoenician guttural consonants disappear with time, in the Neo-Punic dialect they are absent at all. Similar examples of disappearing and lengthening can be found in other Semitic languages, e.g. Hebrew rōš (rôš) ‘head’ comes from *raˀš- (cf. Arabic raˀsun).

All the above described phenomena are thought to be characteristic also for Proto-IE. For that language the consonants H1, H2 and H3 are reconstructed. They can have originate by merging uvular and pharyngeal consonants. Those consonants influenced the quality and the length of vowels: e, a, o developed from the groups H1e, H2e, H3e respectively – while ē, ā, ō developed from the groups eH1, eH2, eH3 before a consonant or word-finally. The ordinary a in other locations must have disappeared from the Proto-IE system very early; perhaps it merged with the old e, which resembles the development in Arabic. The example of Polish żywy, Latin vīvus ‘alive’ vs. English quick < IE *gʷīHʷ-o-, *gʷigʷ-jo- seems to be a proof for similar hesitations in the development of uvular consonants like the history of the Semitic q in modern Hebrew and Arabic dialects. A similar hesitation like in Arabic ˁanzatun ‘goat’ in comparison with Akkadian azzatu, ḫazzatu can be found in IE: Polish koza < IE *koǵā but Sanskrit ajā́ < IE *oǵā.

Contents Next part