The first systematic attempt to study the implications of Jones's statement in detail was made by a Dane, Rasmus Rask, in an essay written in 1814, called An Investigation into the Origin of the Old Norse or Icelandic language. It was followed shortly afterwards by Franz Bopp's first major work, Concerning the Conjugation System of the Sanskrit Language in Comparison with those of the Greek, Latin, Persian and Germanic Languages. This was published in 1816, three years before a third scholar, Jacob Grimm, enlarged and further systematized Rask's statement in his German (i.e. Germanic) Grammar. By 1833, the techniques and data amassed by these three, supplemented by much extra information from contemporaries, led to the production of Bopp's comprehensive handbook, which was extremely popular and went through three editions: the Comparative Grammar of Sanskrit, Zend, Greek, Latin, Lithuanian, Gothic and German took nineteen years to prepare, and by its third edition incorporated Old Slavic, Celtic and Albanian.

The study began in an empirical way within its own field. However, it was not long before the comparative philologists shifted their focus of attention. From the comparative data which was being described, they began to deduce the features of a language which they assumed must have been in existence before the earliest records, which would account for the similarities in the forms of say, Sanskrit, Greek and Latin. The hypothesis that the languages were related produced the further hypothesis that they had a common source. At first it was thought that Latin and Greek had descended from Sanskrit (which had always held a patriarchal position in the eyes of European scholars); but further research indicated that all three languages were 'cognate' - that is, had a common ancestor, which, in this case, had not survived in any recorded form. Work thus began on determining this old language's characteristics.

Modern Romance languages, of course, showed a similar pattern. The similarities existing between certain words and forms in Italian, Spanish and French, for example, indicated clearly that they came from the same parent language. The fact that the three words for 'father' had so much in common (Italian, padre, Spanish, padre, French, pere) would be just one case in point among thousands. But one could then go further, and suggest that the word, as it stood in the original parent tongue, must have had a p in it, because this is a common factor in the three modern languages; similarly the presence of an r might be deduced; and if sufficient comparative work was done, the reason for the vowel discrepancy might become clear, and the parent language vowel determined. In such a way, one could arrive at an ancestor form pater - which in this case exists, in Latin. By studying a large number of such cases, dealing with more complex grammatical constructions as well as with letters, the totality of the Latin we know could be deduced, as it were, backwards.

This reasoning, then, was applied to Latin, Greek and Sanskrit, which showed a very similar set of correspondences. Here the forms for 'father' were: Latin, pater, Greek, patēr, Sanskrit, pitar. The conclusion reached suggested that the parent language from which the three had derived would have had a word for 'father' of the form *pəter. The asterisk in front of this form, or any other in comparative philology, indicates that it is a reconstruction along these lines which is not attested in written records. (It is a different use of the asterisk from that usual in contemporary, synchronic linguistics, where it refers to an unacceptable usage.) In other words, a reconstruction is a kind of hypothesis based on the consideration of a multiplicity of examples taken from as many cognate languages as would seem to be relevant. (The theoretical status of these reconstructed forms has been a source of recent debate in historical linguistics.) Thus, in deducing the older word for 'brother' one might well wish to consider as part of the evidence Latin, frāter, Greek, phrātēr), Sanskrit, bhrātār, Old Church Slavonic, bratrǔ, Gothic, broӨar, Old Irish, brāthir, Old English brdbor, and so on, from which one could arrive at a form *bhrāter. (Most large dictionaries provide 'etymological' information of this kind.)

A further eye-catching example of relationship is the present-tense indicative conjunction of the verb 'to be' in Latin, Sanskrit and Greek (the forms of the latter two languages have been transliterated for ease of comparison):

sum smi eim PIE *smi

es si essi *si

est sti esti *sti

The regularity of sets of phonetic correspondences existing in such listings was soon established beyond reasonable doubt, and provided the stimulus that led scholars to suggest lines of phonetic relationship between the languages. Thus, for example, the presence of an a vowel in the verb paradigm given above for Sanskrit corresponds systematically with an e vowel in the other two. The weight of evidence is therefore for an e vowel in the parent language (called Indo-European), the a having developed from this *e by some process of phonetic development. If this was so (and scholars assumed it was), the process could be described as 'Indo-European *e became Sanskrit a and Latin and Greek e*; or in a more simple shorthand: 'IE */e/>Skt /a/, L. Gk. /e/.' Such formulas were worked out in great detail and became known as 'sound-laws'. This was originally a metaphorical use of the word 'law', because the formulas were only representing what were thought of as very strong tendencies for the sounds to behave in such ways; they were not originally considered as exceptionless laws at all - an attitude to be sharply reversed later by the neogrammarian school of comparative philologists, discussed below.

The original parent language, then, was gradually reconstructed word by word, as far as the evidence of the written remains allowed, and is now called Proto-Indo-European (or PIE for short). It was assumed that it was being spoken before 3000 B.C. The languages which developed from it were thus called the Indo-European 'family' (according to the then popular model of linguistic description), and the relationship of one member language to another was described in kinship terms: Sanskrit, Greek and Latin were 'sister-languages'. Such a procedure accounted satisfactorily for the interrelationships of most European languages (Basque being an odd exception) by postulating various sub-families or groups of languages within the main family that had some kind of common structural core. From the practical point of view, PIE was given more attention by scholars, partly because there was more written material available in the daughter-languages, and partly because the users of the languages within the Indo-European complex were politically and economically more important. This is, of course, by no means to say that a language of Indo-European stock is somehow intrinsically 'better' than any other: a culture may be more powerful than another, but this does not affect the status of the languages used which are, linguistically speaking, of equal standing (cf. p. 70).

Thus, the major language 'families' of the world were suggested, and those of Europe given more detailed treatment:


the Germanic family, the Romance, the Balto-Slavic, the Celtic, the Hellenic, and so on. The figure above shows one way, using the family tree model, of relating the languages of the Germanic group, of which English is a member. We should remember, of course, that arbitrarily to date and label language states with different names, as is usual in comparative work, is a distorting process. Language is continually in a state of flux; it is always changing. Latin did not suddenly become French overnight, nor Old English Middle English; and it is impossible to pin down the exact moment when any two dialects diverged to the point of unintelligibility, at which point we say they are separate languages (though there has been one movement in recent years which has tried to do just this - the technique variously called 'lexicostatistics' or 'glottochronology'). The names given to the different language states postulated in the comparative method are averages, approximations only, as are the dates. Transitional periods are always present between two arbitrarily determined states.

This was the genealogical method of classification. At the same time, attempts were being made to produce an alternative classification of languages, the typological, wherein each language would be placed according to its major structural characteristics. This procedure was first proposed in detail by A. von Schlegel in 1818. He suggested that there were three kinds of language that could be characterized in this way: at one extreme there were analytic (or isolating) languages, such as Chinese, which have no inflections; at the other extreme there were synthetic (or inflectional) languages, such as Greek or Sanskrit; and in between there were agglutinative (or affixing) languages, such as Turkish or Korean, which string verbal elements together in long sequences. Despite the fact that most languages fell between these points, Schlegel's theory had many adherents. It did provide a comprehensive standpoint at least, and showed that there were general structural tendencies in the ways languages indicated relationships. Its main advantage, too, was that it did not require a vast quantity of textual analysis before making its statements, and was therefore a more useful tool than the genealogical model for classifying languages which had no written records. The two techniques are not incompatible, of course; but for many reasons the genealogical approach remained more popular, and despite support by Sapir and other linguists, no broad classification has yet been produced on typological principles that is satisfactory.