Native American language families
Native American language families encompass a diverse array of languages spoken by Indigenous peoples across North America and beyond. Scholars propose that these languages evolved from a small number of ancestral languages, which diversified as migrants spread through the continent after their arrival from Asia via the land bridge known as Beringia. The complexity of these languages has led to various classification systems, primarily through typology and genetic relationships. Typological classification focuses on structural similarities, while genetic classification seeks historical connections among languages, tracing their evolution from common ancestors.
The primary language families identified include Eskimo-Aleut, Na-Dene, and Amerind, with the latter being the most extensive and contentious, comprising hundreds of languages. Over the years, linguistic research has faced debates regarding the number of families and their interrelationships, often influenced by differing methodologies among scholars. Despite the challenges, there is a consensus that American Indian languages reflect significant cultural diversity and historical migrations, with ongoing studies revealing potential connections to Old World languages. Understanding these language families is crucial for appreciating the rich cultural heritage and identities of Indigenous peoples in the Americas.
Native American language families
Tribes affected: Pantribal
Significance: A language family’s existence indicates that its member languages have descended from a common, ancient source; that fact helps scholars reconstruct the origins and kinship of tribes
Anthropologists believe that humans first reached North America via a land bridge called beringia that intermittently connected Alaska and Siberia between twenty thousand and five thousand years ago. They came in a series of migrations, some separated by thousands of years, and (the theory holds) each migrating group spoke a single language. As a group slowly spread through North America and perhaps into Central and South America, it fragmented into subgroups that settled different areas along the way. Many subgroups lost contact with one another. The original language the group spoke changed, because all languages evolve, and it changed at different rates and in different manners among the subgroups as each developed a distinct culture.
![Map of the Amerind language. By Spesh531 (File:Primary Human Language Families Map.png) [CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons 99109932-94907.jpg](https://imageserver.ebscohost.com/img/embimages/ers/sp/embedded/99109932-94907.jpg?ephost1=dGJyMNHX8kSepq84xNvgOLCmsE2epq5Srqa4SK6WxWXS)

Soon subgroups spoke mutually unintelligible versions of the ancestral tongue; in other words, each had its own language. So disparate had the descendant languages become that when Europeans arrived on the American continents in the late fifteenth and early sixteenth centuries, they encountered what seemed to them a bewildering variety of languages radically unlike their own.
Typology and Genetic Classifications
Yet despite the apparent diversity, underlying relationships exist among the languages. There are basically two ways to describe a linguistic relationship. The first, called typology, classifies languages based on structural similarities. Soon after American linguistics began, scholars noted that most Indian languages are polysynthetic (or incorporative), a type that combines major grammatical features into single words. In this sense, New World languages seemed distinct from all other languages then known. Typology, however, does not necessarily prove historical kinship. For example, according to typological criteria, English is more like Japanese than it is like German, to which English has a known historical connection.
The second method, genetic classification, hunts for these historical connections. Historical and comparative linguists analyze languages to discover features that can only have been inherited from the same source. When they find similar pronunciations, words and affixes, and grammatical features among two or more languages that cannot be explained by coincidence or by borrowing, these languages must share a family relationship—a genealogy—just as organisms descended from the same parent share physical traits. Linguists often use the metaphor of a tree to characterize the relationships: An ancestral language (also called a “proto” language) splits into branches, each branch into sub-branches, and sub-branches into separate languages. The term “family” refers collectively to the descendants of the ancestral language, which lends its name to the family. A grouping of multiple families is called a superfamily or phylum.
Even if the parent language no longer exists, its living offspring reveal much of its nature. By using modern evidence to reconstruct an ancient tongue’s sounds, words, and grammar, linguists offer potential evidence of humankind’s prehistoric character, evidence parallel to the ruins and middens studied by archaeologists and the skeletal remains studied by paleontologists. Since the early nineteenth century, reapplying linguistic methods developed during the study of the Indo-European languages, scholars have had notable success; many American Indian languages do indeed belong in families. Yet a number of topics—how many families, which languages belong in each, and what the families say about the original settlement of the Americas—have remained controversial from their beginnings.
History of Classifications
In A Guide to the World’s Languages (1987), Merritt Ruhlen lists 627 Indian and Eskimo languages in the Americas, many of which are extinct and known only from short word lists that European explorers compiled. Although their methods were often crude, these explorers were the first contributors to American linguistics. The first formal studies of individual North American languages appeared in the mid-seventeenth century, John Eliot’s Natick grammar in 1666 and Roger Williams’ Narragansett phrase book in 1643. As European colonists moved westward and more Indian languages became known, affinities among them led to speculations about their relationships. Thomas Jefferson, for example, wrote in 1789 that a common parentage might become apparent from a study of Indian vocabularies and suggested New World languages may have a kinship to Asian languages, an idea that scholars began exploring seriously in the late twentieth century.
Attempts to define the genetic relationship of American Indian languages began in the mid-nineteenth century. The first comprehensive study came from Albert Gallatin in 1836 (revised and expanded in 1848). Gallatin, a secretary of war, distributed a questionnaire to Indian language experts nationwide, soliciting information on six hundred words and some grammatical features. Gallatin made his classification by systematically comparing the responses. He grouped all North American languages, except those of California, into thirty-two families.
Gallatin’s classification remained the standard until 1891, when separate studies by Daniel Brinton and John Wesley Powell appeared. Brinton, who included all the languages in both North and South America about which he could get information, perceived a fundamental unity behind them, although he separated them into about eighty families for each continent in The American Race. Powell, as director of the Bureau of American Ethnology and a founder of the American Anthropological Association, had access to much more information than Brinton did; he also had a staff of linguists to help him. His article in the bureau’s seventh annual report, however, treated only those languages north of Mexico. Based on comparisons of vocabulary, Powell and his staff distinguished fifty-eight language families and isolates (languages which do not show kinship to other languages). The report served as the basis for subsequent investigations in North American linguistics well into the twentieth century, while Brinton’s book did much the same for the languages of South America.
Twentieth century American linguistics has been divided by a dispute over methods, a dispute that gradually arose between Columbia University anthropologist Franz Boas and several former students, principally Edward Sapir. Boas collected and analyzed information on a remarkable number of Indian languages, and early in his career he suggested that structural similarities among some languages bespoke a common origin. Later he changed his mind about the validity of genetic groupings and criticized the findings of his students. Those students, collecting and assessing languages on their own, especially in California, worked to classify them in ever larger families. In an influential 1929 Encyclopædia Britannica article, Edward Sapir tentatively proposed six families for all of North America and parts of Mexico and Central America because of similarities in vocabulary and grammar: Eskimo-Aleut, Algonquian-Mosan, Na-Dene, Penutian, Aztec-Tanoan, and Hokan-Siouan. Specialists in individual families denounced Sapir’s broad classifications, some claiming that the resemblances he cited were purely fanciful and others faulting him for not distinguishing adequately between coincidental similarities, borrowings, and true cognates when he compared vocabulary items. The controversy persisted through the rest of the century; traditionalist linguists, in the spirit of Boas, resisted large-scale classifications and argued with reductionists, who followed Sapir in proposing families. The two sides were somewhat facetiously known as “splitters” and “lumpers.”
Traditionalist Classification
In their introduction to The Languages of Native America (1979), Lyle Campbell and Marianne Mithun, rejecting the simple vocabulary comparisons of reductionists, listed three criteria for genetic classifications that would satisfy the traditionalists. First, only purely linguistic evidence is admissible; the findings of cultural anthropologists or archaeologists, for example, are irrelevant. Second, only resemblances between languages that include both sound and meaning are to be considered. If two or more languages have only a similar sound structure (such as the same number and type of consonants) or only employ the same method for constructing words (such as the use of suffixes to turn verbs into nouns), the kinship, Campbell and Mithun argue, should be viewed with skepticism. Basically, in this view, linguists should look for as many cognates as possible. Cognates (from Latin, meaning “born together”) are words in different languages that have similar sounds and meanings because they derive from the same word in an ancestral language. For example, English yoke, Latin iugum, and German Joch are cognates deriving from the hypothetical Indo-European form jugo.
Third, comparisons of sounds, words, and grammatical features must not be conducted piecemeal; they must be accompanied by a hypothesis systematically explaining how changes took place. That is, linguists must discover laws of change from a parent language to its offspring languages. Only then will the relation between the offspring languages be proved. Additionally, they warn that not enough attention has been paid to “areal diffusion,” or the borrowing of words and (less often) grammatical features between groups living close to one another. Such borrowings prove only physical proximity, not common origins and kinship.
Applying these criteria and cautions, Campbell and Mithun list 62 language families and isolates for North America. Their classifications are pointedly conservative and uncontroversial, intended to summarize contemporary research and serve as a starting point for further work. They recognize that many of the languages they list as isolates and some of the major branches will eventually be proved to belong together, but they refuse to allow lumping based on comparisons of vocabulary alone. Still, they follow Sapir in some cases, notably the universally accepted Eskimo-Aleut language and Na-Dene language family; however, they completely reject four of his six groupings.
Campbell and Mithun insist that the watchword for linguistics should be “demonstration,” not “lumping,” in order to give American Indian linguistics a scientific rigor. Yet their call for rigor and their criteria have placed traditionalists in something of a dilemma. Their 62 families for North America and the 117 families posited for South America by the traditionalist Cestmir Loukotka in 1968 amount to considerable linguistic diversity, far more than exists in Europe or Africa—both of which were settled long before the Americas. In general, anthropologists have found that cultural diversity increases with time. That a more recently settled region such as the Americas should show greater linguistic diversity than an older cultural area such as Africa flouts this principle. Furthermore, paleoanthropological evidence fails to support such great diversity, a fact which has made some linguists unhappy with the traditionalist approach.
Reductionist Classification
In 1987 Stanford University’s Joseph H. Greenberg published Language in the Americas, among the most controversial books about historical linguistics published in the twentieth century. In it he sweeps aside the traditionalists’ cautions, which he argues are largely specious. He claims that it is not necessary to reconstruct sound laws in order to show linguistic relationships. If two or more languages contain a sufficient number of cognates, then it is reasonable to assume that those languages descend from a common protolanguage. To ignore cognates because no sound laws exist to explain their varying forms, Greenberg argues, eliminates much valuable evidence.
Greenberg and Ruhlen, his former student, applied their system of “multilateral analysis” to hundreds of languages. For this method, they compiled lists of words for universal concepts and natural phenomena, such as pronouns, terms for family members, names for body parts, and names for water, because such words are seldom borrowed. Then they compared the words for a particular concept all at once, not language by language as traditionalists would have it. Together they discerned the etymologies (historical roots of modern words) of about five hundred words and found 107 grammatical features existing in more than one language. From this evidence, Greenberg concluded that all the languages in the Americas belong to one of three phyla: Eskimo-Aleut, Na-Dene, and Amerind.
Eskimo-Aleut includes ten languages and is spoken by about eighty-five thousand people living on the Aleutian Islands and in a belt of land that extends from western Alaska across the top of Canada to the coasts of Greenland. The Eskimo branches fall into two sub-branches, western (or Yupik) and eastern (or Inuit), which meet at Alaska’s Norton Sound. Because it has relatively little diversity, Eskimo-Aleut is thought to be the youngest of the three phyla.
Na-Dene contains three independent languages, Haida, Tlingit, and Eyak, which together have perhaps two thousand speakers, and a large branch, Athapaskan, which has thirty-two languages, most notably Chipewyan, Beaver, Apache, and Navajo. Navajo, with about 149,000 speakers, is the largest single Indian language in North America and the only one with a growing number of speakers. The Na-Dene phylum spreads from central Alaska as far as Hudson Bay in the east and south well into British Columbia. There are also small linguistic islands of Athapaskan in coastal Washington, Oregon, and Northern California and a large island that covers a substantial portion of New Mexico and Arizona.
There has been little controversy about Eskimo-Aleut and Na-Dene, but Amerind, by far the largest group with 583 languages, was immediately denounced by traditionalists, who not only rejected the phylum but many of the branches and sub-branches in it because Greenberg does not distinguish typological similarities from genetic similarities. The large number of etymologies, however, has impressed some scholars. Most telling is the appearance of n in first-person pronouns and m in second-person pronouns in all Amerind subgroups, while i- is a common third-person marker; such widespread features for basic language concepts, Greenberg contends, can only point to a common ancestral language.
Greenberg and Ruhlen divide the Amerind phylum into six major stocks, two of which apply to North America. Northern Amerind contains Almosan-Keresiouan (sixty-nine languages), which in its sub-branches has such famous languages as Blackfoot, Cheyenne, Arapaho, Cree, Ojibwa, Shawnee, Massachusett, Tillamook, Crow, Dakota, Pawnee, Mohawk, and Cherokee; Penutian (sixty-eight languages), with Chinook, Nez Perce, Natchez, Choctaw, Alabama, and Yucatec; and Hokan (twenty-eight languages), with Pomo, Mojave, Yuma, and Washoe. Central Amerind includes Tanoan (forty-nine languages), with Kiowa and Taos; Uto-Aztecan (twenty-five languages), with Hopi, Paiute, Shoshone, Comanche, and Nahuatl (the Aztec language); and Oto-Manguean (seventeen languages). The remaining four major stocks, Chibchan-Paezan (forty-three languages), Andean (eighteen languages), Equatorial-Tucanoan (192 languages), and Ge-Pano-Carib (117 languages), occupy South America and the Caribbean islands. Quechau, an Andean language in Colombia, Ecuador, Peru, and Bolivia, has the largest number of speakers, about eight million.
Greenberg remarks that his broad approach to classification is a beginning, not an end in itself. Detailed reconstructions of languages and sound laws, the scrutiny which traditionalists demand, are still needed to work out the details in his proposal. Although he admits that some features of his groupings may need revising after such examinations, he remains confident that the overall plan is correct. He further proposes that the three American phyla show connections to Old World language groups. Eskimo-Aleut may belong in Eurasiatic, a postulated immense superfamily whose members include English, Turkic, and Japanese; Amerind may also be related to Eurasiatic, but much more distantly. Since Language in the Americas appeared, some Russian and American scholars have placed Na-Dene and Caucasian (languages of central Russia) in Dene-Caucasian, with possible affiliation to Sino-Tibetan, a family that includes the Chinese languages. Ultimately, Greenberg suggests, all modern languages may descend from a single stock, which he calls Proto-Sapiens and others have called Proto-World and Proto-Human.
Nonlinguistic Evidence
Despite the debate among linguists, Greenberg’s Eskimo-Aleut, Na-Dene, and Amerind categories have found some support from other scientific disciplines. The findings all appear to substantiate the theory that American Indians and Eskimos crossed from Asia in at least three migrations that correspond to the three language phyla. The first, the ancestors of Amerind speakers, came no more recently than twelve thousand years ago and may correspond, in anthropological terms, to the Clovis culture. The Na-Dene migration began to arrive sometime between seven and ten thousand years ago and probably became the Paleo-Arctic culture. The Eskimo-Aleuts came last, about four to five thousand years ago, and may have been the Thule culture, although that identification is uncertain. The periods are so vague because the archaeological and linguistic evidence is difficult to date precisely.
Geneticists also have found that American Indians belong in three distinct groups. A team led by L. L. Cavalli-Sforza studied variations in Rh factor, a blood antigen, by population; Cavalli-Sforza claims that Greenberg’s language phyla accord with his genetic groups. Studies of variations in mitochondrial deoxyribonucleic acid (DNA) by Douglas C. Wallace also appear to support Greenberg. Finally, analyses of human teeth, immunoglobulin G, and blood serums in modern Indian populations have produced corroborating findings.
A majority of linguists reject, or at least are skeptical of, the multilateral analysis Greenberg and Ruhlen used to reach their conclusions. At the same time, most assume that large-scale relationships do exist among the more than six hundred known Indian languages, which language-by-language comparison and deduction of sound laws will eventually confirm. Thus, scientists largely agree that the Americas were populated by a small number of groups who traveled from Asia and whose languages slowly differentiated as the groups spread throughout the New World.
Bibliography
Bright, William, et al., eds. Linguistics in North America. Vol. 10 in Current Trends in Linguistics, edited by Thomas A. Sebeok. The Hague: Mouton, 1973. Essays devoted to the history of American linguistics, protolanguages, and the mutual influence of languages within regions present summary information on genetic and typological classifications.
Campbell, Lyle, and Marianne Mithun, eds. The Languages of Native America: Historical and Comparative Assessment. Austin: University of Texas Press, 1979. The editors propose sixty-two language families and isolates, based on rigorous and systematic classification methods, and contributors summarize research on seventeen of the families.
Greenberg, Joseph H. Language in the Americas. Stanford, Calif.: Stanford University Press, 1987. This controversial book classifies all languages in North and South America into three phyla based on correspondences in vocabulary and grammar.
Greenberg, Joseph H., and Merritt Ruhlen. “Linguistic Origins of Native Americans.” Scientific American 267 (November, 1992): 94-99. Summarizes the authors’ classification of American languages into three phyla, discusses their relation to Old World language families, and outlines corroborating evidence from genetics and anthropology.
Ruhlen, Merritt. Classification. Vol. 1 in A Guide to the World’s Languages. Stanford, Calif.: Stanford University Press, 1987. An illuminating chapter on classification methods helps make sense of the long-standing controversy over American Indian languages; another chapter presents major classification proposals for them and repeats Greenberg’s conclusions.