Thursday, July 12, 2012

Vindicating Greenberg. NOT. On the peopling of the Americas

The New York Times this morning has a piece on genetic evidence that the Americas were populated by a set of three migrations from Asia, an early one and two later ones. The first peopled most of the Americas and the others became the groups of people who speak Na-Dene and the Eskimo-Aleut languages. The evidence looks interesting but it's really incomplete — not much from most of North America, which is a serious gap.

Then, linguistics comes in:
The finding vindicates a proposal first made on linguistic grounds by Joseph Greenberg, the great classifier of the world’s languages. He asserted in 1987 that most languages spoken in North and South America were derived from the single mother tongue of the first settlers from Siberia, which he called Amerind. Two later waves, he surmised, brought speakers of Eskimo-Aleut and of Na-Dene, the language family spoken by the Apache and Navajo.
But many linguists who specialize in American languages derided Dr. Greenberg’s proposal, saying they saw no evidence for any single ancestral language like Amerind.
Well, not just specialists in American languages but historical and comparative linguists around the world.

The piece then concludes with this, a quote from Andres Ruiz-Linares of University College London, one of the investigators:
“Many linguists put down Greenberg as rubbish and don’t believe his publications,” Dr.  Ruiz-Linares said. But he considers his study a substantial vindication of Dr. Greenberg. “It’s striking that we have this correspondence between the genetics and the linguistics,” he said.
So, actually, Joseph Greenberg — who was an important linguist, somebody who was a founder of the modern field of linguistic typology (image from here) — claimed that almost all languages of the Americas are genetically related, save for the Na-Dene and Eskimo-Aleut families. He proposed this using tragically, horribly flawed data and using methods that can most gently be described as extremely controversial and not accepted by any serious historical linguist that I know or know of. The basic method is 'Mass Comparison', where you just eyeball lots of data. You can get a sense of some of the discussion in the journal Language 64.591-615 in 1988, where Lyle Campbell wrote a review article on Greenberg's book Language in the Americas. But there's a huge literature dedicated simply to cataloging the vast numbers of errors in Greenberg's data and some work showing that the errors consistently skew things in favor of his views.

As far as I know — and I don't have all the stuff at hand, but I know the work pretty well — linguists didn't particularly challenge claims about three waves of migration. If we think about basic geography, you might posit parts of this based on just distribution of languages – Eskimo-Aleut languages are pretty close to Siberia and a lot of Na-Dene ones are not much farther away. Then there's the rest of the hemisphere and it's hundreds of languages which look like they belong to many different families. Don't really need linguistics or detailed linguistic evidence for that.

Linguists really focused on the supposed evidence for 'Amerind' as a so-called super-family including hundreds of languages that cannot be shown to be related using the classic tools of comparative linguistics, like the comparative method. Most of us who talked about this in public and in print specifically said that we weren't claiming that the languages weren't related, just that we don't have evidence showing it or methods that allow us to show it.  There IS evidence emerging for connections between some languages of Siberia and the Na-Dene family (search this blog for Ed Vajda's name), and that could easily correlate with them having come in a distinct migration. But that's as much as we've got right now.

I can build a theory about how dry weather leads to plants dying that rests on gremlins killing plants because they're angry about the absence of rain, because they like to dance in the rain.  Whether gremlins play a role is independent of whether dry weather correlates with plants dying. (Watching things here, though, I'm pretty confident about the correlation.) Greenberg's observation about the distinctiveness of Eskimo-Aleut and Na-Dene vis-à-vis other languages of the western hemisphere is an interesting one, as is the claim of three waves of migration which he connected to that. And the claim of waves and these particular waves might be right — the genetic evidence sounds like it's consistent with it and the geography would fit neatly — but the purported linguistic arguments are all about gremlins.

My point is that it's best to see Greenberg's views about waves of migration as a CLAIM that may or may not be right. You really cannot say that it's supported by linguistic evidence nor that 'Amerind' is supported by linguistic evidence.

Quick update, 12:30: After posting this, I checked facebook and saw some similar points made by notable linguists (you know who you are) and I take that as suggesting that the basic outline here is pretty clear for those of us in the field. Less happily, this story is up, claiming that the genetic evidence "comes close to settling an old question in linguistics". The above post gives arguments against that view, I hope.


Jonathon said...

Once again, it's depressing to hear about linguistics in the news. And on the flip side, finding that the languages are all related wouldn't prove that the people are related. Establishing correlations between archeology, ethnicity, and language is surprisingly difficult. They just don't line up as neatly as we'd like.

We discussed mass lexical comparison and the work of linguists like Greenberg in my historical linguistics class (which used Campbell's textbook). What I took away from it is that mass comparison is nothing more than a way to get the ball rolling by identifying some possible cognates that then need to be verified by traditional methods. And of course, mass comparison wouldn't accurately identify many real cognates, like five and cinq, so its usefulness is pretty limited. But just try explaining the many serious shortcomings of mass lexical comparison to a journalist.

Christian said...

Both amongst ourselves and in reponses to the NY Times, it seems like many of us have been talking about the distinction between the comparative method and genetic evidence. I wonder if it would be good to discuss in more detail the types of conclusions that one reaches using the comparative method and the reliability of reconstructions. I can certainly think of plenty of laypeople that might wrongly assume that reconstruction is some sort of "magic" that gives you a definite answer.

Eugene said...

My understanding is that the comparative method is limited by time depth - beyond 10,000 years it says little or nothing. If North America was originally peopled between 12,000 and 15,000 years ago, historical linguistics can't help much with regard to the Amerind problem.

Also, the actual migration must have been fairly complicated, with perhaps many different languages and/or language families crossing the Bering land bridge over a fairly long period of time. The evidence on the ground today may not reflect all of that complexity.

Still, Greenberg eyeballed the data and arrived at more or less the right answer, at least according to the geneticists. That's a major intellectual accomplishment, and I don't think we should disparage it.

Jonathon said...

But as Mr. Verb notes, you can use an utterly flawed hypothesis and methodology and still come to the right answer. The fact that you arrived at the right answer doesn't necessarily validate your work.

Monica said...

I'm going to copy over some of the really informative comments made on my Facebook post about this (attributed, of course!). Here's the first, from Steve Anderson:

Yes, someone has to get out the message that linguists didn't reject the basic suggestions Greenberg made, but rather argued that he didn't have any scientific basis for making them. Greenberg was by and large lucky in his guesses (I have heard from Africanist colleagues that the African work is not much better than the "Amerind" stuff, but when it was done right, and put together with other results, it turned out to be largely correct), but that's not a reason to emulate his methods. Unfortunately, the NYT article doesn't seem to have a way to make comments, and letters to the editor have a way of finding a black hole.

Monica said...

From Anthony Aristar: Greenberg had a weird habit of getting it right with questionable methodology. The evidence is now piling up that the Athabascans formed a second wave of immigrants: consider Vajda's recent work. And it was always obvious that Eskimo-Aleut was yet another, as a glance at the linguistic map of Alaska shows.

Monica said...

From Jack Martin: I don't think it was luck. If I had to guess, I'd say Greenberg first reviewed the biological evidence available in his day (dentition, blood types, juvenile blondism, etc.), and then made his linguistic classifications fit the biological research. DNA studies were subsequently used to "vindicate" a classification that was based on biology in the first place. Was it Bill Poser who found that Greenberg's language notebooks for the Americas were arranged in much the same order as his final classification?

Monica said...

From Bill Poser: I did look at Greenberg's notebooks and observe that they were arranged in much the same order as his final classification but I'm not sure that I was the first to make that point. I wouldn't be surprised if Victor Golla deserves the credit.

Monica said...

More from Bill Poser: Another piece of evidence that Greenberg classified on the basis of factors other than linguistic data is that, as I think Lyle Campbell pointed out, Greenberg classified languages that do not exist or for which we have no data.

From David Costa: One thing that's not often pointed out about JG's subclassifications is that he essentially took EVERY subgrouping that had ever been posited by anyone as a given. That gives a big hint as to how he would have organized his initial work.

Monica said...

Steve Anderson again: Greenberg's basic claim was that "Amerind" was LINGUISTICALLY monophyletic, and the genetics-based argument still doesn't really even bear on that (unless you could show that the first wave of migrants consisted only of a tiny, linguistically uniform population, and good luck with that). Suppose, for instance, that a representative population drawn from modern Finland were to colonize a distant planet, and eventually spread out to cover most of its land area. You might then be able to show that most of the people of that planet had a common genetic signature --- one distinct, say, from that of colonists who arrived on a later expedition of people from today's Mongolia. Some of those original Finnish colonists may well have spoken Swedish, though, and others Finnish, so the corresponding linguistic claim wouldn't hold.

Monica said...

Last one, I promise. Well, last two, from Bill Poser: The other point that I view as absolutely devastating but that non-linguists, at least, seem not to get, is that Greenberg's very "success" (in his own view) in establishing that all languages are related, coupled with his complete lack of a methodology for subgrouping, prevented him from making ANY valid claims about intermediate nodes like Amerind. For his work to be of historical interest (that is, to archaeologists etc.) monogenesis is not important; what is important is the claim that things like Amerind and Eurasiatic are valid subgroups. But even if we accept his method of demonstrating affiliation as valid, in the absence of a subgrouping method, things like Amerind and Eurasiatic are merely artifacts of the route that he took toward his goal - there's no reason whatever to regard them as valid subgroups.

Incidentally, on Africa, let me suggest that everyone read the paper Africa's Linguistic Diversity by Bonny Sands in Language and Linguistics Compass 3.2.559-580 (2009). She concludes that at the present state of our knowledge there are at least 20 language families in Africa.