Tuesday, March 06, 2007

English as a "fourth branch of Germanic"

Sorry for the silence; been away and otherwise occupied for a few days. A nonlinguist colleague of Joe's passed him this story yesterday and the print version of it is now out. It's about the DNA of the population of the British Isles, with some geneticists claiming:
that both Britain and Ireland have been inhabited for thousands of years by a single people that have remained in the majority, with only minor additions from later invaders like Celts, Romans, Angles, Saxons, Vikings and Normans.
That's pretty cool, certainly, but I get a little nervous at hearing that Stephen Oppenheimer, the lead geneticist here, is arguing that these populations are basically Basque. I haven't read the background material this point is based on (yet), but there's no evidence mentioned that shows that connection: It's clear that the area was inhabited by pre-Indo-European populations, but was it this particular group? I'm curious because, as Joe reminds me, Theo Vennemann has written a ton in recent years about pre-IE substrates in European languages, including arguments that languages related to Basque originally stretched across Europe and that important streams of Semitic speakers came to Celtic- and Germanic-speaking areas very early on. Such stuff is fascinating maybe precisely because it's basically impossible to confirm or disconfirm in any really rigorous way. That kind of discussion drives 'traditional' historical linguists utterly mad and DNA support for that position would stir the pot in a big way.

But then things turn to language and it all gets decidedly weird:
Dr. Oppenheimer has relied on work by Peter Forster, a geneticist at Anglia Ruskin University, to argue that Celtic is a much more ancient language than supposed, and that Celtic speakers could have brought knowledge of agriculture to Ireland, where it first appeared. He also adopts Dr. Forster’s argument, based on a statistical analysis of vocabulary, that English is an ancient, fourth branch of the Germanic language tree, and was spoken in England before the Roman invasion.
OK, you're tempted to leave aside the notion of a language being 'more ancient' in this sense, and whether the Celts really brought agriculture a few millennia back and whether their language supplanted the original, Basque-related tongue. That's all so shrouded in the mist of prehistory that it would seem very hard to really shoot down. Alas, Joe Eska (a leading specialist in early Celtic) and Don Ringe (a leading Indo-Europeanist who has done very important work in computational approaches to historical linguists) showed in considerable detail that Forster's approach to Celtic, and to historical linguistics generally, is profoundly and fundamentally flawed in virtually every regard. (See their discussion note, "Recent work in computational linguistic phylogeny", Language 2004, 80.569-582.) Here's a little slice of their conclusion:
We have shown that [Forster & Toth's] selection and analysis of data are full of errors, that their confusion about what kinds of evidence are valuable for research in linguistic phylogeny has compromised their project, and that their rejection of the principles of the comparative method is not only counterproductive, but also completely antithetical to historical linguistics as a science. Most importantly, they have not addressed the crucial computational problems involved in phylogenetic reconstruction from comparative data.
Leaving aside comparative linguistics, I tend to assume that geneticists control quantitative methods and am pretty stunned at what look like basic problems found by Eska and Ringe. Forster's web page lists among his publications a reply in Language, actually a brief letter to the editor. I have to agree with Eska & Ringe's counter-reply that "Forster’s response … fails to address any of our criticisms concerning their methodology." Ouch.

When we get to the history of English, we're playing on turf where we have some clearer data, linguistic and otherwise. Maybe we could get some expert opinions on English as "a fourth branch of Germanic"?

Update, 8:18 a.m.: I've just been informed that the German version of Scientific American, Spektrum der Wissenschaften, has run pieces on the Basque DNA thing, here, and on Basque as the pre-IE language of Europe, here.


Karen said...

Check Language Log for Sally Thomason's takedown - among other things she says:

Second, the idea that 150 years of careful research in Germanic languages can be overthrown by a statistical analysis of vocabulary (which is Forster's sole technique) makes no sense: it might be relevant if languages were all vocabulary and if Forster understood enough about language to construct a useful sample, but the linkage of English with West Germanic -- through its closest relation, Frisian, and then the also closely-related Dutch and Low German -- is absolutely solid. These languages, together with (High) German, share significant innovations in phonology and morphology as well as in the lexicon; it is those innovations that provide the evidence for the usually accepted -- not "assumed"! -- subgrouping of the Germanic branch of Indo-European.

I have no expertise whatsoever in genetics and I therefore have no comment on Dr. Oppenheimer's proposals in this highly technical and well-developed field of inquiry. It would be nice if geneticists like Forster (and reporters like Wade) would reciprocate -- if they would somehow manage to arrive at an understanding of the fact that historical linguistics is a highly technical and well-developed field of inquiry in which expert knowledge is needed to support hypotheses.

Stuart said...

There is a certain amount of looseness of terminology in Oppenheimer's book, but especially so in the coverage of the debates it has given rise to in the popular press. When Oppenheimer, or the journalists covering him, say "we're all Basques", what they mean is that the majority of the inhabitants of the British Isles can trace their ancestry to people who migrated here at the end of the Ice Age from the area we now know as the Basque country, whatever languages may or may not have been spoken by such persons at the time in that area.

The problem is an unawareness of how labels, in historical linguistics, are applied to a particular slice of the linguistic continuum in space and time, and dates and locations attached accordingly, thus leading to a looseness in the use of terms such as "Celtic" or "proto-Celtic".

But Oppenheimer's concerns about when and from where "languages that come from that grouping that we choose to call Celtic" and "languages that come from that grouping that we choose to call Germanic" arrived in the British Isles is a valid one: his concern seems to be that while the almost complete absence of Celtic loanwords into Old English would suggest substantial population replacement between c.400AD and c.600AD he does not find support for this in the genetics, and so suggests that maybe dialects/languages that we would classify as Germanic rather than Celtic were spoken in eastern England far earlier than c.400AD.

He does acknowledge a modest "Anglian" migration c.450AD, mostly into East Anglia and the East Midlands/East Riding of Yorkshire, but asks the valid question that if all the substrate populations when they arrived were speaking Late Brythonic Celtic dialects and/or Vulgar Latin, why that did not result, either in a hybrid language spoken by indigenous and arrivals alike (cf. Middle English from Old English and Norman French), or the abandonment by the arrivals of the language they had brought with them (the Franks in Gaul, abandoning their Germanic dialect to speak the Vulgar Latin of the indigenous that later developed into French). He is also puzzled by the fact that in cases elsewhere in history where substantial poulation replacement did take place (e.g. North America or Australia after European settlement), the replacement populations incorporated far more indigenous words into their language than the Anglo-Saxons seem to have done with Celtic or Latin, which seems particularly bizarre given that Roman Britian was far more densely populated than pre-Cook Australia.

Mr. Verb said...

Yes, these are emphatically and without doubt legitimate questions. And the role of language contact gets at some of the most fundamental questions about (to quote the title of a John McWhorter paper) "What happened to English?"

Still, we really don't know that much about what range of substrate effects we could reasonably expect in situations involving colonization of this sort. That is, I'm not sure how much weight we can put on the absence of more (securely identifiable) Celtic loanwords and such things.

I do continue to be bothered by the linguistics these folks are relying on. (See Joe's later post on the position of English within Germanic.)

dan said...

Personally I reckon that some form of English was probably spoken in the East Coast of Britain well before the Romans invaded. The Roman invasion may well have ven led to the settlement of German speakers in areas like Iceni /Anglia where the Romans butchered rebellious tribes.