Wednesday, June 18, 2008

"Facts" in linguistics

Last week, The Stranded Preposition (whose stranding is hopefully syntactic and not flood-related) passed along this quote from Raven McDavid's classic 1948 American Speech article on r-fulness/r-lessness in the American South:
It is a tradition among some schools of scientific investigation not to insist on facts and examples, and to ignore them when they conflict with previously formulated theories.
I was so inspired by this that I added it to the footer on this blog (scroll to the bottom). The quote is a footnote to a passage noting that conventional descriptions of American English claim that Southern dialects are r-less — a stunning oversimplification. How could the noted scholars cited have written/thought such things? Plenty of descriptions were available that showed otherwise and it's hard to know any Southerners without knowing r-ful ones.

This morning, reader E.G. called my attention to this post from LINGUIST, a review written by Madalena Cruz-Ferreira of this book:
Engh, Jan. 2007. Norwegian Examples in International Linguistics Literature: An inventory of defective documentation. Munich.
The book does something remarkable and deeply disturbing: It documents mistakes in linguistic analyses of Norwegian. Think about that: A whole book that does nothing but catalog and analyze errors in data from one language. As it happens, the 346 flawed chunks of data (published by 139 different linguists) are all from syntax. Engh apparently found no errors in the lit on Norwegian phonetics, phonology or morphology.

Here's the sobering conclusion the reviewer offers:
In scientific research worthy of the name, public retractions with apologies are the honorable step when honest mistakes are made, which Engh assumes to be the case for all of his examples, and exposed, which Engh does beyond reasonable doubt. The painstaking demolition of these and ''other results of lack of understanding'' (p. 147) can be summarized in one question: do linguists want to deal with facts or fiction? The scope of misinformation about Norwegian raises alarm bells far beyond the language in question. If, despite widespread availability of resources, both data and analyses concerning Norwegian end up mangled in the way described in this book because of inexplicable reliance on dubious sources, one is more than justified to feel unsettled about what may have happened and what may go on happening in reports about languages whose users are either dead or otherwise less accessible for consultation than Norwegians.
It seems like time for the field to develop some better fact-checking during the refereeing process. 

Note: The 5th comment on this post for key follow-up on the book. It seems to confirm Bill Idsardi's suspicion about the 'facts' in question. 


Bill Idsardi said...

These discussions seem to presuppose that observation statements aren't theory-laden to begin with, an untenable position since early Kuhn and Feyerabend.

pc said...

Ugh, but somehow this isn't surprising. In terms of the syntactic errors I guess I'm curious what counts as "incorrect" by Engh's definition/standards: there's "incorrect" like "not part of the grammar according of any native speaker," but also "incorrect" like "arguably not part of the grammar you purport to be explaining, but arguably part of someone else's [perhaps very similar] grammar" (or the inverse). And also "incorrect" like just "mistaken." It seems like the first kind should be the most egregious, the latter kind ought to be the most easily fixed, and the middle kind is the toughest gray area.

Even for English, which many (most?) English-studying syntacticians speak natively, it's really not so easy to figure out what's "correct" - some grammaticality judgments are backed up by experiments with native speaking subjects, but many aren't, and this is problematic (I am, fwiw, speaking as someone who has taken several syntax classes, but I don't currently undertake syntactic research [aside from a term paper or three]). When a class full of native English-speaking graduate students comes across an example on which a whole syntax paper is based and disagrees about the example's presumed un/grammaticality, there's a problem with that example's explanation pushing something forward in terms of theory.

But it sounds like maybe much of what Engh is talking about are just stupid mistakes, so I guess the question is always whether the data are replicable; stupid mistakes shouldn't lead to replicable data. But it's so easy to rely on someone else's descriptions and assume they're right (especially if you don't have native speakers freely roaming the halls of your U. waiting to give you grammatical validation)....

[I also agree with bill idsardi :)]

Mr. Verb said...

Oh yeah, Feyerabend would have had fun with this kind of stuff, I imagine.

Most of the stuff in Engh looks like what are clearly outright errors. It's easy to wonder if there's a theoretical (say, anti-generative) undercurrent in the book and/or review, but I didn't notice anything that made that clear.

Mr. Verb said...

Yeah, pc, the review strongly suggests that Engh is pretty clear about the various levels of error, including things like pragmatic issues, but it appears to be mostly the 'stupid mistakes'.

Peter Svenonius said...

I've read Engh's book. Engh does indeed find some actual errors which bear on the analyses that they are provided to support, but in the vast majority of cases the examples he cites fall into three categories:

1. Cases where normative grammar requires something other than what the linguist has reported that people say. Some of these are spelling errors, others have to do with prescriptive rules of Bokmål and Nynorsk writing.

2. Cases where Engh disagrees with the grammaticality judgment, though there are many people in Norway who would support it. In these cases Engh mocks or trivializes dialectal variants, calling them "slang," or saying that a usage "has a childish ring," or implying that it a contamination from Swedish.

3. Cases where Engh has failed to understand the context in which the example sentence is presented, for example thinking that a sentence presented as narrow-scope negation was supposed to be interpreted with sentence-level negation, and suggesting the sentence-level negation is "correct," failing to notice that it would give the wrong meaning in the context of the discussion.

Basically, the book is a clumsy attempt at generativist-bashing, and one in which the author reveals his incompetence.

Mr. Verb said...

Thank you for clarifying, Peter. That's a real shame, obviously -- it sounds like the book is a waste of time. In this day and age,

Peter Svenonius said...

Full disclosure: I am the source of some of the purported errors Engh reports.

Mr. Verb said...

That was to be expected from the context, I think. At the least, people who are interested now have a sense of how to read the book. Thanks.