Chomsky vs. corpus linguistics, e-books, English, Linguistics
[ Pobierz całość w formacie PDF ]
Noam Chomsky
CHOMSKY
VS.
CORPUS LINGUISTICS
John Fry
San Jose State University
• Born 1928 in Philadelphia; aliated with MIT since 1955
• Chomsky’s mentalistic, generative approach to language
revolutionized linguistics and cognitive science in the 20th
century
Linguistics 115: Corpus Linguistics, Fall 2007, SJSU
Linguistics 115: Corpus Linguistics, Fall 2007, SJSU
1
The ‘Cognitive Revolution’ of the 1960s
SyntacticStructures(Chomsky 1957)
• Empirical and statistical methods were popular the 1950s,
dominating fields from psychology (behaviorism) to electrical
engineering (information theory)
From now on I will consider alanguageto be a set (finite or
infinite) of sentences, each finite in length and constructed
out of a finite set of elements.
• They faded in the 1960s under the ‘cognitive revolution’
The fundamental aim in the linguistic analysis of a language
L
is to separate thegrammaticalsequences which are the
sentences of
L
from theungrammaticalsequences which
are not sentences of
L
and to study the structure of the
grammatical sequences.
• Seminal events:
– Chomsky’sSyntacticStructures(1957) andAspectsofthe
TheoryofSyntax(1965)
– Chomsky’s review ofVerbalBehavior
– Chomsky & Miller’s critiques of statistical language models
– Minsky and Papert’s criticism of neural networks
One what basis do we actually go about separating
grammatical sequences from ungrammatical sequences?
Linguistics 115: Corpus Linguistics, Fall 2007, SJSU
2
Linguistics 115: Corpus Linguistics, Fall 2007, SJSU
3
SyntacticStructures(Chomsky 1957)
SyntacticStructures(Chomsky 1957)
First, it is obvious that the set of grammatical sentences
cannot be identified with any. . . finite and somewhat
accidental corpus of observed utterances. . .
Third, the notion “grammatical in English” cannot be
identified in any way with the notion “high order of
statistical approximation to English.” It is fair to assume
that neither sentence (1) nor (2) (nor indeed any part of
these sentences) has ever occurred in an English discourse.
Hence, in any statistical model for grammaticalness, these
sentences will be ruled out on identical grounds as equally
‘remote’ from English. Yet (1), though nonsensical, is
grammatical, while (2) is not.
Second, the notion “grammatical” cannot be identified
with “meaningful” or “significant” in any semantic sense.
Sentences (1) and (2) are equally nonsensical, but any
speaker of English will recognize that only the former is
grammatical.
(1) Colorless green ideas sleep furiously.
To choose another example, in the context “I saw a
fragile—,” the words “whale” and “of” may have equal
(i.e., zero) frequency in the past linguistic experience of
a speaker who will immediately recognize that one of
these substitutions, but not the other, gives a grammatical
sentence.
(2) Furiously sleep ideas green colorless.
Linguistics 115: Corpus Linguistics, Fall 2007, SJSU
4
Linguistics 115: Corpus Linguistics, Fall 2007, SJSU
5
SyntacticStructures(Chomsky 1957)
Chomsky vs. probabilities
Evidently, one’s ability to produce and recognize
grammatical utterances is not based on notions of statistical
approximation and the like. . . I think that we are forced to
conclude that grammar is autonomous and independent of
meaning, and that probabilistic models give no particular
insight into some of the basic problems of syntactic
structure.
• Chomsky’s arguments against probabilistic models of language
were very influential
• From the 1960s through the end of the 1980s, Chomsky’s
generative grammar framework was dominant
• Theoretical linguists eschewed corpus data, which was hard to
obtain and use anyway
• Today statistical methods are popular again, and Chomsky’s
arguments against probabilities are no longer accepted
Linguistics 115: Corpus Linguistics, Fall 2007, SJSU
6
Linguistics 115: Corpus Linguistics, Fall 2007, SJSU
7
Refuting Chomsky’s arguments
Refuting Chomsky’s arguments
• Chomsky (1957:16) writes:
[I]n the context “I saw a fragile—,” the words “whale”
and “of” may have equal (i.e., zero) frequency in the past
linguistic experience of a speaker who will immediately
recognize that [only] one of these substitutions. . . [is]
grammatical. . . Evidently, one’s ability to produce and
recognize grammatical utterances is not based on notions
of statistical approximation and the like.
• Chomsky (1957:15-16) writes:
(1) Colorless green ideas sleep furiously.
(2) Furiously sleep ideas green colorless.
. . . in any statistical model for grammaticalness, these
sentences will be ruled out on identical grounds as equally
‘remote’ from English.
• This claim is false—all modern statistical models of language
can assign probabilities to previously-unseen utterances
• Experiment: replacegrammaticalwithtallin this argument
That’s how speech recognition works!
• Even if you have never encountered a 3-foot man or a 7-foot
man, you will immediately recognize the latter as atallman
• E.g., Pereira’s (2000) statistical model of newspaper text
assigns (1) a probability 200,000 times greater than (2)
• Does this fact really prove that the concept of tallness cannot
be represented probabilistically?
• Even a simple model based on word classes can do this
Linguistics 115: Corpus Linguistics, Fall 2007, SJSU
8
Linguistics 115: Corpus Linguistics, Fall 2007, SJSU
9
Brilliant white flares swayed eerily
Chomsky vs. corpus linguistics
$cd/corpora/brown/data
$cat*|tr’\n’’’|
egrep-o’\w+/jj\w+/jj\w+/nn\w*\w+/vb\w*\w+/rb’
• Chomsky emphasizes thecreativityof language
excellent/jjforeign/jjbomb/nntakes/vbzonly/rb
brilliant/jjwhite/jjflares/nnsswayed/vbdeerily/rb
prevalent/jjmental/jjdisturbance/nnaffecting/vbgeven/rb
little/jjgreen/jjbiplane/nnstruggled/vbdnorthward/rb
fake/jjtherapeutic/jjdevices/nnsfigure/vbprominently/rb
routine/jjvital/jjstatistics/nnsgot/vbdnowhere/rb
appropriate/jjregional/jjoffice/nnlisted/vbnbelow/rb
autonomic/jjhypothalamic/jjbalance/nnoccurring/vbgspontaneously/rb
long/jjfluorescent/jjtube/nnsuspended/vbndirectly/rb
little/jjsweet/jjpotato/nntrilled/vbdneatly/rb
looking/jjyoung/jjofficer/nnfell/vbdback/rb
• This creativity is manifested in recursive generative rules like
S ) NP VP, VP ) V NP (or more recently, “Merge”)
• Empiricists, on the other hand, tend to focus on common
patterns (e.g., collocations), and to emphasize the redundancy
and predictability of language
• Warren Weaver (1949):
[A]bout half of the letters or words we choose in writing
or speaking are under our free choice, and about half
(although we are not ordinarily aware of it) are really
controlled by the statistical structure of the language.
$cat*|tr’\n’’’|
egrep-o’\w+/rb\w+/vb\w*\w+/nn\w*\w+/jj\w+/jj’
$
Linguistics 115: Corpus Linguistics, Fall 2007, SJSU
10
Linguistics 115: Corpus Linguistics, Fall 2007, SJSU
11
Autonomy of syntax
Chomsky on corpus linguistics: 2004 interview
• Chomsky (1957) writes:
I think that we are forced to conclude that grammar is
autonomous and independent of meaning
Corpus linguistics doesn’t mean anything. . . [S]uppose
physics and chemistry decide that instead of relying on
experiments, what they’re going to do is take videotapes
of things happening in the world and they’ll collect
huge videotapes of everything that’s happening and from
that maybe they’ll come up with some generalizations or
insights. Well, you know, sciences don’t do this. But maybe
they’re wrong. Maybe the sciences should just collect lots
and lots of data and try to develop the results from them.
Well if someone wants to try that, fine. They’re not going
to get much support in the chemistry or physics or biology
department. . . We’ll judge it by the results that come out.
So if results come from study of massive data, rather like
videotaping what’s happening outside the window, fine—
look at the results. I don’t pay much attention to it. I
don’t see much in the way of results.
• Corpus linguist John Sinclair (1991:108):
[I]t is folly to decouple lexis and syntax, or either of
those and semantics. The realization of meaning is far
more explicit than is suggested by abstract grammars.
The model of a highly generalized formal syntax, with
slots into which fall neat lists of words, is suitable only
in rare uses and specialized texts. By far the majority
of text is made of the occurrence of common words in
common patterns. Most everyday words do not have an
independent meaning, or meanings, but are components
of a rich repertoire of multi-word patterns that make up
text.
Linguistics 115: Corpus Linguistics, Fall 2007, SJSU
12
Linguistics 115: Corpus Linguistics, Fall 2007, SJSU
13
Chomsky on corpus linguistics: 2004 interview
Chomsky on corpus linguistics: 2004 interview
My judgment, if you like, is that we learn more about
language by following the standard method of the sciences.
The standard method of the sciences is not to accumulate
huge masses of unanalyzed data and to try to draw some
generalization from them. The modern sciences, at least
since Galileo, have been strikingly dierent. What they
have sought to do was to construct refined experiments
which ask, which try to answer specific questions that
arise within a theoretical context as an approach to
understanding the world. . . If you want to understand
how bodies fall, Galileo would not have been interested
in videotapes of leaves falling and balls going around and
rocks rolling down mountains and so on and so forth. What
he was interested in is the highly refined abstract conception
of a ball rolling down a frictionless plane, which doesn’t
even exist in nature. . . To say that it’s more empirical to
just collect and observe data is completely wrong.
People who work seriously in this particular area do not rely
on corpus linguistics. They may begin by looking at facts
about frequency and shifts in frequency and so on, but if
they want to move on to some understanding of what’s
happening they will very quickly, and in fact do, shift to
the experimental framework. Where you design situations,
you enquire into how people will act in those situations.
You design them within a framework of theoretical inquiry
which has already suggested that these are likely to be
important questions and I want the answers to them. But
that’s not corpus linguistics. If you want to use hints from
data that you acquire by looking at large corpuses, fine.
That’s useful information for you, fine. I mean, Galileo
might have gotten some hints from looking at events that
were happening in the world. In fact, he did. He observed
the tides—that’s like corpus linguistics. You’re observing
the tides.
Linguistics 115: Corpus Linguistics, Fall 2007, SJSU
14
Linguistics 115: Corpus Linguistics, Fall 2007, SJSU
15
The sciences construct refined experiments
Chomsky’s Minimalist Program
• For Chomksy, “to construct refined experiments” means to
ask a native speaker to judge the acceptability of artificially
constructed example sentences
• Examples of acceptability judgments from van Riemsdijk &
Williams (1986),IntroductiontotheTheoryofGrammar
1. Who did Jo think said John saw him?
2. John I believe Sally said Bill believed Sue saw.
3. John wants very much for himself to win.
4. What did Sally whisper that she had secretly read?
5. The boys read Mary’s stories about each other.
Linguistics 115: Corpus Linguistics, Fall 2007, SJSU
16
Linguistics 115: Corpus Linguistics, Fall 2007, SJSU
17
Is generative grammar scientific?
• Hawking (1998:9) on scientific theory:
[A] good theory. . . must accurately describe a large class
of observations on the basis of a model that contains
only a few arbitrary elements, and it must make definite
predictions about the results of future observations.
• Denis Bouchard on generative grammar:
[It] contains a very large amount of arbitrary elements
and makes no predictions.
“Exaption and linguistic explanation,”Lingua115:12, 2005
Linguistics 115: Corpus Linguistics, Fall 2007, SJSU
18
[ Pobierz całość w formacie PDF ]