Copy Link
Add to Bookmark
Report
NL-KR Digest Volume 02 No. 37
NL-KR Digest (5/07/87 22:52:01) Volume 2 Number 37
Today's Topics:
request for references on task-specifity
Yet More Leakage
Re: Natural vs. artificial languages
Re: grammar checkers
----------------------------------------------------------------------
Date: Wed, 6 May 87 10:58:05 EDT
From: mccarthy@CCA.CCA.COM (Dennis McCarthy)
Subject: request for references on task-specifity
There seems to be a fundamental disagreement between linguistics and
cognitive science regarding task-specific cognitive mechanisms for
language, especially language acquistion. Linguists generally assume
the existence of such mechanisms. In cognitive science, recent proposals
for the architecture of cognition (e.g. ACT*, Soar) attempt to explain
language acquisition, comprehension, and production using general
mechanisms for cognition.
Does anyone know of a book or article that deals with this apparent
conflict?
------------------------------
Date: Wed, 6 May 87 09:08:52 EDT
Organization: The MITRE Corp., Washington, D.C.
From: camis..mitch@mitre.ARPA (Mitchell Sundt)
Subject: Yet More Leakage
Bruce Nevin replies:
BN: [MS's assertion that both formal and natural languages do have leaky
BN: grammars] misses the essential point of the extended quote from Harris: at
BN: any given time t_i some parts of the natural language grammar may be
BN: stated in two ways: one way in accord with the grammar at the earlier
BN: time t_h, and the other way in accord with the grammar at a later time
BN: t_j. Because descriptions of formal languages are matters of formal
BN: definition, no such ambiguity of description can occur in the grammar of
BN: a formal language.
And this misses my point. If you look at the programmer's reference manuals for
*ANY* *IMPLEMENTATION* of a programming language, there are always unsupported
features or enhancements to the *FORMAL DEFINITION* of that language, as set
out by the current standards committee document on that language.
This discrepancy indicates that the *FORMAL DEFINITION* is a past grammar
(analogous but not quite equivalent to that at t_h referred to above), and that
the IMPLEMENTATION of the programming language is the current state of the
grammar (analogous to the grammar at t_i), and that whenever a revised formal
definition of the language comes out of a standards committee, this document
will be analogous to the grammar at t_j.
Several issues are raised by this assumed mapping:
- The formal specification at t_h (and t_j) is never the union of all
currently available forms of the language (except at t=0, when only
-> one version exists). The *FORMAL DEFINITION* is therefore not
-> properly the grammar of the programming language referred to in
the Harris extract by BN (Otherwise, I would agree with BN).
- The grammar defined by the particular implementation of a language is
also not truly the t_i grammar referred to in the Harris extract for
the similar reasons. However, the t_i grammar is the union of all
grammars defined by the current implementations of the language. In
-> this way, the grammar of the programming language expands with each
-> new implementation of a compiler for that language (and *NOT* with each
new *HUMAN USER*).
Thus, under this revised definition of a programming language's grammar,
I argue that programming languages and natural languages exhibit the same
leaky characteristics. This leakage is modified for programming languages,
however, to support *HISTORICAL COMPATIBILITY*, which is a major intent
of the user community (as I asserted in my earlier submission, I believe
the changes in the INTENT of a user community are the driving force of change
in a language grammar). Thus:
- Any future grammar of a programming language will only extend the
grammar defined by the earlier standards (formal descriptions) of
that language. Thus, leakage vs extension of the grammar is only
apparent when one looks at the grammars of the implementations of the
language, and sees the extensions to the language that have failed
to gain acceptance and have since been dropped (of course, whether
this actually happens is an open question). If none are dropped,
-> the distinguishing feature of programming languages is that the
-> *INTENT* of the users for *HISTORICAL COMPATIBILITY* forces the
-> language to only extend its grammar and not prune it (as can and
-> does happen in natural language).
As to the agent of leakage, and the fact that programmers cannot (yet?)
alter or extend the language on their own (in analogy with BN's children
who never get a grammar right), clearly the development of a programming
language grammar is heavily influenced by the strict two-layer society of
the compiler writers and the users of a programming language.
------------
Mitch Sundt
sundt@mitre.arpa
------------------------------
From: Brian Holt <seismo!umix!apollo!brian>
Date: Tue, 5 May 87 13:37:27 EDT
Subject: Re: Natural vs. artificial languages
In NL-KR Digest, Volume 2 Number 33, sundt@mitre.arpa says:
To say that programming languages are not "natural langauges" because they
do not describe "love" or other topics is also absurd. The user community
does not intend them to describe such concepts, and therefore they have been
dropped as unuseful!!! I'm sure the eskimo langauge has no words which
adequately describe the sweltering heat of a tropical jungle -- because they
have no use for such words.
(On the other hand, I understand that they have many (20?) words describing
snow, because what type of snow it is is (was?) very important for their
survival (and survival was a major intention in their use of langauge)).
Just a few quick points. No, eskimo languages do not have 20 words for snow,
although they do have many different words which describe snow. An early
paper misinterpreted the suffix formation rules and attributed many words,
rather than identifying them as simply modifiers to a few words (sort of
like snow, sleet, hail, slush, etc. being modified by powder, corn, granular
wet, sticky, etc. in English).
More importantly, although the "eskimo" language may not have specific words
for a tropical jungle (although I do not know that for a fact), it certainly
can "adequately describe the sweltering heat" of one. That is one of the
properties that distinguishes natural, human languages from artificial, formal
languages: natural languages can be used to describe anything. The fact
that US English can describe tropical jungles when we have none in this
country simply serves to reinforce this fact.
I apologize for not providing references for these points. The latter
is well discussed by most introductory texts in modern linguistics (such
as Akmajian and Heny, _An Introduction to Linguistics_) and I can dig
up the citation for the former should anyone be interested.
In general, I am unsure as to whether those posting to this mailing list
are simply posting their intuitive opinions as native speakers of some
human language, or are drawing on any of the insights and conclusions
brought about by a multitude of studies and experiments in psycho-
and anthropological linguistics in the past two decades. Unfortunately,
it often appears that many writers are simply arguing from their
experience as language users, and not as language researchers.
=brian
Brian R. Holt
Apollo Computer, Inc.
apollo!brian@eddie.mit.EDU
------------------------------
Date: Sun, 3 May 1987 23:35 EDT
From: MINSKY%OZ.AI.MIT.EDU@XX.LCS.MIT.EDU
Subject: Re: grammar checkers
[Excerpted from AIList]
I agree with Todd, Ogasawara: one should not criticise to extremes. I
found RightWriter useful and suggestive. It was helpful in detecting
obnoxious passive constructions and excessively long sentences. In
final editing of "The Society of Mind" I used spelling checkers to
notify me of unfamiliar words, and I often replaced them by more
familiar ones. I also used it to establish a "gradient". The early
chapters are written at a "grade level" of about 8.6 and the book ends
up with grade levels more like 13.2 - using RightWriter's quaint
scale.
Naturally the program makes lots of errors, but they are instantly
obvious and easily ignored.
I imposed a "style gradient" upon "The Society of Mind" because I
wanted its beginning to be accessible to non-specialists. I
cheerfully assumed that any reader who gets to the end will by then
have become a specialist.
------------------------------
Date: Mon, 4 May 87 12:18 EST
From: "Linda G. Means" <MEANS%gmr.com@RELAY.CS.NET>
Subject: re: grammar checkers
[Excerpted from AIList]
mom: toaster oven, kimono Todd Ogasawara writes in AILIST Digest v.5 #108:
>I think that if these style checking tools are used in conjunction
>with the efforts of a good teacher of writing, then these style
>checkers are of great benefit. It is better that children learn a
>few rules of writing to start with than no rules at all. Of course,
>reading lots of good examples of writing and a good teacher are still
>necessary.
Sure, but the problem is the bogus rules that the child is likely
to infer from the output of the style-checking program, like never
write a sentence longer than x words, or don't use passive voice,
or try not to write sentences with multiple clauses.
>On another level... I happened to discuss my response above with one
>of my dissertation committee members. His reaction? He pulled out
>a recent thesis proposal filled with red pencil marks (mostly
>grammatical remarks) and said, "So what if the style checkers are
>superficial? Most mistakes are superficial. Better that the style
>checker should find these things than me."
Sounds like a rather irresponsible attitude to me, given the state
of the art of automatic style checkers. Your prof needs a graduate
student slave if he dislikes having to correct student grammar
errors. Let's consider separately the issues of grammar correction and
stylistic advice (the two worlds partially overlap, but remain distinct
some areas).
1. Grammar. As your prof points out, lots of grammar errors are
superficial, but your commercial grammar checker will fail to find all
of them, correct perceived mistakes which really aren't, and give plenty
of bad advice. Those programs "know" less about grammar than the students
who use them. Any bonafide grammatical errors which can be found by the
commercially available software could also be found by the writer if he
were to proof his paper carefully. It grieves me to think of students
failing to proof their own papers because the computer can do it for them.
2. Style. The analysis of writing style is not a superficial task; it is,
in fact, a kind of expertise not found in many "literate" individuals.
In my experience, the best way to learn to write well is to scrutinize
your work in the company of a good writer who will think aloud with you
while helping you to rewrite sentences. I've successfully taught various
people to write that way. The second best method is a patient teacher's
red pen. In both cases, your prose is being evaluated by someone who is
trying to understand what you are trying to communicate in your writing.
You must understand that this is not the case with the computer. It
probably has no way of representing the discourse as a whole; all analysis
is performed at the sentence level with a heavy emphasis on syntax and
with no semantic theory of style. The result? Stylistic advice which
is so superficial as to be useless. Many years of research in the area of
computational stylistics have provided evidence that although some (few)
stylistic discriminators can be found through syntactic analysis, the
features which contribute to textual cohesion and to a given writer's
"stylistic fingerprint" cannot. Researchers are still stymied by the
problem of identifying stylistically significant features of a text.
Yet the program advocated by Carl Kadie feigns an understanding of the
effect that the prose will have on its reader; it generalizes from
syntactic structure to stylistic impact. Look at the summary generated
at the end of the text. The program equates active voice and short
sentences with "directness". I won't take the time here to argue
against the use of fuzzy adjectives like 'direct', 'expressive', 'fresh',
and so on to describe prose, since the use of such imprecise language
is a longstanding tradition in the arena of literary criticism. I can't
tell you exactly how to make your writing "direct", but I know that
directness cannot always be computed empirically, which is how your
machine computes it. A paragraph of non sequiturs probably shouldn't
be characterized as direct, even if all sentences are short and contain
only active verbs.
- Linda Means
GM Research Laboratories
means%gmr.com@relay.cs.net
------------------------------
End of NL-KR Digest
*******************