Copy Link
Add to Bookmark
Report
NL-KR Digest Volume 02 No. 45
NL-KR Digest (5/28/87 17:53:25) Volume 2 Number 45
Today's Topics:
In layman's terms.
Re: English grammar (was Re: In layman's terms.)
BNFs and NL (Was: In layman's terms.)
Re: Chart parsers and PROLOG
----------------------------------------------------------------------
Date: Fri, 22 May 87 18:50 EDT
From: mnetor!utzoo!utgpu!water!watmath!erhoogerbeet@seismo.css.gov
Subject: In layman's terms.
Hello. I am an undergrad going into computer science, and have long been
interested in ai. But ai has always been "above" us lowly first and second
year students and we have been relegated to the quicksorts and recursive
n factorial problems. I think this would be a good place to have articles
that introduced the laymen to ai to keep our interest.
Recently, we worked on a simple compiler for numerical expressions using
pwap on cms which is similar to yacc on unix. We entered a modified
Backus-Naur Form and it spewed forth source code in pascal for an expression
interpreter.
Also, I have played many an adventure game and have always
been intrigued by the way it parses and interprets commands given to it in
simple English.
Is there a Backus-Naur Form for the English language itself or is this too
complicated? If not, how is it that we can understand a huge variety of
different sentence forms and still recognize that some are syntactically
incorrect? Basically, what I am asking is it possible to do syntactic
checking as if "compiling" a sentence with rules set down in some BNF?
As I understand it so far, natural language processing would have at least
two levels (syntactic, semantic) and that syntactic checking level would
be the basis of the other.
"I bed he not on." is not syntactically or semantically correct.
"Colourless green ideas sleep furiously." syntactically but not semantically
correct.
To be semantically correct, a sentence must be syntactically correct.
Semantic checking would involve some connotation checking using trees and
associations, but there is probably no clear-cut way of doing this.
So I ask about syntactic checking.
Some compilers have a front-end/back-end structure. Could syntactic checking
be a front end part of the compiler? Theoretically, a syntax checker could
be given a certain BNF for a language and be plugged into a back end
semantic checker.
But there must be someone out there on the net who can set me straight and
do it in layman's terms.
From an interested undergrad:
erhoogerbeet@watmath.uucp "`The Guide says there is an art to flying,'
ehoogerbeets@wateuler.uucp said Ford,`or at least a knack. The knack lies
Edwin (Deepthot) in learning how to throw yourself at the ground
Hoogerbeets and miss.' He smiled weakly."
------------------------------
Date: Wed, 27 May 87 11:34 EDT
From: Randy Gordon <randyg@iscuva.ISCS.COM>
Subject: Re: English grammar (was Re: In layman's terms.)
There are two major "camps" of natural language techniques.
The "Old School" is the syntactic/transformational grammer types, which
comprise the majority of folks actually producing natural language parsers.
Harris's Intellect, for example. Some translation work is done by them,
Tomita's has a nice book on English/Oriental translation, and McCord's LMT
on English/German translation, for example. Personally, I find thier efforts
brilliant, but as futile as the dying stages of the "ether" theory.
The "Young Turks" are the descendents of Schank's Conceptual Dependency
theory, Sowa, Schank, etc. These generally treat semantics as secondary
to syntactics. Shanks company, Cognitive Systems, several years ago, produced
a system for a Belgian Bank that could translate six languages well enough
to act as an interface to an EFT system. The nice thing about concept based
parsing is that you don't have to fully map the sentence to understand it.
This is very useful in unrestricted input systems.
There is a small group that believes you ought to pick the input words off
of menus. It works, and with modifications, better than ANY of the other
methods, but frankly, I don't consider it natural language processing.
A lot of the concept based parsing techniques have fascinating consequences
in expert systems, database representation type work, and may very well
represent the next "hot button" in AI.
Randy Gordon "Tao ku tse fun pee"
------------------------------
Date: Wed, 27 May 87 14:47 EDT
From: srt@locus.ucla.edu
Subject: Re: English grammar (was Re: In layman's terms.)
In article <529@iscuva.ISCS.COM> randyg@iscuva.UUCP (Randy Gordon) writes:
>
>The "Young Turks" are the descendents of Schank's Conceptual Dependency
>theory, Sowa, Schank, etc. These generally treat semantics as secondary
>to syntactics.
>
I think you probably mean "syntactics as secondary to semantics".
The important distinction is that the "Old School" believes that there
exists semantics-free language principles (Universal Grammar, the Language
Acquisition Device and so on) and as a consequence study language separate
from meaning (e.g., semantics).
The "Young Turks" study language as a (natural) representation for internal
concepts (Schank's CD is more a theory of concept representation than of
linguistics). As a consequence, the research focus in this group is on
the processes involved in turning language into concepts and vice versa.
It has turned out that "semantics" (in the sense of word meanings and
concept manipulation) has proven much more important to correctly
translating from language to concepts than syntactic information.
Unlike the "Old School" that tries to study syntax in isolation, the "Young
Turks" recognize the importance of syntax and use syntactic information
whenever appropriate. The apparent bias arises because of the actual
relative importance of semantics vs. syntax, not because of any research
methodology.
Scott R. Turner
UCLA Computer Science "But If I Graduate,
Domain: srt@ucla.cs.edu I'll Have to Work for a Living"
UUCP: ...!{cepu,ihnp4,trwspp,ucbvax}!ucla-cs!srt
------------------------------
Date: Wed, 27 May 87 18:24 EDT
From: paul robinson wilson <code@sphinx.uchicago.edu>
Subject: Re: English grammar (was Re: In layman's terms.)
In article <529@iscuva.ISCS.COM> randyg@iscuva.UUCP (Randy Gordon) writes:
>
>There are two major "camps" of natural language techniques.
>
>The "Old School" is the syntactic/transformational grammer types, which
>[...]
>The "Young Turks" are the descendents of Schank's Conceptual Dependency
>theory, Sowa, Schank, etc. These generally treat semantics as secondary
>to syntactics...
Uh, I think it's the other way around (Schankians treat semantics as
primary), but it's a very important point.
The posting that started this exchange suggested intro articles for
beginners. I don't think this is an appropriate place for that, since
any good articles are not going to be really short (translation: they'll
cost the net a lot of money).
However, it seems like an excellent idea to post references to good intro
material, both for beginners and for teachers of beginners.
So how about it? Any references to good intro articles on NLP, especially
ones that clearly discuss both the syntax-first and integrated approaches?
(Also, is James Allen's new book good?)
|=-=-=-=- "COMPUTER TALKS WITH THE DEAD: New microchip receives -=-=-=-=|
| brainwaves from beyond the grave" -- Weekly World News, 5/26/87 |
| Paul R. Wilson EECS Dept.(M/C 154) UIC Box 4348 Chicago, IL 60680 |
| ARPA: uicbert!wilson@uxc.cso.uiuc.edu UUCP: ihnp4!uicbert!wilson |
|=-=-=-=-=-=-=- if no answer try: ihnp4!gargoyle!sphinx!code -=-=-=-=-=-=-=|
------------------------------
Date: Wed, 27 May 87 23:19 EDT
From: Brian Hughes <hughes@endor.harvard.edu>
Subject: Re: English grammar (was Re: In layman's terms.)
In article <1116@houdi.UUCP> marty1@houdi.UUCP (M.BRILLIANT) writes:
(summarized)
>In article <13263@watmath.UUCP>, erhoogerbeet@watmath.UUCP writes:
>> ...
>> Is there a Backus-Naur Form for the English language itself or is this too
>> complicated? ... Basically, what I am asking is it possible to do syntactic
>> checking as if "compiling" a sentence with rules set down in some BNF?
Natural language is not context free (though some people disagree
on this). BNF formalisms cannot deal with context sensitive languages
>About 30 years ago when I was at MIT doing graduate study in EE, my
>wife was talking with a guy named Chomsky who wanted to do machine
>translation. The effort resulted in new approaches to English grammar,
>but not in machine translation.
While in a strict sense this is true, Chomsky's transformational
grammer seems to be almost universaly accepted as the basis upon which to
build models that deal with the syntax of natural languages. This is true
for computerized models as well as pure abstract models.
>We read, write, talk, listen, understand and misunderstand each other
>in this chaos because our brains are not structured like computers.
While I agree that my brain is not structured like an IBM/PC,
(at least IBM hasn't asked me for royalties :-)) or any existing silicon
processor, I don't agree with the implication that a computer cannot,
a priori, understand natural language. We just haven't built intelligent
enough systems yet (hardware & software architectures). I realize that
Marty Brilliant may not have meant the implication I read; just wanted
to make my position clear.
>> As I understand it so far, natural language processing would have at least
>> two levels (syntactic, semantic) and that syntactic checking level would
>> be the basis of the other.
>
>True.
But that's not the end of the matter. You also have to deal with the
higher levels of language organization, such as discourse (e.g., a
conversation). An utterance may refer back to an entity introduced in a
previous utterance in "shorthand" - by a pronoun, or short refering expression.
To understand that shorthand, we somehow are able to (unconsciously) retrieve
the full referant. At an even higher level, a discourse may go from one
concept to a sub-concept, engage in temporary diversions, and jump from
one thing to another, but we can understand it. People are starting to
explicate the rules of discourse grammer, buts lots remains to be done.
>Seriously, I would like to know what (in plain English, if possible) is
>going on in formal English grammar and natural language parsing.
For a nice, powerful method of parsing English into a deep
structure representation, check out ATNs (Augmented Transition Networks).
Bill Woods developed this method in the late 60's. You can implement a
simple ATN in a couple of pages of LISP (see Jon Amsterdam's article in
AI Expert a few months ago, also online in AI Expert's BBS). For a
full ATN writeup, including code, see The Lunar Sciences Natural Language
Information System: Final Report by W.A. Woods, R.M. Kaplan, and
B. Nash-Webber, BBN Report #2378, BBN, 50 Moulton St., Cambridge, MA.
Also see Madeline Bates' article "The Theory and Practice of Augmented
Transition Networks" in Natural Language Communications with Computers,
Bolc, L., ed., Springer-Verlag, 1978
For a lot of different views of language understanding, all the
way from syntax through discourse, see Readings in Natural Language
Processing, Grosz, B., Jones, K.S., and Webber, B., eds., 1986,
Morgan Kauffman. After reading this, read a transcript of a naturally
occuring informal conversation to see how far we have yet to go.
Please excuse the length of this - I just finished 2 courses
in this stuff with Bill Woods and Barbara Grosz, & its still buzzing
in my head.
--------------------------------------------------------------------
Disclaimer: Correct statements mainly due to profs. Woods & Grosz,
stupidities all my own.
------------------------------
Date: Thu, 28 May 87 05:41 EDT
From: Jeffrey Goldberg <goldberg@su-russell.ARPA>
Subject: BNFs and NL (Was: In layman's terms.)
In article <13263@watmath.UUCP> erhoogerbeet@watmath.UUCP (Edwin (Deepthot)) writes:
>Is there a Backus-Naur Form for the English language itself or is this too
>complicated? If not, how is it that we can understand a huge variety of
>different sentence forms and still recognize that some are syntactically
>incorrect? Basically, what I am asking is it possible to do syntactic
>checking as if "compiling" a sentence with rules set down in some BNF?
>erhoogerbeet@watmath.uucp
There are a couple of things to distinguish here. It is possible that
there is a recursive grammar for a natural language (let's say English)
without there being a BNF. There are two questions here:
(1) Is it possible in general to write a BNF grammar for a
natural language?
(2) Is it appropriate to write a BNF grammar for a natural
language?
First a small confession. I am a linguist and not an AI researcher. I
have never taken a computer science course in my life. I believe, however,
that a BNF grammar is form of Context-Free Phrase Structure Grammar
(CF-PSG, I will explain this later). If I am wrong, a portion of what I
have to say will be meaningless and I will be royally flamed. I don't have
the references here right now, and I am willing to go out on a limb.
Let me take the question (2) first. Is it appropriate to
parse/describe/characterize the grammar of a natural language with a BNF
grammar? Well, it seems that you have only been exposed to only this
grammar formalism so you have found yourself captivated by this notion of
defining a language in these terms. But there are other ways of
characterizing (formal) languages. If you use any of the Unix editors (ed,
ex, vi) or Unix utilities such as grep or sed you will be familier with
another set of pattern matching tools. The pattern search done with these
programs basically is checking whether some string is in the language
defined by the grammar. Now while you could recognize (almost) any language
that grep could recognize using a BNF grammar, it would be very awkward to
have to write a BNF grammar when a simple grep type pattern would do the
job. The language for writing grep type expressions is tailored to a
particular class of tasks. It provides easy ways of talking about a range
of characters, etc. The (largely) more powerful BN form lacks such
convenient devices.
So how does this apply to natural language syntax? Well in many natural
languages verbs agree with their subjects. This could be coded up in a BNF
grammar, but it would take a substantial multiplication of rules. Now you
might not consider this a problem, but you must realize that the linguist
would like her system for writing rules to express certain general facts
about natural languages. One such fact is (3).
(3) Agreement systems happen in whole bunches of languages.
This shows one way in which BNF grammars are not tailored to natural
language. Another would be the area of word order. Languages vary in how
strict their word order is. BNF grammars are useful for very strict word
order languages (like English), but worthless for Makua or Warlpiri. Yet
if one were to modify the formalism one used for writing grammars so that a
specific rule doesn't (as in BNF) tell you both about constituency (what
elements can make up what kind of phrase) and linear precedence (what order
those elements must come in) one would have a system that was much more
appropriate for describing natural language. Linear precedence rules would
be kept separate from constituency rules. I could go on and list hundreds
of ways in which using BNF grammars would be inappropriate for natural
language analysis. But I will stop with this question and go on to why it
is impossible.
If I am correct in assuming that BNF grammars are a subset of
CF-PSGs then I can point out that there are a number of natural
language constructs that cannot be defined via BNF grammars no
matter how clumsily one were willing to do it. CF-PSGs can only
describe a certain class of languages: the Context-Free
Languages (CFLs). The (formal) language that has as "words" 'a', 'b', and
'c' and as sentences only those strings that consist of some number of 'a's
followed by the same number of 'b's followed by the same number of 'c's
is not a CFL. So this is the language containing the strings "", "abc",
"aabbcc", "aaabbbccc", and so on. I dare you to write a BNF grammar for
such a language!
Anyway, I don't have the papers or the exact references right here, but
there were two papers published in the journal "Linguistics & Philosophy"
in 1985 which should that some natural languages had constructs which were
provably non context-free. One was by Christopher Culy on reduplication in
Bambara. The other was by Stuart Shieber on embedded serial verbs in Swiss
German. I refer you to these articles for the proofs which are quite
straight forward.
To answer your question: It is not possible in general to write a BNF
grammar for an arbitrary natural language. But linguists do believe that
there is some form of grammar (specific to language, we don't expect to
find it in a Computer Science text book) that is fully appropriate for
describing natural language. Such a form will force each grammar to have
the properties that are universal to language. It will not allow us to
write grammars for unnatural languages. We are working on discovering this
form it.
--
Jeff Goldberg
goldberg@russell.stanford.edu, ...!ucbvax!russell.stanford.edu!goldberg
[The Bulgarians can assassinate him. The cipher key won't get to Pest.]
(The preceding two sentence are to annoy security organizations.)
------------------------------
Date: Sun, 24 May 87 22:37 EDT
From: Jay Kim <nosc!humu!uhccux!jkim@sdcsvax.ucsd.edu>
Subject: Re: Chart parsers and PROLOG
In article <548@uvm-gen.UUCP>, emerson@uvm-gen.UUCP (Tom Emerson) writes:
> I would like to hear from others who have been working in PROLOG for natural
> language processing. I am currently working on implementing an effecient
> parser in PROLOG, and have found the active chart formalism to be the most
> effective for efficient parsing. Has anyone else done work such as this? I
> know that Hideki Hirakawa of ICOT has implemented a chart parser in
> concurrent PROLOG (see ICOT TR-008). For those who would like a
> description of the chart formalism, I would be glad to put a short
> description on the net.
> Does any one know of a more effecient formalism than the active chart? All
> and any information would be appreciated.
> Tom Emerson
> University of Vermont
i am working on a parser in spitbol and another in LISP. i want to know what
you mean by the acitve chart. it mgiht help to see if the one i am working on
is more effecient than that.
------------------------------
Date: Mon, 25 May 87 21:19 EDT
From: Randy Gordon <uunet!iscuva!randyg@seismo.css.gov>
Subject: Re: Chart parsers and PROLOG
I can't seem to get a reply out, so...
Actually there are a number of formalisms, depending on the language. I use
(unofficially) McCord logical Form(slot and Grammer) stuff in prolog, plus
an occasional Conceptual Dependency type analyzer. Efficency depends mostly
on what you are trying to do and with what language.
To the fellow who asked about chart parsing:
If you want a good explanation of chart parsing, look at chapter 9 of
Artificial Intelligence, tools, techniques and applications, by Shea and
Eisenstadt, Harper and Row, 1984.
------------------------------
End of NL-KR Digest
*******************