Copy Link
Add to Bookmark
Report

NL-KR Digest Volume 02 No. 12

eZine's profile picture
Published in 
NL KR Digest
 · 1 year ago

NL-KR Digest             (3/09/87 12:54:36)            Volume 2 Number 12 

Today's Topics:
Rule-based knowledge representation
source of data
Re: A Real Linguistics Question ? ( A restatement )
New-Word Inquiry on NL-KR

----------------------------------------------------------------------

Date: Sat, 28 Feb 87 05:16:33 PST
From: AJ Stonewall Peterson <andrewjp%drizzle%uoregon.csnet@RELAY.CS.NET>
Subject: Rule-based knowledge representation

I've been considering possible ways to represent a very large
vocabulary relatively compactly for use in a generalized English
language parser, probably implemented on a "supermicro" computer
(there's even a Compaq-sized UNIX 4.2 box out now), and I wonder
whether there has been any investigation done on the efficiency and/or
feasibility of one possible scheme that's reared its ugly head.

My thought is to create a small kernal vocabulary of nouns and verbs
with extensive definitions based on subsets and functional attributes
of the words, then to build a much larger vocabulary (I'm hoping
40,000 words will be feasible) by defining words in terms of the
kernal words in a compact rule-based system. For example, the verb
"run" could be defined as the kernal verb "move" with rules statements
attached to stipulate that it be movement on foot above a certain
speed. In this way, I expect that a very complete generalized parser
could be kept small enough to be kept entirely in the RAM of many
forthcoming supermicros, speeding the interpretation of English
sentences into coded requests for action and making it useful to a
much wider audience.

Any thoughts on this idea would be appreciated. I'm learning my AI
from the top down, reading dissertations and going back to fill in
gaps in my knowledge, so let me know if there's something I'm missing.
Thanks.

Eric Swanson

"Better Living through Rule-Based Systems"
============================================================
Eric Swanson | (503) 484-2790 or (503) 484-4184
c/o | P.O. Box 30098
andrewjp@drizzle.uucp | Eugene, OR 97403
============================================================

------------------------------

Date: Wed, 4 Mar 87 13:23:08 EST
From: Bruce Nevin <bnevin@cch.bbn.com>
Subject: source of data

Mark Edwards questions whether `a sentence is always the proper datum
for doing GB, GPSG' etc. He says that `Chomsky would argue that . . .
sentences that seem grammatical [only] in a certain context are
really syntactically ungrammatical but pragmatically correct. Or
something on that line of thought.'

Lots of work suggests that discourses specified as to sublanguage
(subject-matter domain) are the proper domain for grammatical research,
not sentences. Consider, for instance, that you cannot take any two
perfectly sensible assertions and conjoin them with `and', thus:

Not all of the nonverbal signals of body language are concerned
with external choices, of course, and looking at GNP from this
perspective, we see it not so much as a stream of goods, but as
a flow of buying, of expenditure, of demand.

(The first part is from _Deciphering the Senses_, Rivlin & Gravelle, p.
98; the second is from _Economics Explained_, Heilbroner & Thurow, p. 71.)

And this one is really not too bad. Try the following from the same
source (p. 70 and p. 182, respectively):

The tuna may be using this information about the angle of the
sun as a prime means of navigation and over-the-counter
brand-name aspirins sell for up to three times the cost of
nonbrand versions of the identical product.

To sequences like this, one can only say `huh?!' It gets rapidly much
worse when you use other than the topic sentence from paragraphs in the
two sources and take less care than I have to avoid deictics,
referentials, and such.

Note that you can't get out of the problem by postulating discourse
constraints that are beyond the scope of syntax: you can have the
identical nonsequitur under any conjunction or in a relative clause.

One may also have sentences that are absolutely unacceptable in a
sublanguage grammar (`The cells were washed in polypeptides' in a
cellular biology sublanguage) but are subject to no such judgement
outside that domain (there might be a vat of polypeptides someplace).

On sublanguages, try the book _Sublanguage_ edited by Kittredge &
Lehrberger, and the study of the sublanguage of immunology by Harris et
al. (_The Form of Information in Science: A Test Case in Immunology_,
forthcoming from D. Reidel in Boston Studies in Phil. of Sci.) Look at
work by Sager, Grishman, and others for an earlier approach based on the
LSP system at NYU.

Bruce Nevin
bn@cch.bbn.com

(This is my own personal communication, and in no way expresses or
implies anything about the opinions of my employer, its clients, etc.)

------------------------------

From: edwards@uwmacc.UUCP (mark edwards)
Subject: Re: A Real Linguistics Question ? ( A restatement )
Date: 26 Feb 87 19:23:31 GMT
Organization: UW-Madison Academic Computer Center
Keywords: Source of Linguistic Data

>In article <259@su-russell.ARPA> goldberg@su-russell.UUCP (Jeffrey Goldberg) writes:
:In article <1111@uwmacc.UUCP> I write:
: I am thinking about doing a paper on a topic that I think is one of
:the Fundamental Problems of Linguistics. Namely, is a sentence always
: the proper datum for doing GB, GPSG or other roughly related research.
>
>But, somehow, I gather that you are really asking about units
>larger then the sentence. I will get to that below.
>
>> Let me say outright that I am not a particular fan of GB or that line
>
>GPSG is not derivative of GB in any sense. But GPSG and GB have
>common origins. Both are theories of "Generative Syntax"

What a really meant to say is both are children of Transformational
Grammar, and the philosophy that is at the roots of both GB and GPSG
still causes problems (?)

: Chomsky would argue that a sentence is the proper place for doing work
: in Linguistics (syntax ?). He would also say that sentences that seem
: gramatical in a certain context are really syntactically ungrammatical,
: but pragmatically correct. Or something on that line of thought.
:
>
>I am not sure that I understand what you are asking here.
> [much deleted]
>
>But to get back to your original question, there are linguists who
>look a things larger then the sentence. It is not clear that there
>are linguistically definable units at those larger levels, thus
>making the sentence the largest unit that one can really try to say
>things about. But just because there aren't larger units doesn't
>mean that there is nothing for linguists to discover at what is
>called "the discourse level". It is a useful thing to look at, and
>I wish you well with it.
>

I am specifically interested in why some linguists think a sentence
can be realistically used as grounds for research. And of course why
some think it can not be. Gazdar (1979) says that when we are asked
whether some sentence is grammatical, we naturally try to find some
context where this sentence is grammatical. The key word here being
context. So if we try to bring some context to bear anyways in a
single sentence, what really is the differents if we supply the context
in previous sentences ?

It might be that this is true on some levels, but makes no difference
in what we are actually trying to do at the moment (like binding anaphors
or pronomials). Can we actually segement Linguistics into its various
components ? (Phonology, syntax, semantics, pragmatics) Or is language
all semantics, or all pragmatics ? Is syntax just an artifact of the
other two ? Or of course not, syntax is this and we really can say that
about a sentence.

What are the implications for AI if it turns out that a sentence can not
be examined without other contextual sentences? (I think that AI has
already discovered that this is the case. For example when talking
about psychology you must bring in the domain of pschology to parse
a given sentence. Problems arise when a sentence has meanings in
multiple domains.)

What are the implications for Chomsky if this this is the case ?


: What I am interested in, is any references or any thoughts (specific
: examples) on this topic.

" "

I hope some of this makes sense.

mark
--
edwards@unix.macc.wisc.edu
{allegra, ihnp4, seismo}!uwvax!uwmacc!edwards
UW-Madison, 1210 West Dayton St., Madison WI 53706

------------------------------

From: allegra!bwb
Date: Wed, 4 Mar 87 09:36:55 est
Subject: New-Word Inquiry on NL-KR

To: Larry Wasnick
From: Bruce Ballard

My apology for the delayed response to your inquiry, but here is
a summary of how lexical, syntactic, and semantic information
is acquired from a user for an unknown word in the TELI system.
Details can be found in the Ballard & Stumberger paper in the
1986 ACL Proceedings and the Ballard paper in the 1986 COLING.

1. When a new token is seen, a menu appears proposing possible
spelling corrections or letting the user say this is a "new word".

2. If the user says it's new, the system lists open parts of speech
and the user picks one (a null response => return to previous menu).

3. If the word is a part of speech directly associated with objects
of the domain at hand (e.g. noun, adjective, or proper noun, which are
treated as 1-place modifiers) a multiple-choice menu appears for the
user to indicate what object types it deals with (e.g. for "large"
the user might button PERSON and OFFICE but not PROJECT).

4. Now the system checks whether the part of speech of the new word
is the "head" of a phrase type (e.g. prep for PP, verb for VP,
adj for AdjP (as in "ADJACENT to")) and the user gets a chance to
supply any or all case frames for the word. As described in the
COLING-86 paper this can be done, at the user's option, by menus or
by typing phrases like "a person can work on a project". If new
words come up here, an appropriate recursion on the entire process
of acquisition (starting with the possible-spelling-error menu) is done.

5. At this point the system continues processing. When a parse is found,
and the semantic component comes alive, it will notice that there is
no known meaning for the new word in whatever context it is being
used (e.g. "large person" and "large office" will have separate defs).
It responds by suggesting some obvious meanings, in terms of semantics
already known to it, along with a menu option for the user to supply a
meaning which is an arbitray boolean combination of "primitive" defs,
which again may be given by menu or in English. As described in the
ACL-86 paper, this is a rather sophisticated process, but it has been
thoroughly tested and works rather smoothly.

Hope this information helps and again, sorry for the delay in my reply.

------------------------------

End of NL-KR Digest
*******************

← previous
next →
loading
sending ...
New to Neperos ? Sign Up for free
download Neperos App from Google Play
install Neperos as PWA

Let's discover also

Recent Articles

Recent Comments

Neperos cookies
This website uses cookies to store your preferences and improve the service. Cookies authorization will allow me and / or my partners to process personal data such as browsing behaviour.

By pressing OK you agree to the Terms of Service and acknowledge the Privacy Policy

By pressing REJECT you will be able to continue to use Neperos (like read articles or write comments) but some important cookies will not be set. This may affect certain features and functions of the platform.
OK
REJECT