NL-KR Digest Volume 05 No. 15

Published in
· 11 months ago
NL-KR Digest             (9/12/88 21:11:42)            Volume 5 Number 15 

Today's Topics: 
        Re: open/closed classes 
        nl evaluation workshop 
        Data Wanted: 
        Re: GPSG parsers 
         
Submissions: NL-KR@CS.ROCHESTER.EDU  
Requests, policy: NL-KR-REQUEST@CS.ROCHESTER.EDU 
---------------------------------------------------------------------- 

Date: Thu, 1 Sep 88 07:54 EDT 
From: Bruce E. Nevin <bnevin@cch.bbn.com> 
Subject: open/closed classes 

There is some recent work by Leonard Talmy on the supposed cognitive 
whys and wherefores of open vs closed classes.  Sorry, I don't have a 
reference handy. 

The supposition that a speech recognizer has to be especially good at 
hearing closed-class words misses an important point:  the closed-class 
words are unstressed and generally subject to reduction in phonemic--how 
shall I say--extent.  This is part of a general process, apparently in 
all languages, of reducing the phonemic representation of words that 
carry less information.  They are reducible to the extent that they are 
redundant.  (Not much difficulty predicting the filler in the context 
'He __ gone.' You only need enough phonemic content to distinguish the 
words 'has, had, was' plus of course more obvious--and less reduced-- 
constructions incorporating these such as their negatives, `will have 
gone', etc.) 

Historically, closed-class morphology derives from open-class words that 
have become more redundant and predictable, so that their reduced forms 
become `frozen' in their now predictable contexts.  An example is the 
suffix -hood in `childhood', from an earlier form had meaning `state', 
something like `child-state'.  The suffix -ly in adverbs of manner 
derives from the dative of a word for `form, body'.  Ancestors of 
Proto-Indo-European not having been reconstructed, we have no 
confirmation that this is the origin of inflectional morphology such as 
the preterit in descendant languages like English, but that is certainly 
the most plausible assumption.  In American Indian languages, Shirley 
Silver dubbed this process `morphemization' almost 20 years ago. 

So affixes (inherently closed-class morphology) appear to be derived by 
reduction from once free-standing words.  Similarly for closed-class 
words.  `Because' derives from `by cause'.  OED cites 1305 `bi cause 
whi'; whi or `why' is the instrumental of the wh- pronouns typified by 
`what', reduced to `that' in the later `by cause that, because that'. 
(Compare reduction of cause to zero in `for the cause why' --> `forwhy', 
a common conjunction now obsolete, to which compare further `from the 
place where' --> `from where'.)  Zeroing of `why ~ that' in `because 
why, because that' leaves `because' as a conjunction, a closed-class 
word.  (See Jespersen _Modern English Grammar on Historical Principles_ 
V 397 and Harris _A Grammar of English on Mathematical Principles_ 195 
for further details.) 

An example currently in progress in English is `going to' --> `gonna', a 
reduction that takes place before verbs but not before nouns (*`I'm 
gonna New York')  precisely because `going to' can occur before the 
whole class of verbs (and consequently carries less information and is 
subject to reduction there)  but cannot occur before every possible 
noun.  (Note that in e.g.  `I'm going to authority' an indefinite noun, 
one of exceptionally broad distribution, can be understood as having 
been elided:  `I'm going to someone of/in authority'.  It is not 
possible to reverse a reduction in this way to account for the broad 
distribution of `going to' before verbs.)  This appears to be on the way 
to being a separate future tense morpheme in the closed-class set. 

The above example of `forwhy' illustrates that closed-class words also 
become obsolete and drop from the language.  The class is closed with 
respect to distribution, and conservative but not closed with respect to 
change. 

Bruce Nevin 
bn@cch.bbn.com 
<usual_disclaimer> 

 
------------------------------ 

Date: Fri, 2 Sep 88 12:19 EDT 
From: palmer@PRC.Unisys.COM 
Subject: nl evaluation workshop 

                   CALL FOR PARTICIPATION 

                        Workshop on 
     Evaluation of Natural Language Processing Systems 

                          Dec 8-9 
           Wayne Hotel, Wayne, PA (Philadelphia) 

     There has been much recent interest  in  the  difficult 
problem  of  evaluating  natural language systems.  With the 
exception of natural language interfaces there are few work- 
ing systems in existence, and they tend to be concerned with 
very different tasks and use equally  different  techniques. 
There  has been little agreement in the field about training 
sets and test sets, or  about  clearly  defined  subsets  of 
problems  that  constitute standards for different levels of 
performance.  Even those groups that have attempted a  meas- 
ure of self-evaluation have often been reduced to discussing 
a system's performance in isolation - comparing its  current 
performance  to  its  previous  performance  rather  than to 
another system. As this technology  begins  to  move  slowly 
into  the  marketplace, the need for useful evaluation tech- 
niques is becoming more and more obvious.  The  speech  com- 
munity  has  made some recent progress toward developing new 
methods of evaluation, and  it  is  time  that  the  natural 
language  community followed suit.  This is much more easily 
said than done and will require a concentrated effort on the 
part of the field. 

     There are certain premises that should underly any dis- 
cussion  of  evaluation  of natural language processing sys- 
tems: 

(1)  It should be possible to  discuss   system   evaluation 
     in   general  without  having to state whether the pur- 
     pose  of the  system is  "question-answering" or  "text 
     processing."     Evaluating   a   system   requires the 
     definition of an  application  task  in  terms  of  I/O 
     pairs   which   are  equally  applicable  to  question- 
     answering, text processing, or generation. 

(2)  There are two basic types of evaluation: a) "black  box 
     evaluation"  which  measures  system  performance  on a 
     given task in terms of well-defined I/O pairs;  and  b) 
     "glass  box  evaluation"  which  examines  the internal 
     workings of the system.  For example,  glass  box  per- 
     formance   evaluation   for  a  system that is supposed 
     to  perform semantic  and  pragmatic   analysis  should 
     include the  examination  of  predicate-argument  rela- 
     tions,  referents,  and temporal and causal relations. 

     Given these premises, the workshop will  be  structured 
around  the following three sessions: 1) Defining "glass box 
evaluation" and "black box evaluation." 2) Defining criteria 
for "black box evaluation." _A Proposal for establishing task 
oriented benchmarks for NLP Systems_ (Session Chair  -   Beth 
Sundheim)  3)  Defining criteria for "glass box evaluation." 
(Session Chair - Jerry Hobbs)  Several  different  types  of 
systems will be discussed, including question-answering sys- 
tems, text processing systems and generation systems. 

     Researchers interested in participating  are  requested 
to  submit  a  short  (250-500  word)  description  of their 
experience and interests, and what they could contribute  to 
the  workshop.  In particular, if they have been involved in 
any evaluation efforts that they would like  to  report  on, 
they  should  include  a  short abstract (500-1000 words) as 
well. The number of participants at  the  workshop  must  be 
restricted  due  to limited room size.  The descriptions and 
abstracts will be reviewed by the following committee:  Mar- 
tha  Palmer  (Unisys),  Mitch Marcus (University of Pennsyl- 
vania), Beth Sundheim  (NOSC),  Ed  Hovy  (ISI),  Tim  Finin 
(Unisys),  Lynn  Bates  (BBN).   They  should  arrive at the 
address given below no later than October 1st.  Responses to 
all  who  submit  abstracts  or descriptions will be sent by 
November 1st. 

                       Martha Palmer 
                           Unisys 
                   Research & Development 
                         PO Box 517 
                      Paoli, PA 19301 
                   palmer@prc.unisys.com 
                       (215) 648-7228 

 
------------------------------ 

Date: Mon, 5 Sep 88 16:36 EDT 
From: Mark William Hopkins <markh@csd4.milw.wisc.edu> 
Subject: Data Wanted: 

     I am in need of some English text, for setting up a data base.  If you  
have any to contribute please e-mail them to me. 

     I asked Jerry Lewis to set up a telethon for this, but he said he was  
busy :-) 

------------------------------ 

Date: Mon, 12 Sep 88 08:02 EDT 
From: COR_HVH%HNYKUN52.BITNET@CUNYVM.CUNY.EDU 
Subject:  GPSG parsers 

Some time ago I asked for information on GPSG parsers (or parser-generators) 
and promised to report any replies. Up to now, I have been notified of two 
efforts in this area. 

At the Technical University in Berlin a PROLOG system is being developed in 
a machine translation context (Eurotra). It is able to parse and generate 
sentences according to a small English or a medium German grammar. 
At Boeing work is done on a LISP GPSG parser with the eventual aim of 
automatic message processing. The system can parse English sentences 
using a fairly large grammar and dictionary. Neither system uses "pure" 
GPSG (in case it exists at all), the most important difference being the 
absence of metarules. 

I will ask both my contacts to do a more detailed write-up about their 
work and submit them to this list. 

Hans van Halteren             COR_HVH@HNYKUN52.BITNET 

------------------------------ 

End of NL-KR Digest 
*******************
NL-KR Digest Volume 05 No. 15

Share this article

Let's discover also

Original call for a Netizens Association

NL-KR Digest Volume 03 index

Volume 2 - Back Orifice

Volume 1 - NetBus

INQUIRING MIND Volume 10, Number 2

AMIGAphile Volume 1 Number 2

Volume 3 - Extras

NL-KR Digest Volume 04 index

NL-KR Digest Volume 02 index

NL-KR Digest Volume 05 index

Recent Articles

Pizza/focaccia con base di lenticchie rosse

Amazon: Guardians of Eden

The Electrical Knowledge of Ancient Civilizations

Mind Control Cult Newsletter -- Issue #5

Mind Control Cult Newsletter -- Issue #4

Mind Control Cult Newsletter -- Issue #3

Mind Control Cult Newsletter -- Issue #2

Mind Control Cult Newsletter -- Issue #1

Physics (Book 8) by Aristotle

Physics (Book 7) by Aristotle

Recent Comments