Copy Link
Add to Bookmark
Report

NL-KR Digest Volume 03 No. 57

eZine's profile picture
Published in 
NL KR Digest
 · 10 months ago

NL-KR Digest             (12/04/87 20:08:14)            Volume 3 Number 57 

Today's Topics:
Knowledge-based bibliographies
Wanted: a module for natural language interface (in LISP)
Text Encoding Standard for the Humanities - Vassar Workshop report
DCG
Re: measures of "Englishness"
Re: Lip Movement and Mental Lexicons?
Re: Language Learning
Re: Language Learning (a Turing test)
Re: Language Learning (anecdotes)
Re: Language Learning (anecdotes)

----------------------------------------------------------------------

Date: Wed, 2 Dec 87 09:01 EST
From: Roland Zito-Wolf <RJZ@JASPER.Palladian.COM>
Subject: Knowledge-based bibliographies


I am looking for references regarding knowledge-bases and KB-based tools
for organizing a bibliographic database on AI. I want to be able to retrieve
references by various indices.

Specific issues I'd like to know about:
- friendly data entry
- searching through alternate paths (say, finding articles related
to a given article in some way: by author, topic, system name,
etc.)
- ability to "evolve" the structure of the KB with time
- what is areasonable conceptual structure for reference databases, in
general?

I'll post a digest of responses to the list.

Roland J. Zito-wolf
Palladian Software
4 Cambridge Center
Cambridge, Mass 02142
617-661-7171
RJZ%JASPER@LIVE-OAK.LCS.MIT.EDU

------------------------------

Date: Wed, 2 Dec 87 12:54 EST
From: David Naumann <naumann@umn-cs.cs.umn.edu>
Subject: Wanted: a module for natural language interface (in LISP)


Wanted: A module for a natural language interface (in LISP)

We are developing a tool for research of systems analyst behavior. The tool requires a natural lanThe tool
requires a natural language front end. We would like to know if anybody has,
or knows of, any natural language interface module (in LISP) that would take a
question in English, validate it and produce a parsed tree.

We prefer public domain software, but are also willing to pay for it if necessary. ne
necesary. Please note that we have a limited budget.

Thanks for your help.

J. David Naumann
Macedonio Alanis
University of Minnesota
Management Sciences Department
Management Information Systems Area

ARPA nauman@umn-cs
BITNET naumann@umnacvx
alanis@umnacvx

------------------------------

Date: Wed, 2 Dec 87 22:50 EST
From: Robert Amsler <amsler@flash.bellcore.com>
Subject: Text Encoding Standard for the Humanities - Vassar Workshop report

[The following is a summary prepared by Michael Sperberg-McQueen for
the HUMANIST mailing list of the first workshop on the preparation of
an encoding standard for text in the humanities held at Vassar
College last month. As an attendee and steering committee member, I
would be willing to answer further questions concerning this effort
for the IRLIST or NL-KR communities. The effort to develop a standard for
encoding texts in the humanities is just starting and anyone with
interest in this noble and ambitious goal should not feel the
slightest hesitancy about becoming a part of the effort. What is at
stake is nothing less than the creation, use and preservation of our
global electronic cultural heritage - R. Amsler, (amsler@flash.bellcore.com)]

Contributor: "Michael Sperberg-McQueen" <U18189@UICVM>

A followup on the current status of the ACH effort to formulate
guidelines for text encoding practices.

******************************************************************
* NOTE: The following encoding conventions have been used to *
* represent French accents throughout this message: *
* *
* To Represent Accents -- Pour la representation des accents *
* / acute accent - accent aigu *
* ` grave accent - accent grave *
* *
* The accent codes are typed Les codes pour les accents se *
* AFTER the letter, and are trouvent APRES la lettre qu'ils *
* used with both upper and modifient, et s'utilisent avec *
* lower case letters. les majuscules aussi bien que *
* les minuscules. *
******************************************************************


On November 12 and 13, 1987, 31 representatives of professional
societies, universities, and text archives met to consider the
possibility of developing a set of guidelines for the encoding of texts
for literary, linguistic, and historical research. The meeting was
called by the Association for Computers and the Humanities and funded
by the National Endowment for the Humanities. The list of participants
is appended to this document.

The participants heartily endorsed the idea of developing encoding
guidelines. In order to guide such development, they agreed on
the following principles:


The Preparation of Re/daction des directives
Text Encoding Guidelines pour le codage des textes

Poughkeepsie, New York
13 November 1987

1. The guidelines are intended 1. Le but des directives est de cre/er
to provide a standard format un format standard pour l'e/change
for data interchange in des donne/es utilise/es pour la
humanities research. recherche dans les humanite/s.

2. The guidelines are also 2. Les directives sugge/reront
intended to suggest principles e/galement des principes pour
for the encoding of texts l'enregistrement des textes
in the same format. destine/s a` utiliser ce format.

3. The directives should 3. Les directives devraient

a. define a recommended a. de/finir une syntaxe recommande/e
syntax for the format pour exprimer le format,

b. define a metalanguage b. de/finir un me/ta-langage
for the description de/crivant les syste`mes de
of text-encoding schemes, codage des textes,

c. describe the new format c. de/crire par le moyen de ce
and representative me/talangage, aussi bien qu'en
existing schemes both in prose, le nouveau syste`me de
that metalanguage and codage aussi bien qu'un choix
in prose. repre/sentatif de syste`mes
de/ja` en vigueur.

4. The guidelines should 4. Les directives devraient proposer
propose sets of coding des syste`mes de codage utilisables
conventions suited for pour un large e/ventail
various applications. d'applications.

5. The guidelines should 5. Sera incluse dans les directives
include a minimal set of l'e/nonciation d'un syste`me de
conventions for encoding codage minimum, pour guider
new texts in the format. l'enregistrement de nouveaux textes
conforme/ment au format propose/.

6. The guidelines are to be 6. Le travail d'e/laboration des
drafted by committees on: directives sera confie/ a` quatre
comite/s centre/s sur les sujets
suivants:

a. text documentation a. la documentation des textes,

b. text representation b. la repre/sentation des textes,

c. text interpretation c. l'analyse et l'interpre/tation
and analysis des textes

d. metalanguage definition d. la de/finition du me/talangage et
and description of son utilisation pour de/crire le
existing and proposed nouveau syste`me aussi bien que
schemes ceux qui existent de/ja`.

co-ordinated by a steering Ce travail sera coordonne/ par un
committee of representatives comite/ d'organisation ou`
of the principal sie`geront des repre/sentants des
sponsoring organizations. principales associations qui
soutiennent cet effort.

7. Compatibility with existing 7. Dans la mesure du possible, le
standards will be maintained nouveau syste`me sera compatible
as far as possible. avec les syste`mes de codage
existants.

8. A number of large text 8. Des repre/sentants de plusieurs
archives have agreed in grandes archives de textes en form
principle to support the lisible par machine acceptent en
guidelines in their function principe d'utiliser les directives
as an interchange format. en tant que description des formats
We encourage funding agencies pour l'e/change de leurs donne/es.
to support development of Nous encourageons les organismes
tools to facilitate this qui fournissent des fonds pour la
interchange. recherche de soutenir le
de/veloppement de ce qui est
ne/cessaire pour faciliter cela.

9. Conversion of existing 9. En convertissant des textes
machine-readable texts to lisibles par machine de/ja`
the new format involves the existants, on remplacera
translation of their automatiquement leur codage actuel
conventions into the syntax par ce qui est ne/cessaire pour les
of the new format. No rendre conformes au format nouveau.
requirements will be made for Nul n'exigera l'ajout
the addition of information d'informations qui ne sont pas
not already coded in the de/ja` repre/sente/es dans ces
texts. textes.

(trad. P. A. Fortier)

******************

The further organization and drafting of the guidelines will be
supervised by a steering committee selected by the three sponsoring
organizations: ACH (the Association for Computers and the Humanities),
ACL (the Association for Computational Linguistics), and ALLC (the
Association for Literary and Linguistic Computing). Drafts of the
guidelines will be submitted for comment to an editorial committee with
representatives of all participating organizations (in addition to the
sponsors, thus far: the Modern Language Association, the Association
for Computing Machinery Special Interest Group for Information
Retrieval, and the Association of American Publishers; the following
groups have indicated interest informally but have not yet formally
pledged participation, in most cases pending a formal vote: the
Linguistic Society of America, the Association for Documentary Editing,
the American Philological Association. The American Anthropological
Association, plus several organizations within Europe, are now being
asked to consider participation.

The interchange format defined by the guidelines is expected to be
compatible with the Standard Generalized Markup Language defined
by ISO 8859, if that proves compatible with the needs of research. The
needs of specialized research interests will be addressed wherever it
proves possible to find interested groups or individuals to do the
necessary work and achieve the necessary consensus. Formation of
specific working groups will be announced later; in the meantime, those
interested in working on specific problems are invited to contact
either Dr. C. M. Sperberg-McQueen, Computer Center, University of
Illinois at Chicago (M/C 135), P.O. Box 6998, Chicago IL 60680 (on
Bitnet: U18189 at UICVM), or Prof. Nancy Ide, Dept. of Computer
Science, Vassar College, Poughkeepsie NY 12601 (on Bitnet: IDE at
VASSAR).

- N.I., C.M.S-McQ

------------------------------------------------------------------------------

List of Participants

NOTE: Association names are given following the names of their
representatives at this meeting.

Helen Aguera, National Endowment for the Humanities
Robert A. Amsler, Bell Communications Research
David T. Barnard, Department of Computing and Information Science,
Queen's University, Ontario
Lou Burnard, Oxford Text Archive
Roy Byrd, IBM Research
Nicoletta Calzolari, Istituto di linguistica computazionale, Pisa
David Chestnutt (Assoc. for Documentary Editing, American Historical
Assoc.), Department of History, University of South Carolina
Yaacov Choueka (Academy of the Hebrew Language), Department of
Mathematics and Computer Science, Bar-Ilan University
Jacques Dendien, Institut National de la Langue Francaise
Paul A. Fortier, Department of Romance Languages, University of
Manitoba
Thomas Hickey, OCLC Online Computer Library Center
Susan Hockey (Association for Literary and Linguistic Computing),
Oxford University Computing Service
Nancy M. Ide (Association for Computers and the Humanities),
Department of Computer Science, Vassar College
Stig Johansson, International Computer Archive of Modern English,
University of Oslo
Randall Jones (Modern Language Association), Humanities Research
Computing Center, Brigham Young University
Robert Kraft, Center for the Computer Analysis of Texts, University of
Pennsylvania
Ian Lancashire, Center for Computing in the Humanities, University of
Toronto
D. Terence Langendoen (Linguistic Society of America), Graduate
Center, City University of New York
Charles (Jack) Meyers, National Endowment for the Humanities
Junichi Nakamura, Department of Electrical Engineering, Kyoto
University
Wilhelm Ott, Universitaet Tuebingen
Eugenio Picchi, Istituto di linguistica computazionale, Pisa
Carol Risher (American Association of Publishers), American
Association of Publishers, Inc.
Jane Rosenberg, National Endowment for the Humanities
Jean Schumacher, Centre de traitement e/lectronique de textes,
Universite/ catholique de Louvain a` Louvain-la-neuve
J. Penny Small (American Philological Association), U.S. Center for
the Lexicon Iconographicum Mythologiae Classicae, Rutgers
University
C.M. Sperberg-McQueen, Computer Center, University of Illinois at
Chicago
Paul Tombeur, Centre de traitement e/lectronique de textes,
Universite/ catholique de Louvain a` Louvain-la-neuve, Belgium
Frank Tompa, New Oxford English Dictionary Project, University of
Waterloo
Donald E. Walker (Association for Computational Linguistics), Bell
Communications Research
Antonio Zampolli, Istituto di linguistica computazionale, Pisa, Italy

------------------------------

Date: Thu, 3 Dec 87 20:27 EST
From: ganguly@ATHENA.MIT.EDU
Subject: DCG

Hi!
Does someone have a Definite Clause Grammar parser written in
Edinburgh PROLOG that I may use as an user interface ?
Thanking in advance,


Jaideep Ganguly

------------------------------

Date: Fri, 20 Nov 87 11:46 EST
From: Bruce Nevin <bnevin@cch.bbn.com>
Subject: Re: measures of "Englishness"

Re statistical measures of `Englishness':

A number of studies were made of admissable and inadmissable phoneme
sequences in English vocabulary in the '50s. One application was
provision of a list of unused potential English vocabulary for new trade
names. There may be something about this in Gleason's old textbook.
There are some examples illustrating the general method of generating
tables of next-successor phonemes or of next-successor morphemes in
words in Harris's _Methods in Structural Linguistics_ (1952).

In his 1968 book _Mathematical Structures of Language_, Harris
summarizes results of computer test of a hypothesis made earlier in his
`From phoneme to morpheme' paper (sorry, I don't have the reference--
_Language_ in the early '50s I think). The report of results of the
test appears in full in one of the TDAP papers from U. Penn. The
general observation is that the number of next successors drops as you
proceed along the phoneme sequence making up a morpheme, and rises again
when you get to morpheme boundary, reflecting the relative arbitrariness
of how the next morpheme may begin. Thus for the sentence `Dogs were
indisputably quicker', the number of next successors for each phoneme is
as follows (numbers under phonemes):

d o g . z . w ^ r . i n . d i s . p y u w t . ^ b . l i y .
12 7 29 29 7 3 28 13 28 10 14 21 9 2 2 2 28 2 4 2 2 28


k w i k . ^ r .
12 8 10 28 3 29

The dots indicating morpheme boundaries suggested by the test were not
input to the test, and are included only to clarify results. The
boundary between the last two syllables of `indisputably' is the least
strongly indicated. Running the test in reverse order (next
predecessors, as it were) helps confirm or eliminate marginal cases.
And all results are subject to regularization by standard distributional
methods of linguistics.

I have altered the display on p. 25 of Harris's book (1) by using ^ for
schwa and (2) by estimating the numbers from his graph. I may not have
got the numbers just right but they are certainly good enough to make
the point.

Bruce Nevin
bn@cch.bbn.com

(Disclaimer: if you infer anything from this about the opinions of my
employer, its clients, etc, it's not by my intent, and you're on your own.)

------------------------------

Date: Wed, 25 Nov 87 17:20 EST
From: Steve Cassidy <steve@comp.vuw.ac.nz>
Subject: Re: Lip Movement and Mental Lexicons?

Date: Sun, 15 Nov 87 10:41 EST
From: Murray Watt <murrayw@utai.UUCP>
Subject: Re: Lip Movement and Mental Lexicons?


What does phonemic represention have to do with LEXICAL MEANING?
(Phonemic meaning is all the rage in current linguistic research,
but I think this is a different type of meaning.)
...
I have never SEEN any arguments that the phonemic representation
resides in the same location as lexical enties and I have never
heard of a letter based lexicon in the mind. Are you sure your not
confusing dictionaries and the human mind? 8-)

Murray Watt


The current `best' theory of human word processing (going from printed word to
`lexical item') is based on making analogies with stored representations of
the words based on a letter mediated representation. That is there are letters
in there somewhere but they may be grouped or organised in a way which is not
yet clear.

The current best theory of reading development suggests that it is heavily
tied in with spelling development and that the same `lexical entry' is used
for both, and that there is transference between the two skills. For competent
spelling a letter by letter representation of the word is needed, sound to
letter rules don't work well enough. Similarly it would seem that a phonemic
representation is needed to pronounce (some) words.

So the mental lexicon should contain references to an orthographic
representation (probably close to letter strings) and a phonemic
representation.

I don't know what lexical meaning is. Do you?

The judgement as to 'best' theories above is my own.

Steve Cassidy ACSnet: steve@vuwcomp.nz|
Victoria University, Private Bag, -------------------------------------|
Wellington, New Zealand UUCP: ...seismo!uunet!vuwcomp!steve|

"If God had meant us to be perfect, He would have made us that way"
- Winston Niles Roomford III

------------------------------

Date: Wed, 25 Nov 87 08:45 EST
From: Richard Wexelblat <rutgers!philabs.philips.com!rlw>
Subject: Re: Language Learning

Readers of this group might be interested in looking up the Ph.D. dissertation
of Kathy Hirsh-Pasek (Univ. of Penna., 1980+-2?) who did an extensive study of
language learning in hearing children of deaf parents. As I recall, she
concluded that there was no statistically significant difference -- but I
don't really remember the parameters of the study. Perhaps someone with
access to _Dissertation_Abstracts_ will look up the specific reference.
--

--Dick Wexelblat {uunet|ihnp4|decvax}!philabs!rlw
rlw@philabs.philips.com

------------------------------

Date: Tue, 1 Dec 87 11:56 EST
From: Rick Wojcik <rwojcik@bcsaic.UUCP>
Subject: Re: Language Learning (a Turing test)


In article <2363@tut.cis.ohio-state.edu> paul@tut.cis.ohio-state.edu (Paul W. Placeway) writes:
>
>Actually, the "story" I was thinking of is similar, but with a big
>difference: I am told that Dr. Lehiste (who's native language is
>Estonian), when traveling in Germany, regularly fools native speakers
>into thinking that she is German, but from some other region. From
>what I have been told, this effect is true, even for extended
>conversations.
>
I am familiar with your examples, since I took my undergraduate and
graduate degrees in linguistics at OSU. Having studied Estonian with
Dr. Lehiste, a world-renowned acoustician and phonetician, I can well
believe that she fools native German speakers. She has pointed out some
very subtle differences between Estonian & German--for example, the fact
that word-initial vowels in German are always preceded by a glottal
stop, but that those in Estonian never are. I may be wrong, but I don't
think that she has native-like control over this aspect of German. When
did she learn German, anyway? That's an essential point here. (Don't
forget that the Estonia of her childhood had close ties to Germany.) It
is also worth noting that, despite her many years of residence in
America and her linguistic sophistication, she retains a noticeable
foreign accent in English. Her control of English is about as good as
it can get in adult language learners.

>The similarity of dialect does not allways hold either. Elizabeth
>Zwicky does not speak the same regional dialect of SAE that I do, even
>though the two of us spent the majority of our lives growing up within
>10 miles of each other, in the same side of the same city. Our

You miss the point. I never said anything about the social and ethnic
factors that shape dialects. The Columbus neighborhood that you and she
grew up in contains a mixture of Northern and Midland dialects.
Elizabeth's dialect (Northern) and yours (Midland?) are recognizably
American.

===========
Rick Wojcik rwojcik@boeing.com

------------------------------

Date: Tue, 1 Dec 87 14:23 EST
From: goldfain@osiris.cso.uiuc.edu
Subject: Re: Language Learning (anecdotes)


I think the "crystallization hypothesis" in language acquisition is an
hypothesis which by its very nature will snag people into a debate. I think a
review of the overall nature of this hypothesis and debate are instructive as
to something which should be avoided whenever possible in science.

1) We have an observable phenomenon at a very high level of complexity:
It concerns fine distinctions in natural language behavior.
2) The observations of the phenomenon are not well pinned down: Researchers
mention something about "mastery" of the language, then sometimes back off
and only simply make claims about phonetic categorial perception, then
shift back to discussing scores on grammar tests among people who have been
in a culture for 10-20 years, immigrating at different times in their
lives, etc.

**************************************************************************
* I am not saying the phenomenon isn't real! There are observable and *
* interesting phenomena here. I am just qualifying that. *
**************************************************************************

3) The phenomena *suggests* that *possibly* there is a physiological basis for
such trends and differences as are observed. To make a really concrete
claim, it *suggests* that perhaps some maturation process in the normal
human brain occurs at about mid teenage years.
4) There are lots of other mechanisms that are consistent with the observed
phenomena: a wide range of psychological "lower-level" factors have been
listed in the current debate in this note file.
5) If one really steps back and looks at this objectively, we can tell that
the "experiments" ("studies" is actually a better word) thus far performed
and currently underway will never help distinguish whether this phenomenon
has a physiological basis or merely a psychological basis, or a combination
of both (don't forget that possibility!)
6) There is a large set of anecdotal rumor floating around that is only going
to keep the issue cloudy. It may keep us from the wrong conclusion, but it
is not going to settle us down on whatever the correct answer is.

I think the only way to settle the matter will have to wait on tighter
experimentation (if it is ever judged that this issue is worth the experiments
it would take to settle it.) It will require a great deal of progress in
neurophysiology, or some volunteers for some outrageous psychology
experiments. (Find me 100 open-minded adults who will set aside all other
interests for at least 5 years of their lives ... )

In other words, I think the moral of this issue is that you cannot expect to
settle an issue that is several layers of abstraction below the level of your
observational apparatus. (In this case it might be more than "several".) In
a sense I'm saying: "Go back to the lab and let's look for other things we can
get a better grip on - this issue will have to wait until another day."


Mark Goldfain arpa: goldfain@osiris.cso.uiuc.edu
US Mail: Mark Goldfain
(A lowly student at)--> Department of Computer Science
University of Illinois at U-C
1304 West Springfield Avenue
Urbana, Illinois 61801

------------------------------

[Editor's Note: There is still a backlog of items on language learning which
will be posted next week.]

End of NL-KR Digest
*******************

← previous
next →
loading
sending ...
New to Neperos ? Sign Up for free
download Neperos App from Google Play
install Neperos as PWA

Let's discover also

Recent Articles

Recent Comments

Neperos cookies
This website uses cookies to store your preferences and improve the service. Cookies authorization will allow me and / or my partners to process personal data such as browsing behaviour.

By pressing OK you agree to the Terms of Service and acknowledge the Privacy Policy

By pressing REJECT you will be able to continue to use Neperos (like read articles or write comments) but some important cookies will not be set. This may affect certain features and functions of the platform.
OK
REJECT