Copy Link
Add to Bookmark
Report
NL-KR Digest Volume 09 No. 62
NL-KR Digest (Thu Dec 3 15:54:23 1992) Volume 9 No. 62
Today's Topics:
Announcement: Catalogue of "ontological" concept-systems
Announcement: Loglan
CFP: 5th DARPA/SISTO Message Understanding Conference (MUC-5)
Submissions: nl-kr@cs.rpi.edu
Requests, policy: nl-kr-request@cs.rpi.edu
Back issues are available from host archive.cs.rpi.edu [128.213.3.18] in
the files nl-kr/Vxx/Nyy (ie nl-kr/V01/N01 for V1#1), mail requests will
not be promptly satisfied. Starting with V9, there is a subject index
in the file INDEX. If you can't reach `cs.rpi.edu' you may want
to use `turing.cs.rpi.edu' instead.
BITNET subscribers: we now have a LISTSERVer for nl-kr.
You may send submissions to NL-KR@RPIECS
and any listserv-style administrative requests to LISTSERV@RPIECS.
-----------------------------------------------------------------
To: nl-kr@cs.rpi.edu
Date: Sat, 21 Nov 92 18:52:59 CST
From: fritz@rodin.wustl.edu (Fritz Lehmann)
Subject: Announcement: Catalogue of "ontological" concept-systems
[I started to write this, but before doing any work I'd
like to get references and information on concept-systems and
ontologies I don't know about. For some of the 84 systems
listed below, I know little or nothing and have no
references. I'll distribute a useful version with text once
I get some feedback from "you-all". Fritz Lehmann -11/20/92]
- --------------------------------------------------
CONCEPT-SYSTEMS CATALOGUE
Fritz Lehmann
145 Exeter, Irvine, CA 92715 USA (714)725-9057
[rarely accessed email: fritz@rodin.wustl.edu]
Version of: November 1992
This is to be an informal catalogue of existing concept
catalogues and hierarchies (including high level
"ontologies") for possible use in knowledge representation,
artificial intelligence, and database integration. Anybody
can contribute (and be acknowledged). Each concept system is
described (in a page or less) with some references and other
information. I hope to be inclusive, with emphasis on
machine-readable/usable concept (and relation) hierarchies.
Some people think there is ONE concept system
for the true structure of the world. Others like me
think pragmatic concerns (subjective or socially
agreed-upon) may dictate different structures. Most
"ontologies" have large areas of near-agreement on concepts
like time, space, individuals, properties, etc. Technical
thesauri deal with more specific subject areas like
accounting, subfields of medicine, or plumbing fixtures.
Philosophical concepts are necessary but controversial; some
concepts like "check-stub" are quite uncontroversial.
Formalized or not, two aspects of every system are: its
purely mathematical (order) structure, and the meanings of
its components. Notation or language is incidental to both.
The page ordering, for now, is vaguely chronological.
It must be emphasized that only rarely is a concept in one
system genuinely the same as a concept with the same name in
another system. Please let me know about ANY OTHER concept-
systems you know about, or at least give a reference.
- ------------------------------------------------
[I started this Nov. 20, 1992 for the "PEIRCE project"
(a cooperative international implementation of a Conceptual
Graphs inferential database processing sytem, initiated by
Gerard Ellis and Robert Levinson), with a mental list of 84
systems beginning with Aristotle's.]
ARISTOTLE'S CATEGORIES
LEIBNIZ' CHARACTERISTICA UNIVERSALIS
LODWYCK'S COMMON WRITING
DALGARNO'S ARS SIGNORUM
WILKINS' PHILOSOPHICAL LANGUAGE
LINNAEUS BIOLOGICAL TAXONOMY
(ANONYMOUS) UNIVERSAL CHARACTER
CAVE BECK
KANT'S CATEGORIES
ROGET'S THESAURUS
PEIRCE'S CATEGORIES
BOLZANO
MEINONG
BRADLEY
HUSSERL'S ONTOLOGY
PRINCIPIA MATHEMATICA
WHITEHEAD'S PROCESS THEORY
BASIC ENGLISH
LIESNIEWSKI'S MEREOLOGY
SEMANTOGRAPHY SYMBOLS
RICHENS/MASTERMAN/WILKS SEMANTIC PRIMITIVES
CECCATO'S CORRELATION NET PRIMITIVES
INGARTEN'S ARISTOTLE REVISION
LINCOS INTERPLANETARY LANGUAGE
R.M. MARTIN'S SEMIOTIC PRIMITIVES
COLON CLASSIFICATION - FACETED
DEEP CASE SYSTEMS
LOGLAN/LOJBAN PRIMITIVES
LAFFAL'S CONCEPT DICTIONARY
CONCEPTUAL DEPENDENCY THEORY
ACM COMPUTER SCIENCE CLASSIFICATION
PARKER-RHODES' INFERENTIAL SEMANTICS LATTICES
WIERZBICKA'S LINGUA MENTALIS
KAMP'S DISCOURSE REPRESENTATION STRUCTURES
HAYES' NAIVE PHYSICS
EXPLANATORY-COMBINATORY DICTIONARY (MEANING-TEXT)
MeSH - MEDICAL SUBJECT HEADINGS CATALOGUE
THE HOLOTHEME
AM/EURISKO MATH CATEGORIES
CONCEPTUAL GRAPHS PRIMITIVES
SITUATION SEMANTICS
SCHUBERTIAN ("ECO") SUBHIERARCHIES
QUALITATIVE PHYSICS PRIMITIVES
SMITH-MULLIGAN ONTOLOGY
SIMONS' PART SYSTEM
SMALLTALK DATA TYPE TREE
OBJECTIVE-C (NeXTSTEP) DATA TYPE TREE
RESEDA ONTOLOGY
GRAESSER'S MULTIPLE CONCEPT HIERARCHIES
LONGMAN DICTIONARY CODINGS (INCL. SLATOR)
LONGMAN'S THESAURUS
THE WORDTREE
PENMAN UPPER MODEL
SPARCK JONES/BOGURAEV DEEP CASE LIST
COOK ONTOLOGY
DIXON ONTOLOGY
VARIOUS WILLE CONCEPT LATTICES
RUSSIAN MERONOMY
SOMERS' CASE GRID
CHAFFIN'S RELATION HIERARCHY
CYC PROJECT
EPSTEIN AM-BASED GRAPH THEORY HIERARCHY
MARTY'S SEMIOTIC LATTICES
IRDS DATABASE CATEGORIES
VELARDI'S SEMANTIC LEXICON
ONTEK ONTOLOGY
LAKOFF'S CATEGORIES
HUHNS & STEPHENS RELATION FEATURES
WORDNET
EDR CONCEPT DICTIONARY
DOUDNA QUANTIFIER RHOMBIDODECAHEDRON
NIRENBURG'S DIONYSUS ONTOLOGY
SCHUBERT'S EPISODIC LOGIC CATEGORIES
UNITRAN-LCS
RELATIONAL LEXICON HIERARCHY
SUMM
SKUCE ONTOLOGY
PETRI ONTOLOGIES
TEPFENHART ONTOLOGY
PLINIUS CERAMICS ONTOLOGY
RANDELL & COHN'S SPATIOTEMPORAL LATTICES
HARTLEY'S TIME AND SPACE WORLD
ONTOLINGUA-KIF
DICK'S CASE-RELATION SYSTEM
SUGGESTED FORMAT: Brief description, Example, Formalized?,
Abstract hierarchy structure, Necessary/sufficient?,
References, Current authorities or enthusiasts, Machine-
readable text?, Machine-usable structure?, Source,
FTP site?, Implementations?
.
------------------------------
To: nl-kr@cs.rpi.edu
From: James Salsman <bovik@delta.eecs.nwu.edu>
Subject: Announcement: Loglan
Date: Wed, 2 Dec 92 10:51:40 CST
Reply-To: bovik@eecs.nwu.edu
[Poster's note: This file is a copy of the descriptive text contained
in a brochure which The Loglan Institute sends out in response to an
initial request for information, plus brief descriptions of some of
the materials available for purchase. For a printed copy of the bro-
chure or any other information, write to The Institute at the address
given here, or send CompuServe MAIL to Kirk Sattley 76010,1363.]
THE LOGLAN INSTITUTE, INC.
A Non-Profit Research Corporation
3009 Peters Way
San Diego, CA 92117
What Is Loglan?
Loglan is a speakable, human language originally designed to
serve as a test of the Sapir-Whorf hypothesis that the structure of
local human languages places local constraints on the development of
human thought, and hence, on human cultures. If this hypothesis is
correct, a language which "lifted" those constraints -- that is to
say, which reduced them to some formal minimum -- should in a certain
sense "release" the human mind from these ancient linguistic bonds
and, in any case, have notable effects on both individual thinking and
on the development of a global human culture.
Since its original development in the late 1950's and 1960's
Loglan has acquired certain other properties that make it interesting
to computer science, principally (1) its total freedom from syntactic
ambiguity. This feature of the language, together with (2) its audio-
visual "isomorphism" (which means that the Loglan speechstream breaks
up automatically into fully punctuated strings of separate words) and
(3) its borrowing algorithm (by which the International Scientific
Vocabulary goes into Loglan virtually ad libitum) makes it an ideal
medium for three uses: (i) for international information storage and
retrieval, (ii) for machine-aided translation between natural lan-
guages, and (iii) for spontaneous interaction between computer-users
and their machines. Finally, Loglan is (4) culturally and politically
neutral in the sense that its basic predicate vocabulary has been
engineered to be maximally memorable to speakers of the eight most
widely spoken human languages: English, Chinese, Hindi, Russian,
Spanish, French, Japanese and German.
All these features taken together have suggested to many loglan-
ists that their adopted language is ideally suited to become a second
language for the world. For others, conducting a scientific test of
the Whorf hypothesis with Loglan has the highest priority. For still
others, its use at the human/machine interface is the most challenging
role for Loglan in the years ahead.
Going Public Again
Your inquiry reaches The Institute at a most interesting time.
Loglan is in the midst of "going public again". This is the third
and, we trust, final time. The first time we went public was in 1960,
when James Cooke Brown's article on "Loglan" was published in the
Scientific American for June of that year. (Reprints of this article
are still available.) The second time was in 1975, when two of our
books, Loglan 1, a grammar, and Loglan 4 & 5, a dictionary, were pub-
lished in paperback for the first time. The 15-year interval between
the 1st and 2nd "goings public" was mainly occupied by three activi-
ties: (a) the development of Loglan grammar on computers, (b) the
construction of its internationally-based lexicon, and (c) the prepa-
ration of the several earlier editions of the 1975 volumes. The
similar interval between the 2nd and 3rd "goings public" was mainly
occupied by engineering three final design features into the language.
One of these was the formal discovery and demonstration of the syntac-
tically unambiguous grammar mentioned above. This feature had long
been planned but had had to wait for the development of mathematical
tools powerful enough to install it; these became available in 1975.
Another engineering challenge was to build a set of decipherable word-
parts from which all the complex predicates of the language could be
recognizably constructed. Still a third engineering task was to build
its "borrowing algorithm", the procedure by which natural language
words, but especially the International Scientific Vocabulary, may now
be freely incoroporated into Loglan. These last two features together
implement yet another long-planned function of the language, namely
that it should be capable of rapid, spontaneous, and yet continuously
intelligible growth.
In short, modern Loglan is now ready for its many uses. Here are
the publications and services which The Institute has prepared to let
you examine this extraordinary language and decide whether and how you
wish to use it.
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
[Poster's note: The following is a much abbreviated extract from four
pages of descriptions of materials available. I have chosen the ones
I thought most likely to interest an inquiring language-lover,
especially one who uses a personal computer.]
BOOKS
Loglan 1: A Logical Language
by James Cooke Brown, 4th Edition, 1989;
599pp. A general introduction and complete description of the Loglan
language. Has detailed explanation of the language's syntax and word-
construction, as well as pronunciation guides, historical notes,
specimen translations, word-lists, and a chapter on testing the Whorf
hypothesis. [$21.50]
Loglan 4 & 5: A Loglan-English/English-Loglan Dictionary,
collated by
JCB, 2nd Edition 1975; 510pp. [New edition in preparation, old one
still useful as word-source when checked against new Loglan 1.]
[Paperback $10.00; Hardback $15.00]
SOFTWARE [All available for both PC-DOS machines and Macintoshes]
MacTeach* 1: Forming Loglan Utterances,
MacTeach 2: Learning Loglan Words,
MacTeach 3: Learning Loglan Affixes,
by Robert A. McIvor, Evelyn R.
Anderson, and JCB, 1st Edition 1989. All of these use the "learning
ladder" technique developed at The Institute to teach both utterance
formation and vocabulary acquisition. The technique helps the learner
master long lists of items with minimum overlearning and error-making.
MacTeach 1 comes with an input file of about 400 utterances, covering
about 75% of the grammar. MacTeach 2 has an input list of more than
900 primitive words. MacTeach 3 has the complete set of combining
affixes used for forming complex predicates.
[$20.00 each, all three on one disk $50.00]
LIP*, The Loglan Interactive Parser*,
by RAM, Scott L. Burson, JCB,
and other workers on the Machine Grammar Project. LIP will produce a
parse-tree or sentence-diagram syntactic analysis of any grammatical
sentence that is submitted to it, as well as pointing out where an
ungrammatical sentence went wrong. LIP can also parse a text file,
either utterance-by-utterance or all at once, and allows individual
utterances to be modified interactively until they are correct. It is
thus a useful tool for an aspiring Loglan writer as well as a
practically indispensable one for a teacher or editor. [$50.00]
AUDIO RECORDINGS
Cassette 1: Readings from Loglan 1, Chaps 2-4,
Cassette 2: Readings from Loglan 1, Chaps 5,6.
On these two
cassettes, all the Loglan sentences in Chapters 2 through 6 of Loglan
1 are plainly pronounced by competent readers, so the learner will
learn to speak the entire grammatical range of Loglan utterance forms
correctly. [$10.00 for each cassette]
MEMBERSHIP
Ordinary membership is $50 per two-year period. Several classes of
membership at lower and higher dues are available. Members receive a
quarterly newletter "Lognet" as well as a 40% discount on purchases
of all Institute materials.
_________________________
*`MacTeach', `LIP', and `The Loglan Interactive Parser' are trademarks
of The Loglan Institute.
------------------------------
To: nl-kr@cs.rpi.edu
Date: Tue, 1 Dec 92 16:12:07 -0800
From: sundheim@cod.nosc.mil (Beth M. Sundheim)
Subject: CFP: 5th DARPA/SISTO Message Understanding Conference (MUC-5)
* * * CALL FOR PARTICIPATION * * *
FIFTH MESSAGE UNDERSTANDING SYSTEM EVALUATION
AND MESSAGE UNDERSTANDING CONFERENCE (MUC-5)
1 MARCH - 27 AUGUST, 1993
Preparation: 1 March - 23 May
29 May - 25 July
Evaluations: 24-28 May (dry run)
26-30 July (formal run)
Conference: 25-27 August
Sponsored by:
Defense Advanced Research Projects Agency
Software and Intelligent Systems Technology Office
(DARPA/SISTO)
The Message Understanding Conferences have provided on ongoing
forum for assessing the state of the art and practice in text analysis
technology and for exchanging information on innovative computational
techniques. They have also encouraged experimentation in the context
of fully implemented systems that perform the realistic task of
extracting factual information from free text. The first two
conferences focused on short naval messages; the two most recent
conferences challenged the systems with longer and stylistically
varied terrorism news stories. The four conferences have seen the
application of a wide variety of approaches to the information
extraction task.
There is a growing appreciation of the potential utility of the
technologies. At the same time, performance constraints attributed to
inadequate computational methods are becoming serious issues for the
more highly developed systems. The Fifth Message Understanding
Conference (MUC-5) will continue the technology assessment cycle, with
new information extraction tasks in new domains. MUC-5 will also
continue the effort to define an insightful, objective set of
performance evaluation criteria.
DARPA sponsors the Message Understanding Conferences as part of
the TIPSTER Text program. Participation in MUC-5 is actively sought
from both new and veteran organizations. Veteran evaluation
participants will be able to measure their progress in designing
robust, end-to-end information extraction systems and to continue the
fruitful interchange of ideas about systems and evaluation. New
participants will also contribute to and benefit from such
interactions, while learning to manage the challenges posed by the
evaluation task. In this process, all organizations enjoy some
advantages and suffer from some disadvantages in the evaluation.
These differing circumstances are recognized by the evaluators and
should not deter organizations from participating.
The conference itself will consist primarily of presentations and
discussions of test results, system design, and innovative techniques.
Attendance at the conference is limited to evaluation participants and
to guests invited by DARPA. A conference proceedings, including all
test results, will be published.
Modest amounts of financial support will be made available to
selected participants in an effort to maximize the number of
participants and to attract the widest possible variety of technical
approaches and system architectures. This funding is intended only as
a supplement to other support. Both U.S. and non-U.S. participants
are eligible for this funding.
SCHEDULE:
3 January 1993 Deadline for applications that include funding
requests
15 January 1993 Final application deadline (no funding requests)
1 February 1993 Notification of acceptance and funding
1 March 1993 Release of system development corpus and
evaluation software
24-28 May 1993 Performance evaluation (dry run) on test corpus
26-30 July 1993 Performance evaluation (formal run) on new test
corpus
25-27 August 1993 Fifth Message Understanding Conference
DATA AND TASK DESCRIPTION:
Subject to successful completion of negotiations to obtain proper
permissions concerning the data, the data and task to be used for
MUC-5 will be the same as those already in use for the data extraction
portion of the DARPA/SISTO TIPSTER Text program. There are two
languages, English and Japanese, and two domains, joint ventures and
microelectronic chip fabrication. These form four separate corpora.
The texts are newswire articles selected to produce the desired mix of
relevant and nonrelevant texts, and they were blindly divided into
pools of development (training) and test data.
The task is to extract information about the nature and status of
activities in the domain, the entities involved, etc. Analysts have
been doing software-assisted manual generation of the "key" templates
against which the system-generated templates will be evaluated. The
template design is object oriented, and each slot in the template has
its own fill specifications for data type, valency, etc. The fill
specifications in each domain vary slightly between English and
Japanese, reflecting differences in language usage; however, the
general design of the template is the same for both languages.
An English and a Japanese sample text and corresponding template
in the joint ventures domain are available from the program chair
(address at end of this announcement). Please specify which
language(s) you are interested in. A microelectronics example may be
available shortly. The total amount of data that will be available in
March to support system development is expected to be between 200 and
1,000 templates and corresponding texts. This number will vary
according to the corpus and the data rights that are obtained. To
receive the data, participants will be required to acknowledge its
copyright status by signing agreements to safeguard the data and to
use it for research purposes only.
TEST PROTOCOL AND EVALUATION CRITERIA:
MUC-5 participants may elect to do either language or both
languages; they are limited to selecting just one domain.
Participants will have access to TIPSTER Government-Furnished
Information and shared resources such as the training texts and
templates, task documentation, gazetteers, and evaluation software.
TIPSTER data extraction contractors will be participating in MUC-5,
for which previously unseen test data will be used.
Each test set will consist of 100-300 texts, depending on
language and domain. A dry-run test will be conducted about three
months after the release of the training data; the formal test will be
conducted about two and one-half months after the dry run. Each test
will be carried out by the participants at their own sites in
accordance with a prepared test procedure and the results submitted to
NRaD for official scoring by domain analysts.
Systems will be evaluated using the criteria applied to the
TIPSTER Text data extraction systems. These criteria, which are still
under development, are likely to use the scoring categories (correct,
partially correct, incorrect, spurious, missing, and noncommittal) to
support not only the measures used for MUC-4 (recall, precision,
overgeneration, fallout, and F-measure) but also new measures
(probability of detection, probability of false alarm, and a measure
that combines them). MUC-5 participants will be able to familiarize
themselves with the evaluation criteria through usage of the
evaluation software, which will be released along with the training
data.
INSTRUCTIONS FOR RESPONDING TO THE CALL FOR PARTICIPATION:
Organizations within and outside the U.S. are invited to respond
to this call for participation. Minimal requirements include
development before the dry-run test of a system that can accept texts
without manual preprocessing, process them without human intervention,
and output templates in the expected format. Organizations should
plan on allocating at least three person-months of effort for
participation in the evaluation and conference; a substantially greater
level of effort is likely to be needed in order to achieve relatively
high performance. It is understood that organizations will vary with
respect to experience with information extraction, domain
expertise/engineering, resources, contractual demands/expectations,
etc. Recognition of such factors will be made in any analyses of the
results.
Organizations wishing to participate in the evaluation and
conference must respond by submitting a summary of their text analysis
approach and a system architecture description, not to exceed five
pages in total. The summary should include the strengths of
the approach and highlight its innovative aspects. Acceptance or
rejection of each application will be determined on the basis of a
technical assessment by the program committee. The body of the
application will serve as the basis for an article in the conference
proceedings. Participants will have the opportunity to make revisions
prior to publication.
The application must also include the following information:
1. Domain (choose only one)
a. Joint ventures
b. Microelectronics
2. Language (choose one or two)
a. English
b. Japanese
3. An estimate of the degree of coverage and/or length of time
under development of existing software to be applied to the
MUC-5 task in the selected language(s) and domain.
4. Primary point of contact for notification of
acceptance/rejection of application. Please include name,
surface and email addresses, and phone and fax numbers.
Those organizations wishing to request funding to supplement
their own resources must provide a second statement, not to exceed two
pages. This statement should include an estimate of the amount of
funding available from other sources to support participation in this
work and a specification of the amount of funding desired and the
minimal acceptable amount. In addition, it should describe any
software to be used for MUC-5 that the organization is willing to
deliver to NRaD and MUC participants for possible redistribution.
Please indicate clearly whether the organization is interested in
participating in MUC-5 even if no funding is available. Evaluators of
funding requests will not include any MUC system developers.
RESPONSES THAT INCLUDE FUNDING REQUESTS MUST BE SUBMITTED BY
JANUARY 3, 1993. THE DEADLINE FOR OTHER RESPONSES IS JANUARY 15,
1993. All participants are expected to have Internet access and to
be able to do electronic file transfer via anonymous FTP. All
responses should be submitted to the program chair via email to
sundheim@nosc.mil. If Internet access is currently unavailable,
responses may be sent via surface mail to Beth Sundheim, NCCOSC/NRaD,
Code 444, San Diego, CA 92152-5000, and if a quick reply to questions
is needed, the program chair may be reached by phone at 619/553-4145.
PROGRAM COMMITTEE:
Beth Sundheim, NCCOSC/NRaD, program chair
Sean Boisen, BBN Systems and Technologies
Lynn Carlson, U.S. Department of Defense
Nancy Chinchor, Science Applications International
Jim Cowie, New Mexico State University
Ralph Grishman, New York University
Jerry Hobbs, SRI International
Joe McCarthy, University of Massachusetts, Amherst
Mary Ellen Okurowski, U.S. Department of Defense
Boyan Onyshkevych, U.S. Department of Defense
Lisa Rau, General Electric R&D Center
Carl Weir, Paramax Systems Corporation
REFERENCE: _Proceedings_of_the_Fourth_Message_Understanding_Conference_
(MUC-4)_, Morgan Kaufmann, June, 1992. To order, call
(800)745-7323 (toll free in North America) or (415)578-9928
(direct), send fax to (415)578-0672 or email to
morgan@unix.sri.com. Please refer to ISBN 1-55860-273-9.
------------------------------
End of NL-KR Digest
*******************