Copy Link
Add to Bookmark
Report
IRList Digest Volume 2 Number 39
IRList Digest Sunday, 7 September 1986 Volume 2 : Issue 39
Today's Topics:
Call for Papers - Call for contributions to ACM SIGIR Forum
Abstracts - Appearing in latest issue of ACM SIGIR Forum, Part 2
News addresses are ARPANET: fox%vt@csnet-relay.arpa BITNET: foxea@vtvax3.bitnet
CSNET: fox@vt UUCPNET: seismo!vtisr1!irlistrq
----------------------------------------------------------------------
Date: Sun, 7 Sep 86 11:48:53 edt
From: fox (Ed Fox)
Subject: call for papers for ACM SIGIR Forum, fall 1986
It is time to gather short articles, book reviews, abstracts,
announcements, etc. for the next Forum. I will be putting out
this issue, so send electronic versions (unless you say otherwise
they may appear in IRList too) or paper copies (done in camera
ready form, single spaced).
I look forward to receiving your materials in the next few weeks.
Many thanks, Ed Fox (co-editor for Forum).
------------------------------
Date: Wed, 23 Jul 1986 13:06 CST
From: Vijay V. Raghavan <RAGHAVAN@UREGINA1.bitnet>
Subject: SIGIR FORUM Abstracts [Part 2 - Ed]
[Note: Members of ACM SIGIR should have received the spring/summer
Forum, and can find these on pages 30-31. The rest will appear in
machine readable form also in later issues of IRList. - Ed]
ABSTRACTS
(Chosen by G. Salton or V. Raghavan from 1984 issues of journals
in the retrieval area)
10. TESTING OF A NATURAL LANGUAGE RETRIEVAL SYSTEM FOR A FULL
TEXT KNOWLEDGE BASE
Lionel M. Bernstein and Robert E. Williamson
Lister Hill National Center for Biomedical Communications,
National Library of Medicine, National Institutes of Health,
Bethesda, MD 20209
"A Navigator of Natural Language Organized Data" (ANNOD) is a
retrieval system which combines use of probabilistic,
linguistic, and empirical means to rank individual paragraphs
of full text for their similarity to natural language queries
proposed by users. ANNOD includes common word deletion,
word root isolation, query expansion by a thesaurus, and
application of a complex empirical matching (ranking)
algorithm. The Hepatitis Knowledge Base, the text of a
prototype information system, was the file used for testing
ANNOD. Responses to a series of users' unrestricted natural
language queries were evaluated by three testers.
Information needed to answer 85 to 95% of the queries was
located and displayed in the first few selected paragraphs.
It was successful in locating information in both the
classified (listed in Table of Contents) and unclassified
portions of text. Development of this retrieval system
resulted from the complementarity of and interaction between
computer science and medical domain expert knowledge.
Extension of these techniques to larger knowledge bases is
needed to clarify their proper role.
(JASIS, Vol. 35(4): 235-247; 1984)
11. A COMPARISON OF THE COSINE CORRELATION AND THE MODIFIED
PROBABILISTIC MODEL
W. Bruce Croft
Computer and Information Science Dept.
University of Massachusetts
Amherst, MA 01003
It has been pointed out that the comparison between the
performance of the cosine correlation and the modified
probabilistic model was incomplete. In particular, the term
weights used for the cosine correlation were term frequencies
within the document text. Salton has for some time used a
term weight known as 'tf.idf' in his retrieval experiments
with the cosine correlation. This weight consists of the
within document term frequency (sometimes normalized by the
maximum frequency) multiplied by the inverse document
frequency weight. Although the inverse document frequency
weight can be regarded as a product of the retrieval process,
it has also been used as part of the indexing process in that
the weight is assigned to the terms in the document
representatives. In this note, we shall present the results
of retrieval experiments with the cosine correlation and the
tf.idf weights. The comparison of these results to those
obtained with the modified probabilistic model leads to some
interesting conclusions about the cosine correlation.
(Information Technology, Vol. 3 No. 2 113-115, April 1984
12. SCIENTIFIC INQUIRY: A MODEL FOR ONLINE SEARCHING
Stephen P. Harter
School of Library and Information Science, Indiana
University, Bloomington, IN 47405
Scientific inquiry is proposed as a philosophical and
behavioral model for online information retrieval. The
nature of scientific research concepts of variable,
hypothesis formulation and testing, operational definition,
validity, reliability, assumption, and the cyclical nature of
research are established. A case is made for the
inevitability of end-user searching. It is argued that the
model is of interest now only for its own sake, for the
intellectual parallels that can be established between two
apparently disparate human activities, but as a useful
framework for discussion and analysis of the online search
process from an educational and evaluative viewpoint.
(JASIS, VOL. 35(2): 110-117; 1984)
13. A DRILL AND PRACTICE PROGRAM FOR ONLINE RETRIEVAL
Bert R. Boyce
School of Library and Information Science, Louisiana State
University, LA 70803
David Martin, Barbara Francis, and Mary Ellen Slevert
Department of Information Science, University of Missouri at
Columbia, 110 Stewart Hall, Columbia, MO 65211
DAPPOR, a drill and practice program for online retrieval
provides reinforcement to students engaged in learning the
basic command protocols of the major vendors of bibliographic
databases. The DAPPOR evaluation program overcomes the
difficult problems of determining the correctness of a user
response in a highly flexible environment. The coding of
answer definitions and the process of recursive reduction
used by the evaluation program are described.
(JASIS, Vol. 35(2): 129-134; 1984)
14. TWO PARTITIONING TYPE CLUSTERING ALGORITHMS
Fazli Can and Esen A. Ozkarahan
Arizona State University, Tempe, AZ 85287
In this article, two partitioning type clustering algorithms
are presented. Both algorithms use the same method for
selecting cluster seeds; however, assignment of documents to
the seeds is different. The first algorithm uses a new
concept called "cover coefficient" and it is a single-pass
algorithm. The second one uses a conventional measure for
document assignment to the cluster seeds and is a multipass
algorithm. The concept of clustering, a model for seed
oriented partitioning, the new centroid generation approach,
and an illustration for both algorithms are also presented in
the article.
(JASIS, Vol. 35(5): 268-276 1984)
15. ARTIFICIAL INTELLIGENCE: UNDERLYING ASSUMPTIONS AND BASIC
OBJECTIVES
Nick Cercone
Computing Science Department, Simon Fraser University,
Burnaby, British Columbia, Canada V5A 1S6
Gordon McCalla
Department of Computational Science, University of
Saskatchewan, Saskatoon, Saskatchewan, Canada S7N 0W0
Artificial intelligence (AI) research has recently captured
media interest and it is fast becoming our newest "hot"
technology. AI is an interdisciplinary field which derives
from a multiplicity of roots. In this article we present our
perspectives on methodological assumptions underlying
research efforts in AI. We also discuss the goals (design
objectives) of AI across the spectrum of subareas it
comprises. We conclude by discussing why there is increased
interest in AI and whether current predictions of the future
importance of AI are well founded.
(JASIS Vol, 35(5): 280-290; 1984)
16. NATURAL LANGUAGE PROCESSING
Ralph Grishmman
Department of Computer Science, New York University, 251
Mercer Street, New York, NY 10012
Natural language processing has two primary roles to play in
the storage and retrieval of large bodies of information:
providing a friendly, earily-learned interface to information
retrieval systems, and automatically structuring texts so
that their information can be more easily processed and
retrieved. This article outlines the organization of a
natural language interface for data retrieval (a "questions -
answering system") and some of the approaches being taken to
text structuring. It closes by describing a few of the
research issues in computational linguistics and a
possibility for using interactive natural language processing
for information acquisition.
(JASIS, Vol. 35(5): 291-296; 1984)
17. EXPERT SYSTEMS: A TUTORIAL
N. Shahla Yaghmai
School of Library and Information Science, University of
Wisconsin-Milwaukee, P.O. Box 413, Milwaukee, WI 53102
Jacqueline A. Maxin
Computer Services, The H.W. Wilson Company, Bronx, NY 10452
Expert systems are intelligent computer applications that use
data, a knowledge base, and a control mechanism to solve
problems of sufficient difficulty that significant human
expertise is necessary for their solution. Expert systems
use artificial intelligence problem-solving and knowledge-
representation techniques to combine human expert knowledge
about a problem area with human expert methods of
conceptualizing and reasoning about that problem area. As a
result, it is expected that such systems can reach a level of
performance comparable to that of a human expert in a
specialized problem area. The high-level knowledge base and
associated control mechanism of expert systems are in essence
a model of the expertise of the best practitioners of the
problem area in question and, hence, human users are provided
with expert opinions about problems in that area. Expert
systems do not pretend to give final or ultimate conclusions
to displace human decision making; they are intended for
consulting purposes only.
(JASIS, Vol. 35(5); 297-305; 1984)
18. APPROACHES TO MACHINE LEARNING
Pat Langley
The Robotics Institute, Carnegie-Mellon University,
Pittsburgh, PA 15213
Jaime G. Carbonell
Department of Computer Science, Carnegie-Mellon University,
Pittsburgh, PA 15213
The field of machine learning strives to develop methods and
techniques to automate the acquisition of new information,
new skills, and new ways of organizing existing information.
This article reviews the major approaches to machine learning
in symbolic domains, illustrated with occasional paradigmatic
examples.
(JASIS, Vol. 35(5); 306-316: 1984)
19. ARTIFICIAL INTELLIGENCE: A SELECTED BIBLIOGRAPHY
Compiled by Linda C. Smith
Graduate School of Library and Information Science,
University of Illinois at Urbana-Champaign, Urbana, IL 61801
The literature of artificial intelligence (AI) is scattered
over many books, journals, conference proceedings, and
technical reports. This selected annotated bibliography,
arranged by type of material, can serve as an introduction to
that literature.
(JASIS, Vol. 35(5); 317-319: 1984)
20. AUTOMATIC SEARCH TERM VARIANT GENERATION
K. Sparck Jones and J. I. Tait
Computer Laboratory, University of Cambridge
The paper describes research designed to improve automatic
pre-coordinate term indexing by applying powerful general-
purpose language analysis techniques to identify term sources
in requests, and to generate variant expression of the
concepts involved for document text searching.
(Journal of Documentation, Vol. 40, No. 1, March 1984, pp.
50-66).
21. HIERARCHIC AGGLOMERATIVE CLUSTERING METHODS FOR AUTOMATIC
DOCUMENT CLASSIFICATION
Alan Griffiths, Lesley A. Robinson and Peter Willett
Department of Information Studies, University of Sheffield,
Western Bank, Sheffield S10 2TN, UK
This paper considers the classifications produced by
application of the single linkage, complete linkage, group
average and Ward clustering methods to the Keen and Cranfield
document test collections. Experiments were carried out to
study the structure of the hierarchies produced by the
different methods, the extent to which the methods distort
the input similarity matrices during the generation of a
classification, and the retrieval effectiveness obtainable in
cluster based retrieval. The results would suggest that the
single linkage method, which has been used extensively in
previous work on document clustering, is not the most
effective procedure of those tested, although it should be
emphasized that the experiments have used only small document
test collections.
Journal of Documentation, Vol. 40, No. 3, September 1984, pp.
175-205.
22. PROBABILISTIC AUTOMATIC INDEXING BY LEARNING FROM HUMAN
INDEXERS
S. E. Robertson
Department of Information Science, City University,
Northampton Square, London EC1V 0HB
P. Harding
Inspec, Station House, Nightingale Road, Hitchin,
Hertfordshire SG5 1RJ
A probabilistic model previously used in relevance feedback
is adapted for use in automtic indexing of documents (in the
sense of imitating human indexers). The model fits with
previous work in this area (the 'adhesion coefficient'
method), in effect merely suggesting a different way of
arriving at the adhesion coefficients. Methods for the
application of the model are proposed. The independence
assumptions used in the model are interpreted, and the
possibility of a dependence model is discussed.
Journal of Documentation, Vol. 40, No. 4, December 1984, pp.
264-270.
------------------------------
END OF IRList Digest
********************