Copy Link
Add to Bookmark
Report

IRList Digest Volume 3 Number 29

eZine's profile picture
Published in 
IRList Digest
 · 1 year ago

IRList Digest           Tuesday, 25 August 1987      Volume 3 : Issue 29 

Today's Topics:
Announcement - Abstracts from next ACM SIGIR Forum (part 1 of 4)

News addresses are ARPANET: fox%vtopus.cs.vt.edu@relay.cs.net
BITNET: foxea@vtvax3.bitnet CSNET: fox@vt UUCPNET: fox@vtopus.uucp

----------------------------------------------------------------------

Date: Mon, 10 Aug 87 15:17:43 CDT
From: nancy@usl-vb.usl.edu (Nancy )
Subject: Abstracts from next ACM SIGIR Forum - sent by Raghavan

ABSTRACTS (part 1 of 4)

(Chosen by G. Salton from recent issues of journals in the retrieval area).

1. FUZZY RELATIONAL DATABASES: REPRESENTATIONAL ISSUES AND REDUCTION USING
SIMILARITY MEASURES
Henri Prade and Claudette Testemale
Laboratoire Langages et Systemes Informatiques
Universite Paul Sabatier
118 Route de Narbonne
31062 Taulouse Cedex, France
Until Now, the idea of a fuzzy database has been investigated along
different lines: Some authors have dealt with the imprecision of attri-
bute values by modeling, using fuzzy similarity relations, the extent to
which these values could be regarded as interchangeable. Others have used
possibility distributions for representing fuzzily known or incompletely
known attribute values. The first approach, which cannot accommodate
incomplete information, is restated in the framework of rough sets
extended to fuzzy relations. Besides, in the second one, similarity meas-
ures between attribute values can be introduced and computed; then a com-
parison of the two approaches is provided. The proposed similarity meas-
ure, based on a fuzzy Hausdorff distance, estimates the mismatch between
two possibility distributions. From storage and query-evaluation points
of view, it may be interesting to gather items having similar attribute
values. Thus the similarity measures previously considered can be used
for the reduction of the fuzzy database. When several items have suffi-
ciently similar values for each attribute in a relation, the reduction is
performed by taking for each attribute the union of these similar values.
The consequences of the reduction process on query evaluation are studied.
(JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, Vol. 38, No. 2,
pp. 118-126, 1987)

2. KNOWLEDGE-ASSISTED DOCUMENT RETRIEVAL: II. THE RETRIEVAL PROCESS
Gautam Biswas, James C. Bezdek, Viswanath Subramanian, and Marisol
Marques.
Department of Computer Science
University of South Carolina
Columbia, South Carolina 29208
This article presents our conceptual model of the retrieval process of
a document-retrieval system. The retrieval mechanism input is an unambi-
guous intermediate form of a user query generated by the language proces-
sor using the method described previously. Our retrieval mechanism uses a
two-step procedure. In the first step a list of documents pertinent to
the query are obtained from the document database, and then an evidence-
combination scheme is used to compute the degree of support between the
query and individual documents. The second step uses a ranking procedure
to obtain a final degree of support for each document chosen, as a func-
tion of individual degrees of support associated with one or more parts of
the query. The end result is as set of document citations presented to
the user in a ranked order in response to the information request. Numer-
ical examples are given to illustrate various facets of the overall sys-
tem, which has been proto-typically implemented in modular form to test
system response to changes in model parameters.
(JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, Vol. 38, No. 2,
pp. 97-110, 1987)

3. KNOWLEDGE-ASSISTED DOCUMENT RETRIEVAL: I. THE NATURAL-LANGUAGE INTERFACE
Gautam Biswas, James C. Bezdek, Marisol Marques, and Viswanath
Subramanian
Department of Computer Science
University of South Carolina
Columbia, South Carolina 29208
In this article we describe the conceptual model and processing of
(constrained) natural-language queries in information retrieval systems.
A language interface based on fuzzy set techniques is proposed to handle
the uncertainty inherent in natural-language semantics. The conceptual
model is developed and exemplified in the context of document retrieval.
Specifically, the user query is considered to be a triple,
q = ( q , q , q ) where q
c y n c
indicates the part of the query that deals with concepts and operators
that link these concepts, q identifies the publication period the user
y
is interested in, and q
n
pertains to the number of documents to be retrieved. We describe query
decomposition using an augmented transition network parser and the assign-
ment of functions and relations needed by each portion of the query to
represent uncertainties inherent in the natural language. The output of
the natural-language interface is then passed to a knowledge-based
retrieval mechanism that will be described in a companion article (Part
II).
(JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, Vol. 38, No. 2,
pp. 83-96, 1987)

4. A NOTE ON WEIGHTED QUERIES IN INFORMATION RETRIEVAL SYSTEMS
Ronald R. Yager
Machine Intelligence Institute
Iona College
New Rochelle, New York 10801
Several authors have suggested the introduction of fuzzy set methodolo-
gies as a means for improving the performance of information-retrieval
systems [1-8]. In a recent survey [9] of information-retrieval technolo-
gies, Bartschi discusses the fuzzy set model among other models. A prob-
lem of considerable interest to designers of fuzzy set retrieval systems
concerns itself with the evaluation of the retrieval status function in
the situation in which the query terms or the search criteria have weights
indicating their importance to the requester. A number of approaches have
been suggested for this problem, but Bartschi [9] points out some of the
difficulties with each of these proposed methods. We suggest an alterna-
tive methodology for handling weighted queries in a fuzzy environment.
(JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, Vol. 38, No. 1,
pp. 23-24, 1987)

5. A GRAPHICAL DATABASE INTERFACE FOR CASUAL, NAIVE USERS
Clifford Burgess
Computer Science Department
University of Southern Mississippi
Hattiesburg, Mississippi 39403
and
Kathleen Swigger
Computer Science Department
North Texas State University
Denton, Texas 76203
This paper is concerned with some aspects of database interfaces for
casual, naive users. A ``casual user'' is defined as an individual who
wishes to execute queries once or twice a month, and a ``naive user'' is
someone who has little or no expertise in operating computers. The study
focuses on a specific group of casual, naive users, analyzes their needs
and proposes a solution. The proposed interface consists of a graphical
display of a model of a database and a natural language query language.
One of the unique properties of the database interface is that it allows
the user to see local item names within the context of a global structure.
The interface was then tested to determine whether it was acceptable to
the user population and to discover the level of graphical model that the
users would find most comfortable.
(INFORMATION PROCESSING & MANAGEMENT, Vol. 22, No. 6, pp. 511-521, 1986)

6. COMPRESSION OF INDEX TERM DICTIONARY IN AN INVERTED-FILE ORIENTED DATA-
BASE: SOME EFFECTIVE ALGORITHMS
Janusz L. Wisniewski
Applied Informatics Department
Nicholas Copernicus University
Grudziadzke 5/7, Torun, Poland
A new method of index term dictionary compression in an inverted-file-
oriented database is discussed. A technique of word coding that generates
short fixed-length codes obtained from the index terms themselves by
analysis of monogram and bigram statistical distributions is described.
Transformation of the index term dictionary into a code dictionary
preserves a word-to-word discrimination with a rate of three synonyms per
1300 terms, at compression ratio up to 90% and at low cost in terms of the
CPU time expenditure. When applied in computer network environment, it
offers substantial savings in communication channel utilization at negli-
gible response time degradation. Experimental data for 26,113 index term
dictionary of the New York Times Info Bank available via a computer net-
work are presented.
(INFORMATION PROCESSING & MANAGEMENT, Vol. 22, No. 6, pp. 493-501, 1986)

7. COMPUTER USE OF A MEDICAL DICTIONARY TO SELECT SEARCH WORDS
John O'Connor
Computer Science and Electrical Department
Packard Lab #19
Lehigh University
Bethlehem, Pennsylvania 18015
In a preceding experiment in text-searching retrieval for cancer ques-
tions, search words were humanly selected with the aid of a medical dic-
tionary and cancer textbooks. Recall results were (1) using only stems of
question words (humanly stemmed): 20%; (2) adding dictionary search words:
29%; (3) adding also textbook search words: 70%. For the experiment
reported here, computer procedures for using the medical dictionary to
select search words were developed. Recall results were (1) for question
stems (computer stemmed): 19%; (2) adding search words computer selected
from the dictionary: 24%. Thus the computer procedures compared to human
use of the dictionary were 50% successful. Human and computer false
retrieval rates were almost equal. Some hypotheses about computer selec-
tion of search words from textbooks are also described.
(INFORMATION PROCESSING & MANAGEMENT, Vol. 22, No. 6, pp. 477-486, 1986)

8. IMPLEMENTING AGGLOMERATIVE HIERARCHIC CLUSTERING ALGORITHMS FOR USE IN
DOCUMENT RETRIEVAL
Ellen M. Voorhees
Department of Computer Science
Cornell University
Ithaca, New York 14853
Searching hierarchically clustered document collections can be effec-
tive [6], but creating the cluster hierarchies is expensive, since there
are both many documents and many terms. However, the information in the
document-term matrix is sparse: Documents are usually indexed by rela-
tively few terms. This paper describes the implementations of three
agglomerative hierarchic clustering algorithms that exploit this sparsity
so that collections much larger than the algorithms' worst case running
times would suggest can be clustered. The implementations described in
the paper have been used to cluster a collection of 12,000 documents.
(INFORMATION PROCESSING & MANAGEMENT, Vol. 22, No. 6, pp. 465-476, 1986)

9. NATIONAL SCIENCE FOUNDATION SUPPORT FOR COMPUTER AND INFORMATION SCIENCE
AND ENGINEERING
Harold E. Bamford and Charles N. Brownstein
National Science Foundation
Washington, D. C. 20550
The National Science Foundation has supported research in the informa-
tion sciences for 25 years, initially through its Office of Scientific
Information, later through the Office of Science Information Service and
the Division of Science Information, and most recently through the Divi-
sion of Information Science and Technology. The Foundation has also sup-
ported research in computer science and engineering, most recently through
the Division of Computer Research and the Division of Computer and Infor-
mation Engineering. On May 1, 1986 all these elements were brought
together to form the Directorate for Computer and Information Science and
Engineering (CISE), one of the five research branches of the Foundation.
A more persuasive demonstration of the Foundation's commitment to this
dynamic new field of research would hardly have been possible.
(INFORMATION PROCESSING & MANAGEMENT, Vol. 22, No. 6, pp. 449-452, 1986)
[Note: continued in next 3 issues - Ed]

------------------------------

END OF IRList Digest
********************

← previous
next →
loading
sending ...
New to Neperos ? Sign Up for free
download Neperos App from Google Play
install Neperos as PWA

Let's discover also

Recent Articles

Recent Comments

Neperos cookies
This website uses cookies to store your preferences and improve the service. Cookies authorization will allow me and / or my partners to process personal data such as browsing behaviour.

By pressing OK you agree to the Terms of Service and acknowledge the Privacy Policy

By pressing REJECT you will be able to continue to use Neperos (like read articles or write comments) but some important cookies will not be set. This may affect certain features and functions of the platform.
OK
REJECT