IRList Digest Volume 1 Number 17

Published in
· 1 year ago
IRList Digest           Sunday, 27 Oct 1985      Volume 1 : Issue 17 

Today's Topics: 
   References - Abstracts of Talks at ACM SIGIR 85 in Montreal 

---------------------------------------------------------------------- 

From: "V.J. Raghavan" <ihnp4!sask!regina!raghavan@UCB-VAX> 
Date: Tue, 27 Aug 85 16:02:23 cst 
Subject: Conf. '85  

[This year's Montreal Conference on IR had many interesting papers. 
 A file, in "SMART" form, of the abstracts, was typed in at Univ. 
 of Regina and edited at Virginia Tech.  Order info for proceedings is: 
   Proc. of the Eighth Annual Int. ACM SIGIR Conf. on R&D in Inf.  Ret., 
   ACM Order No. 606850  -- Ed] 

.I 1 
.T 
Graphical Information Resources: Maps and Beyond 
.A 
MICHAEL LESK 
.W 
The rise of computer graphics offers a new challenge for information 
retrieval: how to search and retrieve information which is partly or 
wholly graphical.  As an example, procedures for handling geographical 
information, such as street maps and directories, are explained.  With 
this data, it is possible to find routes on maps, retrieve locations 
and names of people or businesses, and draw maps.  But a comparision 
of these programs with programs for face processing or computer 
typesetting makes clear how far we are from general purpose routines. 
Today, successful graphics routines contain a great deal of local  
domain knowledge.  There is no analog of the simple keyword systems  
that handle textual documents in any subject area.  Just as  
computational linguists have found that subject matter expertise is 
nessary to do really sophisticated processing of English, it seems 
also necessary for sophisticated processing of pictures; the difference 
is that we don't know how to do unsophisticated processing of graphics. 
.I 2 
.T 
Implications of Boolean Structure for Probabilistic Retrieval 
.A 
ABRAHAM BOOLSTEIN 
.W 
The purpose of an information retrieval system is to retrieve documents 
in response to a request.  However, there are many strategies as to how  
this can be accomplished.  Most popular are the Boolean systems attached 
to the commercial on-line database systems.  Less conspicuous are a number 
of approaches being developed in university research laboratories and 
which are the object of this paper. 

This paper is divided into two parts:  The first wil be a more or less 
nontechnical overview of research in IR over the last ten to fifteen 
years.  The second part, which is a bit more technical and a lot more  
speculative, will cover some interesting questions that are now being 
studied. 
.I 3 
.T  
Generalized Vector Space Model In Information Retrieval 
.A 
S. K. M. WONG 
WOJCIECH ZIARKO 
PATRICK C. N. WONG 
.W 
In information retrieval, it is common to model index terms and documents 
as vectors in a suitably defined vector space.  The main difficulty with 
this approach is that the explicit representation of term vectors is not 
known a priori.  For this reason, the vector space model adopted by 
Salton for the SMART system treats the terms as a set of orthogonal 
vectors.  In such a model it is often necessary to adopt a separate,  
corrective procedure to take into account the correlations between terms. 
In this paper, we propose a systematic method (the generalized vector 
space model) to compute term correlations directly from the automatic 
indexing scheme.  We also demonstrate how such correlations can be 
included with minimal modification in the existing vector based 
information retrieval systems.  The preliminary experimental results 
obtained from the new model are very encouraging. 
.I 4 
.T 
Handling Multiple Data Bases in Document Retrieval 
.A 
IAN A. MACLEOD 
.W 
There is no such thing as a standard document.  Bibliographic information 
comes in a wide variety of formats.  Existing retrieval systems handle 
different document styles either by creating an artificial document type 
or by providing different and independent data bases.  Neither approach 
seems satisfactory.  In this paper we describe a data model which we feel   
is more appropriate for document representation and show it can handle the 
multiple document type problem quite naturally. 
.I 5 
.T 
Theoretical Measures in P/Q Document Spaces 
.A 
ROBERT R. KORFHAGE 
.W 
Early work on the use of profiles to aid in the interpretation of queries 
in an information retrieval system developed the concepts of retrieval  
subspaces that are either ellipsoids or Cassini ovals.  This report extends 
the theoretical basis of these models, exploring other models and the 
effect of different metrics on the conceptual retrieval subspace. 
.I 6 
.T 
Composite Document Extended Retrieval - An Overview 
.A 
EDWARD A. FOX 
.W 
'Composite document' is the name given herein to items which possess 
some complex internal structure and are made up of a variety of types 
of data, including a significant amount of textual material.  Examples 
are: elaborate bibliographic records, office correspondence or reports, 
electronic mail, digests from computer bulletin boards, entries in online 
name servers for network users, books, and parts of some expert system  
knowledge bases.  Investigation of composite documents is being done in 
connection with research in passage retrieval, database management, office 
modeling of documents, natural language analysis, and information retrieval. 
Preliminary results indicate that extending the vector space model leads to 
better document representation, and that extending standard Boolean logic 
query capabilities results in more effective searches.  Further study with  
a variety of specially prepared test collections will aid evaluation of the 
several models and methods proposed. 
.I 7 
.T 
Automatic Assignment of Soft Boolean Operators 
.A 
GERARD SALTON 
ELLEN VOORHEES 
.W 
The conventional bibliographic retrieval systems are based on Boolean 
query formulations and inverted file implementations.  Such systems 
provide rapid responses in answer to search queries but they are not 
easy to use by uninitiated patrons.  An extended Boolean retrieval 
strategy has been devised in which the Boolean operators are treated 
more or less strictly, depending on the setting of a special parameter, 
known as the p-value.  The extended system is much more forgiving than 
the conventional system, and provides better retrieval effectiveness. 
In this study various problems associated with the determination of 
appropriate p-values are discussed, and suggestions are made for an 
automatic assignment of p-values.  Evaluation output is included to 
illustrate the operations of the suggested procedures. 
.I 8 
.T 
Output Ranking Methodology for Document-Clustering-Based 
               Boolean Retrieval Systems 
.A 
TADEUSZ RADECKI 
.W 
During the last two decades or so several alternative approaches to 
document clustering have been introduced, each of which constitutes 
a basis for a number of various clustering methods.  Unfortunately,  
these methods are limited in many ways.  In particular, most of the 
available document clustering procedures have been designed to be 
applicable in the cases where both documents and user queries are 
represented by sets of weighted or unweighted index terms.  However, 
in commercial document retrieval systems queries are usually  
characterized by Boolean expressions of variables corresponding to 
the index terms of interest to the system's users.  Therefore, it 
would be desirable to extend the conventional approaches to document  
clustering with the aim of implementing them in operational  
environments.  In developing these extended approaches, one of the  
most important problems to be solved appears to be related to the 
system's adaptation to individual users.  More specifically, in the 
course of such an adaptation the system should utilize all the  
information available about the user's needs, and document and query  
representations should be modified accordingly, so that the most 
relevant documents could be retrieved.  This paper presents an 
an account of the author's most recent research along these lines. 
After providing a background for widely known approaches to  
document clustering, rigorous reasoning for extending these 
approaches to a conventional Boolean retrieval framework is given. 
.I 9 
.T 
The New Oxford English Dictionary and its Potential Users: 
               Some Preliminary Comments 
.A 
JOHN STUBBS 
.W 
On May 15, 1984, the Oxford University Press formally announced its 
intention to computerize the Oxford English Dictionary.  The OED and 
its Supplement, which comprise approximately 60 million words, will 
be entered into a computer; a database design will be created for this 
integrated version of the Dictionary; and, in addition to its printed 
form, the Dictionary will be available in electronic form. 
.I 10 
.T 
Processing Free-Text Input to Obtain a Database of Medical Information 
.A 
EMILE C. CHI 
CAROL FRIEDMAN  
NAOMI SAGER 
MARGARET S. LYMAN 
.W 
The Linguistic String Project of New York University has developed computer 
programs that convert the information in free-text documents of a technical 
specialty into a structured form suitable for mapping into a relational 
database.  The processing is based upon the restrictions on the use of  
language that are characteristic of the subject matter and the document type. 
These restrictions are summarized in a "sublanguage grammar" that provides a 
set of word classes and formulas corresponding to the objects and relations 
of interest in the domain.  The programs are independent of the particular 
sublanguage grammar employed.  The application to narrative patient records 
will be described and the applicability of the methods to other domains 
discussed. 
.I 11  
.T 
An Approach to Multikey Sequencing in an Equiprobable Keyterm 
		    Retrieval Situation 
.A 
B. G. T. LOWDEN 
.W 
We present a simple generalized technique, for sequencing a multi- 
attribute file, which can be used in a situation where the query 
pattern is unknown and the term content equiprobable.  The method 
is based on constructing a short spanning path through the records 
thereby minimizing the sum of the Hamming distances between them. 
Retrieval performance is simulated over a range of query expressions 
and results suggest a significant reduction in the number of block 
accesses using this technique as compared with records randomly 
distributed over the file space. 
.I 12 
.T 
Optimization of Inverted Vector Searches 
.A 
CHRIS BUCKLEY 
ALAN F. LEWIT 
.W 
A simple algorithm is presented for increasing the efficiency of  
information retrieval searches which are implemented using inverted 
files.  This optimization algorithm employs knowledge about the 
methods used for weighting document and query terms in order to  
examine as few inverted lists as possible.  An extension to the 
basic algorithm allows greatly increased performance optimization 
at a modest cost in retrieval effectiveness.  Experimental runs are 
made examining several different term weighting models and showing 
the optimization possible with each. 
.I 13 
.T 
P-Trees: Storage Efficient Multiway Trees 
.A 
DAVID M. ARNOW 
AARON M. TENENBAUM 
CONNIE WU 
.W 
A new variation of high order multiway tree structures, the P-tree, is 
presented.  P-trees have average access costs that are significantly  
better than those of B-trees and are no worse (and often better) in 
storage utilization.  Unlike compact B-trees, they can be maintained  
dynamically, and unlike dense multiway trees and B-trees, their  
associated insertion algorithm, which is also presented, is cheap and 
involves (at most) a very localized rearrangement of keys. 
.I 14 
.T 
Efficient Variants of Huffman Codes in High Level languages 
.A 
Y. CHOUEKA 
S. T. KLEIN 
T. PERL 
.W 
Although it is well-known that Huffman Codes are optimal for text 
compression in a character per character encoding scheme, they are 
seldom used in practical situations since they require a bit-by-bit 
decoding algorithm, which has to be written in some assembly language 
and will perform rather slowly.  A number of methods and some slight 
variants are here presented that avoid these difficulties.  The decoding 
algorithms efficiently process the encoded string on a byte-per-byte 
basis, and can be programmed in any high level language.  This is achieved 
at the cost of reducing the net compression savings by a negligible amount, 
and storing some tables in the internal memory.  Experimental results are 
presented. 
.I 15 
.T 
Optimization of a Hierarchical File Organization for Spelling Correction 
.A 
TETSURO ITO 
CLEMENT T. YU 
.W 
The application of a hierarchically organized file to spelling check and 
correction problems seems to be promising, since it can correct more than  
the common typing mistakes.  However, the speed of retrieving the exactly 
matched records to the input is rather slow, and the time for computing 
similarity values is not negligible.  Here some techniques to overcome  
these defects are presented along with the approximate estimation of the  
improved file search time. 
.I 16 
.T 
Designing an Information Retrieval Interface Based on User Characteristics 
.A 
CHRISTINE L. BORGMAN 
DONALD O. CASE 
CHARLES T. MEADOW 
.W 
With the increasing number of information retrieval systems and databases 
available and the increasing demand of end users to utilize the system, 
a need exists for improved training mechanisms.  This paper reports on a 
project to develop an integrated online instruction and assistance system 
to be used as a "front end" to the U.S. Department of Energy's RECON 
retrieval system.  The conceptual framework for the interface design will 
be based on individual characteristics of its current and prospective  
users, predominantly scientists conducting energy research.  We will 
build a prototype based on information gained from interviews with 
scientists using the system (either directly or through a search  
intermediary) and interviews with search intermediaries.  In addition, we 
will outline a separate interface tailored to search intermediary needs, 
based on interviews and on known differences between end users and 
intermediaries in information seeking behavior and conceptual views of 
information systems.  The paper reports on research in progress through 
the initial interviews.  The remaining parts of the project are outlined. 
.I 17 
.T 
Different levels of expertise for an expert system in information retrieval 
.A 
B. DEFUDE 
.W 
The aim of this article is to describe the specifications of an Information 
Retrieval Expert System (IRES).  The resort to expert systems is a recent  
way which seems to be very convenient to realize performant and convivial 
systems.  In fact the capabilities allowed by classical Information Retrieval 
Systems (IRS) are not adequate.  This inadequacy comes from some important 
restrictions : 
	- limited indexing vocabulary, very often designed a priori. 
	- the search terms have to belong to this vocabulary and there 
	are just some simple semantical relations (as synonymy for example) 
	which can be used during the search. 
Consequently, if the terms used in the query are not those used for indexing 
we do not have a good answer. 
.I 18 
.T 
One-Time Complete Indexing of Text: Theory and Practice 
.A 
RAYMOND J. D'AMORE 
CLINTON P. MAH 
.W 
N-gram indexing provides a significant alternative to the keyword indexing 
and full text scanning methods.  N-grams are sets of overlapping word 
fragments selected for indexing according to information-theoretic  
considerations.  They provide complete, fixed indexing of all words, numbers, 
and special terms in text.  The behavior of such an n-gram system can be  
statistically modeled and validated over a wide range of text.  The model 
provides a descriptive and predictive tool for controlling precision and 
recall during retrievals and scaling estimates of relevance to an adaptive, 
reference noise distribution for a given collection.  Partial inversion of 
and simple data compression are some of the capabilities which make n-gram 
systems performance-competitive with other approaches. 
.I 19 
.T 
Experiments with cited titles for automatic document indexing and similarity 
measure in a probabilistic context 
.A 
K. L. KWOK 
.W 
Most people will agree that the ultimate goal of an information retrieval  
(IR) system would be to incorporate as many artificial intelligence  
techniques as possible so that the system would be able to reply to queries 
with specific answers (so called fact retrieval).  Current capabilities for  
large scale general purpose retrieval however is still at the stage of  
document retrieval level, namely retrieving documents that most probably 
contain answers to a query.  This may not necessarily be a transient stage 
at all, as has been pointed out that document retrieval capability may serve  
as a pre-processing stage to the more detailed fact retrieval, even if the 
problems associated with the latter are assumed solved. 
.I 20 
.T 
A Learning Algorithm Applied to Document Redescription 
.A 
MICHEAL D. GORDON 
.W 
Relevance feedback in document retrieval usually involves an inquirer  
revising a query to obtain better retrieval effectiveness [Rocchio] and 
[Salton].  The revised query is adjusted to correspond to the descriptions 
of known documents which the inquirer has found relevant.  Several problems  
exist with such methods, however, and will be described in this paper. 
.I 21 
.T 
The Cluster Hypothesis Revisited 
.A 
ELLEN M. VOORHEES 
.W 
A new means of evaluating the cluster hypotheses is introduced and the  
results of such an evaluation are presented for four collections.  The 
results of retrieval experiments comparing a sequential search, a cluster- 
based search, and a search of the cluster collection in which individual 
documents are scored against the query are also presented.  These results 
indicate that while the absolute performance of a search on a particular 
collection is dependent on the pairwise similarity of the relevant documents, 
the relative effectiveness of clustered retrieval versus sequential retrieval 
is independent of this factor.  However, retrieval of entire clusters in 
response to a query usually results in a poorer performance than retrieval 
of individual documents from cluster. 
.I 22 
.T 
C. T. YU 
Y. T. WANG 
C. H. CHEN 
.W 
It is well known [Salt, LaYu] that the retrieval of documents stored in  
clusters allows queries to be answered efficiently.  Furthermore,  
clustered retrieval may sometimes gain in retrieval effectiveness over 
relevance term weighting [Crof].  It has also been showm [VanR] that 
clustering of terms, when used properly (e.g. in tree dependence model 
[HaVa]) will gain significant improvement over retrieval processes based 
on the independence of terms [RoSp, YuSa].  Although numerous clustering 
algorithms [Salt, VanR, Ever] have been proposed and tested, the  
following problems remain. 
        1) the clustering algorithms are extremely expensive to use, if 
	   the number of objects (documents or terms) is large; 
	2) when the user enviroment changes, it is necessary to perform 
	   reclustering; 
	3) most clustering algorithms cluster objects based on syntactic 
	   relationships between the objects only.  Information which 
	   is extractable from previous users is ignored. 
.I 23 
.T 
Concepts of the Cover Coefficient-Based Clustering Methodology 
.A 
FAZLI CAN 
ESEN A. OZKARAHAN 
.W 
Document clustering has several unresolved problems.  Among them are high 
time and space complexity, difficulty of determining similarity thresholds, 
order dependence, nonuniform document distribution in clusters, and  
arbitrariness in determination of various cluster initiators.  To overcome 
these problems to some degree, the cover coefficient based clustering  
methodology has been introduced.  The concepts used in this methodology  
have created certain new concepts, relationships, and measures such as the 
effect of indexing on clustering.  These new concepts are discussed.  The  
result of performance experiments that show the effectiveness of the  
clustering methodology and the matching function are also included.  In 
these experiments, it has been also observed that the majority of the  
documents obtained in a search are concentrated in a few clusters 
containing a low percentage of documents of the database. 
.I 24 
.T 
The LIVE Project - Retrieval Experiments Based on Evaluation Viewpoints 
.A 
P. BOLLMANN 
F. JOCHUM 
U. REINER 
V. WEISSMANN 
H. ZUSE 
.W 
The LIVE-project (Leistungsbewertung von Information Retrieval VErfahren) at  
the Technische Universitaet Berlin, West Germany concerns with the evaluation 
of information retrieval systems.  Two fields are mainly under investigation. 

One area is about the investigation of methodological foundations of retrieval 
experiments.  Results of the LIVE-project can be seen on three different areas: 
on the one hand measurement-theoretical criteria for the application of simil- 
arity and evaluation measures in retrieval experiments have been considered 
and developed.  Further work has been done in the application of statistical 
principles of experimental design in information retrieval.  Especially control 
structures of factors in retrieval tests have been investigated and some 
aspects of statistical models for experimentation in information retrieval.   

The other topic in the LIVE project is the conduction of a retrieval experiment 
in cooperation with FIZ4 which is an information service center for mathematics 
physics and energy in Karlsruhe, West Germany. Amongst the many databases which 
FIZ4 is offering, the LIVE-project used the database about physics in their  
retrieval experiment. 
.I 25 
.T 
On Retrieval Tests with an Inhomogeneous Query Collection 
.A 
FRIEDBERT JOCHUM 
.W 
Retrieval tests are often performed with an inhomogeneous query collection, 
that means certain properties of queries vary within the collection.  The 
presented paper investigates under which conditions this inhomogeneity may  
lead to a loss of validity of retrieval tests and should be controlled by 
an experimental design.  In particular the interrelationship between  
experimental design of a retrieval test,  inhomogeneity of the used query  
collection and measurement theoretical properties of the used evaluation 
measure will be discussed. 
.I 26 
.T 
The Impact of VLSI Generalized Associative Memory Devices on the Intelligence 
and Architecture of Computer Systems 
.A 
D. R. MCGREGOR 
J. R. MALONE 
M. HENNING 
.W 
An intelligent system requires to function at different time-scales: 
	- fast associative recall of previously encountered information 
	- more complete, but slower search for hypotheses, checks for 
	  consistency and so on.  We must not expect automatic systems 
	  to do these in seconds. 

This paper describes a novel database system which used a new type of 
associative memory (now being implemented in VLSI) to improve its speed 
of response. 
.I 27 
.T 
A Testbed for Information Retrieval Research: 
The Utah Retrieval System Architecture 
.A 
LEE A. HOLLAAR 
.W 
The Utah Retrieval System Architecture provides an excellent testbed for 
the development and testing of new algorithms or techniques for information 
retrieval.  URSA is a message-based structure capable of running on a  
variety of system configurations, ranging from a single mainframe processor  
to a system distributed across a number of dissimilar processors.  It can 
readily support a variety of specialized backend processors, such a high- 
speed printers. 

The architecture divides the components of a text retrieval system into 
two classes: servers and clients.  A triple of servers (index, search, and 
document access) for each database provide the capabilities normally  
associated with a retrieval system.  Possible clients for these servers 
include a window-based user interface, whose query language can be easily 
modified, a connection to a mainframe host processor, or Al-based query 
modification programs that wish to use the database. 

Any module in the system can be replaced by a new module using a different 
algorithm as long as the new module complies with the message formats for 
that function.  In fact, with some care this module switch can occur while 
the system is running, without affecting the users.  A monitor program  
collects statistics on all system messages, giving information regarding 
query complexity, processing time for each module, queuing times, and 
bandwidths between every module. 

This paper discusses the background of URSA and its structure, with  
particular emphasis on the features that make it a good testbed for 
information retrieval techniques. 
.I 28 
.T 
An Integrated Hierarchical File Organization For Data Selection and Retrieval 
.A 
SAU-LAN LEE 
.W 
This paper presents an approach, the IHF approach, of organizing data of a  
hierarchical structure into one single file, maintaining adequate represen- 
tation of the relationships among the data.  The IHF approach requires each 
data item be stored in the basic form of name-value pair and consists in  
assigning each data item with identifier which provides not only the data 
item identification but also an implicit identification of the hierarchical 
structural relationships among groups of data items.  Data selection and 
retrieval operations are effectively implemented with the concept of masking 
of relevant identifier bits, without resorting to any linking devices. 
Application of the IHF technique on CAFS is examined. 
.I 29 
.T 
RUBRIC:  An Environment for Full Text Information Retrieval 
.A 
RICHARD M. TONG 
VICTOR N. ASKMAN 
JAMES F. CUNNINGHAM 
CARL J. TOLLANDER 
.W 
This paper describes an ongoing research project that is concerned with 
developing a computer-based aid for retrieval from unformated full text 
databases.  Unlike other attempts to improve upon Boolean keyword retrieval 
systems, our research concentrates on providing an easily used rule-based 
language for expressing retrieval concepts.  This language draws upon work 
both in artificial intelligence and the theories of evidential reasoning to 
allow the user to construct queries that have better precision and recall  
than more traditional forms.  The paper includes a discussion of our formal 
model of retrieval, the rule language, and some comparative performance 
results. 
.I 30 
.T 
ANNOD - A Navigator of Natural-language Organized (Textual) Data 
.A 
ROBERT E. WILLIAMSON 
.W 
ANNOD is the name of a system developed at the National Library of Medicine 
(NLM), which implements a set of linguistic and empirical techniques that 
permit retrieval of natural language information in response to natural 
language queries.  The system is based on Dr. Gerard Salton's SMART document 
retrieval system and is presently implemented on a mini-computer as part of 
an Interactive Text Management System, ITEMS.  Actual experience with  
retrieval of information from NLM's Hepatitis Knowledge Base (HKB), an 
encyclopedic hierarchical, full-text file, is presented.  The techniques 
used in ANNOD include: automatic stemming of words, common word deletion, 
thesaurus expansion, a complex empirical matching (ranking) algorithm 
(similarity measure), and techniques expressly designed to permit rapid  
response in a mini-computer environment.  Preliminary testing demonstrates 
high efficiency in identifying portions of a text which are relevant to users. 
.I 31 
.T 
The User's Mental Model of an Information Retrieval System 
.A 
CHRISTINE L. BORGMAN 
.W 
An empirical study was performed to train naive subjects in the use of a  
prototype Boolean logic-based information retrieval system on a bibliographic 
data-base.  Subjects were undergraduates with little or no prior computing 
experience.  Subjects trained with a conceptual model of the system performed 
better than subjects trained with procedural instructions, but only on complex, 
problem-solving tasks.  Performance was equal in simple tasks.  Differences in 
patterns of interaction with the system (based on a stochastic process model) 
showed parallel results.  Most subjects were able to articulate some  
description of the system operation, but few articulated a model similar to 
the card catalog analogy provided in training.  Eleven of 43 subjects were  
unable to achieve minimal competency in system use.  The failure rate was  
equal between training conditions and genders: the only differences found 
between those passing and failing the benchmark test were in academic major and 
frequency of library use. 
.I 32 
.T 
A Study of the Relationship Between User Profiles and User Queries 
.A 
MICHAEL A. SHEPHERD 
.W 
This research investigates the relationship between a user profile and a 
user query by examining the degree of similarity of a user profile and a  
query and the degree of similarity between the sets of documents retrieved 
by the profile and the query, respectively.  This relationship will form 
the basis for the integration of a profile and a query into a single  
enhanced query to retrieve a set of documents that reflects the similarity 
between the sets of documents retrieved by the profile and by the query.  
Such an integration will permit the development of an expert search 
intermediary in which the user profile acts as a knowledge base in providing 
a context for the interpretation of the query to the database. 
.I 33 
.T 
A Conceptual Model and Experiments on How People  
Classify and Retrieve Documents 
.A 
ALBERTO J. CANAS 
FRANK SAFAYENI 
DAVE W. CONRATH 
.W 
In other words, a better understanding of the way in which people classify  
and search for information may be instrumental for designing more effective  
automated systems.  Cognitive psychologists have dealt with the problem of 
categorization for many years (e.g. Rosch, 1978) but not within the context 
of storage and retrieval.  Their work has been based on the categorization  
of simple stimuli (e.g. geometric shapes) and has not addressed the problem 
of retrieval after categorization. 

The purpose of this paper is twofold: 
	1) To present a conceptual model of the cognitive processes in the 
	   storage and retrieval of documents. 
	2) To describe ongoing experiments and their potential implications 
	   in the design of automated systems. 

 

------------------------------ 

END OF IRList Digest 
********************
IRList Digest Volume 1 Number 17

Share this article

Let's discover also

IRList Digest Volume 3 Number 40

IRList Digest Volume 2 Number 60

IRList Digest Volume 4 Number 57

IRList Digest Volume 1 Number 12

IRList Digest Volume 5 Number 01

IRList Digest Volume 2 Number 31

IRList Digest Volume 3 Number 45

IRList Digest Volume 1 Number 15

IRList Digest Volume 1 Number 13

IRList Digest Volume 4 Number 36

Recent Articles

The First Earth's Circumnavigation by Antonio Pigafetta

Yak Facts Issue #10: It's Flavorific!

Yak Facts Issue #9: Now with Ginseng

Yak Facts Issue #8: As Seen On TV

Yak Facts Issue #7: Caution: Live Animals

Yak Facts Issue #6: Repeat as necessary

The Esoteric Origin of the Universal Weekly Sequence

Yak Facts Issue #5: Repeat as necessary

Yak Facts Issue #4: In Technicolor

SA CROCORIGA MANNOSA

Recent Comments