Copy Link
Add to Bookmark
Report
IRList Digest Volume 2 Number 19
IRList Digest Tuesday, 8 Apr 1986 Volume 2 : Issue 19
Today's Topics:
Article - Research proposal:Software development for information structures
CSLI - Categories of Correspondence
----------------------------------------------------------------------
Date: Fri, 4 Apr 86 01:09:34 est
From: vtcs1::in% (Peter_Smit%ub-mts%umich-mts.mailnet@mit-multics.ARPA)
Subject: seventh possible PhD research project
[Note: This is a long message, but Peter would like feedback. I have
left this in a narrow column (though in future it is better if
submissions are done more normally), which I hope won't hurt
readability too much. The copy I edited had the 1st letter of most
lines omitted, so in case I erred in editing, you can try to figure
out the correct form yourself, since lines are as before. - Ed]
University of Michigan
Peter Smit 8603 2803
Urban, Technological and Environm. Planning
TITLE: Software development strategies for
new information structures.
BACKGROUND: Providing retrieval services of
existing literature sources for the public at
large, as is done at present by some on-line
systems and some laser disk services, is not enough
anymore. There are many such services already.
Moreover, retrieval by category or by a
combination of keywords is not good enough because,
in response to the general queries that non-
specialists are most likely to ask, it responds
with impractically large numbers of references.
While all of these may include that general
concept, usually only few provide good introductory
readings for a novice. Finally, the readings to
which retrieval services typically refer contain
parts that are redundant, side tracks irrelevant to
the user's purposes and specialized unexplained
terminology. People in a practical frame of mind
would like to know what all these articles and
books really have to say about their particuilar
problem, but they don't have the time to read even
a selection.
Rather, there may be some demand for a service that
provides short summaries of general as well as
specific topics, with menus of choices after each
summary to get to related or more specific summary
items. The advantage of this approach is, first,
that in practice, the use of menus is easier than
having to learn command codes or retrieval
languages which is often necessary in keyword
retrieval systems. Also, the specially re-written
summaries can be made short enough to fit on a
single screen display. Furthermore, general
overview items can be retrieved in response to a
general queries, and more specific items in
response to queries contqaining more specialized
searching terms. However many times different
source texts repeat each other in mentioning some
basic facts, the users will have to read those
general introductory items only once because one
summary item will cover all those instances.
Moreover, items can be made to inculde menus or
references to other items that deal with similars
topics, provide evidence, mention exceptions, or
are otherwise related. Finally, just as that list
or menu of related items, a list can be provided
of all the books, articles and other source texts
that mention the topic described in the item. The
latter will help the reader to assess the strength
or truthvalue of the contents of the items.
The disadvantage of the proposed linked, general
and specific summary items is that, in order to
make these, all literature has to be picked apart,
its elements summarized and compared to similarly
dissected sourcematerial. This is kind of analysis
is different from abstracting because, of the
desired summaries, some go much more into depth
than typical abstracts do, and the summaries may be
segemented in strange ways to fit the divisions as
to what is mentioned in other source texts as well.
Not only is this a lot of work, but it also
involves judgements that cannot be left to a
computer: are particular paragraphs of two articles
so similar as to be redundant, does one paragraph
support the other or generalize from the other, are
they different but related or do they directly
oppose eachother, etc. In fact, not even all
people will be able to analyze and summarize all
texts. In a field like environmental care, there
are specialties like nature preservation, traffic
planning, housing, historic preservation, economic
development, utilities, water quality management,
emergency preparedness, recreation, etc. Probably
only people from the right specialty, or better
yet, a panel of those specialists, will find their
content analyses and summaries accepted by others
in the field. Making new information items for a
service as proposed above, then, will be a very
labor intensive endeavour. Ways should be sought
in which computers may assist in getting that work
done.
APPROACH: While the computer can not interpret
texts well enough to formulate the new sets of
summary items and link these with existing items,
it can help in setting priorities for new texts to
process and it can provide administrative support
to the specialists that do the analysis,
reformulation and overview building. Computer
programs to assist in retrieval from the envisioned
integrated sets of text items should be no problem
in principle because several retreival files with
that structure are operational already (e.g.
Bernstein and Williamson's ANNOD, files on the
English PRESTEL or the Dutch VIDITEL, etc.).
However, software for entering, editing and
connecting new items into a file of this kind may
not exist because most of these files are small,
custom made and may not expand as much as get
details within their items updated or altered.
Research questions are, then, (A) Is there software
that does part of the job, such as:
Maintain a classification of subscribers as to
their area of specialty and the kind of
feedback or reward that has made them willing
to process a new text in the past.
Perform a word frequency analysis on newly
incoming titles in order to determine the area
of specialty of which they seem to be a part.
Administer the citation (or the entire new
source text if there is room) to several
subscribers in the proper area of specialty and
ask, when they sign on, if they would be
willing to process it. Keep track of who is
beginning to respond and how fast they seem to
go. Put out this request to even more
articipants if the process is taking too long.
While helpful subscribers are retrieving items
to see how they relate to what the new text is
saying, let them work at a reduced rate.
provide a rebate on later system usage, lottery
tickets, names of famous people who are waiting
for this information to be added, or whatever
incentive is appropriate and indicated by 1.
Provide wordprocessing support and encourage
the use of standard layouts and formats that
are used throughout the file. Let it be a
matter of a few keystrokes to add the new
citation to lists of source material that are
shown on the bottom of other items.
Prod the analyst-editors for general summaries
that introduce a particular term or argument as
used in the new text, for title lines to
identify items in the menus of related items,
for connections to other texts even where the
author of the source text did not indicate
these in the new text or its bibliography, for
lists of synonyms, criteria to distinguish
between look-alike terms or for anything else
that may help in retrieval.
Facilitate debate when different specialists
disagree, such as concensus facilitation in the
delphi process and in computer conferencing.
Show them each other's analyses, perhaps in the
form of item-maps as well as the items
themselves and ask them to pick the prefered
formulation or pinpoint the areas of
disagreement. Collect particular third
opinions or hold a poll if necessary.
Enter items about which there is minimal
agreement in a provisional way with indication
of the nature and extent of uncertainty or
disagreement still pending.
Enter the items properly once sufficient
agreement has been reached, replace the revised
ones and delete the provisional ones.
10. Keep solliciting user feedback. Group comments
by the item or relation to which they pertain.
Call the matter to the attention of a
specialist in the proper area if the proportion
of users that leaves a comment is high.
11. Analyze usage frequencies and browsing patterns
so as to identify areas where additional
information would be most welcome and to spot
loops etc. where users are getting lost.
Propose clarifications even where users did not
have the awareness to leave comments.
12. Analyze specialist's summarizing quality by
tabulating the number of adjustments that have
to be made later to their work, because some
specialists are better at doing their thing
than at talking or editing about it.
Not all of these parts have the same priority, but
if programs for any of them exist, it would be good
to make programs for the other parts compatible and
to enable all of them to work from the same data
formats.
A further question if different software suppliers
can offer some of these parts, is, of course, (B)
what would be a good development startegy for
getting the missing parts written, in terms of
cost, duration and flexibility? What effort of
writing database reformatting programs can be
justified to accomodate less compatible programs?
If you know of (A) any appropriate software, even
if only for part of the job, or of (B) any
strategies for complex package development, I
should like to hear from you. Leave a message
before the end of April for Peter Smit on UB,
using the attached electronic address.
Thank you
------------------------------
Date: Fri, 4 Apr 86 01:08:43 est
From: EMMA@su-csli.ARPA
To: friends@su-csli.arpa
Subject: Calendar, April 3, No. 10 [Extract - Ed]
C S L I C A L E N D A R O F P U B L I C E V E N T S
April 3, 1986 Stanford Vol. 1, No. 10
Categories of Correspondence
Brian C. Smith (Briansmith.pa@xerox)
[April 3]
Photographs, sentences, balsa airplane models, images on computer
screens, Turing machine quadruples, architectural blueprints,
set-theoretic models of meaning and content, maps, parse trees in
linguistics, and so on and so forth, are all representations---
complex, structured objects that somehow stand for or correspond to
some other object or situation (or, if you prefer, are `taken by an
interpreter' to stand for or correspond to that represented
situation). It is important, in trying to make sense of
representation more generally, to identify the ways in which the
structure or composition of a representation can be used to signify or
indicate what it represents.
Strikingly, received theoretical practice has no vocabulary for
such relations. On the contrary, standard approaches generally fall
into one of two camps: those (like model-theory, abstract data types,
and category theory) that identify two objects when they are roughly
isomorphic, and those (like formal semantics) that take the
``designation'' relation---presumably a specific kind of
representation---to be strictly non-transitive. The latter view is
manifested, for example, in the strict hierarchies of meta-languages,
the notion of a ``use/mention'' confusion, etc. Unfortunately, the
first of these approaches is too coarse-grained for our purposes,
ignoring many representational details important for computation and
comprehension, while the latter is untenably rigid---far too strict to
cope with representational practice. A photographic copy of a
photograph of a sailboat, for example, can sometimes serve perfectly
well as a photo of the sailboat. Similarly, it would be pedantic to
deny, on the grounds of use/mention hygiene, that the visual
representation `12' on a computer screen `must not be taken to
represent a number,' but rather viewed as representing a data
structure that in turn represents a number. And yet there are clearly
times when the latter reading is to be preferred. In practice,
representational relations, from the simplest to the most complex, can
sometimes be composed, sometimes not. How does this all work?
Our approach starts very simply, identifying the structural
relations that obtain between two domains when objects of one are used
to correspond to objects of the other. For example, we call a
representation `iconic' when its objects, properties, and relations
correspond, respectively, to objects, properties, and relations in the
represented domain. Similarly, a representation is said to `absorb'
anything that represents itself. Thus the grammar rule `EXP ->
OP(EXP1,EXP2)', for a formal language of arithmetic, absorbs
left-to-right adjacency; model-theoretic accounts of truth typically
absorb negation; etc. A representation is said to `reify' any
property or relation that it represents with an object. Thus
first-order logic reifies the predicates in the semantic domain, since
they are represented by (instances of) objects---i.e., predicate
letters---in the representation. A representation is called `polar'
when it represents a presence by an absence, or vice versa, as for
example when the presence of a room key at the hotel desk is taken to
signify the client's absence. By developing and extending a typology
of this sort, we aim to categorize representation relations of a wide
variety, and to understand their composition, their use in inference
and computation.
------------------------------
END OF IRList Digest
********************