Copy Link
Add to Bookmark
Report

IRList Digest Volume 1 Number 04

eZine's profile picture
Published in 
IRList Digest
 · 1 year ago

IRList Digest           Monday, 26 Aug 1985      Volume 1 : Issue 4 

Today's Topics:
EMAIL - digest formats
Query - ACM SIGIR Officers
Abstracts - Panel at SIGIR 85 Conf. in Montreal
Query - map databases (question and summary)
- response about World Databank II

----------------------------------------------------------------------

Subj: From: David.Anderson@CMU-CS-K
Date: 17 Aug 1985 11:45-EDT
Subject: Re: IRList V 1 No. 2

It would be nice if you could change the format of the IRList digests
to match the "standard" ARPAnet digest layout, so that those of us with
digest reading software could read the digests more effectively.
Basically the lines of -------- aren't quite right -- if you need
details, let me know and I'll look them up.

--david
[Note: Thanks to David and some others that pointed this out! I believe
the digests are in correct form now. That is, 70 dashes at top, and
30 dashes separating messages, with a blank line before. - Ed]

------------------------------

From: Aviezri Fraenkel <fraenkel@wisdom>
Date: Sun, 18 Aug 85 09:53:43 -0200

Ed,
Who are the new SIGIR Officers? Could you pls send me a list of names
with their functions? Thanks very much.

Best wishes, Aviezri S. Fraenkel, Professor.

Bitnet: fraenkel@wisdom.bitnet
From Arpanet: fraenkel%wisdom.bitnet@wiscvm.arpa
From Csnet: fraenkel%wisdom.bitnet@csnet-relay.arpa

[Note: here is the requested information. Remember, ACM SIGIR needs
new members! The new officers are energetic and have good plans. - Ed

Chairman:
Professor Clement Yu Secretary:
Department of Elect. Eng. John Tolle, Ph.D.
and Computer Science OCLC, Inc.
Univ. of Illinois at Chicago 6565 Frantz Road
Chicago, Illinois 60680 Dublin, Ohio 43017-0702
(312) 996-2318 (614) 764-6000

Vice Chairman: Treasurer:
Professor Lee A. Hollaar Donna Williamson
Department of Computer Science National Library of Medicine
University of Utah 8600 Rockville Pike
3160 Merrill Engineering Bldg. Bethesda, Maryland, 20894
Salt Lake City UT 84112 301-496-1936
801-581-3203 donna@nlm-vax.arpa
hollaar@utah-20.arpa

------------------------------

From: Aviezri Fraenkel <fraenkel@wisdom>
Date: Sun, 18 Aug 85 09:30:23 -0200

Ed,
At the recent SIGIR Conference in Montreal, a panel on Theoretical and
Algorithmic Aspects of Information Storage and Retrieval took place. I thought
that the abstracts of the panel might be of general interest, so I forward
them to you hereby:

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Panel on

THEORETICAL AND ALGORITHMIC ASPECTS OF INFORMATION STORAGE AND RETRIEVAL

Chairman: Aviezri S. Fraenkel

The main aim of the panel is to portray a large number of examples
illustrating how theoretical and algorithmic approaches can lead to a better
understanding of problems or to improved solutions to problems in Information
Storage and Retrieval.
Abstracts of the addresses follow below. Each address will probably
be limited to 10 minutes -- the mean free attention span at the end of a
strenuous day! -- followed by a 5 minute question and answer period.



SELECTION OF CLUSTERS
Clement Yu
The University of Illinois at Chicago
Department of Electrical Engineering and Computer Science
Illinois 60680, U.S.A.

The problem of deciding the clusters to be searched is considered.
If terms are distributed independently, a simple algorithm is given for
the problem. The algorithm is then modified to take into consideration
dependencies between terms. It is shown that the algorithm has low time
complexity and yields accurate estimation.



ON THE PHYSICAL ORGANIZATION OF CLUSTERED FILES

Vijay V. Raghavan
University of Regina
Department of Computer Science
Regina S4S 0A2, Canada

The process of classification involves placing a set of objects into
affinity classes or clusters, in such a way that the objects within a class are
more similar to each other than they are to objects outside the class. In
information retrieval, the documents comprising the bibliographic data base
may be classified in order to identify clusters of documents that exhibit
similar characteristics. In addition, there may also be a need to cluster
index terms used to represent documents or to obtain a clustering of user
queries.
Since the objects within a cluster are to be retrieved jointly, it is
desirable that they be kept in close proximity in physical storage. When the
clusters are pairwise disjoint, a simple scheme called the cluster-inverted
organization provides the answer.
More generally, when clusters are allowed to overlap or when multi-
level clusterings are employed, specialized algorithms are needed to ensure
that physical proximity of objects of a cluster is achieved while keeping
the redundancy in stored objects as small as possible.
Some recent results in this area are identified. In particular, it
is pointed out that for a single level, overlapping clustering, identifying
minimum redundacy organization is NP-complete. This result justifies the
development of a heuristic algorithm that is efficient, although not always
optimal. The results for multi-level clusterings deal with the issue of
having one arrangemet of the objects that would simultaneously achieve
consecutive retrieval for clusters in every level of a clustering.




THE USE OF FUZZY SUBSET THEORY IN INFORMATION RETRIEVAL

Donald H. Kraft
Department of Computer Science
Louisiana State University
Baton Rouge, Louisiana 70803, U.S.A.


Substantial work has been done on the application of fuzzy subset theory
to information retrieval. Boolean query processing has been generalized to
allow for weights to be attached to individual index term and to terms in a
query. However, problems still remain in terms of preservation of the Boolean
lattice structure. Interesting applications using weights to generate
the linguistic structure of queries have been considered. Moreover, work is
ongoing to develop retrieval performance measures taking ranked output into
consideration.




SOME RECENT ADVANCES IN STRING-MANIPULATION ALGORITHMS

Alberto Apostolico
Computer Science Department
Purdue University
West Lafayette, Indiana 47907, U.S.A.


The talk will briefly review some recent progress in (serial as well as
parallel) Algorithmics on Strings which seem of relevance to common
tasks of Information Storage and Retrieval.
One such task is the well known, and widely investigated, pattern matching
problem, i.e., the problem of finding the first (or all) occurrences of
an assigned string (the pattern) in a longer string (the text). Some
practically relevant variations of the pattern matching problem
have also been enhanced to some extent, although the main questions
posed by them are still open: examples are the so-called pattern matching
with don't care and the approximate pattern matching ,i.e.,
pattern matching within an assigned edit distance. Other advances
revolve around index structures built around the notion of Suffix Tree.
Such structures have been found suitable to such tasks as
multiple pattern matching in time linear in the sum of the lengths of the
patterns, computing the complete substring statistics with or without
overlaps, supporting some data compression techniques, building
inverted files associated with a set of text files, etc.
Finally, recent progress in problems related to common subsequences
of strings will be pin-pointed.



ROBUST DATA COMPRESSION

Aviezri S. Fraenkel
The Weizmann Institute of Science
Department of Applied Mathematics
Rehovot 76100, Israel

For speeding up data retrieval or data encryption, or for reducing
communication costs, it is often desirable to compress data. Many of
the standard encodings used for compression are very error-sensitive
in the sense that a single error may render the entire tail of the
encoded string undecipherable. A new class of robust universal and
uniquely decipherable encodings, based on exotic numeration systems,
will be reviewed briefly.



MEASUREMENT-THEORY AND EVALUATION MEASURES IN INFORMATION RETRIEVAL

Peter Bollmann, Technische Universitaet
Berlin

If we get measurement-values by the application of an evaluation measure
we want to make statements using these values. Such statements include
the comparison of these values, of differences, of percentages or
of arithmetic means. These comparisons are not meaningful for all
kinds of scales. For example the comparison of two arithmetic means is
meaningful for an interval scale but not for an ordinal scale.
Measurement-theory yields criteria which assumptions scales have to
fulfil and hence whether a statement is meaningful or not. In order to
have a foundation for measurement in information retrieval the basic
concepts of measurement-theory have to be interpreted in terms of
information retrieval. Examples will be given.

With best wishes, Aviezri S. Fraenkel.


Bitnet: fraenkel@wisdom.bitnet
From Arpanet: fraenkel%wisdom.bitnet@wiscvm.arpa
From Csnet: fraenkel%wisdom.bitnet@csnet-relay.arpa

------------------------------

From: thrapp@noscvax.UUCP (Gary R. Thrapp)
Subject: summary, map generation on micros
Date: 31 Jul 85 17:33:57 GMT

[Forwarded from WORKS Digest V.5 No. 31. - Ed]
A couple of weeks ago I submitted the following request:

****************************************************************

I am interested in obtaining software to generate geographic maps on
Sun workstations and other micros of various word sizes. I expect
this will involve a map database and software to access it. The
preferred features are to accept a map center and radius (or
latitude-longitude boundaries) and generate a display with shore
outlines or filled landforms with a lat-lon grid. If the software
used high level generic graphics calls for lines and polygons it
could help for portability.

A while ago I checked into getting a database called World Databank
II, however I was told it took 9 reel-to-reel tapes to hold it. I am
interested in something with perhaps less resolution but that would
fit on micros with limited storage.

I plan to post a summary of responses I receive by mail. Thank you.

Gary R. Thrapp
Naval Ocean Systems Center
San Diego, CA

MILNET/ARPANET: thrapp@nosc
UUCP: {ihnp4,akgua,decvax,dcdwest,ucbvax}!sdcsvax!noscvax!thrapp

****************************************************************

Thank you all for the responses I received. There are some
possibilities here given some work. However my interest was in a
medium resolution map database of a couple of megabytes or so with
landforms and high level source code to display it. I was
considering something simple that could be easily ported to various
micros. If someone knows of such software please send me mail.

For those who expressed interest in the CIA generated map database
and software, here is the information I have found:

World Databank II, 5 volume set ( > 400 megabytes, perhaps
much more) $660.
CAM Fortran software to access and display
the map database $400.

National Technical Information Service
5285 Fort Royal Road
Springfield, VA 22161

Products 703-487-4600
Sales 703-487-4650

Here is an edited summary of the responses that I received, minus
references to others that I don't have permission to post:

****************************************************************

For ibm-type pc's, you might consider routines from a company
called Golden Software. They have a series of programs, fairly cheap
that take x,y,z coordinates and plot 3-D and topographic maps.
Various rotations, translations, and slices are possible.

****************************************************************

How about World Database I? It's about 5 megabytes.

****************************************************************

I think you can get whatever map information you want, in up to 1/2
kilometer (or better) resolution, from the National Center for
Cartieographic Studies or something. Check in a book called
"Information USA," it's listed there. They can provide it on 9-track
tape in various formats.

If too much resolution is a problem, you can write some programs to
take, for instance, a costal outline and do vector sums until you
reduce it to something reasonable.

****************************************************************

I am writing software to display maps (color or black &
white) on Sun workstations. The software is in C, utilizing the
SunCGI low-level graphics package. CGI is an ANSI Computer Graphics
Interface standard currently under development. Also I am using the
the multiple tools/windows, subwindows and icon facilities (Pixwin)
of the Sun workstations.
As for the map information itself, I'm planning on using the
US Geological Survey (USGS) digital cartographic standards. There
are two parts to the standards: Digital Line Graphs (DLG), such as
rivers and roads; Digital Elevation Models (DEM), ground elevations.
The DEM is available also in a modified format (a tighter gridding of
elevation points) from the Defense Mapping Agency (DMA), and also
from USGS. I'm currently working with the DLG attributes only. The
USGS has completely mapped the USA with a 1:250,000 map scale and is
slowly digitizing on a 1:24,000 scale. Eventually I believe the
digital standards will apply to overseas maps as well, the DMA is
already using a modified DEM standard.
I plan to allow zooming between 1:250,000 and 1:24,000 scale
maps, panning, toggling of features, etc. The software will be
mostly mouse driven with an eye toward easing the usage by
non-typists. Various dynamic features, such as military units, will
be represented by icons (standard military symbols) which can be
manipulated by a mouse.
In general most of the software will be portable to other
systems, the only exception will be the Pixwin features of the Sun
workstations. I am using the ANSI CGI and the USGS DLG/DEM standards
with regards to portability to, and compatabilty with, other machines
and/or software.
As for the storage of data on micros, that will be a problem.
I found that maps are very detailed, dense pieces of information. I
suggest that the map data base be eventually stored on optically
encoded disks (giga-bytes).

****************************************************************

This won't help with a Sun, but it might give you someting to
look into: Software Concepts, Stamford CT (203-357-0522) sells a
low-priced graphics program for the IBM-PC that is supposed to
provide a 3-D world map. It costs $69.95 and fits on one disk. The
features seem impressive for this package - turning the globe N, E, S
or W, zoom, etc. It also does some distance calculating and plotting.
All this info from an articel in PC Week of a few months ago.

****************************************************************

We have the NCAR graphics package up and running on our Suns. One of
the routines is called SUPMAP, and its purpose is to draw maps in
various projections centered at various points. It may have all of
the capabilities you need. My only experience with it is running the
test program provided by NCAR to see if it worked as expected (it
did). The map database is about 1Mbyte. I don't know how hard it
would be to port the NCAR package to other micros. The thing is
written in Fortran 66, but is very machine dependent, and we had to
hack hard to get it to work on the Suns and the Vax.

------------------------------

From: rick@seismo.UUCP (Rick Adams)
Subject: Re: summary, map generation on micros
Date: 14 Aug 85 22:41:57 GMT

[Forwarded from WORKS Digest V.5 No. 31. - Ed]
Actually the World Databank II is really only about 20 Mbytes. It
comes on 5 tapes and looks like 400 Mbytes of data because, as
delivered, it has everything in fixed length records of ascii
characters. Once on your target machine, by writing binary floating
point numbers instead of the characters, you can get it down to a
reasonable size. You can reduce it even smaller by throwing out the
rivers and state/province boundaries.

---rick

------------------------------

END OF IRList Digest
********************

← previous
next →
loading
sending ...
New to Neperos ? Sign Up for free
download Neperos App from Google Play
install Neperos as PWA

Let's discover also

Recent Articles

Recent Comments

Neperos cookies
This website uses cookies to store your preferences and improve the service. Cookies authorization will allow me and / or my partners to process personal data such as browsing behaviour.

By pressing OK you agree to the Terms of Service and acknowledge the Privacy Policy

By pressing REJECT you will be able to continue to use Neperos (like read articles or write comments) but some important cookies will not be set. This may affect certain features and functions of the platform.
OK
REJECT