Copy Link
Add to Bookmark
Report
IRList Digest Volume 2 Number 60
IRList Digest Wednesday, 26 November 1986 Volume 2 : Issue 60
Today's Topics:
Email - Problems with the size of our distribution
- **URGENT** Request to members to set up re-distribution lists
Query - Address of Ellen Voorhees?
Abstracts - Reference, and Query on Bit String Use
- One Abstract on Bit Strings, Others on Associative Networks
- Bibliography on Bit String Use
News addresses are ARPANET: fox%vt@csnet-relay.arpa BITNET: foxea@vtvax3.bitnet
CSNET: fox@vt UUCPNET: seismo!vtisr1!irlistrq
----------------------------------------------------------------------
Date: Fri, 21 Nov 86 13:25:51 EST
From: seismo!rick (Rick Adams)
Subject: Re: ... my distribution list?
...
While we're on the subject of the distribution list, it is getting too
big to handle. We can not send to multiple recipients at the same host.
It is very expensive to do.
[Note: in case you did not know, Rick Adams at seismo has been letting
me send 1 message to their computer and then the distribution list is
expanded there to go out over various networks. So, we had best
follow his requests! - Ed]
At an absolute minimum, you need to remove the 50 or so bitnet addresses
and get them onto a mail forwarded on a bitnet site. The people
at BITNIC can help you and are encouraging this behavior. It is
making a big impact on the arpa/bitnet gateway.
I would appreciate it if you can get local forwarding on many of the
other sites as well. The general rule is that if there are more than
2 recipients on a host (bitnet counts as a host because it is a relay
machine) there should be a local forwarder. There should also
be one on csnet-relay, but they aren't very cooperative about those things
usually.
---rick
[Note: I have taken the following steps as a result of this message:
1) Bitnet recipients will receive mail directly from me from
foxea@vtvax3.bitnet. This will load our network for a while but
should give Bitnet recipients very fast service.
2) Remaining recipients will still be reached from seismo. However,
as you can see in the next message, I am encouraging your setting
up of re-distribution points.
Thanks for your cooperation! - Ed]
------------------------------
Date: Sat Nov 22 18:13 EST 1986
From: fox
Subject: Request for setting up distribution lists!
I am asking for cooperation of all IRList recipients to help with
setting up of distribution lists. Your computer system administrator
can set up an alias, such as
IRList-dist@site-address
so that I can just send to that and the 1 copy to your computer can
then be replicated to all local readers who are interested. There are
a number of aspects to changeover.
First, there are a number of lists already, and some of you could be
added to one rather than receive a direct copy:
bboard.IRList@r20.utexas.edu
cmu-IRList@cmu-cs-pt.arpa
dist-irlist@louie.udel.edu
incoming-IRList@SU-CSLI.arpa
incoming-IRList@sumex-aim.arpa
ir-list@lsu.csnet
ir@bellcore.arpa
irlist-bboard@red.rutgers.edu
irlist-disty@MIT-Multics.arpa
irlist-inbox@mcc.arpa
irlist-incoming@TI-CSL.csnet
irlist-local@scrc-stony-brook.arpa
irlist-p@brl.arpa
irlist@nlm-vax.arpa
irlist@umass-cs.csnet
irlist@usc-cse.csnet
munnari!IR-List
palladian-irlist@live-oak.lcs.mit.edu
ucl-irlist@ucl-cs.arpa
Second, there are several MIT addresses (which could, perhaps, be tied in
with above MIT-Multics or other distributions):
media-lab.mit.edu
mit-hermes.arpa
mit-mc.arpa
OZ.AI.MIT.EDU@XX.LCS.MIT.EDU
xx.lcs.mit.edu
Each of these sites, at least, should have a re-distribution address.
Similarly there are several at CMU:
cad.cs.cmu.edu
CMU-CS-G.arpa
CMU-CS-K.arpa
sei.cmu.edu
Also, there are several lll sites:
lll-mfe.arpa
lll-tis-a.arpa
lll-tis-b.arpa
lll-tis.arpa
sav@LLL-MFE.arpa
sds.mfenet@lll-mfe.arpa
Further, there are several sites at Digital:
bartok.dec@decwrl.dec.com
closet.DEC@decwrl.dec.com
gvaic2.dec@decwrl.dec.com
newton.dec@decwrl.dec.com
sprite.DEC@decwrl.dec.com
whoaru.DEC@decwrl.dec.com
Finally, there are many sites with >1 recipient, where each person
should contact their computer system administrator, ask for a
distribution list to be set up, and ask to be added to that. Then, the
system administrator can tell me the new address and what old
addresses are handled by it:
allegra
apple.csnet
cornell.arpa
cs.dal.cdn@ubc.csnet
cs.ucl.ac.uk
gmr.csnet
gmu90x
hans@oslo-vax.arpa
harvard.arpa
hplabs.arpa
ibm-sj.arpa
ihnp4!hoqam
mitre-bedford.arpa
mitre.arpa
njit-eies.MAILNET
northeastern.csnet
nyu-csd2.arpa
nyu.arpa
smu.csnet
sri-ai.arpa
sri-nic.arpa
uchicago.csnet
unl.csnet
usc-isi.arpa
utah-20.arpa
Thanks for your cooperation! - Ed
------------------------------
Date: Wed, 19 Nov 86 00:41:09 est
From: kraft@LSU.CSNET
Subject: where is ellen voorhees
Ed, do you have a forwarding address for ellen voorhees? She seems to have
left Cornell and no one there seems to have heard where she is? Thanks, Don
[Note: I have heard that she is working at a company in the Princeton
NJ area but would welcome further details myself too. - Ed]
------------------------------
Date: Tue, 28 Oct 86 15:14:51 -0100
From: Wyle <seismo!mcvax!ifi.ethz.chunet!wyle>
Subject: New reference and question on bit strings
Ed:
Here is an entry to the bibliography. I don't think it has
been published yet. Entries on bitstrings will follow.
I am looking for literature references related to bit strings and
signature records used in text indexing. Does anyone in IR digest
list know of a good place to start looking? Who are the key players
in "signature records" used in IR?
[Note: there was an article that surveyed related matters, that might
help fill out your bibliography further:
Faloutsos, Christos. Access Methods for Text. ACM Computing
Surveys, 17(1), March 1985.
Thanks for the references! - Ed]
%A M Domenig
%A P Shann
%T Towards a dedicated database system for dictionaries
%B Proceedings of the 11th International conference on Computational
Linguistics, August 25-29 1986
%C Bonn
%I IKP Universitaet Bonn
------------------------------
Date: Wed, 29 Oct 86 09:04:11 -0100
From: Wyle <seismo!mcvax!ifi.ethz.chunet!wyle>
Subject: Yet more new references
Here is a bit string entry and some associative networks
references of particular interest (using connectionism
to index text):
%A D R McGregor
%A J R Malone
%T The Fact Database System - a system using generic associative
networks
%J Research and Development in Information Technology
%V 1
%P 55-72
%D 1982
%A D R McGregor
%A J R Malone
%T The Fact System - a hardware-oriented approach
%B Database management systems: a technical comparison
%E P J King
%S Computer sate of the art reports
%C Maidenhead
%I Pergamon Infotech
%D 1983
%P 99-112
%A S E Fahlman
%T A system for representing and using real-world knowledge
%C Cambridge Massachussetts
%I MIT Press
%D 1979
%A J R Quinlan
%T Induction over large databases
%R HPP 79-14
%S Heuristic Programming Project
%C Stanford California
%I Stanford University Press
%D 1979
%A R S Michaelski
%T A theory and methodology of inductive learning
%J Artificial Intelligence
%V 19
%D 1982
%P 189-249
%A D R McGregor
%A J R Malone
%T Generic associative hardware, its impact on database systems
%B Proceedings of an IEEE Colloquium on Associative Methods and
Database engines
%D May 1982
%A M L Minsky
%A S Papert
%T Perceptrons: threshold function geometry
%C Cambridge Massachusetts
%I MIT Press
%D 1986
%A S A Feldman
%A D Ballard
%T Computing with connections
$R TR72 14727
%C Rochester, New York
%I Rochester Institute of Technology, Computer
Science Department
%D 1981
%A K C Mohan
%A P Willett
%T Nearest neighbor searching in serial files using text
signatures
%J Information Science and Technology (Netherlands)
%V 11
%N 1
%P 31-39
%D 1985
%X A nearest neighbor search procedure is described for use
with serial files of textual data. The procedure involves the
grouping of records into blocks, each of which is characterized
by a fixed length bit string. A comparable query bit string may
be matched against each of these bit strings, and an upper bound
calculation used to identify those blocks which need to be
inspected in detail if the document that is most similar to the
query is to be identified. Experiments with three small
collections of documents and queries are used to test the
efficiency of the approach.
------------------------------
Date: Tue, 4 Nov 86 11:28:42 -0100
From: Wyle <seismo!mcvax!ifi.ethz.chunet!wyle>
Subject: Bit string bibliography references
As promised, here are bibliographic citations on bit string entries.
Our librarian can now send the references electronically (and
error-free), so there should henceforth be fewer errors.
...
%A A F Harding
%A M F Lynch
%A P Willett
%O Author's current address: Dept. of Information Studies,
Univ. of Sheffield, Sheffield, England.
%T Document retrieval using a serial bit string search
%J Inf-Process-Manage (GB)
%V 19
%N 1
%P 1-8
%D 1983
%K information-retrieval-systems
%K file-organisation
%K serial-bit-string-search
%K best-match-retrieval-system
%K serial-file-organisation
%X An experimental best match retrieval system is described based on the
serial file organisation. Documents and queries are characterised by
fixed length bit strings and the time-consuming character-by-
character term match is preceeded by a bit string search to eliminate
large numbers of documents which cannot possibly satisfy the query.
Two methods, one fully automatic and one partially manual in
character, are described for the generation of such bit string
characterisations. Retrieval experiments with a large document test
collection show that the two-level search can increase substantially
the efficiency of serial searching while maintaining retrieval
effectiveness, and that a single-level search based only upon the bit
strings results in only a small decrease in effectiveness in some
cases.
%A K D MacLaury
%O Author's current address: Res. Libraries Group Inc., Stanford, CA, USA.
%T Automatic merging of monographic data bases-use of fixed-length keys
derived from title strings.
%J J-Libr-Autom (USA)
%V 12
%N 2
%P 143-155
%D June 1979
%K library-mechanisation
%K monographic-data-bases
%K title-strings
%K bibliographic-files
%K optimized-character-position-key
%K Harrison-bit-string-key
%K fixed-length-keys
%K automatic-merging
%K library-mechanisation
%X To find duplicate records in machine-readable bibliographic files,
two different fixed-length keys were developed for finding matching
titles. Each had different characteristics and functions. An
optimized character position key was developed for comparing all
titles in the files and a Harrison bit string key, tolerant of
typographic errors and other small differences, was used for
comparing titles within small groups of records that were potential
matches.
%A E Mumprecht
%O Author's current address: IBM Corp., Armonk, NY, USA
%T Efficient bit string handling with standard processing units
%J IBM-Tech-Disclosure-Bull (USA)
%V 26
%N 10A
%P 4912-4914
%D March 1984
%K data-handling
%K semiconductor-storage
%K storage-management-and-garbage-collection
%K high-resolution-graphics
%K image-handling
%K data-processing
%K bit-string-handling
%K standard-processing-units
%K storage-reference-instructions
%K microprocessors.
%X The method described enhances the power of ordinary storage reference
instructions in standard processing units, e.g., microprocessors.
%A K Ramamohanarao
%A J W Lloyd
%A J A Thom
%O Author's current address: Dept of Computer Sci, Univ of
Melbourne, Parkville, Victoria,
Australia
%T Partial-match retrieval using hashing and descriptors
%J ACM-Trans-Database-Syst (USA)
%V 8
%N 4
%P 552-576
%D December 1983
%K database-management-systems
%K information-retrieval-systems
%K database-management-systems
%K hashing
%K descriptors
%K partial-match-retrieval-scheme
%K addresses
%K mathematical-model
%X This paper studies a partial-match retrieval scheme based on hash
functions and descriptors. The emphasis is placed on showing how the
use of a descriptor file can improve the performance of the scheme.
Records in the file are given addresses according to hash functions
for each field in the record. Furthermore, each page of the file has
associated with it a descriptor, which is a fixed-length bit string,
determined by the records actually present in the page. Before a page
is accessed to see if it contains records in the answer to a query,
the descriptor for the page is checked. This check may show that no
relevant records are on the page and, hence, that the page does not
have to be accessed. The method is shown to have a very substantial
performance advantage over pure hashig schemes, when some fields in
the records have large key spaces. A mathematical model of the
scheme, plus an algorithm for optimizing performance, is given.
%A R Sacks-Davis
%A K Ramamohanarao
%O Author's current address: Dept of Computing, Royal Melbourne
Inst of Technol, Melbourne,
Victoria, Australia.
%T Partial-match retrieval based on superimposed coding
%B Proceedings of the 6th Australian Computer Science Conference,
Sydney, NSW, Australia, 10-12 Feb. 1983.
%J Aust. Comput. Sci. Commun. (Australia)
%V 5
%N 1
%P 166-176
%O Author's current address: February 1983
%K information-retrieval
%K record-retrieval
%K partial-match-retrieval
%K data-files
%K superimposed-coding
%K descriptor-file
%X This paper describes a method for partial-match retrieval on very
large data files. The method is based on superimposed coding
techniques. Associated with the data file is a descriptor file
containing bit strings which describe the records. In order to
retrieve records efficiently a two level descriptor file is proposed.
An analysis of this scheme is presented.
%A R Sacks-Davis
%A K Ramamohanarao
%O Author's current address: Dept of Computing, Royal Melbourne
Inst. of Technol, Melbourne,
Victoria, Australia
%T A two level superimposed coding scheme for partial match retrieval
%J Inf-Syst (GB)
%V 8
%N 4
%P 273-280
%D 1983
%K database-management-systems
%K information-retrieval.
%K DBMS
%K information-retrieval
%K two-level-superimposed-coding-scheme
%K partial-match-retrieval
%K very-large-data-files
%K descriptor-file, bit-strings
%K descriptor-file
%X The authors describe a method for partial-match retrieval on very
large data files. The method is based on superimposed coding
techniques. Associated with the data file is a descriptor file
containing bit strings which describe the records. In order to
retrieve records efficiently a two level descriptor file is proposed.
An analysis of this scheme is presented.
------------------------------
END OF IRList Digest
********************