Copy Link
Add to Bookmark
Report
IRList Digest Volume 1 Number 09
IRList Digest Tuesday, 17 Sep 1985 Volume 1 : Issue 9
Today's Topics:
EMAIL - Back issues, help needed
Query - Relationship between Videotex and IR
- Takers for offer of Tax Expertise Available for Expert System
Announcement - Seminar on Speech Recognition and NL Processing (BBN)
Article - The Inter-Network Database Server
----------------------------------------------------------------------
From: Rabjohns.Henr@XEROX
Date: 4 Sep 85 08:32:07 EDT (Wednesday)
...
The IRList digest sounds like a good one to me, would it be possible
to become a member of the dl and also could you send me any back issues of
the digest that you might have already generated.
Thanks in advance,
Douglas T. Rabjohns
[Note: Virginia Tech is a member of CSNET, connected through Phonenet.
We are polled twice daily for mail. We cannot FTP nor can other sites
directly access our computers over Internet. Would any ARPAnet sites
like to keep an archive of IRList messages so that others can FTP?
It would be much easier for those who missed issues to get them that
way. Also, at the present time, sending back issues repeatedly is
very expensive for my department. Thanks, Ed]
------------------------------
From: Tom Scott <scott@BGSU>
Date: Thu, 5 Sep 85 12:32:28 edt
Subject: The Relationship between Videotex and Information Retrieval
This month's "Hardcopy" (September 1985, p.18) has a half-page
article on the relationship between videotex and information retrieval
which may be of interest to the readers of IRList. The author, Pam
Jones, quotes Leslie Townsend of International Resource Development as
follows:
Videotex has been called a technology, when it's
really not so much a technology as it is an
information retrieval method .... The line between
what were software packages for information retrieval
and hardware for videotex systems are [sic]
essentially becoming negligible.
I'd like to hear more about this. Perhaps someone would
submit and article to IRList, detailing the relationship between
videotex and information retrieval. What exactly is videotex? What
are its theoretical basis, practical applications, and technology?
How do the theory, practice, and technology of videotex compare and
contrast to the theory, practice, and technology of information
retrieval?
Tom Scott CSNET: scott@bgsu
Dept. of Math. & Stat. ARPANET: scott%bgsu@csnet-relay
Bowling Green State Univ. UUCP: osu-eddie!bgsuvax!scott
Bowling Green OH 43403-0221 ATT: 419-372-2636
------------------------------
Date: Thu 29 Aug 85 13:26:29-CDT
From: Charles Petrie <CS.PETRIE@UTEXAS-20.ARPA>
Subject: Tax Expertise Available for Expert System
[Copied from AIList Digest V3 N 116 - 2 Sept]
Prof. Lewis Solomon is a specialist in tax law and is interested in
working with someone on an expert system in that domain. He would also
like to hear about existing systems. His US Mail address is:
George Washington University
National Law Center
Washington, D.C. 20052 Phone #(202)676-6753
------------------------------
Date: 20 Aug 1985 16:08-EDT
From: AHAAS at BBNG.ARPA
Subject: Seminars - Speech Recognition and NL Processing (BBN)
[Copied from AIList Digest V3 N 117 - 3 Sept, where it was:]
[Forwarded from the MIT bboard by SASW@MIT-MC.]
There will be an AI seminar on Monday August 26 at 10:30 in the second floor
conference room at 10 Moulton St. Jean-Francois Cloarec and Michel Gilloux
of Centre Nationale d'etudes des Telecommunications (CNET), Lannion, France
will speak. Their abstract:
SERAC : An Expert System for Acoustic-Phonetic Speech Recognition
We present a knowledge based approach to speech recognition at the
phonetic level. SERAC is a production system generating phonetic
hypotheses for continuously spoken french sentences.
We give the motivations for using such an approach and we describe
the knowledge representation language.
Then we present the knowledge base and report some preliminary
results.
There will be another talk by Karen Sparck Jones the next morning,
August 27th, at 10:00 in the 2nd floor conference room. Her abstract:
Natural Language Processing Research
at the
Computer Laboratory, University of Cambridge
The talk will outline recent and current work at the
Laboratory. This includes both research with a semantic
stimulus and research driven by parsing issues. The semantic
work is concerned with interpretation problems like reference
resolution, and with techniques for representation and
inference involving general as well as domain knowledge, in
the context of such tasks as database query and construction,
paraphrase, and indexing. The parsing work includes projects
on grammar construction, morphological analysis, and the use
of a large machine-readable dictionary, and research on finite
state techniques for compositional interpretation and on
robust phrase-based parsing strategies.
------------------------------
From: Henry Nussbacher <vshank%weizmann.BITNET@WISCVM>
Date: Fri, 30 Aug 85 10:39 O
How does the Inter-Network Database Server Work
Henry Nussbacher
This article will attempt to describe all the components
that go into the database server that is currently running
on host BITNIC in Bitnet. The work on developing this
inter-network database server is being funded by the Bitnet
Development and Operations Center (BITDOC).
DATABASE is an information retrieval system that will allow
users from any network to have access to various types of
information contained within a full-blown database system.
DATABASE is a server machine (daemon) that runs on the VM
operating system at BITNIC. It can accept commands in many
different ways. Users within Bitnet can send interactive
messages, punch files (record length of 80), print files
(record length of 133), and Note files (IBM standard for
electronic mail). In addition, users that are not located
within Bitnet, but reside in any of the other networks that
are connected to the Internet (Mailnet, Arpanet, Csnet,
UUCP, etc.) can send RFC822 mail to
DATABASE%BITNIC.BITNET@WISCVM.ARPA and the server will
accept it as a command.
Language used
=============
DATABASE is written in Rexx (approximately 1600 lines), a
high level macro language for VM. Rexx is a combination of
Algol (parsing capabilities), C and Pascal (structured
programming - Do, Do While, Do Until, Do Forever, Select-
When-Otherwise) and PL/1 (functions - Index, Verify, Substr,
Translate, Justify, etc.). Rexx has developed quite a
following within the hackers that use Bitnet since it was
created using the best properties of existing high level
languages while leaving out the parts that everyone hates
(e.g. declares - all Rexx variables are self declaring,
etc.).
By definition, Rexx has access to the VM file system. What
needed to be created was hooks into the VM mail system and
into a VM database system called Spires.
Interface to Bitnet mail system
===============================
It was decided that the most common aspect among all
computer networks in the world was RFC822 mail. Upon this
basis, the mail interface was developed. It is a separate
subroutine within DATABASE, so that when X.400 mail becomes
more accepted, additional coding can be done without
affecting any of the other segments of the code.
Most Bitnet sites run a package developed at Columbia
University which is another system server to handle mail
files. It performs all the validity checking and routing of
mail. When a mail file arrives at DATABASE, it is parsed to
find out from where it came. The first non-blank line after
the RFC822 header starts the command stream. Multiple
commands can be coded within one mail file for delivery to
DATABASE. The commands are passed into Spires for handling
and the results are stuffed into the VM file system.
DATABASE then takes the resultant file and places it inside
an RFC822 envelope and sends it to the mailer server for
handling. If the "From:" field that was sent is invalid
(i.e. xyzzy@Mitre-Bedford instead of the correct form of
xyzzy@Mitre-Bedford.ARPA) then the mailer server will kick
out the mail since Bitnet cannot determine where to send the
mail back to.
The Spires Database System
==========================
There are many database packages that are available for VM
systems: Focus, SQL, Adabas, etc. Spires (Stanford Public
Information Retrieval System) was selected due to its
functionality and the its ability to accept commands from
Rexx.
Spires can locate a single record within a database of half
a million records after only 4 disk reads (maximum). Spires
is an index based database system. The definer can define
indices for any field that he/she so wishes. Spires is the
result of over 10 years of development at Stanford
University and has such advanced functions as phonetic
search capability as well as all the standard items one
expects to find in a database package (report generator,
sequential processing, etc.)
Arpanet Digests
===============
One of the first projects was the incorporation of selected
Arpanet digests into the DATABASE system. The auto-digest
loader is written in Rexx (approximately 800 lines) and has
tables to control which digests to accept and properties
they contain. Certain digests are digested (examples: Ai-
List, Info-Ibmpc) and some are rebroadcasted immediately
(examples: Info-Nets, Security). Digested digests fall into
two categories. Some follow the standard of having exactly
30 '-' (hyphens) separating individual entries along with 70
hyphens separating the table of contents from the individual
entries. These digests, in addition are sequenced and have
their first line being a title line for the digest.
Examples of these digests would be Ai-List, Info-Ibmpc,
Info-Kermit, and Sf-Lovers. The other example would be
digested digests that are not sequenced and that do not have
a title line. An example of this style of digest is Info-
Graphics.
As Arpanet forums arrive, they are parsed into their
individual entries and added to the appropriate database
subfile. This process of database addition is performed
independently of the functioning of the DATABASE server.
Further documentation
=====================
To receive a detailed list of the valid command structure as
accepted by DATABASE, send an RFC822 piece of mail to the
address as stated above with the single line of HELP. You
should receive in return an RFC822 piece of mail with
introductory documentation on how to use DATABASE. In order
to receive further information on Arpanet digest searching,
issue the command HELP ARPANET.
Appendix - HELP ARPANET
======== ============
This service (Arpanet digests) will be available as of mid-October 1985.
DATABASE - Bitnet Inter-network Database Server (last updated 08/16/85)
--------
This Inter-network database server is currently under development
by the Development and Operations Center (BITDOC) of Bitnet.
Suggestions and comments should be forwarded to:
Henry Nussbacher
Bitnet: HANK@BITNIC
Internet: HANK%BITNIC.BITNET@WISCVM.ARPA
-------------------------------------------------------------------------
How to search Arpanet digests
=============================
This document is meant for advanced users who have mastered the beginning
help file.
For those who don't know what Arpanet digests are: There are currently
over 100 discussion forums that are maintained within Arpanet. These
range from discussions about the Apple MacIntosh to new standards for
networks to information about computer security. Some of these
discussions appear as a digest; a moderator receives all contributions
and creates a formatted digest that is sent out to all subscribers. The
other form of discussion is an immediate redistribution list, where
contributions to the discussion are immediately rebroadcasted to all
individuals who have registered for that discussion.
DATABASE now has the facility to search various selected Arpanet
digests. In order to receive a list of all valid subfiles, issue the
command 'LIST':
DRINKS Demonstration subfile
EXPLAIN Database System subfile
MOVIES Demonstration subfile
PATHFINDER Database System subfile
PRESIDENTS Demonstration subfile
RECIPES Demonstration subfile
RESTAURANT Demonstration subfile
INFO-GRAPHICS Arpanet discussion forum (digest)
AI-LIST Arpanet discussion forum (digest)
INFO-NETS Arpanet discussion forum (digest)
Arpanet "digested" digests, when loaded into the database, are pulled
apart into their individual components, so that when you perform a
search against a particular "digested" digest, you will not receive
the entire digest but rather just the entry that pertains to your
search request.
The following fields are defined as indices for all subfiles that
arrive from Arpanet:
Goal Records: ENTRIES, ENTRY
Simple Index: SD, SPIRES-DATE
Simple Index: SPIRES-TIME, ST
Simple Index: GRANDSEQ, GS
Simple Index: SUBJECT, T, TITLE
Simple Index: FROM
Simple Index: DATE
Simple Index: SEQ, SEQUENCE
Simple Index: TEXT
SPIRES-DATE (or SD) allows a user to find entries based upon the
date it was added. Examples:
FIND SD > 08/01/85 (IN AI-LIST
would find all entries that have been added to Ai-List after 08/01/85.
FIND SD < 07/01/85 (IN INFO-NETS
would find all entries that have been added to Info-Nets before July
1st, 1985.
SPIRES-TIME (or ST) allows you to refine your search even further when
wishing to find entries that have been added after (or before) a
particular day and time. Note should be taken that these indices of
date and time are not the date and time fields as mentioned within
the Arpanet digest but rather the actual date and time that the data
was loaded into the DATABASE system.
GRANDSEQ (or GS) is a unique integer number that is given to each
Arpanet digest as it is added. Certain immediate redistribution
digests (like Info-Nets) do not supply any sequencing number. This
sequence number is assigned by an alternate conferencing system within
Bitnet so that duplication of entries will not occur. This sequence
number will generally be different than the sequence number as assigned
by a "digested" digest moderator. Examples:
FIND GS 96 (IN AI-LIST
would find all individual entries from an Ai-List digest that was
assigned a Grand sequence number of 96.
FIND GS > 10 (IN INFO-NETS TABLE
would find all entries that have been assigned a Grand sequence
number greater than 10. In addition, since the list may be quite
long, this example has specified the TABLE option, which will display
a concise list of which entries have been found.
SUBJECT (or TITLE or T) is the 'Subject:' header line that generally
appears on each Arpanet digest entry. Examples:
FIND SUBJECT PIXEL (IN INFO-GRAPHICS
FIND TITLE PROLOG (IN AI-LIST
FIND SUBJECT WORKST* (IN INFO-GRAPHICS
FROM is the 'From:' header line that appears on each Arpanet digest
entry. Examples:
FIND FROM STRING DEC (IN INFO-NETS
would find any entry in Info-Nets that had a 'From:' field with
the character string 'DEC' anywhere within the field.
FIND FROM HENRY (IN AI-LIST
would find any entry in Ai-List that has the word HENRY in the
'From:' field.
DATE is the 'Date:' header line that appears on each Arpanet digest
entry. Examples:
FIND DATE EDT (IN INFO-IBMPC
would find all entries that have the word EDT in the 'Date:' field
of the entry.
FIND DATE JUL OR DATE JUN (IN INFO-NETS
would find all entries that have the word 'JUN' or 'JUL' in their
'Date:' field.
SEQUENCE (or SEQ) is only valid for Arpanet "digested" digests. Each
"digested" digest is assigned a sequence number by the moderator.
Examples:
FIND SEQ 102 (IN AI-LIST
FIND SEQ > 90 (IN INFO-IBMPC TABLE
TEXT is the text of the entry. This is defined as the section of an
entry that follows a blank line after the 'Date:', 'From:' and
'Subject:' (optional) fields. This entire section is keyword searchable.
Examples:
FIND TEXT EARN (IN INFO-NETS
FIND TEXT PROLOG AND TEXT LISP (IN AI-LIST
FIND TEXT XENIX AND SEQ > 85 (IN INFO-IBMPC
FIND (TEXT UNIX OR TEXT XENIX) AND DATE PST (IN INFO-IBMPC
------------------------------
END OF IRList Digest
********************