Copy Link
Add to Bookmark
Report
Neuron Digest Volume 07 Number 29
Neuron Digest Wednesday, 22 May 1991 Volume 7 : Issue 29
Today's Topics:
New FKI-Report - An O(N^3) Learning Algorithm
Preprint: building sensory-motor hierarchies
Two new Tech Reports
TR - Kohonen Feature Maps in Natural Language Processing
Paper Available: RAAM
New ICSI TR on incremental learning
New Bayesian work
TR - Learning the past tense in a recurrent network
Send submissions, questions, address maintenance and requests for old issues to
"neuron-request@hplabs.hp.com" or "{any backbone,uunet}!hplabs!neuron-request"
Use "ftp" to get old issues from hplpm.hpl.hp.com (15.255.176.205).
------------------------------------------------------------
Subject: New FKI-Report - An O(N^3) Learning Algorithm
From: Juergen Schmidhuber <schmidhu@informatik.tu-muenchen.dbp.de>
Date: 06 May 91 12:43:22 +0200
Here is another one:
=---------------------------------------------------------------------
AN O(n^3) LEARNING ALGORITHM FOR FULLY RECURRENT NETWORKS
Juergen Schmidhuber
Technical Report FKI-151-91, May 6, 1991
The fixed-size storage learning algorithm for fully recurrent
continually running networks (e.g. (Robinson + Fallside, 1987),
(Williams + Zipser, 1988)) requires O(n^4) computations per time
step, where n is the number of non-input units. We describe a
method which computes exactly the same gradient and requires
fixed-size storage of the same order as the previous algorithm.
But, the average time complexity per time step is O(n^3).
=---------------------------------------------------------------------
To obtain a copy, do:
unix> ftp 131.159.8.35
Name: anonymous
Password: your name, please
ftp> binary
ftp> cd pub/fki
ftp> get fki151.ps.Z
ftp> bye
unix> uncompress fki151.ps.Z
unix> lpr fki151.ps
Please do not forget to leave your name (instead of your email address).
NOTE: fki151.ps is designed for European A4 paper format
(20.9cm x 29.6cm).
In case of ftp-problems send email to
schmidhu@informatik.tu-muenchen.de
or contact
Juergen Schmidhuber
Institut fuer Informatik,
Technische Universitaet Muenchen
Arcisstr. 21
8000 Muenchen 2
GERMANY
------------------------------
Subject: Preprint: building sensory-motor hierarchies
From: Mark Ring <ring@cs.utexas.edu>
Date: Wed, 08 May 91 16:16:31 -0500
Recently there's been some interest on this mailing list regarding neural
net hierarchies for sequence "chunking". I've placed a relevant paper in
the Neuroprose Archive for public ftp. This is a (very slightly
extended) copy of a paper to be published in the Proceedings of the
Eighth International Workshop on Machine Learning.
The paper summarizes the results to date of work begun a year and a half
ago to create a system that automatically and incrementally constructs
hierarchies of behaviors in neural nets. The purpose of the system is to
develop continuously through the encapsulation, or "chunking," of learned
behaviors.
=----------------------------------------------------------------------
INCREMENTAL DEVELOPMENT OF COMPLEX BEHAVIORS THROUGH AUTOMATIC
CONSTRUCTION OF SENSORY-MOTOR HIERARCHIES
Mark Ring
University of Texas at Austin
This paper addresses the issue of continual, incremental
development of behaviors in reactive agents. The reactive
agents are neural-network based and use reinforcement
learning techniques.
A continually developing system is one that is constantly
capable of extending its repertoire of behaviors. An agent
increases its repertoire of behaviors in order to increase
its performance in and understanding of its environment.
Continual development requires an unlimited growth
potential; that is, it requires a system that can
constantly augment current behaviors with new behaviors,
perhaps using the current ones as a foundation for those
that come next. It also requires a process for organizing
behaviors in meaningful ways and a method for assigning
credit properly to sequences of behaviors, where each
behavior may itself be an arbitrarily long sequence.
The solution proposed here is hierarchical and bottom up.
I introduce a new kind of neuron (termed a ``bion''),
whose characteristics permit it to be automatically
constructed into sensory-motor hierarchies as determined
by experience. The bion is being developed to resolve the
problems of incremental growth, temporal history
limitation, network organization, and credit assignment
among component behaviors.
A longer, more detailed paper will be announced shortly.
=----------------------------------------------------------------------
Instructions to retrieve paper by ftp, (no hard copies available at
this time):
% ftp cheops.cis.ohio-state.edu (or 128.146.8.62)
Name: anonymous
Password: neuron
ftp> cd pub/neuroprose
ftp> binary
ftp> get ring.ml91.ps.Z
ftp> bye
% uncompress ring.ml91.ps.Z
% lpr -P(your_postscript_printer) ring.ml91.ps.Z
=----------------------------------------------------------------------
DO NOT "reply" DIRECTLY TO THIS MESSAGE!
If you have any questions or difficulties, please send e-mail to:
ring@cs.utexas.edu.
or send mail to:
Mark Ring
Department of Computer Sciences
Taylor 2.124
University of Texas at Austin
Austin, TX 78712
------------------------------
Subject: Two new Tech Reports
From: Scott.Fahlman@SEF1.SLISP.CS.CMU.EDU
Date: Mon, 13 May 91 13:31:04 -0400
The following two tech reports have been placed in the neuroprose
database at Ohio State. Instructions for accessing them via anonymous
FTP are included at the end of this message. (Maybe everyone should copy
down these instructions once and for all so that we can stop sending
repeating them with each announcement.)
=---------------------------------------------------------------------------
Tech Report CMU-CS-91-100
The Recurrent Cascade-Correlation Architecture
Scott E. Fahlman
Recurrent Cascade-Correlation (RCC) is a recurrent version of the
Cascade-Correlation learning architecture of Fahlman and Lebiere
\cite{fahlman:cascor}. RCC can learn from examples to map a sequence of
inputs into a desired sequence of outputs. New hidden units with
recurrent connections are added to the network one at a time, as they are
needed during training. In effect, the network builds up a finite-state
machine tailored specifically for the current problem. RCC retains the
advantages of Cascade-Correlation: fast learning, good generalization,
automatic construction of a near-minimal multi-layered network, and the
ability to learn complex behaviors through a sequence of simple lessons.
The power of RCC is demonstrated on two tasks: learning a finite-state
grammar from examples of legal strings, and learning to recognize
characters in Morse code.
Note: This TR is essentially the same as the the paper of the same name
in the NIPS 3 proceedings (due to appear very soon). The TR version
includes some additional experimental data and a few explanatory diagrams
that had to be cut in the NIPS version.
=---------------------------------------------------------------------------
Tech report CMU-CS-91-130
Learning with Limited Numerical Precision Using the Cascade-Correlation
Algorithm
Markus Hoehfeld and Scott E. Fahlman
A key question in the design of specialized hardware for simulation of
neural networks is whether fixed-point arithmetic of limited numerical
precision can be used with existing learning algorithms. We present an
empirical study of the effects of limited precision in
Cascade-Correlation networks on three different learning problems. We
show that learning can fail abruptly as the precision of network weights
or weight-update calculations is reduced below 12 bits. We introduce
techniques for dynamic rescaling and probabilistic rounding that allow
reliable convergence down to 6 bits of precision, with only a gradual
reduction in the quality of the solutions.
Note: The experiments described here were conducted during a visit by
Markus Hoehfeld to Carnegie Mellon in the fall of 1990. Markus
Hoehfeld's permanent address is Siemens AG, ZFE IS INF 2, Otto-Hahn-Ring
6, W-8000 Munich 83, Germany.
=---------------------------------------------------------------------------
To access these tech reports in postscript form via anonymous FTP, do the
following:
unix> ftp cheops.cis.ohio-state.edu (or, ftp 128.146.8.62)
Name: anonymous
Password: neuron
ftp> cd pub/neuroprose
ftp> binary
ftp> get <filename.ps.Z>
ftp> quit
unix> uncompress <filename.ps.Z>
unix> lpr <filename.ps> (use flag your printer needs for Postscript)
The TRs described above are stored as "fahlman.rcc.ps.Z" and
"hoehfeld.precision.ps.Z". Older reports "fahlman.quickprop-tr.ps.Z" and
"fahlman.cascor-tr.ps.Z" may also be of interest.
Your local version of ftp and other unix utilities may be different.
Consult your local system wizards for details.
=---------------------------------------------------------------------------
Hardopy versions are now being printed and will be available soon, but
because of the high demand and tight budget, our school has has
(reluctantly) instituted a charge for mailing out tech reports in
hardcopy: $3 per copy within the U.S. and $5 per copy elsewhere, and the
payment must be in U.S. dollars. To order hardcopies, contact:
Ms. Catherine Copetas
School of Computer Science
Carnegie Mellon University
Pittsburgh, PA 15213
U.S.A.
------------------------------
Subject: TR - Kohonen Feature Maps in Natural Language Processing
From: SCHOLTES@ALF.LET.UVA.NL
Date: Wed, 15 May 91 13:42:00 +0700
TR Available on Recurrent Self-Organization in NLP:
Kohonen Feature Maps in Natural Language Processing
J.C. Scholtes
University of Amsterdam
Main points: showing the possibilities of Kohonen feature maps in symbolic
applications by pushing self-organization.
showing a different technique in Connectionist NLP by using
only (unsupervised) self organization.
Although the model is tested in a NLP context, the linguistic aspects of
these experiments are probably less interesting than the connectionist
ones. People inquiring a copy should be aware of this.
Abstract
In the 1980s, backpropagation (BP) started the connectionist bandwagon in
Natural Language Processing (NLP). Although initial results were good,
some critical notes must be made towards the blind application of BP.
Most such systems add contextual and semantical features manually by
structuring the input set. Moreover, these models form a small subtract
of the brain structures known from neural sciences. They do not adapt
smoothly to a changing environment and can only learn input/output pairs.
Although these disadvantages of the backpropagation algorithm are
commonly known and accepted, other more plausible learning algorithms,
such as unsupervised learning techniques are still rare in the field of
NLP. Main reason is the highly increasing complexity of unsupervised
learning methods when applied in the already complex field of NLP.
However, recent efforts implementing unsupervised language learning have
been made, resulting in interesting conclusions (Elman and Ritter).
Sequencing this earlier work, a recurrent self-organizing model (based on
an extension of the Kohonen feature map), capable to derive contextual
(and some semantical) information from scratch, is presented in detail.
The model implements a first step towards an overall unsupervised
language learning system. Simple linguistic tasks such as single word
clustering (representation on the map), syntactical group formation,
derivation of contextual structures, string prediction, grammatical
correctness checking, word sense disambiguation and structure assigning
are carried out in a number of experiments. The performance of the model
is as least as good as achieved in recurrent backpropagation, and at some
points even better (e.g. unsupervised derivation of word classes and
syntactical structures).
Although premature, the first results are promising and show
possibilities for other even more biologically-inspired language
processing techniques such as real Hebbian, Genetic or Darwinistic
models. Forthcoming research must overcome limitations still present in
the extended Kohonen model, such as the absence of within layer learning,
restricted recurrence, no look-ahead functions (absence of distributed or
unsupervised buffering mechanisms) and a limited support for an increased
number of layers.
A copy can be obtained by sending a Email message to
SCHOLTES@ALF.LET.UVA.NL Please indicate whether you want a hard copy or a
postscript file being send to you.
------------------------------
Subject: Paper Available: RAAM
From: doug blank <blank@copper.ucs.indiana.edu>
Date: Wed, 15 May 91 15:11:09 -0500
Exploring the Symbolic/Subsymbolic Continuum:
A Case Study of RAAM
Douglas S. Blank (blank@iuvax.cs.indiana.edu)
Lisa A. Meeden (meeden@iuvax.cs.indiana.edu)
James B. Marshall (marshall@iuvax.cs.indiana.edu)
Indiana University
Computer Science and Cognitive Science
Departments
Abstract:
This paper is an in-depth study of the mechanics of recursive
auto-associative memory, or RAAM, an architecture developed by Jordan
Pollack. It is divided into three main sections: an attempt to place the
symbolic and subsymbolic paradigms on a common ground; an analysis of a
simple RAAM; and a description of a set of experiments performed on
simple "tarzan" sentences encoded by a larger RAAM.
We define the symbolic and subsymbolic paradigms as two opposing corners
of an abstract space of paradigms. This space, we propose, has roughly
three dimensions: representation, composition, and functionality. By
defining the differences in these terms, we are able to place actual
models in the paradigm space, and compare these models in somewhat common
terms.
As an example of the subsymbolic corner of the space, we examine in
detail the RAAM architecture, representations, compositional mechanisms,
and functionality. In conjunction with other simple feed-forward
networks, we create detectors, decoders and transformers which act
holistically on the composed, distributed, continuous subsymbolic
representations created by a RAAM. These tasks, although trivial for a
symbolic system, are accomplished without the need to decode a composite
structure into its constituent parts, as symbolic systems must do.
The paper can be found in the neuroprose archive as blank.raam.ps.Z; a
detailed example of how to retrieve the paper follows at the end of this
message. A version of the paper will also appear in your local bookstores
as a chapter in "Closing the Gap: Symbolism vs Connectionism," J.
Dinsmore, editor; LEA, publishers. 1992.
=----------------------------------------------------------------------------
% ftp cheops.cis.ohio-state.edu
Connected to cheops.cis.ohio-state.edu.
220 cheops.cis.ohio-state.edu FTP server (Ver Tue May 9 14:01 EDT 1989) ready.
Name (cheops.cis.ohio-state.edu:): anonymous
331 Guest login ok, send ident as password.
Password:neuron
230 Guest login ok, access restrictions apply.
ftp> binary
200 Type set to I.
ftp> cd pub/neuroprose
250 CWD command successful.
ftp> get blank.raam.ps.Z
200 PORT command successful.
150 Opening BINARY mode data connection for blank.raam.ps.Z (173015 bytes).
226 Transfer complete.
local: blank.raam.ps.Z remote: blank.raam.ps.Z
173015 bytes received in 1.6 seconds (1e+02 Kbytes/s)
ftp> bye
221 Goodbye.
% uncompress blank.raam.ps.Z
% lpr blank.raam.ps
=----------------------------------------------------------------------------
------------------------------
Subject: New ICSI TR on incremental learning
From: ethem@ICSI.Berkeley.EDU (Ethem Alpaydin)
Date: Tue, 21 May 91 10:53:29 -0700
The following TR is available by anonymous net access at
icsi-ftp.berkeley.edu (128.32.201.55) in postscript. Instructions to ftp
and uncompress follow text.
Hard copies may be requested by writing to either of the
addresses below:
ethem@icsi.berkeley.edu
Ethem Alpaydin
ICSI 1947 Center St. Suite 600
Berkeley CA 94704-1105 USA
=--------------------------------------------------------------------------
GAL:
Networks that grow when they learn and
shrink when they forget
Ethem Alpaydin
International Computer Science Institute
Berkeley, CA
TR 91-032
Abstract
Learning when limited to modification of some parameters has a limited
scope; the capability to modify the system structure is also needed to
get a wider range of the learnable. In the case of artificial neural
networks, learning by iterative adjustment of synaptic weights can
only succeed if the network designer predefines an appropriate network
structure, i.e., number of hidden layers, units, and the size and
shape of their receptive and projective fields. This paper advocates
the view that the network structure should not, as usually done, be
determined by trial-and-error but should be computed by the learning
algorithm. Incremental learning algorithms can modify the network
structure by addition and/or removal of units and/or links. A survey
of current connectionist literature is given on this line of thought.
``Grow and Learn'' (GAL) is a new algorithm that learns an association
at one-shot due to being incremental and using a local representation.
During the so-called ``sleep'' phase, units that were previously
stored but which are no longer necessary due to recent modifications
are removed to minimize network complexity. The incrementally
constructed network can later be finetuned off-line to improve
performance. Another method proposed that greatly increases
recognition accuracy is to train a number of networks and vote over
their responses. The algorithm and its variants are tested on
recognition of handwritten numerals and seem promising especially in
terms of learning speed. This makes the algorithm attractive for
on-line learning tasks, e.g., in robotics. The biological
plausibility of incremental learning is also discussed briefly.
Keywords
Incremental learning, supervised learning, classification, pruning,
destructive methods, growth, constructive methods, nearest neighbor.
=--------------------------------------------------------------------------
Instructions to ftp the above-mentioned TR (Assuming you are under
UNIX and have a postscript printer --- messages in parantheses indicate
system's responses):
ftp 128.32.201.55
(Connected to 128.32.201.55.
220 icsi-ftp (icsic) FTP server (Version 5.60 local) ready.
Name (128.32.201.55:ethem):)anonymous
(331 Guest login ok, send ident as password.
Password:)(your email address)
(230 Guest login Ok, access restrictions apply.
ftp>)cd pub/techreports
(250 CWD command successful.
ftp>)bin
(200 Type set to I.
ftp>)get tr-91-032.ps.Z
(200 PORT command successful.
150 Opening BINARY mode data connection for tr-91-032.ps.Z (153915 bytes).
226 Transfer complete.
local: tr-91-032.ps.Z remote: tr-91-032.ps.Z
153915 bytes received in 0.62 seconds (2.4e+02 Kbytes/s)
ftp>)quit
(221 Goodbye.)
(back to Unix)
uncompress tr-91-032.ps.Z
lpr tr-91-032.ps
Happy reading, I hope you'll enjoy it.
------------------------------
Subject: New Bayesian work
From: David MacKay <mackay@hope.caltech.edu>
Date: Tue, 21 May 91 10:40:57 -0700
Two new papers available
=-----------------------
The papers that I presented at Snowbird this year are now
available in the neuroprose archives.
The titles:
[1] Bayesian interpolation (14 pages)
[2] A practical Bayesian framework
for backprop networks (11 pages)
The first paper describes and demonstrates recent developments in
Bayesian regularisation and model comparison. The second applies this
framework to backprop. The first paper is a prerequisite for
understanding the second.
Abstracts and instructions for anonymous ftp follow.
If you have problems obtaining the files by ftp, feel free to
contact me.
David MacKay Office: (818) 397 2805
Fax: (818) 792 7402
Email: mackay@hope.caltech.edu
Smail: Caltech 139-74,
Pasadena, CA 91125
Abstracts
=--------
Bayesian interpolation
----------------------
Although Bayesian analysis has been in use since Laplace,
the Bayesian method of {\em model--comparison} has only recently
been developed in depth.
In this paper, the Bayesian approach to regularisation
and model--comparison is demonstrated by studying the inference
problem of interpolating noisy data. The concepts and methods
described are quite general and can be applied to many other
problems.
Regularising constants are set by examining their
posterior probability distribution. Alternative regularisers
(priors) and alternative basis sets are objectively compared by
evaluating the {\em evidence} for them. `Occam's razor' is
automatically embodied by this framework.
The way in which Bayes infers the values of regularising
constants and noise levels has an elegant interpretation in terms
of the effective number of parameters determined by the data set.
This framework is due to Gull and Skilling.
A practical Bayesian framework for backprop networks
----------------------------------------------------
A quantitative and practical Bayesian framework is
described for learning of mappings in feedforward networks. The
framework makes possible:
(1) objective comparisons between solutions using alternative
network architectures;
(2) objective stopping rules for deletion of weights;
(3) objective choice of magnitude and type of weight decay terms
or additive regularisers (for penalising large weights,
etc.);
(4) a measure of the effective number of well--determined
parameters in a model;
(5) quantified estimates of the error bars on network parameters
and on network output;
(6) objective comparisons with alternative learning and
interpolation models such as splines and radial basis
functions.
The Bayesian `evidence' automatically embodies `Occam's razor,'
penalising over--flexible and over--complex architectures. The
Bayesian approach helps detect poor underlying assumptions in
learning models. For learning models well--matched to a problem,
a good correlation between generalisation ability and the
Bayesian evidence is obtained.
Instructions for obtaining copies by ftp from neuroprose:
=---------------------------------------------------------
unix> ftp cheops.cis.ohio-state.edu # (or ftp 128.146.8.62)
Name: anonymous
Password: neuron
ftp> cd pub/neuroprose
ftp> binary
ftp> get mackay.bayes-interpolation.ps.Z
ftp> get mackay.bayes-backprop.ps.Z
ftp> quit
unix> [then `uncompress' files and lpr them.]
------------------------------
Subject: TR - Learning the past tense in a recurrent network
From: Gary Cottrell <gary@cs.UCSD.EDU>
Date: Tue, 21 May 91 18:27:56 -0700
The following paper will appear in the Proceedings of the Thirteenth
Annual Meeting of the Cognitive Science Society.
It is now available in the neuroprose archive as cottrell.cogsci91.ps.Z.
Learning the past tense in a recurrent network:
Acquiring the mapping from meaning to sounds
Garrison W. Cottrell Kim Plunkett
Computer Science Dept. Inst. of Psychology
UCSD University of Aarhus
La Jolla, CA Aarhus, Denmark
The performance of a recurrent neural network in mapping a set of plan
vectors, representing verb semantics, to associated sequences of
phonemes, representing the phonological structure of verb morphology, is
investigated. Several semantic representations are explored in attempt to
evaluate the role of verb synonymy and homophony in deteriming the
patterns of error observed in the net's output performance. The model's
performance offers several unexplored predictions for developmental
profiles of young children acquiring English verb morphology.
To retrieve this from the neuroprose archive type the following:
ftp 128.146.8.62
anonymous
<your netname here>
bi
cd pub/neuroprose
get cottrell.cogsci91.ps.Z
quit
uncompress cottrell.cogsci91.ps.Z
lpr cottrell.cogsci91.ps
Thanks again to Jordan Pollack for this great idea for net distribution.
gary cottrell 619-534-6640 Sec'y: 619-534-5288 FAX: 619-534-7029
Computer Science and Engineering C-014
UCSD,
La Jolla, Ca. 92093
gary@cs.ucsd.edu (INTERNET)
{ucbvax,decvax,akgua,dcdwest}!sdcsvax!gary (USENET)
gcottrell@ucsd.edu (BITNET)
------------------------------
End of Neuron Digest [Volume 7 Issue 29]
****************************************