Copy Link
Add to Bookmark
Report
Machine Learning List Vol. 3 No. 21
Machine Learning List: Vol. 3 No. 21
Wednesday, Dec. 4, 1991
Contents:
Application of Machine Learning for a Natural language Problem
Wisconsin FTP Site for ML Papers
The Machine Learning List is moderated. Contributions should be relevant to
the scientific study of machine learning. Mail contributions to ml@ics.uci.edu.
Mail requests to be added or deleted to ml-request@ics.uci.edu. Back issues
may be FTP'd from ics.uci.edu in pub/ml-list/V<X>/<N> or N.Z where X and N are
the volume and number of the issue; ID: anonymous PASSWORD: <your mail address>
------------------------------
Date: Wed, 4 Dec 91 13:35:50 CST
From: Jim Barnett <barnett@mcc.COM>
Subject: literature search for Natural language application of machine learning
I would like to apply machine learning techniques to a specific
problem in natural language processing and am looking for pointers to
the machine learning literature. My background is in NL rather than
learning, but I don't mind wading through some math.
General outline: the basic data structure I'm working with is a parse
tree (directed graph). A disambiguator is a function that takes two
parse trees as input and indicates which it prefers or if the two
trees are equal. Disambiguators can be ordered in the following sense:
given disambiguators {d1 .. dn}, we produce an ordered list
<d1, ...dn>, where we use d2 to decide between trees that d1 treats as
equal, then d3 if both d1, d2 treat the trees as equal, etc.
Task 1: Given a set of disambiguators {d1, ... dn} and a set of
sentences with multiple parses (each sentence associated with multiple
parse trees) plus an indication of the preferred parse of each
sentence, how do you determine the optimal ordering of {d1..dn}, in
the sense of the one that produces a ranking most closely resembling
the ranking given in the input?
I have been told that there is a standard algorithm for "learning
decision lists" which will accomplish this, but I don't know where to
find it.
Task 2: Consider more complicated ways of combining disambiguators.
Each disambiguator returns one of 3 values: 1, 2, or Eq. Now given a
list of of N disambiguators, you consider all combination functions
from N values chosen from {1,2,Eq} (that is, from the rankings the N
individual disambiguators return) to a value in {1,2,Eq} (namely, the
final ranking.) The ordering of disambiguators mentioned above is just
one such function - it just starts at the left of the input vector and
takes the first non-Eq value it finds.
Given the same input as described above, how do we find the optimal
combination function?
Task 3: Do task 2, but incrementally. You don't get the ranked
sentences all at once, but one at a time, updating the combination
function each time, gradually converging on the best fit.
I would be grateful for any pointers to relevant literature on any of
these problems. Please reply to me directly.
Thanks
Jim Barnett barnett@mcc.com
(512) 338-3607
[Editors notes:
1. Please forward replies to ml-list as well.
2. If you are willing to make this data set generally available,
you might find 10 people willing to try different algorithms on your
problem. If so, contact "pmurphy@ics.uci.edu"
3. You are probably not going to find an optimal order. The best you
can hope for is one minimizes error on the training data, or one that
is a trade off error on the training data for simplicity of the
combination function.
4. To me, your task 1 seems harder than tasks 2 or 3, (perhaps because
the combination function in task 1 is so constrained). Do you need this
solved or would you be happy if just 3 were solved.
5. A good start would be:
Quinlan, R. (1986). Induction of decision trees. Machine Learning, 1,
81-106. There are several later papers describing incremental versions
of decision tree learners (That will solve task 3 if this paper will
solve task 2 for you). This, as well as a variety of other programs
will be applicable if you can coerce the training data into a set
of examples, where each example is a set of attributes (perhaps
d1, d2 ..., dn) where each attribute takes on a value, (e.g., 1,
2 or Eq) and each example belongs to one of a small number of classes
(e.g., Preferred or Not Preferred) For example:
<d1:1, d2:1, d3:Eq, d4:2> Preferred
<d1:1, d2:1, d3:Eq, d4,1> Preferred
<d1:2, d2:1, d3:1, d4,2> Not Preferred
From your description, I'm not positive that you can do this.
6. Since you mention "decision lists" here is a good reference:
Pagallo, G., & Haussler, D. (1990). Boolean feature discovery in
empirical learning. Machine Learning, 5, 71-100. This will also
be applicable if the previous system is.
7. A recent volume edited by Shavlik & Diettrick called something
like "Readings in Machine Learning" available from Morgan Kaufmann
contains reprints of many "classic" papers.
-Mike]
------------------------------
From: Jude Shavlik <shavlik@cs.wisc.EDU>
To: ml@ics.UCI.EDU
Subject: Wisconsin FTP Site for ML Papers
The connectionist community has a very successful archive of postscript
versions of recent papers. This archive, called neuroprose, is located at
Ohio State and managed by Jordan Pollack. I suggest the readers of ML-LIST
emulate this useful method of rapid dissemination of research.
A centrally-managed facility is a lot of work for one person, so I suggest we
create our own ftp sites and announce the availability of papers using
ML-LIST. (Of course, should anyone volunteer to be the central-site manager,
that would be great.)
Toward that end, attached is an announcement of some recent papers,
avaliable from an ftp site at the University of Wisconsin.
Jude Shavlik
The following working papers are available via ftp from the Machine
Learning Research Group of the University of Wisconsin, Madison.
To retrieve them, you can anonymously ftp to: steves.cs.wisc.edu (128.105.2.201)
MLRG WP 91-2
Maclin, R. and Shavlik, J.W., Refining Algorithms with Knowledge-Based
Neural Networks: Improving the Chou-Fasman Algorithm for Protein Folding,
Machine Learning Research Group Working Paper 91-2 (to appear in
"Computational Learning Theory and Natural Learning Systems",
S, Hanson, G. Drastal, and R. Rivest, (eds.), MIT Press).
File name: maclin.fskbann.ps.Z
MLRG WP 91-3
Scott, G.M., Shavlik, J.W., and Ray, W.H., Refining PID Controllers
using Neural Networks, Machine Learning Research Group Working Paper 91-3
(to be presented at NIPS-91).
File name: scott.nnpid.ps.Z
MLRG WP 91-4
Towell, G.G. and Shavlik, J.W., The Extraction of Refined Rules from
Knowledge-Based Neural Networks, Machine Learning Research Group Working
Paper 91-4 (to be presented at NIPS-91).
File name: towell.interpretation.ps.Z
The abstract of each paper and detailed ftp instructions follow:
Refining Algorithms with Knowledge-Based Neural Networks:
Improving the Chou-Fasman Algorithm for Protein Folding
Richard Maclin
Jude W. Shavlik
Computer Sciences Dept.
University of Wisconsin - Madison
email: maclin@cs.wisc.edu
We describe a method for using machine learning to refine
algorithms represented as generalized finite-state automata. The
knowledge in an automaton is translated into a corresponding
artificial neural network, and then refined by applying
backpropagation to a set of examples. Our technique for
translating an automaton into a network extends the KBANN
algorithm, a system that translates a set of propositional, non-
recursive rules into a corresponding neural network. The
topology and weights of the neural network are set by KBANN so
that the network represents the knowledge in the rules. We
present the extended system, FSKBANN, which augments the KBANN
algorithm to handle finite-state automata. We employ FSKBANN to
refine the Chou-Fasman algorithm, a method for predicting how
globular proteins fold. The Chou-Fasman algorithm cannot be
elegantly formalized using non-recursive rules, but can be
concisely described as a finite-state automaton. Empirical
evidence shows that the refined algorithm FSKBANN produces is
statistically significantly more accurate than both the original
Chou-Fasman algorithm and a neural network trained using the
standard approach. We also provide extensive statistics on the
type of errors each of the three approaches makes and discuss the
need for better definitions of solution quality for the protein-
folding problem.
Refining PID Controllers using Neural Networks
Gary M. Scott (Chemical Engineering)
Jude W. Shavlik (Computer Sciences)
W. Harmon Ray (Chemical Engineering)
University of Wisconsin
The KBANN (Knowledge-Based Artificial Neural Networks)
approach uses neural networks to refine knowledge that can be
written in the form of simple propositional rules. We extend
this idea further by presenting the MANNCON (Multivariable Artif-
icial Neural Network Control) algorithm by which the mathematical
equations governing a PID (Proportional-Integral-Derivative) con-
troller determine the topology and initial weights of a network,
which is further trained using backpropagation. We apply this
method to the task of controlling the outflow and temperature of
a water tank, producing statistically- significant gains in accu-
racy over both a standard neural network approach and a non-
learning PID controller. Furthermore, using the PID knowledge to
initialize the weights of the network produces statistically less
variation in testset accuracy when compared to networks initial-
ized with small random numbers.
The Extraction of Refined Rules from Knowledge-Based Neural Networks
Geoffrey G. Towell
Jude W. Shavlik
Department of Computer Science
University of Wisconsin
E-mail Address: towell@cs.wisc.edu
Neural networks, despite their empirically-proven abilities, have been little
used for the refinement of existing knowledge because this task requires a
three-step process. First, knowledge in some form must be inserted into a
neural network. Second, the network must be refined. Third, knowledge must be
extracted from the network. We have previously described a method for the
first step of this process. Standard neural learning techniques can accomplish
the second step. In this paper, we propose and empirically evaluate a method
for the final, and possibly most difficult, step. This method efficiently
extracts symbolic rules from trained neural networks. The four major results
of empirical tests of this method are that the extracted rules:
(1) closely reproduce (and can even exceed) the accuracy
of the network from which they are extracted;
(2) are superior to the rules produced by
methods that directly refine symbolic rules;
(3) are superior to those produced by
previous techniques for extracting rules from trained neural networks;
(4) are ``human comprehensible.''
Thus, the method demonstrates that neural networks can be an effective tool
for the refinement of symbolic knowledge. Moreover, the rule-extraction
technique developed herein contributes to the understanding of how symbolic
and connectionist approaches to artificial intelligence can be profitably
integrated.
Detailed FTP Instructions:
unix> ftp steves.cs.wisc.edu (128.105.2.201)
Name: anonymous
Password: <your email address>
ftp> cd pub/
ftp> binary
ftp> get maclin.fskbann.ps.Z
OR... get scott.nnpid.ps.Z
OR... get towell.interpretation.ps.Z
ftp> quit
unix> uncompress maclin.fskbann.ps.Z
OR... uncompress scott.nnpid.ps.Z
OR... uncompress towell.interpretation.ps.Z
unix> lpr maclin.fskbann.ps
OR... lpr scott.nnpid.ps
OR... lpr towell.interpretation.ps
(or use whatever command you use to print PostScript)
------------------------------
END of ML-LIST 3.21