Copy Link
Add to Bookmark
Report
Machine Learning List Vol. 2 No. 09
Machine Learning List: Vol. 2 No. 9
Thursday, May 24, 1990
Contents:
Machine Learning Journal special issue on discovery
Report on Change of Representation Workshop
Get Real! Comments on ML 2(8)
The Machine Learning List is moderated. Contributions should be relevant to
the scientific study of machine learning. Mail contributions to ml@ics.uci.edu.
Mail requests to be added or deleted to ml-request@ics.uci.edu. Back issues
may be FTP'd from ics.uci.edu in /usr2/spool/ftp/pub/ml-list/V<X>/<N> or N.Z
where X and N are the volume and number of the issue; ID & password: anonymous
------------------------------
From: ZYTKOW@wsuiar.wsu.ukans.edu
Date: Thu, 24 May 90 19:25 CST
Subject: Machine Learning Journal special issue on discovery
A special issue of the Machine Learning Journal devoted to machine
discovery is planned for the near future. Jan Zytkow will be a guest
editor for that issue.
The papers for the special issue should be submitted by October 30,
1990. All papers will be subject to the standard evaluation procedures.
An ideal paper should satisfy all the ML Journal criteria, as discussed
in various editorial statements. An ideal paper should, of course, make
a substantial contribution to machine discovery.
The preponderance of the work on machine discovery has been done in the
context of discovery in science. There are many good reasons to continue
that direction of research, but other applications are also welcome.
Papers are encouraged on discoveries in databases and virtually any
other application of discovery techniques, provided that the results are
substantial. Because sciences offer us a collection of relatively clear
discovery standards, it is a good idea to compare your method and the
application domain you use with an appropriate scientific counterpart,
and to relate your work to the existing research on scientific
discovery.
An issue of common interest to both the learning and the discovery
researchers is the borderline between discovery and learning, not merely
for the sake of making conceptual distinctions, but because important
research issues can be identified this way for both machine learning and
discovery. The boundary is vague, and although it may never be sharp, we
may try to go beyond the standard supervised vs. unsupervised
distinction and to identify the mechanisms missing in the learning
systems that allow an autonomous discovery system to collect data,
decide on important research problems, and to evaluate its own results.
By opposing the discoverer to the learner we may stimulate the interest
of the broader machine learning community in research on discovery, and
we may discover new problems for our future research.
Both comparisons (non-scientific to scientific discovery; discovery
to learning) should consume up to 10% of a paper, ideally in the
discussion following the main results, unless either of these comparisons
is a major motivation for the paper.
If you haven't been contacted yet, and you are interested in submitting
a paper, please send a message to Jan Zytkow. You will be included at
the mailing list and you may participate in the discussion on the
borderline between discovery and learning.
-Jan Zytkow
my email address: zytkow@wsuiar.wsu.ukans.edu or zytkow@twsuvax.bitnet
my USMail address: Computer Science Department
Wichita State University
Wichita, KS 67208
------------------------------
Subject: Report on Change of Representation Workshop
Date: Wed, 16 May 90 10:33:53 +0100
From: wrobel@gmdzi.uucp
Report about the 2nd Workshop on
Change of Representation and Problem Reformulation
Stefan Wrobel
GMD (German Natl. Research Center for Computer Science)
Research Group Expert Systems
wrobel @ gmdzi.uucp
10. 5. 1990
### 1. Introduction ###
The *2nd Workshop on Change of Representation and Problem Refor-
mulation* was competently organized by Jeffrey van Baalen of
Price Waterhouse, an international business consulting firm, and
took place at their research center in Menlo Park, CA, on March
24 and 25, 1990. Even though a public call for papers had been
posted, attendance was by invitation only based on submitted
research summaries and/or extended paper abstracts. The resulting
audience of 34 researchers was very qualified and focused on the
topic of the workshop, even though not all groups working on
representation adjustment were represented. European partici-
pance was very low as well.
The sessions themselves were divided into 2 or 3 plenary talks of
35 minutes including discussion, augmented by special 30 minute
discussion periods that concluded a group of related talks. This
organization provided sufficient discussion time, even though the
summary discussions were often used for questions to individual
presenters. This suggests that longer question periods might have
been beneficial. Since meals were served ``on location'', the
small size of the workshop allowed for good interaction both
within and between sessions.
The research presented at the workshop reflects a subdivision of
the Change of Representation field into three lines of work, two
of which are oriented towards problem solving, and one which is
oriented towards Machine Learning. A fourth group of researchers
was presenting their theoretically oriented work on representa-
tion change based on category theory. In the following we sum-
marize the talks in each of those groups. Notably absent from the
workshop was any work on the emergence of representation from
perceptual information. There are no official proceedings, so
papers should be obtained by contacting the authors directly.
### 2. Reformulation for problem solving ###
The first group, which we might call ``reformulation for problem
solving'', is concerned with transformations of a problem's
search space in order to make it simpler, smaller, or more effi-
cient to work with. Those reformulations are usually predeter-
mined and carried out automatically, i.e., here, the focus is not
(yet) on automating the process of designing reformulations.
*Saul Amarel's (Rutgers)* classical work on the Missionaries and
Cannibals problem is a good example. At the workshop, Amarel
discussed this work and its continuations in an invited talk.
*Jeffrey van Baalen (Price Waterhouse)* described a technique
that automatically designs representations by moving general
logical axioms expressing a problem's constraints into those
representations. Such representations reflect constraints syn-
tactically, eg. by changing the representation of a predicate
with functional properties into a function, or creating special
representational objects that locally enforce eg. symmetry (cf.
hybrid reasoners). He also showed how a theorem prover solves
problems much more efficiently in the representations that the
technique designs.
*Craig Knoblock (CMU)* presented a way of learning problem-
specific abstraction hierarchies for hierarchical planning. His
ALPINE system determines classes of literals that can be
achieved without interfering with classes higher in the hierar-
chy based on the operators' preand postconditions; since con-
dition arguments/instantiations are ignored, the granularity of
the classes is relatively coarse.
The work of *David Zhu Jean-Claude Latombe (Stanford)* dealt
with reformulation in robot motion path planning. From the ini-
tial representation of the robot as a rectangle and the shapes of
the obstacles, their system computes ``grown'' shapes of the
obstacles that allow the representation of the robot as a point,
making path computations a lot easier.
Finally, *Daniel Weld (Univ. of Washington)* examined how a sys-
tem can select an appropriate model for a prediction task based
on a given set of models, i.e., the focus of the work is on how
to compare two models, and estimate how their differences will
affect the predictions.
### 3. Compilation for problem solving ###
The second group, which we might call ``compilation for problem
solving'', is concerned with the generation of efficiently exe-
cutable forms of an available general model. In contrast to the
first group, this usually does not involve the introduction of
new representation elements (as in eg. van Baalen's work), but
consists instead of determining which parts of the general model
are relevant for a particular solution, and expressing them in
operational terms to yield a problem solving rule. Explanation-
based learning is of this general type.
At the workshop, the work of *Giuseppe Cerbone and Tom Dietter-
ich* dealt with the problem of how to deal with numeric models in
compilation, presenting a compilation-based solution for the
non-linear optimization task of 2-d structural design. Their
system analyzes numerically-derived designs to detect general
geometric patterns that can act as constraints in future design
task; new geometric features are created when necessary.
*Nick Flann (also Oregon State Univ.)* presented a method for
dealing with intractable proofs in EBL as they occur eg. in op-
ponent planning, where one needs to operationalize proofs that
include universal quantification over operators. By employing a
more abstract representation of the influence of operators on
goals, the proofs can be restricted to the relevant operators,
allowing an operationalization that is incomplete (applies to
only part of the problem domain), but guaranteed correct.
In short presentations during a special session on ``Reformula-
tion for engineering problem solving'', *Brian Falkenhainer
(Xerox PARC)*, *Mark Shirley (Xerox PARC)*, and *Richard Keller
(NASA Ames)* discussed the general issues in (qualitative) rea-
soning from general models. With respect to representation, they
emphasized the necessity of using multiple models at different
levels of granularity and with different viewpoints (Falk-
enhainer), and being able to combine them into specialized
models with less coverage but better efficiency (Keller).
### 4. Inductive change of representation ###
The third line of research, ``inductive change of representa-
tion'', seeks to automate the process of Change of Representation
by introducing new terms or selecting a better formalism. Most
of the presented work was within the ``traditional'' symbolic
learning framework.
*Ruediger Wirth (UC Irvine)* presented the inverse resolution
``G'' operators that are used in his LFP2 system. They intro-
duce new terms based on the inversion of resolution steps in
several (failed) proofs. This restricts the search space com-
pared to inverse resolution based on one example (as in
Muggleton's CIGOL) and allows the relaxation of the unit clause
restriction, but requires that all relevant facts are known to
prevent overgeneralization.
*Kevin Thompson and Pat Langley (NASA Ames)* described LA-
BYRINTH, an extension of existing conceptual clustering programs
(particularly COBWEB) to inputs using structured (relational)
information. The key idea in the approach is to regard struc-
tured objects as trees based on the ``part-of'' relation. The
system then first clusters the leaves of this tree (the atomic
parts of the structured object). It then replaces those leaves
by their now-known class names, which means the next higher lev-
el in the tree is now described by non-structured attributes, and
thus be clustered by repeated application of the same process.
This reuse of discovered terms is new in search-based cluster-
ing.
*Stefan Wrobel (GMD)* presented the representation adjustment
method used in MOBAL, which also uses structured information, but
introduces terms in a demand-driven fashion, thus avoiding the
resource allocation problem between rule discovery and new term
introduction. The approach exploits rule exception sets as the
basis for forming new concepts, checks the quality of new con-
cepts, and reuses them for restructuring the rule base and future
learning.
*Raj Seshu (Univ. of Illinois)* showed how the introduction of
new features can overcome the ``parity problem'' that occurs in
decision-tree induction (TDIDT) systems (ID3 and successors):
they fail to learn concepts that are defined by the n-ary parity
operation. Seshu's system introduces a small number of parity
features, resulting in improved learning on the parity problem,
but only at the price of required ``bias'' training, and of
slightly worse performance on some other problems. Also, there
was some discussion in the audience as to how important the par-
ity problem is in practice.
*Sharad Saxena (Univ. of Massachusetts Amherst)* addressed a
presently completely open problem in representation change: how
to evaluate a given representation. To tackle the problem, he
restricted it to the case of decision tree learning from exam-
ples, and proposed to measure representation quality by the
length of an approximate DNF description of the target concept
that is generated from the examples before the actual learning
algorithm is run. At present, computing this approximate con-
cept description is more expensive than running the actual
learning algorithm ($n^2$ vs. $n be able to reduce the cost to
linear in the future.
*Craig Shaefer (Rowland Inst. for Science)* was the sole ex-
ponent of another learning paradigm, genetic algorithms. His
ARGOT system improves the quality of genetic algorithm learning
by changing the representation employed for the chromosomes, so
that the GA sampling is restricted to parts of the total solu-
tion space. To that end, ARGOT employs several heuristics that
either narrow or widen the space based on the characteristics of
payoff evolution.
### 5. Theoretical Foundations ###
The field of representation change does not have a solid theoret-
ical foundation yet. One approach that is being favored by a
number of people is category theoretic models. To introduce the
rest of the community to this mathematical theory, one session of
the workshop was devoted to a category theory tutorial given by
*Paul Benjamin (Philips)*, *Mike Lowry (Kestrel Inst.)* and
*Robert Zimmer (Brunel Univ.)*, who also presented a theoretical
investigation of Rubik's cube representations (Benjamin) and
some theoretical results about the nature of homomorphic
representation change as defined by Korf (Lowry).
Stefan Wrobel
Gesellschaft fuer Mathematik und Datenverarbeitung (GMD)
F3/XPS
Pf. 1240
D-5205 St. Augustin 1
Tel.: 02241/14-2093 E-mail: wrobel @ gmdzi.uucp
------------------------------
Date: Thu, 17 May 90 19:16:25 PDT
From: Jeff Shrager <shrager@parc.xerox.COM>
Subject: Get Real! Comments on ML 2(8) &c
I must say that machine learning has gotten pretty boring, what with
trivial (or worse, randomly-generated) test data, a notion of "explanation"
that amounts to logical proof+chunking, and a notion of "concept" that's in
the dark ages of objectivist psycho-philosophy.... Oh, yes, either these or
connectionism and it's probable-equivalents, which have an infinite
algorithmic search space that no one understands, and so which are grounded
neither on an architectural, goal-directed, or psychological basis.
If those are the answers, I'm no longer interested in the questions!
What ever happened to the goals of understanding human learning,
explanation, development, concept formation? Instead of random test data
or soy bean data, why aren't you feeding your systems protocols of parent
or teacher input gathered from real homes or classrooms? Yeah, I know,
you're trying to get at the deep computational properties and scope of
these algorithms...blah blah blah. There's something to this, but it's a
shame to see a whole community drop into this black hole. And anyway, how
do you know that the algorithms you're spending so much time studying the
details of are even vaguely plausible either as engineering tools or as
psychological models?
Try this. Tape record "All Things Considered" or "McNeil-Lehrer" one day
and transcribe the explanations that take place there. (Note that the
whole show is really one big explanation, in some very interesting sense of
explanation!) Alternatively, go sit on a park bench on a Saturday, or in a
museum and write down the explanations you hear people giving their
children or between peers...or watch them trying to figure out the exhibits
and try to gently engage these folks in explaining them to you... or even
just write down what the exhibit placards say! If you have children of 2
to 6 years old, or so, apy attention to what you tell them or what your
spouse tells them or what their freinds tell them. I challenge you to fit
these phenomena into something even vaguely related to the ML notions of
data, explanation, activity, or concept.
If real world contact scares you, you can just pick up Holland, Holyoak,
Nisbett, & Thagard's "Induction" (try really really hard to ignore the
computational mumbo-jumbo) and study the excellent examples! Here's the
beginning of a bibliography that every Machine Learning person that cares
about "real" learning and intelligence should read or at least know about.
There are certainly other works that don't happen to be in the tips of my
fingers, and a million papers in, around, and similar to these major works.
I've focussed on books simply to make my typing life easier. I encourage
folks to add to this list their favorite contributions to the
exemplification of real intelligence and learning.
-Jeff
-> On scientific reasoning:
Feyerabend, P. (1975). Against method. New York: Verso. **
Gholson, B., Shadish, W.R., Neimeyer, R.A., & Houts, A.C. (1989) Psychology
of science: Contributions to metascience. Cambridge University Press. %
Hacking, I. (1983). Representing and intervening. Cambridge: Cambridge
University Press. **
Lakatos, I. (1978). The methodology of scientific research programmes.
Cambridge University Press. **
(anything by I. Lakatos)
Mayr, E. (1982). The growth of biological thought. Cambridge, MA: Harvard
University Press. **
Suppe, F. (1977). The structure of scientific theories. Urbana, IL:
University of Illinois Press. (esp. Suppe's critical introduction and
conclusions) *
-> On culture and development:
Bruner, J. (1983). Child's talk. New York: Norton. **
Burner, J., & Haste, H. (1987) Making sense: The child's construction of
the world. New York: Methuen. *
(anything by J. Bruner)
Geertz, C. (1983). Local knowledge. New York: Basic Books.**
Geertz, C. (1973). The interpretation of cultures. New York: Basic Books.**
Gruber, H.E., & Voneche, J.J. (1977). The essential Piaget. New York:
Basic Books. *
Heritage, J. (1984). Garfinkel and ethnomethodology. Cambridge, UK: Polity
Press. *
Landau, B., & Gleitman, L. R. (1985). Language and experience: Evidence
from the blind child. Harvard University Press. **
Vygotsky, L.S. (1934/1962). Thought and language. Cambridge, MA: MIT
Press and Wiley. **
Wertsch, J.V. (1985). Vygotsky and the social construction of mind.
Harvard University Press. *
-> On not-so-cognitive processes:
Arkes, H.R., & Hammond, K.R. (1986). Judgment and decision making: An
interdisciplinary reader. Cambridge University Press. (esp. those
chapters that give examples of real decisions, e.g., in politics) %
Luhrman, T.M. (1989). Persuasions of the witch's craft. Cambridge, MA:
Harvard University Press. **
Neisser, U. (1982). Memory observed. New York: Freeman. *
Rogoff, B. & Lave, J. (1984). Everyday cognition: Its development in a
social context. Harvard University Press. %
(anything by B. Rogoff)
Sudnow, D. (?) Ways of the hand.
(I don't have a complete reference for this.)
---
Key:
* Collections or reviews of important work, so providing reviews
of the field (or of some line of research) at some interesting
historical point as well as a place to begin study.
% Collections or anthologies, some contents of which make good top-nodes
for entry an important literature.
** Major contributions, giving insightful analyses of
activity or historical phenomena.
------------------------------
Date: Mon, 21 May 90 18:11:12 EDT
From: mostow@cs.rutgers.edu
Subject: Re: Get Real! Comments on ML 2(8) &c
Interesting statement. I hope it provokes informative discussion. Naturally I
don't think it applies to my own work, largely because I'm not trying to
model human learning.
On explanation-as-proof: Just because the initial version is simplistic is not
a reason to replace the powerful language and tools of logic and proof with
some new representation whose semantics and properties are less understood. It
is a good reason to extend the notion of explanation to overcome recognized
limitations. Examples of such extensions include work by William Cohen and
others on abductive explanation-based learning, and work by Tom Ellman, Prasad
Tadepalli, Scott Bennett, Steve Chien, Neeraj Bhatnagar, and others on using
simplifying assumptions to derive approximate explanations. Moreover, using
proofs as explanations may be quite reasonable for learning control knowledge;
the problems seem more to do with intractability than with conceptual
impoverishment. There is sometimes an unfortunate tendency to condemn a line
of research prematurely just because it doesn't achieve total immediate success
or suffers from limitations. Unless those limitations are truly intrinsic to
the entire approach, it can pay to figure out how to overcome them.
On plausibility: An emerging and informative (albeit imperfect) plausibility
criterion for learning is based on the PAC paradigm introduced by Valiant.
William Cohen's extension of this paradigm to performance improvement learning
shows that it is not limited to simple pure induction.
On connectionism: If neural networks really have an "algorithmic search space
that no one understands," but manage to do some interesting things anyway, I'd
think that would be a pretty good reason to study them in order to understand
them better.
In general, such broadsides are harder to address than specific criticisms of
specific research. However, they can be healthy insofar as they challenge us
to justify our work and reexamine our goals. Jack Mostow
------------------------------
Date: Mon, 21 May 1990 18:38:57 EDT
From: Jaime Carbonell <jgc@nl.cs.cmu.edu>
Subject: Re: Get Real! Comments on ML 2(8) &c
I do not find Shrager's unfocused attack on Machine Leaning
particularly interesting or worthy of much reflection. "Broadside"
was Mostow's appropriate characterization thereof. A much more useful
criticism would focus upon one particular area of ML at a time,
analyze it in depth, and suggest well-thought-out reasons why the
research was not being well conducted, rather than Shrager's blanket
opinion that the field is boring. Experimental Psychology appreciates
the value of thorough empirical testing, of balanced data sets, of
replicable experiments, etc. Mathematics necessarily simplifies the
world in order to model formally some aspect of interest. Computer
Science addresses tractable problems and attempts to push the
boundaries of what can be computed tractably. Machine learning relies
on these disciplines, rather than some bull-in-a-china-shop approach.
--Jaime Carbonell
------------------------------
END of ML-LIST 2.9