Copy Link
Add to Bookmark
Report
Machine Learning List Vol. 1 No. 03
Machine Learning List: Vol. 1 No. 3
Saturday, July 22, 1989
Contents:
ML-LIST Announcement
cups
Isolated examples, micro-worlds, mini-worlds, and "emotion world"
UMR/MDRL Repository of ML Algorithms
Job Announcement
The Machine Learning List is moderated. Contributions should be relevant to
the scientific study of machine learning. Mail contributions to ml@ics.uci.edu.
Mail requests to be added or deleted to ml-request@ics.uci.edu
----------------------------------------------------------------------
1. The last message in this list is a job announcement. If you feel that such
announcements should or should not be included in future mailings, send
mail to ml-request@ics.uci.edu
2. Please keep messages brief (1 or, at most 2, 24-line screens of text).
----------------------------------------------------------------------
Subject: cups (Re: ML-LIST 1.2 submission by Tom Fawcett).
Date: Mon, 17 Jul 89 16:06:47 -0700
>From: Paul O'Rorke <ororke@trix.ICS.UCI.EDU>
Tom, if you're saying you have no problem with the use of "cups" as an
illustrative example in EBL talks, I agree with you. I also agree
with your statement that one problem with toy domains is that if there
are few rules and they interact in few and uninteresting ways, it is
easy to get a method to work, sidestepping problems that will have to
be addressed in order to get something that scales up to more complex
domains.
The problem with your defense of "cups" (if your defense is that it's
no worse than "domains" such as suicide, economic sanctions, emotions,
and kidnapping) is that you oversimplify when you lump them all
together as "arbitrary simplifications of "real world" domain
theories." Jerry DeJong has a nice distinction between "good
kludging" and "bad kludging" that is relevant here. In the good kind,
less relevant problems are deliberately temporarily sidestepped in
order to make progress and avoid getting mired down. In the bad kind,
major aspects of the problem under attack are ignored when they really
must be addressed in order to make real progress. This is called
"cheating" when it's deliberate but it isn't always deliberate.
Sometimes important problems simply aren't noticed because they don't
show up in examples, while less important issues that do show up in
the examples appear to be more important than they really are.
I believe the isolated examples and micro-worlds used in early EBL
research resulted in some confusion of this sort. For example,
deduction was used as a model of the explanation process in early
models of EBL and operationality was viewed as a very important issue
(the Mitchell et al model of EBL did not learn at the knowledge level
but translated non-operational concepts into operational ones). While
this was a fine first cut, when you look at more realistic examples
and more psychologically valid explanations you see that it is
important to be able to leap to conclusions in constructing
explanations and this is better modelled using abduction algorithms
capable of non-deductive inference. Issues such as plausibility then
become very important. Learning occurs at the knowledge level (see
[1]) and operationality isn't required to play the crucial role it
played in the purely deductive model of EBL.
Most people interested in EBL believe it is important to move on to
more sophisticated combinations of abduction and learning and to test
them on more complex, realistic domains. This is exactly what we are
trying to do in my group at UCI, and we have invested a lot of work on
the emotion knowledge-base as part of our effort. It pains me to see
the results of all this effort characterized as just another
"arbitrary simplification of a "real world" domain theory."
Most philosophers of science believe that all theories are
approximations. Few, if any, hold out hope that we will ever have a
complete and correct theory of reality (whatever that is). But this
doesn't mean any theory is as good or as accurate as any other. Like
any theory, the emotion kb involves approximations and
simplifications. But this does not imply that it is no better than
"cups" as a testbed for research on abduction and learning.
REFS
1. Theory Formation by Abduction: Initial Results of a Case Study
Based on the Chemical Revolution, with S. Morris and D. Schulenburg,
Proceedings of the Sixth International Machine Learning Workshop, held
at Cornell University, Ithaca, New York, June 28-July 1, 1989, A.
Segre (Ed.), Morgan Kaufmann Publishers, Inc., Los Altos, California.
---Paul O
----------------------------------------------------------------------
Subject: Isolated examples, micro-worlds, mini-worlds, and "emotion world"
Date: Mon, 17 Jul 89 16:21:35 -0700
>From: Paul O'Rorke <ororke@trix.ICS.UCI.EDU>
In Machine Learning List 1.2, Tom Fawcett <Tom%catherine@gte.COM> writes:
...in the context of a discussion of whether cups are a bad domain...
>I don't consider the "suicide domain" or the "folk
>psychology/emotion/anger" domain or the 10-rule economic sanctions
>domain, or even the kidnapping domain to be any less toy, since
>they're all arbitrary simplifications of "real world" domain theories.
>- - -Tom
No knowledge-base or theory will ever render an absolutely perfect
picture of any real world domain. We don't really want this in any
case because abstraction and simplicity are virtues of good theories.
However, it is still possible to distinguish between different degrees
of simplicity, completeness, and correctness. I like to use the terms
"isolated examples", "micro-worlds", and "mini-worlds" to mark
different domain levels. The point is that while some of the
so-called domains mentioned in the passage above are comprised of very
small sets of rules and facts covering at most a handful of isolated
examples, others are more complex and complete, making fewer
simplifying assumptions and covering relatively large sets of
examples. Like micro-worlds, mini-worlds should be large enough to
expose problems with proposed methods yet small enough to be
manageable. They should cover a semantic field if possible or at
least a large set of examples. A mini-world should drop major
assumptions associated with micro-worlds (such as the "one agent"
assumption making blocks-worlds so unrealistic).
The emotion kb mentioned above is under development as part of a
collaborative project funded by NSF grants at UCI (Paul O'Rorke) and
Illinois (Andrew Ortony and Jerry DeJong). One of the goals of the
collaboration is to move away from isolated examples and unrealistic
micro-worlds such as blocks world. At this point the knowledge base
associated with the emotion domain already contains a relatively large
number of rules and facts and covers on the order of a hundred
examples.
The kb is based on a theory of emotions developed over nearly a decade
by a cognitive scientist, a social psychologist, and a cognitive
psychologist (Andrew Ortony, Gerald Clore, and Allan Collins). The
theory is based on the psychological literature on emotions and on
psychological experiments conducted at Illinois and elsewhere. The 22
emotion types described by the theory cover roughly 500 emotion words
from the literature on emotions. Ortony's main goal in collaborating
on the development of the knowledge base is to elaborate the emotion
theory and explore its adequacy.
I am more concerned with the AI side of the project than with the
cognitive psychology side. That is, I am more concerned with using
the emotions kb as a way of guiding research and testing ideas about
integrated systems for abduction, plan recognition, and learning.
Some important issues that can be addressed in this domain are listed
below.
1. Issues of plausible reasoning associated with the construction of
explanations can be addressed. I believe this is as important as (and
not the same as) the "operationality" issue that has played such a
large role in early EBL research. Most of the causal associations in
the emotion theory are "soft" and some are stronger than others.
There are many interactions between inferences. Many inferences
supported by the emotion theory are mutually consistent (even
reinforcing each other) while other inferences are inconsistent. We
can test proposed measures of plausibility and explanatory coherence
using the emotion kb.
2. Issues of search control can be addressed. We have a large number
of rules, facts, and examples. This has forced us to use heuristics
to control the search for good explanations. We can do computational
experiments comparing and evaluating different search control methods.
REFS
O'Rorke, P., and Cain, T., Explanations Involving Plans and Emotions,
in Proceedings of the AAAI-88 Workshop on Plan Recognition.
Ortony A., Clore, G. L., and Collins, A., The Cognitive Structure of
Emotions, Cambridge University Press, New York, 1988.
---Paul O
----------------------------------------------------------------------
>From: "Dan St.Clair, Professor" <C0567@UMRVMB.UMR.EDU>
Subject: UMR/MDRL Repository of ML Algorithms
Date: Wed, 19 Jul 89 08:52:01 CDT
We are currently in the process of putting together a repository of machine
learning algorithms as noted below. Your comments and suggestions are most
welcome. Some questions you might consider are: Is it a good idea? Is this
the way to go about it? etc.
UMR/MDRL
REPOSITORY OF MACHINE LEARNING ALGORITHMS
Sponsored by: University of Missouri - Rolla
McDonnell Douglas Research Laboratories
The University of Missouri - Rolla in conjunction with McDonnell
Douglas Research Laboratories plans to establish a Repository of
Machine Learning Algorithms. This database will contain
algorithms which have been developed by researchers in machine
learning. The purpose of this database is not only to provide
researchers with access to the work of others but to foster
communication between machine learning researchers. The database
will complement the current databases of "real" data which are
available to researchers.
Researchers from all areas of machine learning will be invited to
contribute machine learning code and a description including: who
wrote the code (including any desired copyright notice), what the
code does, how it does it, input/output requirements, and
hardware/software requirements. This code will be tested, as
facilities permit, to determine if correct instructions have been
given for using the code and if the code is executable using the
hardware/software specified. Algorithms passing these tests will
then be added to the database.
Algorithms contained in the database will be accessible through
Internet at no cost to users. The database will be available 24
hours a day, 7 days a week with the exception of periods when the
hardware may be down for scheduled maintenance.
Estimated start date: Fall 1989
Your comments and suggestions would be most appreciated. Please
address comments and suggestions to:
Dan St. Clair
Professor of Computer Science
University of Missouri - Rolla
Graduate Engineering Center in St. Louis
St. Louis, Missouri 63121
Tel: (314) 553-5431
E-mail: C0567@umrvmb.umr.edu
----------------------------------------------------------------------
Date: Fri, 21 Jul 89 09:44:32 EDT
>From: Gregory Piatetsky-Shapiro <gps0@gte.COM>
Subject: Open position in an exciting new project on Knowledge Discovery
I am currently looking for a dynamic bright researcher/implementer
to join my project on Knowledge Discovery in Databases.
The project aim is to develop and apply methods for analyzing very
large production databases and extracting from them condensed "knowledge" -
rules, metadata, domain hierarchy, etc.. The methods under consideration are a
combination of inductive, knowledge-based, statistical and database approaches.
Possible applications include giving high-level and approximate
answers to user queries, semantic query optimization, and inducing diagnostic
expert systems from data.
The ideal candidate would be a (relatively) fresh Ph. D. or an advanced
M.S. with a knowledge of Machine Learning, Databases and Statistics.
The work will include implementation of new algorithms, experimentation
with existing algorithms and packages, some theoretical analysis, and help in
defining applications. Implementation will be done using LISP and/or C
on a workstation/PC. Parallel hardware is also under consideration.
If you know any candidate who is interested and approximates the ideal,
please send me a resume ASAP.
Sincerely,
Gregory Piatetsky-Shapiro email: gps0%gte.com@relay.cs.net
GTE Laboratories, MS-45 CSnet: gps0@gte.com
40 Sylvan Road fax: (617) 890-9320
Waltham MA 02254 USA phone: (617) 466-4236
----------------------------------------------------------------------