Copy Link
Add to Bookmark
Report
Machine Learning List Vol. 1 No. 02
Machine Learning List: Vol. 1 No. 2
Saturday, July 15, 1989
Contents: ML-LIST Announcement
UCI Repository of Machine Learning Databases
Grand Challenges
The Cup Domain Theory
Knowledge Integration
The Machine Learning List is moderated. Contributions should be relevant to
the scientific study of machine learning. Mail contributions to ml@ics.uci.edu.
Mail requests to be added or deleted to ml-request@ics.uci.edu
- ----------------------------------------------------------------------
The ML-LIST is locally redistributed by several universities and companies.
If you receive two copies of this message, it means that you are on a local
redistribution list as well as a direct subscriber to ML-LIST. Send a message
to ml-request@ics.uci.edu to be removed.
Please send submissions to ml@ics.uci.edu not to pazzani@ics.uci.edu
- ----------------------------------------------------------------------
Subject: UCI Repository of Machine Learning Databases
Date: Wed, 12 Jul 89 11:26:02 -0700
From: "David W. Aha" <aha@ICS.UCI.EDU>
The purpose of this repository is to centralize a collection of databases
that have been referenced in the machine learning literature. These
databases are of primary interest to researchers who would like to compare
and contrast their supervised and/or unsupervised learning algorithms with
other reported case studies in the literature. We have approximately 40
databases in this repository at this time. It can be ftp'd as follows:
node: ics.uci.edu
location: /usr2/spool/ftp/pub/machine-learning-databases
userid: anonymous
password: anonymous
Extensive documentation exists for each database. An overview is contained
in the HELLO file. We actively solicit contributions of new databases. Our
documentation requirements are outlined in DOC-REQUIREMENTS. Our current
extensions include measuring the "difficulty" of each database, as
approximated by the number of nodes required by a PDP algorithm to attain
convergence.
David W. Aha
- ----------------------------------------------------------------------
Date: Wed, 12 Jul 89 11:28:24 EDT
From: Prasad.Tadepalli@H.GP.CS.CMU.EDU
Subject: Grand Challenges
From: Michael Pazzani <pazzani@ICS.UCI.EDU>:
> At the recent Machine Learning Talk, Tom Dietterich gave an interesting
> talk on Grand Challenges for Machine Learning. One of the challenges
> was learning from natural language texts. In some ways, I think it would
> be a good idea for people in machine learning to look at natural language
> processing.
Me too... I think that one reason that this is a good problem is that
this is a problem in which the normative theory of performance can not be
isolated from psychological considerations. Let me try to explain. In AI,
it is often said, "this is AI; so we do not care how people do or how people
think." As long as you are in a domain in which "good performance" can be
defined without resorting to any psychological considerations (e.g. winning
most often in chess, solving problems in geometry etc.) this argument makes
sense. But, ultimately AI should address "the other" problems -- i.e. problems
for which there is no normative standard other than PEOPLE. How do you judge
a program that claims to "understand natural language"? Only by comparing it
with people. It should not only understand the language as well as PEOPLE do,
but should also understand that people are not perfect, and should
tolerate and correct the errors that PEOPLE make. That requires a better
understanding of how people think, and what errors they make etc. and
hence closer ties between cognitive psychology and AI than we currently have.
I think that this is a good direction to move AI in.
> Unlike many explanation-based learning systems, very few NLP systems use
> Horn clauses to represent knowledge and backward-chaining depth-first search
> as an inference process.
If you said this to provoke some controversy, I must disappoint you. I
agree with the latter part of your sentence. But, in fairness to EBL, I must
point out that EBL techniques do not require Horn clause representation of
knowledge or backward-chaining depth-first problem solvers.
> However, many hard problems in learning from natural language text are
> pure natural language problems that have very little to do with
> learning (e.g., word sense disambiguation, finding the referents of
> noun phrases, etc.). Until progress is made in NLP, I doubt that much
> progress can be made in learning from natural language texts.
I must agree with you again there.
> Perhaps
> a better challenge for the learning community as opposed to the natural
> language community would be to incorporate the hand-coded representations
> of natural language texts into a large memory.
Before this is done, I think some research needs to be done on what the
input (and output) representations must look like. Is Schank's conceptual
dependency the final word on this? Are there any experiments on using cd
on real natural language texts? How well did it perform?
> How much progress has the CYC project made on this project?
One thing that intrigues me about the CYC project is that it is not
clear to me whether the knowledge base they are constructing is supposed
to contain the "common sense knowledge" (e.g. running faster is more difficult
than running slower) or more "idio-syncratic knowledge" (e.g. brain is a
collection of complex neural networks). Also, what are the performance
objectives of the project?
I also think understanding metaphors should go a long way since
(a) natural language seems to be full of metaphors, and (b) metaphors appear
to be very powerful in communicating new knowledge (e.g. look at any
introductory text on almost any subject). There seems to be almost no
other way to introduce a new subject to people. Any comments?
Prasad Tadepalli
- ----------------------------------------------------------------------
Date: Tue, 11 Jul 89 10:46 EDT
From: Tom Fawcett <Tom%catherine@gte.COM>
Subject: The Cup Domain Theory
[At the session on combining empirical and explanation-base learning,
Michael Pazzani jokingly put up a slide to prohibit the "cup" domain theory
from presentations because 1. It has no psychological validity (children
learn about cups before they have theories of containers and heat transfer),
2. The pedagogical value is not necessary for researchers. 3. It is an extreme
toy and no interesting issues, such as incompleteness or incorrect domain
theories are present in the domain theory.]
I hate cups myself, and wouldn't argue about them, but your points
touch on the larger issue of domains.
1) "No psychological validity" - Why is this relevant? None of the
domain theories were psychologically valid, and none of the
presentations claimed that the methods were cognitively plausible.
2) "Nice pedagogical value - so what". We had 15 mins and 3 pages to
present our work. How much of that would you be willing to sacrifice
to discuss your domain? My impression of a lot of the non-cup-domain
talks was that each presenter had a single very busy slide of the
domain, which he or she was reluctant to spend any time discussing. I
don't blame them, but it makes it difficult to understand what the
method is doing.
3) "Extreme toy" - This is really the only relevant objection, but the
"toy/real world" distinction is trickier than you're probably willing
to admit. I don't consider the "suicide domain" or the "folk
psychology/emotion/anger" domain or the 10-rule economic sanctions
domain, or even the kidnapping domain to be any less toy, since
they're all arbitrary simplifications of "real world" domain theories.
So why pick on cups, other than that it was the most popular? The
real problem I see with a toy domain is that there are few rules and
they interact in few (and uninteresting) ways, so it's easy to get a
method to work even if it has real problems.
- - -Tom
- ----------------------------------------------------------------------
Date: Fri, 14 Jul 89 09:45:04 -0700
From: Kenneth Murray <murray@cs.utexas.EDU>
Subject: knowledge integration
Michael Pazzani's note (7/9/89) comments on one of Tom Dietterich's Grand
Challenges for machine learning, specifically the challenge of learning
from natural language text. Pazzani suggests that the task of incorporating
hand-coded representations of natural language text into a knowledge base
might be a better challenge because it includes many of the same learning
issues without the complexities of natural language processing. In fact,
the learning task Pazzani proposes is essentially the task of knowledge
integration, also mentioned in Dietterich's Grand Challenges talk.
Knowledge integration is the incorporation of new information into
existing knowledge. My dissertation research with Bruce Porter involves
studying knowledge integration in the context of incorporating new,
hand-coded knowledge structures into the Botany Knowledge Base, a large
knowledge base representing plant anatomy and physiology developed in
collaboration with Lenat's CYC project.
Knowledge integration is an important learning task because new
information often has significant consequences for existing knowledge.
For example, new information may conflict with existing knowledge;
knowledge integration identifies and resolves such conflicts.
Alternatively, new information may explain or be explained by existing
knowledge; knowledge integration identifies and generalizes such
support relationships. Knowledge integration is difficult because
interactions between new information and existing knowledge can be
numerous and subtle. Furthermore, the task of knowledge integration is
ubiquitous: it occurs (to some extent) whenever a person comprehends
new information, and it is required whenever a knowledge base is
extended.
I'd be very happy to provide additional information on this research
to anyone who is interested.
Ken Murray
Murray, K.S.
``KI: An Experiment in Automating Knowledge Integration,''
(Ph.D. Dissertation Proposal), University of Texas Department
of Computer Sciences Technical Report, AI-TR88-90, October 1988.
Murray, K.S., and Porter, B.W.,
``Developing a Tool for Knowledge Integration: Initial Results,''
Proceedings of the Third Knowledge Acquisition for Knowledge-based
Systems Workshop, November, 1988.
Murray, K.S., and Porter, B.W.,
``Controlling Search for the Consequences of New Information
During Knowledge Integration,''
Proceedings of the Sixth International Workshop on Machine Learning,
July, 1989.
Bareiss, E.R., Porter, B.W., and Murray, K.S.,
``Supporting Start-to-Finish Development of Knowledge Bases,''
(to appear in Machine Learning, special issue on Knowledge Acquisition,
1989).
- ----------------------------------------------------------------------
End of ML-LIST 1.2