Copy Link
Add to Bookmark
Report
Machine Learning List Vol. 6 No. 30
Machine Learning List: Vol. 6 No. 30
Sunday, November 27, 1994
Contents: MLnetNEWS 3.1
The Machine Learning List is moderated. Contributions should be relevant to
the scientific study of machine learning. Mail contributions to ml@ics.uci.edu.
Mail requests to be added or deleted to ml-request@ics.uci.edu. Back issues
may be FTP'd from ics.uci.edu in pub/ml-list/V<X>/<N> or N.Z where X and N are
the volume and number of the issue; ID: anonymous PASSWORD: <your mail address>
URL- http://www.ics.uci.edu/AI/ML/Machine-Learning.html
----------------------------------------------------------------------
Date: Fri, 25 Nov 94 16:43:20 0000
From: MLnet Admin <mlnet@csd.abdn.ac.uk>
Subject: MLnetNEWS 3.1 - October 1994
**************************************
MLnetNEWS 3.1 October 1994
The Newsletter of the European Network
of Excellence in Machine Learning
**************************************
If you wish to receive the original newsletter in hard copy, please
send your *postal* address (i.e. NOT email) to:
mlnet@csd.abdn.ac.uk
**************************************
Contents:
- News from the Management Board and Technical Committees.
- ML Research at Three New Nodes of MLnet
- University of Helsinki, Finland
- Imperial Cancer Research Fund, London, UK
- Polytechnic of Milan, Italy.
- ECML95, 8th European Conference on ML, Call for Papers.
- ECML95 MLnet Familiarization Workshops, Call for Papers
- Knowledge Level Modelling and ML
- Learning Robots
- Statistics, ML, and Discovery in Data Bases.
- ECML97, Call for Proposals.
- ELSnet/MLnet Joint Workshop on ML of Natural Language and Speech,
Call for Participation.
- Report on the 4th Int. Workshop on Inductive Logic Programming (ILP94).
- Inductive Logic Programming European Scientific Network, ILPNET.
- What's new in the MLnet ML Archive at GMD.
- Procedures for Joining MLnet.
**************************************
**************************************
**************************************
News from the Management Board and the Technical
Committees Meetings at Dourdan, September '94
The latest set of Management Board and Technical Committee
meetings were squeezed in between the Industrial Liaison Workshop
(organised by Yves Kodratoff) and the 1994 Summer School in ML
(organised by Celine Rouveirol). All meetings were held at the
Dourdan VVF (Villages de Vacances Familiales), a very comfortable
complex set in a wooded area some 50 kilometres from central
Paris. Both the Industrial Liaison Workshop and the Summer School
were well attended, and the organisers are to be congratulated on
their programmes, and Dolores Caamero (LRI) for her efficient
running of both events. Technical reports on these events will be
found in the next issue of MLnetNEWS.
Below I give highlights of the Management Board and Technical
Committee decisions:
Management Board
Three new Associate Nodes were accepted:
- ICRF (Imperial Cancer Research Fund, London)
- University of Helsinki
- Polytechnic of Milano
(See pages 2-5 for a research profile for each group.)
The Management Board agreed to revise the criteria which are
normally applied for Industrial Nodes to be accepted as
Associate Nodes. These are now:
- that they can show a substantial involvement in a relevant
activity to MLnet over the last year,
- have planned relevant activity for at least the next year,
- are willing to commit up to 2 days annually on MLnet
business (attending meetings etc.).
The joint MLnet/ELSNET Workshop is to be held on 2 and 3
December (1994). See page 13 for further details.
Electronic Communication TC
Good progress has been made by GMD in installing the FTP
server; however problems have been encountered by Amsterdam
over the installation of the Andrew File System as it seems to
raise a number of system security issues.
Industrial Liaison TC
The Industrial Database has a sizeable number of entries -
particularly French and Dutch ones. Other groups are invited to
assist with this work; please contact Yves Kodratoff.
Research TC
The planning of ECML95 is progressing well. See page 7 for the
details of the Familiarization Workshop to be held at
Heraklion, immediately following the ECML95 conference. (Note
particularly that there are 2 sets of deadlines for receipt of
contributions to the workshops.)
The International Machine Learning Conference in 1996, will be
held at Bari (Italy). Lorenza Saitta will be the Program
Chair, and Floriana Esposito will be the Local Chair.
The revised "State of the Art" document is being edited and
will be available early in 1995.
Framework 4: calls for some topics are expected in December,
with a closing date in March 1995.
70 or so entries already exist on the Research Database. This
will be transferred shortly from Macintosh to UNIX and will be
distributed.
Training TC
A training questionnaire will be circulated to European
Computing Science/AI departments shortly. (If you have not
received one and would like to complete one, please contact
Katharina Morik.)
Written Communication TC
It was agreed that the Newsletters for the next year would be
produced in October, February and May.
**************************************
**************************************
**************************************
Machine Learning Research at the Department of Computer
Science, University of Helsinki
University of Helsinki is the largest university in Finland. The
Department of Computer Science was founded in 1967. The faculty of
the department currently comprises about 70 teachers and
researchers. The number of students majoring in computer science
is about one thousand.
The main research areas of the department are algorithms and data
structures, machine learning, neural computing, data mining,
database design, structured text databases, knowledge-based
systems, logic databases, knowledge representation and reasoning,
open distributed processing, modelling of concurrency, and
performance analysis.
The Research Group
There are many separate research projects currently active on
machine learning and closely related topics at the site. An
"umbrella project Pattern Matching and Data Mining (PMDM)
facilitates cooperation between the separate but related groups.
The number of faculty members involved in the PMDM project is 10.
In addition, several students work on machine learning topics.
Machine learning related projects are headed by Professors Esko
Ukkonen (algorithmic learning theory, string databases, biological
applications) and Heikki Mannila (algorithmic learning theory,
data mining, text databases). The other researchers whose interest
mainly focuses on machine learning are Helena Ahonen (grammatical
inference), Tapio Elomaa (classifier learning), Jyrki Kivinen
(algorithmic learning theory), Hannu Toivonen (knowledge
discovery), and Jaak Vilo (biological applications).
Research Topics
Currently our research on machine learning is directed mainly to
the following topics.
Concept Learning [2,3,6,7,10]
Learning algorithms for rule sets. We have developed new types
of algorithms for learning rule sets: hierarchical classifiers
and ripple-down rule sets suit some applications better than
traditional production rule/decision list approaches. Further
development of the algorithms and testing of their
applicability to biological data is under way.
Learning decision trees. Decision tree learning is one of the
most extensively covered topics of concept learning.
Nevertheless, several open problems still remain. Further
development of existing techniques is desired. Application of
the methodology to real-world problems is also within our
interest. Both theoretical and empirical lines of research are
followed.
Program development. We have developed systems that facilitate
empirical concept learning both in the traditional attribute-
based framework and in the more recent string learning
approaches.
Data Mining [5,8,9]
Choosing interesting rules. Large numbers of rules are
typically generated in knowledge discovery systems. However,
only a portion of the rules fulfilling the confidence criterion
are interesting to the user. We are exploring different
measures for choosing the interesting rules from among the
large collection of all rules.
Finding association rules from sequential data. Any time-
dependent events (e.g., the fault reports in a telephone
network) make up an order-dependent sequence of actions.
Identifying related actions from such a sequence is a little
studied problem with many interesting open problems.
Data dependencies in knowledge discovery. We study the
expressive power of the well-known technique and its
applicability to knowledge discovery in databases.
Text and String Databases [1,4]
Description languages for text databases. Large text databases
often incorporate structured parts. Knowing the structure of
the text could be utilised by the user. It is, however,
practically impossible to describe the structure by hand.
Semiautomatic methods for inferring the description of the
structure incorporated in a text database are being developed.
String databases. The query languages of existing string
databases are predominantly based on ad-hoc techniques.
Increasing interest in biological data (e.g., DNA-sequences)
has underlined the deplorable state of affairs in the field. We
are working towards a theoretically solid extension of the
relational data model on which the implementation and query
language design of string databases can be based.
Recent Publications
[1] H. Ahonen, H. Mannila E. Nikunen. Forming grammars for
structured documents: An application of grammatical inference.
In R. Carrasco J. Oncina (eds.), Proc. Second International
Colloquium on Grammatical Inference and Applications , LNAI
862 (pp. 153-167). Springer-Verlag, 1994.
[2] T. Elomaa. In defence of C4.5: Notes on learning one-level
decision trees. In W. Cohen H. Hirsh (eds.), Machine Learning:
Proc. Eleventh International Conference (pp. 62-69). Morgan
Kaufmann, 1994.
[3] T. Elomaa, N. Holsti I. Hyv TELA. A platform for
experimenting with attribute-based learning programs.
Submitted.
[4] G. Grahne, M. Nyk E. Ukkonen. Reasoning about strings in
databases. In Proc. Thirteenth ACM SIGACT-SIGMOD-SIGART
Symposium on Principles of Database Systems (pp. 303-312). ACM
Press, 1994.
[5] J. Kivinen H. Mannila. The power of sampling in knowledge
discovery. In Proc. Thirteenth ACM SIGACT-SIGMOD-SIGART
Symposium on Principles of Database Systems (pp. 77-85). ACM
Press, 1994.
[6] J. Kivinen, H. Mannila E. Ukkonen. Learning rules with local
exceptions. In J. Shawe-Taylor M. Anthony (eds.),
Computational Learning Theory: EuroCOLT 93 (pp. 35-46).
Clarendon Press, 1994.
[7] J. Kivinen, H. Mannila, E. Ukkonen J. Vilo. An algorithm for
learning hierarchical classifiers. In F. Bergadano L. De Raedt
(eds.), Machine Learning: ECML-94 , LNAI 784 (pp. 375-378).
Springer-Verlag, 1994.
[8] H. Mannila K.-J. R Algorithms for inferring functional
dependencies. Data Knowledge Engineering 12 (pp. 83-99), 1994.
[9] H. Mannila, H. Toivonen I. Verkamo. Efficient algorithms for
discovering association rules. In U. Fayyad R. Uthurusamy
(eds.), AAAI Workshop on Knowledge Discovery in Databases (pp.
181-192). Seattle Wa., 1994.
[10] I. Sillitoe T. Elomaa. Learning decision trees for mapping
the local environment in mobile robot navigation. In Proc.
MLC-COLT Workshop on Robot Learning (pp. 119-125). New
Brunswick NJ, 1994.
**************************************
**************************************
**************************************
Machine Learning Related Research at the Imperial Cancer
Research Fund (London)
The Imperial Cancer Research Fund (ICRF) is a medical charity
devoted to understanding and treating cancer. Machine learning
related research at ICRF is concentrated in three laboratories:
the Biomolecular Modelling Laboratory, the Biomedical Informatics
Unit, and the Advanced Computing Laboratory.
Biomolecular Modelling Laboratory Research Interests
Machine Learning - ILP
In collaboration with the Turing Institute and Oxford University
Computing Laboratory, we have pioneered the application of
Inductive Logic Programming (ILP) algorithms to real-world
problems. This work has highlighted areas of improvement for ILP
as well as providing new biological insights. ILP algorithms are
recognised as one of most important recent advances in symbolic
machine learning. Work on ILP has produced important new
theoretical insights into the nature of learning, as well as
opened up new application areas for learning methods.
Machine Learning Applied to Drug Design
Over the last three years we have extensively investigated the
application of machine learning to drug design. On two important
drug design problems, trimethoprim analogues and triazine
analogues, we compared the performance of: symbolic machine
learning algorithms (CART, M5, GOLEM), a neural network algorithm
(backpropagation), and standard statistical methods. All the
methods were found to give comparable accuracy. However symbolic
machine learning algorithms, in particular ILP methods, produced
rules which gave more chemical insight into the design problems.
We are currently extending this work, using a new ILP algorithm
PROGOL, to learn about arbitrary chemical compounds, not just a
series of analogues. We are testing this method on a problem of
predicting compound mutagenesis.
Machine Learning Applied to Protein Structure Prediction
For eight years, the Biomolecular Modelling Laboratory has been
applying symbolic machine learning to the problem of predicting
protein structure. This is one of the most important unsolved
problems in molecular biology. We have studied two sub-problems
in particular: predicting secondary structure from primary
structure, and predicting secondary structure packing. Both
propositional and ILP symbolic machine learning algorithms have
been applied to the problem of predicting secondary structure.
These machine learning methods have shown themselves capable of
forming rules that are more human understandable than alternative
techniques, and for the sub-problem of predicting the secondary
structure of proteins of type a/a, they are more accurate than
alternative techniques.
In collaboration with the Biomedical Informatics Unit, we have
used the ILP program, GOLEM, to learn secondary structure packing
rules for a subset of proteins known as a/b domains. This work
produced a number of interesting findings, some of which were
previously known, others that were novel and gave new insight into
the folding of proteins. These rules are being evaluated for
incorporation into constraint-based methods of protein structure
prediction.
Biomedical Informatics Unit Research Interests
Knowledge Acquisition in Molecular Biology
We have developed knowledge engineering methods for scientific
reasoning in Molecular Biology (specifically protein topology
prediction) and argued that an important aspect of this process
could be characterized as constraint satisfaction. A formal model
was presented which is capable of (a) representing the appropriate
declarative knowledge of molecular biologists for this task
(hypothesis generation processes and consistency relations) and
(b) the strategic reasoning operations this declarative knowledge
must support (data inclusion, hypothesis generation, consistency
checking, hypothesis up dating, conflict resolution and data
exclusion). On the basis of these analyses the authors developed a
logical simulation capable of duplicating the reasoning steps
employed by molecular biologists. Subsequently these ideas have
been incorporated into a software tool capable of dealing with
real data for assisting molecular biologists in constraint-based
protein-topology prediction. This tool now forms part of the
PAPAIN suite of programs.
Advanced Computing Laboratory Research Interests
Advanced Database Technology
Recent developments in database technology promise major benefits
for developing advanced computer systems for molecular biology and
other fields of biomedical research. The IDEA project is funded by
the European Commission as part of its advanced technology ESPRIT
programme. The project aims at developing a high performance
database management system which integrates a number of novel
database techniques. The laboratory is responsible for developing
a genetics application to demonstrate the potential benefits of
such integrated technologies. The application will include
functions for data capture and analysis and management of
laboratory procedures, and provide an opportunity for
investigating advanced computing in genetic modelling. To
facilitate development and maintenance of the application we have
designed an executable specification language, SLOT. This has been
implemented in practical software and its value as a design tool
is currently being evaluated in the genetics laboratory
application.
Scientists Active in Areas related to ML Topics
Biomolecular Modelling Laboratory
R.D. King
M.J.E. Sternberg (Group leader)
Advanced Computing Laboratory
Z. Cui
J. Fox (Group leader)
C. Gordon
P. Hammond
C. Hearne
A. Jackson-Smale
P. Krause
S. Parsons
Biomedical Informatics Unit
D.A. Clark
C.J. Rawlings (Group leader)
Relevant Publications
Sternberg, M.J.E., Lewis, R.A., King, R.D., & Muggleton, S. (1992)
Modelling the structure and function of enzymes by machine
learning. The Royal Society of Chemistry: Faraday Discussion No:
93. 269-280.
King, R.D., Muggleton, S., Lewis R.A., & Sternberg, M.J.E. (1992)
Drug design by machine learning: The use of inductive logic
programming to model the structure-activity relationships of
trimethoprim analogues binding to dihydrofolate reductase. Proc.
Nat. Acad. Sci. U.S.A. 89, 11322-11326.
Bratko, I., & King, R.D. (1994). Applications of inductive logic
programming. SIGART Bulletin. 5. 43-49.
Sternberg, M.J.E., King, R.D., Lewis, R.A., & Muggleton, S.
(1994). Application of machine learning to structural molecular
biology. Phil. Trans. R. Soc. Lond. B. 344. 365-371.
Rawlings, C.J., Fox, J.P. (1994). Artificial Intelligence and
Molecular Biology - A Review and Assessment, Philosophical
Transactions of the Royal Society, Ser B.
Clark, D. A, Barton G. J. and Rawlings, C. R. (1990). A Knowledge-
Based Architecture for Protein Sequence Analysis and Structure
Prediction, Journal of Molecular Graphics, 8(3), 94-107.
Cui Z, Fox J & Hearne C. (1993). Knowledge Based Systems for
Molecular Biology: the Role of Advanced Database Technology and
Formal Specifications. IJCAI93 AI and the Genome Workshop.
Fox J. (1993). On the Soundness and Safety of Expert Systems In
Artificial Intelligence in Medicine, 5, 159-179.
**************************************
**************************************
**************************************
Machine Learning Research at Politecnico di Milano -
Artificial Intelligence and Robotics Project
The Polytechnic of Milan Artificial Intelligence & Robotics (PM-
AI&R) Project is a research group located at the Department of
Electronics and Information of the Politecnico di Milano, the
largest Italian technical university. It was founded in 1971 by
Marco Somalvico, the current director, and presently has a staff
of 13 researchers. Members of the Project are involved in teaching
activity at both undergraduate and graduate level, in the areas of
Basic Programming, Artificial Intelligence, Knowledge Engineering,
and Robotics. Some 60 Master and PhD theses are supervised
annually.
The laboratory of the PM-AI&R Project includes 7 Unix
workstations, 15 PCs, 12 Apple Macs, 49 Transputers, access to a
Connection Machine, 4 manipulators, 5 mobile robots, and the usual
range of robotic equipment. Software is developed in Common Lisp,
CLOS, Prolog, C, and C++.
The research activity of the group are mainly funded by CNR
(Italian National Research Council), MURST (Italian Ministry for
University and Research), and ESPRIT.
Six members of the PM-AI&R Project are active in the field of
machine learning, namely: Andrea Bonarini, Giuseppe Borghi, Marco
Colombetti, Marco Dorigo, Vittorio Maniezzo, and Fabio Marchese.
At present, the main interest is in behavioural learning for
autonomous agents, with a strong bias toward the application of
"natural" computational models, like genetic algorithms and neural
networks.
Research Topics
Training Autonomous Agents by Reinforcement Learning.
This research is concerned with the problem of transferring task
level knowledge to a reactive system. The approach we use is to
let a trainer, which has the role of an expert, provide
reinforcements to a learning robot. So far, we have implemented a
Learning Classifier System (ALECSYS) and a Fuzzy Learning
Classifier System (ELF), which have been used to train both
simulated agents and real robots. We are also comparing the
performances of different learning algorithms (various versions of
Q-learning).
Behaviour Engineering
This research aims at viewing agent training as a part of a larger
methodology for the development of behaviour-based autonomous
robots. The methodology is concerned with behaviour specification
and analysis, robot design, machine learning techniques for robot
training, and measuring the quality of the resulting behaviour.
Genetic Evolution of Neural Networks
This research led to the design of a novel genetic approach to
neural network design, and to its implementation in the ANNA
ELEONORA system. The system allows for the evolution of both
network weights and network topology, and has been applied to
Boolean function learning, function approximation, and robot
control.
Theoretical analysis of genetic algorithms
This research focuses on the study of theoretical aspects of
genetic algorithms. A theorem has been proven about a lower bound
on the quantity of information processed by a classical genetic
algorithm. We are now trying to extend the proof to the more
general class of evolutionary algorithms.
Recent Publications
Bertoni, A., and M. Dorigo (1993). Implicit parallelism in genetic
algorithms. Artificial Intelligence, 61 (2), 307-314.
Bonarini, A. (1994). Evolutionary learning of general fuzzy rules
with biased evaluation functions: competition and cooperation. In
Proceedings of IEEE WCCI - Evolutionary Computation, IEEE Computer
Press, pp. 51-56.
Bonarini, A. (1994). Some methodological issues about designing
autonomous agents which learn their behaviours: the ELF
experience. In R. Trappl, ed., Cybernetics and Systems Research
94, World Scientific, Singapore.
Colombetti, M. (in press). Adaptive agents: Steps to an ethology
of the artificial. In F. Masulli, P. Morasso and A. Schenone,
eds., Neural Networks in Biomedicine, World Scientific.
Colombetti, M., and M. Dorigo (1994). Training agents to perform
sequential behaviour. Adaptive Behaviour, 2 (3), 247-275.
Colombetti, M., M. Dorigo and G. Borghi (accepted for
publication). Behaviour Analysis and Training: A methodology for
Behaviour Engineering. IEEE Transactions on Systems, Man, and
Cybernetics.
Dorigo, M. (1993). Genetic and non-genetic operators in Alecsys.
Evolutionary Computation Journal, 1 (2), 151-164.
Dorigo, M. (to appear). Alecsys and the AutonoMouse: Learning to
control a real robot by distributed classifier systems. Machine
Learning (also available as Technical Report No. 92-011,
Dipartimento di Elettronica e Informazione, Politecnico di
Milano.)
Dorigo, M., and M. Colombetti (in press). Robot Shaping:
Developing autonomous agents through learning. Artificial
Intelligence.
Dorigo, M., and U. Schnepf (1993). Genetics-based Machine Learning
and Behaviour Based Robotics: A New Synthesis. IEEE Transactions
on Systems, Man, and Cybernetics, 23 (1), 141-154.
Maniezzo, V. (1994). Genetic evolution of the topology and weight
distribution of neural networks. IEEE Transactions on Neural
Networks, 5 (1), 39-53.
**************************************
**************************************
**************************************
ECML-95
8th EUROPEAN CONFERENCE ON MACHINE LEARNING
2527 April 1995, Heraklion, Crete, Greece
Second Announcement and Final Call for Papers
General Information:
Continuing the tradition of previous EWSL and
ECML conferences, ECML-95 provides the major European
forum for presenting the latest advances in the area of
Machine Learning.
Program:
The scientific program will include invited talks,
presentations of accepted papers, poster and demo
sessions. ECML-95 will be followed by MLNet
familiarization workshops for which a separate call for
proposals will be published (see Pages 7-12 of this
newsletter).
Research areas:
Submissions are invited in all areas of Machine
Learning, including, but not limited to:
abduction analogy
applications of machine learning automated discovery
case-based learning comput. learning theory
explanation-based learning inductive learning
inductive logic programming genetic algorithms
learning and problem solving multistrategy learning
reinforcement learning representation change
revision and restructuring
Program Chairs:
Nada Lavravc (J. Stefan Institute, Ljubljana) and
Stefan Wrobel (GMD, Sankt Augustin).
Program Committee:
F. Bergadano (Italy) I. Bratko (Slovenia)
P. Brazdil (Portugal) W. Buntine (USA)
L. De Raedt (Belgium) W. Emde (Germany)
J.G. Ganascia (France) K. de Jong (USA)
Y. Kodratoff (France) I. Kononenko (Slovenia)
W. Maass (Austria) R. Lopez de Mantaras (Spain)
S. Matwin (Canada) K. Morik (Germany)
S. Muggleton (UK) E. Plaza (Spain)
L. Saitta (Italy) D. Sleeman (UK)
W. van de Velde (Belgium) G. Widmer (Austria)
R. Wirth (Germany)
Local chair :
Vassilis Moustakis, Institute of Computer Science,
Foundation of Research and Technology Hellas (FORTH),
P.O. Box 1385, 71110 Heraklion, Crete, Greece (E-mail
ecml-95@ics.forth.gr).
Submission of papers:
Paper submissions are limited to 5000 words. The title
page must contain the title, names and addresses of
authors, abstract of the paper, research area, a list
of keywords and demo request (yes/no). Full address,
including phone, fax and E-mail, must be given for the
first author (or the contact person). Title page must
also be sent by E-mail to ecml-95@gmd.de. If possible,
use the sample LaTeX title page available from
ftp.gmd.de, directory /ml-archive/general/ecml-95, also
accessible via the World-Wide Web ECML95 page at
ftp://ftp.gmd.de/ml-archive/general/ecml-95/ecml95.html.
Six (6) hard copies of the whole paper should be sent by
2 November 1994 to:
Nada Lavrac & Stefan Wrobel (ECML-95)
GMD, FIT.KI, Schloss Birlinghoven, 53754 Sankt Augustin,
Germany
Papers will be evaluated with respect to technical
soundness, significance, originality and clarity.
Papers will either be accepted as full papers (presented
at plenary sessions, published as full papers in the
proceedings) or posters (presented at poster sessions,
published as extended abstracts).
System and application exhibitions :
ECML-95 offers commercial and academic participants
an opportunity to demonstrate their systems and/or
applications. Please announce your intention to demo to
the local chair by 24 March 1995, specifying precisely
what type of hardware and software you need. We
strongly encourage authors of papers that describe
systems or applications to accompany their presentation
with a demo (please indicate on the title page).
Registration and further information:
Current conference information is available online on the
World-Wide Web as:
ftp://ftp.gmd.de/ml-archive/general/ecml-95/ecml95.html
For information about paper submission and program,
contact the program chairs (E-mail ecml-95@gmd.de).
For information about local arrangements or to request
a registration brochure, contact the local chair
(E-mail ecml-95@ics.forth.gr).
_____________________________________________________________
Important Dates:
Submission deadline : 2 November 1994
Notification of acceptance : 13 January 1995
Camera ready copy : 9 February 1995
Exhibition requests : 24 March 1995
Conference : 25 - 27 April 1995
**************************************
**************************************
**************************************
ECML95 Familiarisation Workshops
The following 3 workshops will be held at Heraklion on 28 and 29
April 1995:
Knowledge Level Modelling and Machine Learning.
Learning Robots.
Statistics, Machine Learning, and Discovery in Data
Bases.
Each of these workshops will aim to be informal, and to provide
presenters with considerable feedback. It is planned to have a
keynote speaker for each workshop, and for each of these talks to
be scheduled at a time when they can be attended by all the
participants of the workshops.
Additionally there will be a number of parallel sessions when
workshop attendees will have a chance to discuss in depth aspects
of the running of MLnet; we expect to have groups on Electronic
Communication, Industrial Liaison, Research, Training, and Written
Communication.
Finally there will also be a "wrap-up" session at the end when top
level overviews will be presented of the workshops and of the
"infrastructure" sessions.
Contributions for the Learning Robots and the Statistics,
Machine Learning, and Discovery in Data Bases workshops,
should be sent to the respective coordinators by the 16th
January 95. Decisions on acceptance will be made by the 6th
February (early registration for ECML95 closes on the 17th
February). Final camera-ready versions of the revised papers
should be sent to the coordinator of the individual workshops by
the 13th March.
Contributions for the Knowledge Level Modelling and Machine
Learning workshop should be sent to the coordinator by the 1st
February, and decisions on acceptance will be made by the 1st
March.
Knowledge Level Modelling and ML:
Dieter Fensel
Social Science Informatics
University of Amsterdam
Roetersstraat 15
1018 WB Amsterdam
The Nederlands
Email: fensel@swi.psy.uva.nl
Phone: +31 20 525 6791 (Secretary: 6789)
Fax: +31 20 525 6896
Learning Robots:
Michael Kaiser
University of Karlsruhe
Institute for Real-Time Computer Systems and Robotics
D-76128 Karlsruhe
Germany
Email: kaiser@ira.uka.de
Phone: +49 721 608 4051
Fax: +49 721 606 740
Info by anonymous FTP: ftpipr.ira.uka.de
Statistics, Machine Learning and Discovery in Data
Bases:
Gholamreza Nakhaeizadeh
Daimler Benz AG
Research and Technology / F3W
Postfach 2360
D-89013 Ulm
Germany
Email: nakhaeizadeh@dbag.ulm.DaimlerBenz.COM
Phone: +49 731 505 2860
Fax: +49 731 505 4210
General enquiries about the workshops should be addressed to
Derek Sleeman (Aberdeen) and about local arrangements to
Vassilis Moustakis (Crete).
MLnet Sponsored Familiarization Workshop:
Knowledge Level Modelling and
Machine Learning
At first, knowledge acquisition and machine learning were two very
closely related research fields but there is currently little
interaction between them. One of the reasons for this weakened
relationship results from a paradigm shift in knowledge
acquisition (cf. [David et.al. 93]). Originally, knowledge
acquisition was viewed as a direct transfer of problem-solving
expertise from a human expert to a computer program. The acquired
knowledge was immediately represented by a running prototype. That
is, it was immediately implemented using a knowledge
representation formalism. The underlying assumption was that
frames or production rules represent knowledge identical to the
cognitive foundation of human expertise. Machine learning
techniques could be included directly in the knowledge acquisition
process. The machine learning algorithms could use the implemented
knowledge as input which was improved by their application. In the
mean time, knowledge acquisition is no longer viewed as a process
which directly transfers knowledge from a human to an implemented
computer program but rather as a modelling process. The result of
knowledge acquisition is no longer simply a running program but a
set of complementary models. One of these models, the so-called
model of the expertise, represents expertise in a manner which
differs significantly both from the cognitive base of human
expertise and from the final implementation. This model of
expertise describes the task which should be solved by the
knowledge-based system and the knowledge which is required to
solve the task effectively and efficiently. Both are described in
an implementation- independent manner. Both the human expert and
the implemented systems are instantiations (i.e., specific
problem-solving agents) of this model. Three important
requirements are postulated for such a model by the knowledge
acquisition community.
First, the separation of the symbol level from the knowledge
level. At the knowledge level, the expertise is described in an
implementation-independent manner. It is described in terms of
goals, operations, and knowledge about the relationships of goals
and operations. At the symbol level, a specific computational
agent is implemented which carries out the problem-solving process
by means of a computer program. In terms of software engineering,
a knowledge level description is a specification of the
functionality of the desired system and the required knowledge. A
symbol level description corresponds to an implementation or
design specification. Knowledge acquisition no longer produces
only a running prototype but also a description of the knowledge
which abstracts from its implementation. Distinguishing between
the knowledge and the symbol level therefore reflects the
distinction of specification and design/ implementation in
software engineering and in information system development. The
difference lies in the fact that knowledge acquisition is not only
concerned with the desired functionality of the system but also
with acquiring knowledge about how this functionality can be
achieved.
Second, the use of generic problem-solving methods. The problem-
solving behaviour of the system should be described in a domain-
independent and reusable manner by a problem-solving method. Such
a method defines the different inferences, the different kinds of
domain knowledge which are required by the method, and knowledge
about the control flow between these inference steps. Such a
method is generic in the sense that it can be used to solve
similar problems in different domains. In contrast to general-
purpose methods, a problem-solving method is restricted to a
specific type of problems (i.e., to a specific task). For example,
problem-solving methods for diagnostic tasks are decision tables,
heuristic classification, cover-and-differentiate, case-based
reasoning, model-based diagnosis etc. In addition to the kind of
task it is mainly the type of available knowledge which determines
the applicability of a problem-solving method to a given problem.
Third, different modelling primitives are required for
epistemologically different types of knowledge. A model of
expertise contains different types of knowledge. Most approaches
distinguish between domain knowledge, inference knowledge, and
task-specific control knowledge. A further type of knowledge
concerns the use of domain knowledge by the inference and control
knowledge. Therefore, a model of expertise must explicitly
distinguish between different types of knowledge and several
modelling primitives must be defined for every type as each type
includes again different knowledge entities.
A widespread approach (especially in Europe) to model-based
knowledge acquisition is the KADS project (KADS-I and CommonKADS
[Schreider et.al. 93]). The KADS model of expertise allows an
implementation-independent description of the knowledge using
several layers with pre-defined modelling primitives. Until now
little work has been done which examines possible improvements of
the performance and results of machine learning techniques when
they are applied in this type of a model-based framework. In fact,
there seems to be a kind of cultural barrier between people
working in machine learning and those working in model-based
knowledge acquisition. From a knowledge-level modelling point of
view, work in machine learning is viewed as symbol level stuff and
the latter view the former as producers of nice graphics and
natural language descriptions without a precise and running
semantics. Exceptions are [Dompseler & van Someren 94], [Thomas
et.al. 93], and [Rouveirol & Albert 94] who use inference
structures to bias the learning process and [Van de Velde &
Aamodt92] and [Graner & Sleeman 93] who discuss the integration
of machine learning and knowledge-level modelling. [Fensel et.al.
93] shows the difficulties which arise when applying machine
learning techniques to learn knowledge for a diagnostic task to be
solved by the problem-solving method heuristic classification.
The goal of the workshop is to overcome this barrier by discussing
the new role which machine learning can have for model-based
knowledge acquisition. In fact, we are concerned with topics like:
How can the process of constructing a model of expertise be
supported by machine learning techniques?
How can current machine learning systems be used and integrated
in practical software and knowledge engineering? Will systems
like MOBAL, ENIGME etc. ever be used in daily life? If so, how?
If not, why not?
The different types of knowledge (i.e., domain knowledge,
inference knowledge, control knowledge and mapping knowledge)
require different machine learning procedures and different
combinations of them. How can one type of knowledge be used to
guide the automatic acquisition of other kinds of knowledge?
Problem-solving methods divide a complex reasoning task into
several subactivities. How can problem-solving methods be used
to improve the effect and efficiency of the application of
machine learning procedures?
Machine learning techniques usually involve rather simple
problem solvers. But even simple tasks like diagnosis require
several specialized machine learning techniques and their
combination when a problem-solving method like heuristic
classification is used instead of decision tables. AI research
has also produced a variety of reasoning methods and
architecture. Are there appropriate learning techniques to
support this?
Knowledge acquisition involves many types of learning problems
like the transformation between representation languages. These
languages can be formal or executable but can also use
diagrammatic, natural, or structured text, etc. Do appropriate
learning techniques exist to support this?
In the mean time, a large number of formal and executable
knowledge specification languages have been defined for a model
of expertise. Languages like DESIRE, KARL, KBSSF, and (ML)2
allow a declarative description of the knowledge which
abstracts from implementational aspects. How can they help in
integrating machine learning techniques into model-based
knowledge acquisition?
Can machine learning techniques be improved by knowledge level
modelling?
How can the bias of a machine learning technique be represented
at the knowledge level? This seems to be a very important
criteria for the acceptance and usability of these techniques
for knowledge acquisition.
How can knowledge level description be used to support the
selection, modification, combination, and creation of machine
learning techniques related to given learning tasks. Can the
above mentioned knowledge specification languages be used for
this purpose?
Besides being applied during the knowledge acquisition process,
machine learning procedures can also be integrated into its
product. The knowledge-based system would then not only solve a
given problem but also improve its performance and adapt itself
to modifications of the task and the knowledge. Would knowledge
level descriptions of the learning techniques be required to
enable such a learning system to be maintainable and to remain
intelligible?
This list is not exhaustive and we are interested in different and
controversial points of view. The main purpose of this workshop is
to evaluate the possibilities and limitations for the use of
existing machine learning technology in the context of knowledge
acquisition (and related disciplines such as information/software
engineering) as well as the application of knowledge acquisition
for machine learning. Furthermore, we are aiming at articulating
research goals which will help to increase these possibilities.
Format and kind of contributions
Contributions are invited which present theoretical or practical
results as well as position papers in the area of model-based
knowledge acquisition and machine learning. Submitted papers must
not exceed 15 pages, including abstract and bibliography.
Proceedings will be available at the workshop.
The submission of an electronic version (postscript) of a paper is
highly recommended. Participation without submitting a full paper
is possible but requires the submission of an abstract (up to two
pages) which clarifies the topics of interests.
Contributions for the workshop should be sent to the coordinator
by the 1st February, and decisions on acceptance will be made by
the 1st March.
Coordinator: Dieter Fensel (see address earlier)
Program Committee
Agnar Aamodt, University of Trondheim, Norway
Patrick Albert, ILOG Gentilly, France
Klaus-Dieter Althoff, University of Kaiserslautern, Germany
Enric Plaza i Cervera, AIRI Blanes, Spain
Paul Compton, University of New South Wales, Australia
Werner Emde, GMD-Bonn, Germany
Ronen Feldman, Bar-Ilan University, Israel
Dieter Fensel, University of Karlsruhe, Germany
Jean-Gabriel Ganascia, Universite Paris et M. Curie, France
Philippe Laublet, ONERA, Chatillon Cedex, France
Reza Nakhaeizadeh, Daimler-Benz Research Ulm, Germany
Claire Nedellec, Universite Paris-Sud, France
F. Puppe, University of Wuerzburg, Germany
M. M. Richter, University of Kaiserslautern, Germany
Celine Rouveirol, Universite Paris-Sud, France
Franz Schmalhofer, DFKI-Kaiserslautern, Germany
Guus Schreiber, University of Amsterdam, The Netherlands
Nigel Shadbolt, Nottingham University, United Kingdom
Derek Sleeman, University of Aberdeen, United Kingdom
Maarten van Someren, University of Amsterdam, The Netherlands
Rudi Studer, University of Karlsruhe, Germany
Walter Van De Velde, University of Bruessel, Belgium
Bob Wielinga, University of Amsterdam, The Netherlands
Stefan Wrobel, GMD-Bonn, Germany
References
[David et.al. 93] J.-M. David, J.-P. Krivine, and R. Simmons
(eds.): Second Generation Expert Systems, Springer-Verlag, Berlin,
1993.
[Dompseler & van Someren 94] H. J. H. van Dompseler and M. W. van
Someren: Using Models of Problem Solving as Bias in Automated
KnowledgeAcquisition. In Proceedings of the 11th European
Conference onArtificial Intelligence (ECAI 94), Amsterdam, August
8-12, 1994.
[Fensel et.al. 93] D. Fensel, U. Gappa, and S. Schewe: Applying a
Machine Learning Algorithm In a Knowledge Acquisition Scenario. In
Proceedings of the IJCAIWorkshop Machine Learning and Knowledge
Acquisition: Common Issues, Contrasting Methods, And Integrated
Approaches, W16, Chambery, France, August 29, 1993.
[Graner & Sleeman 93] N. Graner and D. Sleeman: MUSKRAT: a
Multistrategy Knowledge Refinement and Acquisition Toolbox. In
Proceedings of the Second International Workshop on Multistrategiy
Learning, 1993.
[Rouveirol & Albert 94] C. Rouveirol and P. Albert: Knowledge
Level Model of a Configurable Learning System. To appear 1994.
[Schreiber et.al. 93] G. Schreiber, B. Wielinga, and J. Breuker
(eds.): KADS. A Principled Approach to Knowledge-Based System
Development, Knowledge-Based Systems, vol 11, Academic Press,
London, 1993.
[Thomas et.al. 93] J. Thomas, P. Laublet, and J.-G. Ganascia: A
Machine Learning Tool Designed for a Model-Based Knowledge
Acquisition Approach. In N. Aussenac et al. (eds.), Knowledge
Acquisition for Knowledge-Based Systems, Proceedings of the 7th
European Workshop (EKAW+93, Toulouse, France, September 6-10,
1993), Lecture Notes in AI no 723, Springer-Verlag, Berlin, 1993.
[Van de Velde & Aamodt 92] W. Van De Velde and A. Aamodt: Machine
Learning Issues in CommonKADS. Research report, ESPRIT Project
P5248 KADS-II, KADS-II/T2.4.3/TR/VUB/002/3.0, Vrije Universiteit
Brussel, January 1992.
MLnet Sponsored Familiarization Workshop and Third European
Workshop on:
Learning Robots
Background
The application of Machine Learning techniques in real-world
applications and especially in Robotics is currently a topic
gaining a lot of interest. However, the real world often poses
much stronger requirements on the learning methods than problems
considered in the Machine Learning community usually do. Missing
or noisy, continuous-valued data, context- or time-dependent
information and system behaviours have proven to be difficult to
handle and to require substantial extensions of existing Machine
Learning algorithms. On the other hand, the next generation of
robots and especially those to be employed for everyday tasks will
have to be much more communicative, adaptive, and safe. These
requirements have a direct impact on the cost of robot
programming, and, consequently, on the overall cost of the product
"robot". In this context, learning capabilities are obviously
becoming essential.
Scope
The workshop aims at bringing together researchers from both the
Machine Learning and the Robotics community with a special focus
on original work presented by young scientists. It intends to show
that Machine Learning techniques can be successfully employed to
solve some of the problems emerging in Robotics. Therefore, the
workshops emphasis is on the application of Machine Learning in
real world Robotic applications, including, but not limited to:
Human-Robot Interaction and Programming by Demonstration
Mobile Robot Perception, Navigation, and Mission Planning
Architectures for Intelligent Robots
Robot Supervision, Fault Detection and Recovery
Learning and Adaptivity in Robot Control
Format and kind of contributions
Paper submissions are limited to 5000 words. The title page must
contain the title of the talk, name(s) and affiliation(s) of the
author(s) and a list of keywords as well as the full address
(including E-Mail) of the first author. A sample paper and a LaTeX
style file are available via ftp from ftpipr.ira.uka.de. A laser-
quality copy of the paper must be received by the workshop
organizers by January 16th, 1995. Alternatively, the submission of
papers in Postscript format via anonymous ftp to ftpipr.ira.uka.de
is encouraged. Accepted papers will be published in the Workshop
notes, the best papers will be selected for a special issue of
Robotics and Autonomous Systems.
Contributions should be sent to the coordinator by the 16th
January 95. Decisions on acceptance will be made by the 6th
February (early registration for ECML95 closes on the 17th
February). Final camera-ready versions of the revised papers
should be sent to the coordinator by the 13th March.
Coordinator: Michael Kaiser (see address earlier)
Program Committee:
L. Basanez (Spain)
L. Camarinha-Matos (Portugal)
R. Dillmann (Germany)
A. Giordana (Italy)
K. Morik (Germany)
L. Saitta (Italy)
A. Steiger-Garcao (Portugal)
C. Torras (Spain)
H. Van Brussel (Belgium)
G. Vernazza (Italy)
-
MLnet Sponsored Familiarization Workshop:
Statistics Machine Learning And
Knowledge Discovery In Databases
Introduction
The intersection of, and interaction between Machine Learning
(ML), Statistics and Knowledge Discovery is a rapidly growing area
of interest. There are several areas of common research.
On one hand, the considerable amount of data available in
databases is an extraordinary source for extracting knowledge and
tracing causal dependencies and for this reason, Knowledge
Discovery in Databases (KDD) has become an important and new
challenge, economically as well as scientifically. On the other
hand, the statistical and machine learning algorithms can be
considered as robust instruments for KDD. These facts make obvious
that the KDD is an interdisciplinary concept, benefiting from
statistics, machine learning and database technology.
The success of the pervious workshops on KDD at IJCAI89 and the
AAAI (91, 93, 94) Conferences as well as the MLnet Workshop on
Machine Learning and Statistics confirm a growing awareness and
interest in this area.
The main topics of interest:
Statistical and ML approaches for discovery of causal
structures in databases
Incremental learning methods to handle the temporal aspects of
databases
Statistical and ML approaches for comprehensibility and
validation of discovered knowledge
Software architecture of KDD-Tools
Combining of ML and statistical approaches to build hybrid
algorithms for KDD
Focusing aspects to improve the performance of KDD-tools
Classification and prediction algorithms to handle large-scale
data
Successful applications in medicine, business and industry
Form of the workshop
The main part of the workshop will be devoted to presentations and
invited talks. Talks will be grouped according to common interest
with plenty of time for discussions.
Publication
Workshop notes will be distributed at the beginning of the
workshop. We intend to publish revised versions of selected papers
as a book after the workshop.
Format and kind of contributions.
Contributions are invited which present theoretical or practical
results as well as review papers in one of the above topics of
interest. Submitted papers must not exceed 12 pages, including
abstract and bibliography. The title page should include the title
of the talk, name(s) and affiliation(s) of the author(s) and a
list of keywords as well as the full address (including E-Mail) of
the first author. The submission of an electronic version
(postscript or LaTeX) of a paper is highly recommended.
Participation without submitting a full paper is possible but we
encourage all participants to submit an abstract (up to two pages)
to help the organisers to clarify the topics of interests. A
sample paper in LaTeX is available via ftp from amsta.leeds.ac.uk
in directory pub/ecml95.
Contributions should be sent to the coordinator by the 16th
January 95. Decisions on acceptance will be made by the 6th
February (early registration for ECML95 closes on the 17th
February). Final camera-ready versions of the revised papers
should be sent to the coordinator by the 13th March.
Coordinator: Gholamreza Nakhaeizadeh (see address earlier)
Chairs
Yves Kodratoff, University of Paris-Sud, France
Gholamreza Nakhaeizadeh, Daimler-Benz, Ulm, Germany
Charles Taylor, University of Leeds, GB
Program Committee
Peter Edwards, University of Aberdeen, GB
Usama Fayyad, Jet Propulsion Lab, USA
Attilio Giordana, University of Torino, Italy
Bob Henery, University of Strathclyde, GB
Willi Klosgen, GMD, Germany
Heikki Mannila, University of Helsinki, Finland
Marjorie Moulet, Orsay, France
Gregory Piatetsky-Shapiro, GTE, USA
Arno Siebes, CWI, Amsterdam, The Netherlands
Rudiger Wirth, Daimler-Benz, Ulm, Germany
Jan Zytkow, Wichita State University, USA
**************************************
**************************************
**************************************
ECML-97
Call for Proposals
As noted elsewhere in this Newsletter, the International
Conference on Machine Learning will take place in Bari during the
Summer of 1996 (Program Chair: Lorenza Saitta (Torino) and Local
Chair: Floriana Esposito). Thus there will not be an ECML96.
However, MLnet would like to establish a planning cycle for the
ECML conferences, and intends making a decision at the Heraklion
meeting, about the location, and Program Chair for ECML97.
Similarly to the International Meeting, it has been agreed in
future that bids for a site to host the ECML conference, should
include:
overall budgets for the event, including a statement of the
conference fee,
details of accommodation costs,
some details of the facilities which can be provided at the
actual conference site,
details of the distance between the likely residential
accommodation and the conference site,
a summary of how to reach the city by air, rail and road,
sponsorships expected.
Guidelines for drawing up budgets are available from Derek Sleeman
(Aberdeen). Both he and Lorenza Saitta will be happy to talk to
potential sites.
**************************************
**************************************
**************************************
CALL FOR PARTICIPATION
Workshop on
MACHINE LEARNING OF NATURAL LANGUAGE AND SPEECH
Organized under the auspices of the
European networks of Language
and Speech (ELSnet) and Machine Learning (MLnet)
by
The Institute for Language Technology and AI (ITK),
Tilburg University
The Institute for Logic, Language and Computation (ILLC),
University of Amsterdam
Sponsored by the European Commission through MLnet and ELSnet
December 2-3, 1994
Amsterdam
Programme Committee:
Walter Daelemans (ITK, Tilburg University)
Mark Ellison (CCS, University of Edinburgh)
Erik-Jan van der Linden (ILLC, University of Amsterdam)
Local Organization:
Marco de Vries (ILLC, University of Amsterdam)
Aims of the Workshop:
Raise awareness of the opportunities for applying Machine
Learning (ML) techniques in Language and Speech (L&S) research.
Demonstrate with selected papers that ML of L&S KBs is a viable
area of research,
Identify possible funding sources for joint ML and L&S
research.
Our main aim in organizing this workshop and bringing the two
research communities together, is to contribute to an arena for
this work in Europe. The domain seems to be burgeoning in the
USA.
This workshop, while directed specifically to members at sites of
the MLnet and ELSnet networks of excellence, is also open to
attendance by other interested researchers in Machine Learning or
Language and Speech Technology.
MLnet and ELSnet have set aside a fixed sum to support node
members who wish to attend. There are only limited funds
available. A maximum number of 35 participants is aimed at. If you
are interested in participating to this workshop, please send a
brief (1-page) statement of interest to:
For MLnet members:
Derek Sleeman (ELSnet/MLnet WS)
Computing Science Department
The University
Aberdeen
AB9 2UE
Scotland, UK
Fax: +44 1224 273422
Email: sleeman@csd.abdn.ac.uk
For ELSnet members and other participants:
Marco de Vries
ILLC
Plantage Muidergracht 24
NL-1018 TV Amsterdam
The Netherlands
+31 20 525 6051 (tel)
+31 20 525 5101 (fax)
marco@fwi.uva.nl
WORKSHOP MOTIVATION
The Language and Speech Technology Perspective
Current knowledge-based approaches to Language and Speech
Technology use large amounts of hand-crafted knowledge to solve
ambiguity problems in the analysis of text and speech, and in
general to provide the necessary linguistic competence to systems.
Because handcrafting of these enormously complex knowledge sources
is extremely difficult and expensive, and completeness inherently
impossible, the field is confronted with a serious knowledge
acquisition bottleneck. At present, most knowledge sources have to
be rebuilt from scratch for each new application, domain,
theoretical framework or language.
Techniques from Machine Learning and Statistical Pattern
Recognition can alleviate the knowledge acquisition bottleneck,
and can also help in discovering new theories and models. Research
in speech processing has a long history of interest in learning
algorithms, especially by using simple statistical models (e.g.
HMMs) and connectionist learning algorithms. Symbolic learning
algorithms have not yet been tested extensively, however, and may
add to the toolbox of speech recognition and synthesis.
The combination of Machine Learning and Language Technology leads
to a number of interesting research topics with potentially useful
applications:
Which ML approaches are useful for which type of linguistic
knowledge acquisition? E.g. Inductive Logic Programming for
Logic Grammars, Explanation-Based Learning for Grammar
Adaptation, etc.
How can/should the learning algorithms be enriched with
linguistic domain bias?
How can supervised and unsupervised learning be combined to
solve bootstrapping problems and implicit learning problems?
How can the results of learning be made accessible to
linguistic engineers? Learning may result in distributed
representations, prototypes, exemplars, etc. which can be
translated to rules.
The Machine Learning Perspective
We see the following as important arguments for ML researchers to
consider working on Natural Language and Speech problems:
Learning from text has always been one of the long-term goals
of ML research. While the present focus of attention (induction
of lexical and grammatical knowledge) is more modest in its
scope, it is a necessary step towards achieving this goal.
The language processing problems tackled show a wide range of
complexity: from learning phonological regularities (easier) to
learning semantic structures (hard), and are typical of a
larger class of problems (in which generalizations,
subregularities and exceptions interact in complex ways).
Selected language processing problems may be a useful addition
to the existing benchmarks for comparing learning algorithms.
Humans process language, and learn to do so. A great deal of
work has been done in studying natural language acquisition,
which provides an almost unparalleled opportunity for those
using ML for cognitive modelling to compare the learning
behaviour of machines and people.
Symbolic and non-symbolic data is largely accessible (i.e.
understandable by the layman), compared to e.g.
electrocardiograph traces. Words can be read, or speech
synthesised, to see if structures learnt are at least making
the right predictions. Of course there is specialised knowledge
involved on the theoretical side.
PRELIMINARY PROGRAMME
Friday 2nd December
13.00 - 13.05 Opening by organizers
13.05 - 13.55 State of the Art in ML (Maarten Van Someren)
13.55 - 14.40 State of the Art in NL&S (Erik-Jan van der
Linden)
14.40 -
15.40 Keynote: Computational models of syntactic
processing in human language comprehension and
production (Gerard Kempen)
15.40 - 16.00 Tea/Coffee
16.00 - 16.30 Minimal Description Length Applications (Mark
Ellison)
16.30 - 17.00 Explanation-Based Learning of Grammar (Christer
Samuelsson)
17.00 - 17.30 Unsupervised Learning of Language Knowledge
(David Powers)
17.30 - 18.00 Discussion
18.00 - ..... drinks, dinner
Saturday 3rd December
10.00 - 10.30 Case-Based Language Learning (Walter Daelemans)
10.30 - 11.00 Generalization by analogy in language (Stefano
Federici, Vito Pirrelli)
11.00 - 11.30 Inductive Logic Programming and DCGs (Luc De
Raedt et al.)
11.30 - 12.00 Concept Learning (Roberto Basili)
12.00 - 13.30 Lunch
13.30 - 14.00 Formal Language Learning Theory (Dick de Jongh)
14.00 - 14.30 DOP versus Probabilistic Parsing Approaches
(Rens Bod, Remko Scha)
14.30 - 15.00 ML applications in Speech and Signal Processing
(Louis ten Bosch)
15.00 - 15.30 tea/coffee
15.30 - 16.00 Introduction to the discussion
16.00 - 17.00 Closing discussion: funding opportunities for ML
of NL&S; SIGNLL (ACL Special Interest Group Natural
Language Learning) (Moderated Daelemans - Powers)
17.00 - ..... farewell drinks
**************************************
**************************************
**************************************
Report on the 4th International Workshop on
Inductive Logic Programming (ILP94)
by Giovanni Semeraro
Workshop Chair: Stefan Wrobel (GMD, Germany)
ILP94 was held in Bad Honnef/Bonn (Germany) from September 12th
to September 14th 1994. Local support was provided by GMD. ILP94
was sponsored by GI (Gesellschaft fur Informatik e. V. - German
Society for Computer Science), GMD (Gesellschaft fur Mathematik
und Datenverarbeitung MBH) and MLnet (ESPRIT Network of Excellence
in Machine Learning). The workshop was attended by 60
participants; about 16 papers were accepted as full papers and
presented in the plenary session, and 11 further papers were
selected for poster presentation. Each day of the workshop started
with an invited talk.
Katharina Morik gave a talk entitled "The Art of ILP
Applications", in which three real-world applications of ILP
algorithms were presented, namely classification of students
choices about their place of residence, security management in a
telecommunication environment, and robot navigation. These
examples were selected as they show that real-world applications
of machine learning require several kinds of problems to be
solved. Moreover, they show the wide range of applicability of
machine learning.
Lorenza Saitta ("ILP: An Alternative View"), starting from an in-
depth analysis of several definitions of ILP in the literature,
helped to put ILP in the wider perspective of machine learning, by
pointing out differences and similarities with other machine
learning areas.
Paul Vitanyi ("Inductive Reasoning") presented a general theory of
inductive reasoning, based on a form of Bayes rule with no prior
probabilities. A practical application of such a theory is
represented by the Minimum Description Length (MDL) principle
(Rissanen, 83). Furthermore, it can be proven that the MDL
principle selects typical hypotheses, where typical means most
likely for some computable prior probability. Such a proof relies
on a method of inference based on Kolmogorov Complexity
(Solomonoff, 64), which can, in turn, be traced to Occams Razor
and Churchs Thesis. The universal p-randomness test (Martin-Loef)
is used to validate this theory. Finally, the speaker showed an
example of application of this theory to the problem of inferring
a function from input/output examples.
Full papers can be roughly divided into four areas, namely:
applications,
ILP algorithms/systems,
computational learning theory, and
theory revision.
The first area includes the first two papers in the following
overview. In the presentation by A. Srinivasan, S. Muggleton, R.D.
King, and M.J.E. Sternberg ("Mutagenesis: ILP Experiments in a
Non-Determinate Biological Domain"), the new ILP system PROGOL was
used to discover rules for mutagenicity in nitroaromatic
compounds. From a biological point of view, this problem is
relevant because highly mutagenic nitroaromatics have been found
to be carcinogenic and often cause damage to DNA. From an ILP
point of view, this problem is interesting since it involves a
highly non-determinate relational representation that cannot be
dealt with by all the ILP systems which incorporate the ij-
determinate restriction. PROGOLs main features are: possibility
of using non-ground background knowledge (and in the more general
form of horn clauses rather than unit clauses), mode-directed
inverse resolution, best-first search strategy. Volker Klingspor
("GRDT: Enhancing Model-Based Learning for its Application in
Robot Navigation") presented a new system, called GRDT (Grammar
Based Rule Discovery Tool), that proved effective to solve the
task of learning concepts for navigation of autonomous mobile
robots. GRDT is mainly an extension of RDT (Kietz and Wrobel, 92),
and overcomes some shortcomings that FOIL (Quinlan, 90) and
GRENDEL (Cohen, 93) showed when used to solve the same task.
The majority of ILP algorithms/systems, can be subdivided into
methods for learning logic programs and methods for learning
concept descriptions in a logical form, where the boundary between
them is represented by recursion (allowing recursive rules or
not). A. Hamfelt and J.F. Nilsson ("Inductive Metalogic
Programming") proposed a method and a testbed for induction of a
class of logic programs. The method exploits higher order cliches
to restrict the hypothesis language to predefined program
recursion schemes. F. Bergadano and D. Gunetti ("Learning Clauses
by Tracing Derivations") presented TRACY, a system that learns
logic programs from examples. TRACY learns a logic program as a
whole rather than learning a single clause at each learning step.
In other words, the hypothesis space of TRACY is the space of
logic programs, that is, the power set of the space of possible
clauses. The search performed by TRACY is based on an intensional
evaluation of learned clauses and makes use of backtracking to
choose an alternative derivation when some negative example can be
derived from the learned logic program. The algorithm is proved to
be correct and sufficient and it does not depend on the kind and
number of training examples. Matevz Kovacic ("MILP - A Stochastic
Approach to Inductive Logic Programming") presented the system
MILP, which replaces a greedy search technique, common to many ILP
system, with stochastic search. Moreover, MILPs evaluation
function makes use of the MDL principle in order to avoid
overfitting and ranks the remaining hypotheses according to the
classification accuracy on the training set. Tests in the domains
of king-rook-king chess end-games and the finite element mesh
design showed that MILP significantly outperforms other ILP
algorithms. Werner Emde ("Inductive Learning of Characteristic
Concept Descriptions") presented an improved version of his system
COLA (Emde, 94), called COLA-2. COLA-2 adopts a novel approach to
the problem of learning characteristic descriptions from examples.
Such an approach is also applicable when a small set of classified
examples is available, while most of the training examples are
unclassified. COLA-2 embodies a conceptual clustering algorithm,
called SPRITE, to take advantage of the unclassified observations.
Michelle Sebag and Celine Rouveirol ("Induction of Maximally
General Clauses Consistent with Integrity Constraints") presented
an extension to definite clauses of previous works in
propositional logic (Sebag, 94a) and in a restriction of first
order logic (Sebag, 94b). The aim of this work is to show that
only near-miss examples are useful to build the set of maximally
general hypotheses which are consistent with the available
negative examples. Jorg-Uwe Kietz and Marcus Lubbe ("An Efficient
Subsumption Algorithm for Inductive Logic Programming") addressed
a central problem for ILP, namely the efficiency of theta-
subsumption. Generally speaking, the test for establishing whether
a clause d theta-subsumes a clause c is np-complete even if we
restrict ourselves to linked horn clauses and fix the number of
literals in c to a small constant. Thus, the authors show that two
restrictions on the hypothesis space, namely determinacy (of d wrt
c) and k-local clauses, can be used to make theta-subsumption
tractable. The corresponding two theta-subsumption algorithms
constitute the foundations of a future efficient bottom-up
algorithm for learning (the lgg of two any) determinate k-local
horn clauses. Furthermore, the reduction algorithm (under theta-
subsumption) can be greatly improved by these approaches. S.
Muggleton and C.D. Page jr. ("Self-Saturation of Definite
Clauses") investigated the development of complete algorithms for
lgg computation. Indeed, most of these algorithms invert theta-
subsumption and not implication, thus they are incomplete. This
incompleteness has been precisely characterized (Gottlob, 87).
Some fundamental questions concerning the existence and the
computability of lgg are related to the notion of inversion of
implication. Based on these questions, the paper introduces and
analyses the concepts of self-saturation and direct root self-
saturation of definite clauses and of arbitrary clauses. The main
result is that a finite lgg under implication of any two clauses
exists and is efficiently computable if the two clauses have
finite self-saturation. I. Stahl and I. Weber ("The Arguments of
Newly Invented Predicates in ILP") addressed the problem of
searching for the appropriate arguments of a newly invented
predicate. Predicate invention is largely used in ILP in order to
extend the hypothesis language (and consequently the hypothesis
space) when the original language is not rich enough for the
learning task.
Two papers can be ascribed to the area of computational learning
theory. S. Muggleton and C.D. Page jr. ("A Learnability Model for
Universal Representation") proposed a new computational model of
inductive learning, called u-learnability (universal
learnability). Such a model extends existing models by allowing
time-bounded concepts and probability distribution over
hypotheses. The emphasis is placed upon distribution rather than
representation. Luc de Raedt and Saso Dzeroski ("First Order JK-
Clausal Theories are PAC-Learnable") presented positive PAC-
learning results for the nonmonotonic ILP setting. Specifically,
first order range-restricted clausal theories, whose clauses have
up to k literals of size at most j, are polynomial-sample
polynomial-time PAC-learnable from positive examples only.
In the area of theory revision, the paper by H. Bostrom and P.
Idestam- Almquist ("Specialization of Logic Programs by Pruning
SLD-Trees for Definite Clauses") presents a specializing operator
based on unfolding and clause removal. J. Paakki, T. Gyimothy and
T. Horvath ("Effective Algorithmic Debugging for Inductive Logic
Programming") improved Shapiros algorithmic debugging of logic
programs (1983) by means of a technique which integrates category
partition testing and static program slicing. The improvement
consists in a reduction of the number of questions to the oracle.
Such a reduction is achieved by avoiding questions to the oracle
when a verification of the results of a procedure call can be
inferred from the test database. Furthermore, only the relevant
program execution paths that may have affected the value of an
incorrect output are analysed. P.R.J. Van der Laag and S.H.
Nienhuys-Cheng ("A Note on Ideal Refinement Operators in Inductive
Logic Programming") provided sufficient conditions for
nonexistence of ideal refinement operators (ideal means locally
finite, complete and proper) and showed that such conditions are
met when the hypothesis space is the set of Horn clauses and the
model of generalization is theta-subsumption or logical
implication. Therefore, ideal refinement operators for both of
these quasi-orderings do not exist.
Outside the tentative classification of the ILP94 papers, the
contribution by P. Flach ("Inductive Logic Programming and
Philosophy of Science") investigated the relations between work in
ILP and similar work in philosophy of science on the logical
characterization of scientific theory formation.
The proceedings of ILP-94 have been published as a GMD Technical
Report and are still available upon request from:
Ms. Ulrike Teuber
GMD, FIT.KI, Schloss Birlinghoven
53754 Sankt Augustin 1
E-Mail ulrike.teuber@gmd.de
As long as supplies last, they are free of charge.
Finally, I would like to thank Stefan Wrobel and Christine Harms
for the perfect organization of the workshop, for the pleasant
trip on the Rhine and in particular for the pioneering
introduction of a very interesting special session on "Rhine
Valley Wine Tasting".
Bibliography
(Cohen, 93) Cohen, W.W., Rapid Prototyping of ILP Systems Using
Explicit Bias. Proceedings Oo the IJCAI Workshop on ILP, 1993.
(Emde, 94) Emde, W., Inductive Learning of Characteristic Concept
Descriptions from Small Sets of Classified Examples. In F.
Bergadano and L. De Raedt (Eds.), Machine Learning: ECML-94,
Proceedings of the European Conference on Machine Learning,
Lecture Notes in Artificial Intelligence, 103-121, Springer-
Verlag, 1994.
(Gottlob, 87) Gottlob, G., Subsumption and Implication,
Information Processing Letters, 24:109-111, 1987.
(Kietz And Wrobel, 92) Kietz, J.U., and Wrobel, S., Controlling
the Complexity of Learning Through Syntactic and Task-Oriented
Models. In S. Muggleton (Ed.), Inductive Logic Programming, 107-
126, Academic Press, 1992.
(Quinlan, 90) Quinlan, J.R., Learning Logical Definitions from
Relations, Machine Learning, 5(3):239-266, 1990.
(Rissanen, 83) Rissanen, J., A Universal Prior for Integers and
Estimation by Minimum Description Length, Annals of Statistics,
11(1):416-431, 1983.
(Sebag, 94a) Sebag, M., Using Constraints to Building Version
Spaces. In F. Bergadano and L. De Raedt (Eds.), Machine Learning:
ECML-94, Proceedings of the European Conference on Machine
Learning, Lecture Notes in Artificial Intelligence, Springer-
Verlag, 1994.
(Sebag, 94b) Sebag, M., A Constraint-Based Induction Algorithm in
Fol, In Proceedings of IML94: International Conference on Machine
Learning, Morgan Kaufmann, 1994.
(Shapiro, 83) Shapiro, E., Algorithmic Program Debugging, Mit
Press, 1983.
(Solomonoff, 64) Solomonoff, R.J., A Formal Theory of Inductive
Inference, Information and Control, 7:376-388, 1964.
**************************************
**************************************
**************************************
Inductive Logic Programming European Scientific Network
- ILPNET
ILPNET is the Inductive Logic Programming European Scientific
Network, financially supported by the Action for Cooperation in
Science and Technology with Central and Eastern European Countries
(PECO 92), contract no. CIPA3510OCT920044. ILPNET is being
financed for three years, starting on 26th July 1993.
ILPNET gathers 19 leading European institutions involved in
Inductive Logic Programming research:
Romanian Academy of Sciences (RA), Bucharest, Romania (G.
Tecuci),
University of Dortmund (UDO), Dortmund, Germany (K. Morik),
Technical University Graz (TUG), Graz, Austria (M. Kubat),
Katholieke Universiteit Leuven (KUL), Heverlee, Belgium (L. De
Raedt),
J. Stefan Institute and Faculty of Electrical Engineering and
Computer Science (LAI), Ljubljana, Slovenia (N. Lavrac),
Faculty of Technical Sciences Maribor (TFM), Maribor, Slovenia
(B. Dolsak),
Universite Paris-Sud (LRI), Orsay, France (C. Rouveirol),
Oxford University (OUCL), Oxford, United Kingdom (S.
Muggleton),
University of Porto (LIACC), Porto, Portugal (P. Brazdil),
Czech Technical University (CTU), Czech Republic (O.
Stepankova),
German National Research Center for Computer Science (GMD),
Sankt Augustin, Germany (S. Wrobel),
Bulgarian Academy of Sciences (IINF), Sofia, Bulgaria (Z.
Markov),
University of Stockholm (STO), Stockholm, Sweden (C.G.
Jansson),
Universitat Stuttgart (STU), Stuttgart, Germany (B. Tausend),
Hungarian Academy of Sciences (ARJ), Szeged, Hungary (T.
Gyimothy),
Tilburg University (ITK), Tilburg, the Netherlands (P. Flach),
Universita di Torino (TO), Torino, Italy (F. Bergadano),
Research Institute for Applied Knowledge-Processing (FAW), Ulm,
Germany (R. Wirth),
Austrian Research Institute for Artificial Intelligence
(ARIAI), Vienna, Austria (I. Mozetic).
Inductive Logic Programming (ILP) is a research area in the
intersection of machine learning and logic programming, whose goal
is the development of the theory and practical algorithms for
inductive learning in first-order logic representation formalisms.
The aim of the scientific network ILPNET is to stimulate the
development, coordination, communication, mobility and exchange of
results of European ILP researchers and to disseminate the
research results to a wider research community. In the scientific
sense, the aim of ILPNET is to provide the infrastructure for the
ongoing European research in ILP, most of which is currently
performed within the ESPRIT III Basic Research Project No. 6020
Inductive Logic Programming (coordinated by Luc De Raedt and
Maurice Bruynooghe, Katholieke Universiteit Leuven). A particular
aim of ILPNET is to build new communication and dissemination
channels and to make them available to Central and Eastern
European researchers interested in ILP.
ILPNET objectives
Support the existing and build new communication channels
between ILPNET nodes.
Enable joint research activities by supporting short visits of
researchers at other ILPNET nodes.
Support organizing and attending specialized meetings and
workshops.
Build a common database of ILP scientific publications, data
and systems.
Promote the results of ILP research also outside ILPNET.
The management structure of ILPNET
Academic Coordinator: Nada Lavrac, J. Stefan Institute,
Ljubljana, Slovenia
Management Board: Project Managers of ILPNET nodes
Academic Secretary: Darko Zupanic, J. Stefan Institute,
Ljubljana, Slovenia
Contact persons: Nada Lavrac, Darko Zupanic
Jozef Stefan Institute
Jamova 39, 61111 Ljubljana, Slovenia
phone: +386 61 12 59 199,
fax : +386 61 12 58 058 or +386 61 219 385
Email: {Nada.Lavrac,Darko.Zupanic}@ijs.si
More information about ILPNET and its activities can be obtained
via World-Wide Web (WWW) from the ILPNET title page at the J.
Stefan Institute in Ljubljana, at http://www-
ai.ijs.si/ilpnet.html.
ILPNET activities
1. ILPNET publishes the ILP Newsletter, edited by S. Dzeroski and
N. Lavrac. Two issues have been published and sent electronically
to approx. 250 subscribers. To subscribe, send a message to
ilpnet@ijs.si with a subject heading SUBSCRIBE ILPNEWS. The
newsletter includes material relevant to ILPNET and ILP in
general.
The contents of ILP Newsletter 1 (1), 22nd February 1994, are:
Introduction to ILPNET
The GMD repository of ILP publications, data and programs
SIGART special issue on ILP
List of ILP and ILP-related books
ILP book announcements
Calls for papers
The content of ILP Newsletter 1 (2), 23rd May 1994, is actually a
booklet (40 pages) with descriptions of the nineteen ILPNET nodes,
for each node including a list of researchers, main research
areas, description of current research, list of recent
publications and exact address of the contact person.
The ILP Newsletter is accessible also via WWW from http://www-
ai.ijs.si/ilpnet.html.
2. ILPNET supports ILP workshops by providing travel and
subsistence to ILPNET members.
The Third International Inductive Logic Programming Workshop
ILP93 (program chair: Stephen Muggleton, conference chair: Nada
Lavrac), Bled, April 1-3, 1993, was organized by the J. Stefan
Institute (organized before the start of ILPNET).
The Fourth International Inductive Logic Programming Workshop
ILP94 (program and conference chair: Stefan Wrobel) has been
organized by GMD in Bad Honnef/Bonn on 12-14 September 1994 (see
report on Page 15).
3. Mailing lists of researchers interested in ILP have been
established:
ILPNETlist: a list of ILPNET Project Managers and some other
ILPNET node members.
ILPNEWSlist: a list of researchers subscribed to the ILP
Newsletter.
ILPWORLDlist: is a list of individuals interested in ILP.
4. One of the main goals of ILPNET is to establish a central
database which will store lists of scientific publications, public
domain prototypes of ILP systems, and data concerning applications
of ILP. The common database of applications will be used as a
testbed for novel ILP systems.
Currently, the prototype databases are collected at GMD. They can
be accessed by anonymous ftp to ftp.gmd.de on the directories
/MachineLearning/ILP/public/bib, /MachineLearning/ILP/public/data
and /MachineLearning/ILP/public/software. A uniform environment
for accessing ILP references, data and software is being designed
at the J. Stefan Institute - a prototype version of it is already
accessible via WWW from http://www-ai.ijs.si/ilpnet.html.
Currently, an updated list of ILP publications is being collected.
A complete list of ILP and ILP-related books has already been
collected. It is already accessible via WWW from http://www-
ai.ijs.si/ilpnet.html.
The archive of ILP software at GMD, directory
/MachineLearning/ILP/public/software, presently contains GOLEM,
MILES, MOBAL, macCLINT, INDEX, FILP and FOIL6.1. The archive of
ILP data at GMD, directory /MachineLearning/ILP/public/data,
presently contains data for the finite element mesh design
learning problem.
Nada Lavrac
Academic Coordinator of ILPNET
**************************************
**************************************
**************************************
Whats new in the MLnet ML Archive at GMD
The ML Archive is now accessible via WWW using the URL
ftp://ftp.gmd.de/ml-archive/README.html. FTP access continues to
be available from ftp.gmd.de, directory /ml-archive.
New arrivals:
Consultant-2, the MLT advice-giving system Version 2.2, March
12, 1993
(ftp-directory: ml-archive/MLT/public/software/Consultant)
Past MLnet newsletters, available in Postscript-Format and
ASCII.
MLnet Policy Statement, available in Postscript-Format.
MLnet First Year Annual Report, available in Postscript-Format.
MLnet List of Nodes, available in Postscript-Format.
Please address all submission and/or questions about the archive
to ml-archive@gmd.de.
**************************************
**************************************
**************************************
Procedures for joining MLNet
Initial enquiries will receive a standard information pack
(including a copy of the Technical Annex)
All centres interested in joining MLnet are asked to send the
following to MLnets Academic Coordinator:
A signed statement on Institutional notepaper saying that
you have read and agreed with the general aims of MLnet
given in the Technical Annex;
One hard-copy document listing the Machine Learning (and
related activities) at the proposed node and three copies of
any enclosures; the document should include a list of
scientists involved in these field(s), half page curriculum
vitae for each of these senior scientists, current research
students, lists of recent grants and relevant publications
over the last 5 year period;
A statement of the Technical Committees which the Centre
would be interested in joining, and a succinct statement of
the potential contributions of the Centre to the Network and
its Technical Committees.
Two members of the Management Board will be asked to look at the
material in detail and will present the proposal at the next
Management Board meeting. Through the Networks Coordinator, the
members may ask for additional information.
The Academic Coordinator will be in touch with the Centre as soon
as possible after the Management Board meeting.
The Management Board is not planning to set a fixed timetable for
applications, but advises potential nodes that it currently holds
Management Board meetings in November, April and September, and
that papers would have to be received at least six weeks before a
Management Board meeting to be considered. (Contact Derek Sleeman
for details).
**************************************
**************************************
**************************************
Academic Coordinator:
Derek Sleeman
Department of Computing Science
University of Aberdeen
Kings College
Aberdeen AB9 2UE
Scotland, UK
Tel: +44 1224 27 2288/2304
Fax: +44 1224 27 3422
email: {mlnet, sleeman}@csd.abdn.ac.uk
Documents available from Aberdeen:
State of the Art Overview of ML and KA
Recently Announced projects (ESPRIT III)
MLnet Flyer
First Year Report
Policy Statement
**************************************
***** End of MLnetNEWS 3.1 *****
**************************************
------------------------------
End of ML-LIST (Digest format)
****************************************