Copy Link
Add to Bookmark
Report
NL-KR Digest Volume 06 No. 07
NL-KR Digest (Mar 13 1989) Volume 6 No. 7
Today's Topics:
Software for Automatic Translation
N (N>2) party dialogue
addresses of NANOKLAUS authors
e-mail address E.WANNER
Logical and Semantic Paradoxes
Submissions: nl-kr@cs.rpi.edu
Requests, policy: nl-kr-request@cs.rpi.edu
Back issues are available from host archive.cs.rpi.edu [128.213.1.10] in
the files nl-kr/Vxx/Nyy (ie nl-kr/V01/N01 for V1#1), mail requests will
not be promptly satisfied. If you can't reach `cs.rpi.edu' you may want
to use `turing.cs.rpi.edu' instead.
---------------------------------------------------------
To: nl-kr@cs.rpi.edu
Date: Wed, 22 Feb 89 14:42:58 IST
From: Itamar Even-Zohar <B10%TAUNIVM.BITNET@CUNYVM.CUNY.EDU>
Subject: Software for Automatic Translation
Here is a short description of TOVNA, a machine translation system
which I have been preoccupied with recently. I believe this
information can be of interest to others who are both skeptical of
and fascinated by machine translation. This particular system is very
promising indeed.
(I wish to declare that I am in no commercial or other way connected
to this product. My report is wholly based on information received
from the company when I was testing it, as well as on my personal
experience with its performance. Though I have not operated it
independently, I managed to test it in a sufficient variety of ways
to be able to express some opinion about its capacities.
- Itamar Even-Zohar)
TOVNA - MACHINE TRANSLATION SYSTEM
TOVNA MTS (I will refer to it in the following as "Tovna") is a
sophisticated AI solution for multi-language environments. It
currently allows automatic translation for French-English Russian-
English (both ways). The French-English option is at a more advanced
stage than the Russian-English one.
In accordance with new developments in this field, automatic
translation (AT) is no longer conceived of as man-independent.
Translation is interactive in the sense that both a "regular" and an
advanced user (a "power user") can intervene in the various stages of
the MTS decision making. The system consequently can be taught both
rules and new material, including personal preferences on various
levels, both directly and indirectly (through extraction - see
below).
Tovna maintains a complete and rigorous separation of knowledge of
the language from the software. This means that there is only one set
of software programs which work in exactly the same way with *all
languages* available with Tovna. There is only one system for the
user to learn.
Moreover, Tovna is a learning system which improves with use. The
more it translates, the better its performance.
Ambiguity (which leads to incorrect translation) is handled by
discovering, at each phase of the translation process, all the
possible alternatives, passing them on to the next phase in the
expectation that later phases will reject the incorrect alternatives.
The problem of incomplete specification of grammar (which leads to
incomplete translations) is handled by Tovna's capacity to extract
(construct) rules from examples. The linguist who "teaches" Tovna a
language's grammar can do so by either specifying a rule, or where
more convenient, by providing a local solution to a specific case,
i.e., an example. One is never required to specify an algorithm and
in fact has no way of doing so.
Although ambiguity and incomplete specification of grammar are the
crucial problems which must be solved by an MT system, they are
hardly the only ones whose solution is critical to the success of the
system. Other, less technical but still important issues which must
be addressed are:
a. Pre-editing and post editing of text.
b. Adding new words and phrases to dictionaries.
c. Adding new languages to the system.
Pre-editing and post-editing of text consume valuable time because
the user must hunt down sections which require post-editing, and the
output format is often not suitable for word processors and
typesetting equipment. With Tovna, no pre-editing is required. A high
degree of accuracy will eventually eliminate the need for most post-
editing as the system improves its performance. Moreover, Tovna
maintains typesetting and control codes for complete compatibility
with word processors and typesetting equipment (existing and future).
Tovna makes it easy to add words and phrases to dictionaries, by
providing sophisticated and easy to use menu based screens which
enable the user to enter the required data accurately and quickly in
a user friendly environment.
The problem of adding new languages to an MT system is especially
vexing. Most existing systems have to be completely rewritten to
accommodate a new language, a process which takes several years.
Often, the new system has different capabilities and a different user
interface, thus confusing the user. Since Tovna maintains a complete
and rigorous separation of knowledge of the language from the
software, new languages can be added relatively quickly. ("Quickly"
is, of course, relative: I am told that each new language requires
something between 6 months and 2 years, depending on how remote the
relevant language is from the extant material.)
The language's complexity is reflected not in the algorithms but
rather in the rules and in the example-based language model. More
complex languages simply have more rules and more examples in their
models. The software is the same for all languages, and the system's
capabilities and user interface are consistently maintained across
all languages.
In addition to being language independent (that is, the same software
works with all languages), Tovna is also operating system independent
and machine independent. Tovna can work with most commonly available
operating systems and most commonly available computers. It works
best, however, with large memory and large storage, which means that
it would be fastest with an advanced SUN. When I worked with it on a
SUN (with 16 MB of memory), it speed was very impressive, especially
in entering new material and teaching it new fatures.
Tovna headquarters are located in Tel-Aviv and Jerusalem. The
European sales office is located in London. Here are the addresses
for those who wish further information:
Tovna TM Ltd.
Yigal Alon 127
Tel Aviv 67443
Israel
(Phones: 03-256252/3; Fax: 03-256257)
Tovna TM Ltd.
Betar 17
Jerusalem
(Phones: 02-712623, 02-719157)
Tovna TM Ltd.
C.I.B.C Building
Cottons Lane
London SE12QL
England
(Phones: 1-2346633/4/5. Fax: 1-2346897)
Itamar Even-Zohar
Porter Institute for Poetics and Semiotics
Tel Aviv University
------------------------------
To: nl-kr@cs.rpi.edu
Date: Fri, 24 Feb 89 13:47:19 GMT
From: acwf%doc.imperial.ac.uk@NSS.Cs.Ucl.AC.UK
Subject: N (N>2) party dialogue
I am interested in models of dialogue/conversation. In particular models
which deal with dialogue control (turn-taking etc) rather than linguistic
processing per se. I would be grateful for any guidance on discussions
of N-party dialogue where N>2. If these are discussions of computational
models so much the better.
Copies of references mailed directly to me will be circulated to interested
parties.
Anthony Finkelstein Imperial College, Dept. of Computing (Univ. of London)
uk.ac.ic.doc
------------------------------
To: nl-kr@cs.rpi.edu
Date: Mon 27 Feb 89 14:37:53-PDT
From: Steve Albrecht <ALBRECHT@INTELLICORP.COM>
Subject: addresses of NANOKLAUS authors
I need current addresses for Norman Haas and Gary G. Hendrix (perhaps
formerly) of SRI International.
NANOKLAUS is a knowledge aquistion system which uses an english language
interface. It was reported on by Hass and Hendrix in the Proc. of the First
Annual Conf. on Artificial Intelligence, 1980.
References to subsequent related work would also be appreciated.
Thanks for any assistance.
(:::::::::::::::::::::::::::::::::::::::::::::)
) Steve Albrecht IntelliCorp,Inc. (
( Knowledge Systems Product Development )
) "Opinions expressed in this message are my (
( own, if anyone's, and not my employer's." )
) CSNET albrecht@intellicorp.com (
( UUCP ...!sun!intellicorp.com!albrecht )
) or ...!sun!icmv!albrecht%caliph (
(:::::::::::::::::::::::::::::::::::::::::::::)
- ------
------------------------------
To: nl-kr@cs.rpi.edu
From: prlb2!kulcs!siegeert@uunet.UU.NET (Geert Adriaens)
Newsgroups: comp.ai.nlang-know-rep
Subject: e-mail address E.WANNER
Keywords: ATNs, parallelism
Date: 7 Mar 89 11:46:16 GMT
I'm looking for E. Wanner's e-mail address (need urgent contact
in relation to parallelism in ATNs). Help.
- -
Geert Adriaens, Geert Adriaens SIEMENS-METAL Project
Maria Theresiastraat 21 siegeert@kulcs.uucp or
B-3000 Leuven siegeert@blekul60.bitnet or
tel: ..32 16 285091 siegeert@cs.kuleuven.ac.be
------------------------------
To: nl-kr@cs.rpi.edu
From: Aaron Sloman <mcvax!cvaxa.sussex.ac.uk!aarons@uunet.UU.NET>
Date: 2 Mar 89 10:56:44 GMT
Subject: Submission for comp-ai-nlang-know-rep
There's been a lot of discussion of logical and semantic paradoxes
in comp.ai. As my comments are also relevant to comp.ai.nlang-know-rep
I am cross posting.
Here's and exaple of a type that has not yet appeared in comp.ai:
The father of the subject of this sentence is bald
This is a case of infinite recursion in the semantics of the referring
expression
"The father of the subject of this sentence"
Take any function symbol f (e.g 'the father of'), and an expression s
such that s refers to the thing denoted by 'f(s)' (in my example s
is 'the subject of this sentence'). Then if P is any predicate, the
sentence
P(f(s))
will have this property of infinite recursion (or infinite iteration
if you prefer) in the natural semantic interpretation of the argument
of the predicate.
This, like all the old philosophical examples ('This statement is
false', 'The present king of france is bald' and the like), are
illustrations of the very same general principle:
In a natural language (or a sufficiently rich formal language) it is
impossible to guarantee that syntactically well formed expressions
(whether referring expressions, predicate expressions, or whole
sentences) are semantically well formed in the sense of identifying some
entity (an object, a function, a truth value) of the type normally
identified by expressions of that syntactic category.
I.e. you cannot use syntactic well formedness to guarantee extension (or
reference, or denotation.)
This is not to say that the resulting complex expressions are
meaningless.
I think Gottlob Frege had all the essential insights required to
understand these phenomena (see the collection of translations of his
papers edited by P.Geach and M. Black Oxford: Blackwell 1960, especially
the paper on 'Sense and Reference' German version 'Sinn und Bedeutung').
(Incidentally, all the main ideas of lambda calculus come from Frege.)
Frege's key idea, which I think disposes of the paradoxes, is a
distinction between what we might call extensional meaning (= reference,
denotation, extension, and includes objects, sets, truth-values etc) and
intensional meaning (= sense, connotation, intension).
Roughly, but only VERY roughly (remember that qualification), the latter
is closer to what gets preserved when you translate from one language to
another. Two referring expressions that have different senses can refer
to the same object, i.e. have the same extension (Frege's example was
'the evening star is the morning star' i.e. both expressions refer to
the planet Venus, but in different ways. I.e. different procedures are
relevant to checking whether an object is the one referred to. I am
not saying, however, that the procedures are well defined in this case).
Similarly two predicate expressions or function expressions may
correspond to the same mathematical function in the sense of the same
set of argument/value pairs, yet identify that function in different
ways, simple examples being the functions
f(x) = x*x - 16
f(x) = (x - 4)*(x + 4)
or the predicates
x has a heart
x has kidneys (I think these are co-extensive)
Frege showed that a great deal of linguistic complexity can be accounted
for as resulting from the application of functions to arguments,
including higher order functions, which is how he (with great
originality) analysed quantifiers ("all", "some", "every" etc.)
Some of the functions (and here he generalized the work of Boole),
including not only things like "and", "not" etc, but also predicates and
quantifiers, were analyzed as having truth values for their values (i.e.
some arbitrary pair of objects T and F, treated in an asymmetrical way
pragmatically, but otherwise TOTALLY symmetrical).
He then suggested that just as the denotation (reference, extension,
value or whatever you want to call it) of a complex expression was
determined by the way in which it was composed of sub-expressions (e.g.
'(3 * 5) + (6 - 99)') similarly the SENSE (intension, connotation, Sinn,
or whatever you want to call it) is determined by the senses of the
sub-expressions and they ways they are combined.
Although he did not use a computational explanation of all this, I think
a very natural interpretation is that the SENSE of an expression
corresponds to what we would now call a PROCEDURE that can be executed
to compute the value, and the DENOTATION is the result you get.
The notion of a procedure here is actually very difficult to define with
sufficient generality, and there are problems defining criteria for
identify of procedures, especially when they are expressed in totally
different formalisms -- discussing that would lead into a discussion of
layers of procedures involving different virtual machines. It is
particularly difficult to make precise the notion of a procedure that is
not applied to internal datastructures, but to objects in the real world
to compute a value. Worse, we have many expressions that allude to an
ill defined assumed equivalent family of procedures, without selecting
one unambiguously as THE sense of the expression. (Bill Woods has been
trying to clarify these notions for years.)
Anyhow, once you have gone down this route, the paradoxes are relatively
easy to dispose of, because, assuming that semantic complexity does
derive from the application of procedures to arguments which themselves
may have to be identified by procedures applied to arguments, it
follows that there are various ways in which a simple or complex
expresion that is well formed may fail to determine a denotation, just
as every programming language rich enough to be general purpose allows
syntactically legal programs to be constructed that generate run time
errors or loop forever.
Examples of ways in which reference of an expression can fail are:
a. It's a totally undefined expression - e.g. the subject expression in
Zappwiddle is bald.
b. The execution of the corresponding procedure fails to identify any
object because that's how the reality referred to happens to be
The present king of france
The largest prime number between 24 and 28
(The sense is pretty clear - if there were such an object we'd
know what the epression referred to.)
c. There is not a unique object
The person in the next room
(There may be five persons in the next room)
d. The procedure cannot be executed because the arguments to which it is
applied are of the wrong type
Thursday + 17.3
The king of space
Thursday is bald
e. The procedure can be executed but it fails to terminate because of
some aspect of the reality it is applied to
Start with 0 and keep adding 1 and stop when you get to the
largest number (or some number x such that x + 1 = x, or ...)
The original male ancestor of Fred
(in a world with an infinite past)
f. The procedure has an internal loop
f(10), where f(x) = x * f(x-1)
or
f(f), where f(x) = not(x(x))
(Note that this "Russell" function is easy to define in
languages like Lisp and Pop-11.
What this sentence says is false
(You first have to identify what it says, then check its truth
value, but to do that you have to identify what it says and
check its truth value, ... etc.)
What this sentence says is true
(It has EXACTLY the same problems)
The father of the subject of this sentence is bald
(You have to find the subject, then get his father, then that
is the subject, but then you have to get his father, then that
is the subject ... etc.)
The set of all sets that do not contain themselves
The set of all sets that do contain themselves
In all these cases except the use of undefined symbols, we have a
syntactically well formed expression with a well defined SENSE (i.e.
complex procedure defined in terms of the application of simpler
procedures to their arguments), and in some cases we can even begin to
execute the procedure though not all, (e.g. where a primitive argument
expression totally fails to refer). But there is no DENOTATION
(reference, extension, value) though for different reasons.
Frege's definition of "sense" was not expressed in terms of procedures.
He used a host of unsatisfactory metaphors (including comparing the
sense with the image in a telescope). Neither is it clear that it can be
used in connection with all referring expressions (e.g. personal
pronouns and other indexicals), as he found when he tried.
He also at once stage abandoned one of his main insights when he
proposed that if an expression in his formal language failed to denote
anything then it should be taken to denote "the false" (removing the
symmetry of truth values).
By assuming the law of the excluded middle ( 'P or not P' must be true,
no matter what P is) without allowing for cases where there is no value,
you can, of course, get contradictions out of these examples. That was
the source of Bertrand Russell's misery. He did not wish to give up the
law. This forced him into a totally unnatural interpretation of
(1a) The present king of france is bald
as equivalent to something of the form:
(1b) There is at least one thing which is a KofF &
There is at most one thing which is a KofF
and for all x if x is a KofF then X is bald
NB this is not a circular analysis because "KofF" is treated as
a predicate in (1b), not a referring expression.
(1b) is false because of the first conjunct, whereas on the above,
Fregean, analysis (1a) would simply fail to denote any truth value
because the function 'x is bald' is not supplied with an argument since
the subject fails to refer, because that's how the world is.
Frege also pointed out that there are some contexts in which we
manage to refer to the SENSE normally expressed by an expression
(as I have done several times above). The expression whose reference is
a sense then has a higher order sense. Defining the semantics of such
higher order expressions is tricky, e.g. in contexts like
Fred wants to meet the King of France
Joe is trying to find the largest prime number
There are many loose ends in the above theory especially when you try to
apply it to something as rich, messy and ill defined as a natural
language.
However, the main point of this posting is that you can't hope to
understand the problems generated by the paradoxes (or most other deep
philosophical problems) without exploring a lot of the existing
literature. Unfortunately, my own explorations are probably now out of
date (the above analysis was done long ago). In particular, it is
possible that recent work on the semantics of programming languages is
very relevant, and perhaps a suitably informed reader will comment. But
I suspect that even that work has not addressed all these issues
properly, since programming languages are not YET rich enough to
generate the problems. Just wait will we program computers in English!
Aaron Sloman,
School of Cognitive and Computing Sciences,
Univ of Sussex, Brighton, BN1 9QN, England
ARPANET : aarons%uk.ac.sussex.cogs@nss.cs.ucl.ac.uk
aarons%uk.ac.sussex.cogs%nss.cs.ucl.ac.uk@relay.cs.net
JANET aarons@cogs.sussex.ac.uk
BITNET: aarons%uk.ac.sussex.cogs@uk.ac
UUCP: ...mcvax!ukc!cogs!aarons
or aarons@cogs.uucp
IN CASE OF DIFFICULTY use "syma" instead of "cogs"
------------------------------
End of NL-KR Digest
*******************