Copy Link
Add to Bookmark
Report
NL-KR Digest Volume 03 No. 38
NL-KR Digest (10/22/87 21:42:24) Volume 3 Number 38
Today's Topics:
Re: Infinite alphabets - (Turing via Berke)
----------------------------------------------------------------------
Date: Mon, 12 Oct 87 21:26 EDT
From: berke@locus.ucla.edu
Subject: Re: Infinite alphabets - (Turing via Berke)
I thought you might find Turing's words on the subject interesting.
These are from his 1936/7 article "On Computable Numbers, with
an application to the Entscheidungsproblem." I think they are
important. I also quote them in my article "That Does Not Compute"
from ICNN '87, distinguishing brains from computers from neural
networks, an expanded version of which I have submitted to Computer
Magazine. Turing:
"If we regard a symbol as literally printed on a square... The symbol
is defined as a set of points in theis square, viz. the set occupied
by the printer's ink. If these sets are restricted to be measurable,
we can define the "distance" between two symbols as the cost of
transforming one symbol into the other if the cost of moving unit area
of printer's ink unit distance is unity..."
He later goes on...
"I shall also suppose that the number of symbols which may be printed
is finite. If we were to allow an infinity of symbols, then there would
be symbols differing to an arbitrarily small extent. The effect of this
restriction of the number of symbols is not very serious. It is always
possible to use sequences of symbols in the place of single symbols.
Thus an Arabic numeral such as 17 or 999999999999 is normally treated
as a single symbol.
"Similarly, in any European language words are treated as single symbols
(Chinese however, attempts to have an enumerable infinity of symbols).
The differences from our point of view between the single and compound
symbols is that the compound symbols, if they are too lengthy, cannot
be observed at one glance. This is in accord with experience. We
cannot tell at a glance whether 999999999999999 and 9999999999999999
are the same."
These passages led me to suggest that what we want to do with neural
networks is to mechanize perception, not simply to build faster computers.
(Though that is a desirable goal in its own right.)
Please note that it is Turing claiming Chinese "attempts" to have an
enumerable infinity of symbols. I am ignorant of Chinese. At some point,
with an increasing number of symbols, all members of an alphabet cannot be
recognized at-a-glance. Or, looking at it another way, if you presume
an infinity of symbols upon which you are operating, you are not computing.
This is also the intuitive reason I claim ambiguity resolution
is not computable. I think of ambiguity resolution as the
general case of "object recognition" and "problem formulation."
Imagine a vague scene or situation not already described by a finite
enumeration of the objects in it and their relationships, etc.
Such a vague scene is equivalent, in Turing's sense, to supposing
an infinite number of things in the scene, which you are going to reduce
to a finite number of options from which to choose. Since you are
supposing an infinite alphabet (or the equivalent "vague" scene) technically,
you are not computing.
By Turing's above remarks, I think by his definition, Chinese cannot
succeed at having an enumerable infinity of symbols. It can only
"attempt" to have them. Unless you "go down a level" and consider
"things that make up" Chinese symbols "the symbols." Then, there
must be a finite number of them. Does anyone know if this is true
about Chinese? It would seem that even in English it does not apply to
orthography, though it apparently does to letters,
since we use in hand-writing no fixed "alphabet" of strokes, etc. Well,
I meant only to introduce Turing's words, forgive me for going on.
Regards,
Peter Berke
------------------------------
Date: Tue, 13 Oct 87 14:30 EDT
From: Hideyuki Nakashima <russell!nakashim@labrea.stanford.edu>
Subject: Re: Infinite alphabets - (Turing via Berke)
In article <8583@shemp.UCLA.EDU> berke@CS.UCLA.EDU (Peter Berke) writes:
>
>By Turing's above remarks, I think by his definition, Chinese cannot
>succeed at having an enumerable infinity of symbols. It can only
>"attempt" to have them. Unless you "go down a level" and consider
>"things that make up" Chinese symbols "the symbols." Then, there
>must be a finite number of them. Does anyone know if this is true
>about Chinese? It would seem that even in English it does not apply to
>orthography, though it apparently does to letters,
>since we use in hand-writing no fixed "alphabet" of strokes, etc. Well,
>I meant only to introduce Turing's words, forgive me for going on.
>
The comparison between Europian alphabetical system and Chinese
character system is very interesting. Chinese can DO have infinite
characters. What they do is to assign one character to each concept.
Of course, they don't have infinite characters now. But I say it is
possible to have them if they want.
One chinese character usually consists of smaller constituents. For
example, a character for "maple tree" consists of two parts: one
designating "tree" and the other designating "wind". The character
for "forest" consists of three "tree"s.
The character for "tree" comes from a pictorial representation of a
real tree. It is like this:
|
----+----
/|\
/ | \
/ | \
So are characters for "sun", "house", "river" and so on. By this way,
you can just invent a new character for a concept if it is a basic
one. Otherwise you can combine several characters to define a new
one.
I think that Europian way of thinking is analytic while Eastern is
holistic. Having alphabets to compose words vs. having characters for
each concepts, is one of the examples. I further think that digital
computer follows Europian way. I want to come up with Eastern
equivalent of digital computer (not an analog one, though).
Connectionism is one of the possibilities.
--
Hideyuki Nakashima
CSLI and ETL
nakashima@csli.stanford.edu (until Aug. 1988)
nakashima%etl.jp@relay.cs.net (afterwards)
------------------------------
Date: Wed, 14 Oct 87 08:27 EDT
From: mitch bayersdorfer <kodak!bayers@cs.rochester.edu>
Subject: Re: Infinite alphabets - (Turing via Berke)
In article <417@russell.STANFORD.EDU> nakashim@russell.UUCP (Hideyuki Nakashima) writes:
>
>The comparison between Europian alphabetical system and Chinese
>character system is very interesting. Chinese can DO have infinite
>characters. What they do is to assign one character to each concept.
>Of course, they don't have infinite characters now. But I say it is
>possible to have them if they want.
>
I don't mean to make a reductio ad adsurdum, but can any alphabet which
a given human can perceive be truely infinite? Given that human beings have
a finite (but very large) number of neurons in their visual and cerebral
cortex, and that any distinct alphabetic character would exceed the
thresholds of an enumerable permutation of those neurons, mustn't there
be only a finite number of characters (and concepts?). I am assuming that
neural thresholds for a certain learned concept are relatively constant. If
another concept produces the same permutation, but only to a greater
degree, then couldn't that be reduced down to two concepts-- one a measure
of the concept, and the other a measure of its degree?
- Mitch Bayersdorfer
...{}!rochester!kodak!bayers
"The above does not represent the views of the Eastman Kodak Company or
its management."
------------------------------
Date: Wed, 14 Oct 87 11:17 EDT
From: David Feldman <super.upenn.edu!linc.cis.upenn.edu!david@rutgers.edu>
Subject: Re: Infinite alphabets - (Turing via Berke)
In <8583@shemp.UCLA.EDU>, Peter asks:
>By Turing's above remarks, I think by his definition, Chinese cannot
>succeed at having an enumerable infinity of symbols. It can only
>"attempt" to have them. Unless you "go down a level" and consider
>"things that make up" Chinese symbols "the symbols." Then, there
>must be a finite number of them. Does anyone know if this is true
>about Chinese? It would seem that even in English it does not apply to
>
>Peter Berke
Chinese characters are by and large composed of 'radicals' which have base
meanings. These meanings are not necessarily related to the meanings of the
characters that are composed of the radicals, but they sometimes do. For
instance, the question particle 'ma' has the 'ko' radical which is associated
with the mouth. And indeed, there is a finite number of radicals. I have
a couple of sheets describing them. Anything component of a character that is
not a radical can be classified as a stroke, and there are specific kinds of
strokes also.
Dave Feldman
david@linc.cis.upenn.edu
------------------------------
Date: Thu, 15 Oct 87 13:53 EDT
From: Dr. Robin Lake <rbl@nitrex.UUCP>
Subject: Re: Infinite alphabets - (Turing via Berke)
In article <8583@shemp.UCLA.EDU> berke@CS.UCLA.EDU (Peter Berke) writes:
>
> ...
>This is also the intuitive reason I claim ambiguity resolution
>is not computable. I think of ambiguity resolution as the
>general case of "object recognition" and "problem formulation."
>Imagine a vague scene or situation not already described by a finite
>enumeration of the objects in it and their relationships, etc.
>Such a vague scene is equivalent, in Turing's sense, to supposing
>an infinite number of things in the scene, which you are going to reduce
>to a finite number of options from which to choose. Since you are
>supposing an infinite alphabet (or the equivalent "vague" scene) technically,
>you are not computing.
>
> ...
>
>Peter Berke
"Computable/Computing" with the current mathematics. If you go back to the fundamentals
of math (Robinson's axiomatic set theory) and remove the Axiom of Choice, one
can derive a new mathematics which is more appropriate for situations where
uncertainty prevails. This line of reasoning has worked extremely well in
developing some powerful tools for "smart" data analyzers where the data is
too dirty and uncertain for classic statistical techniques.
--
Rob Lake
{decvax,ihnp4!cbosgd}!mandrill!nitrex!rbl
------------------------------
Date: Thu, 15 Oct 87 16:21 EDT
From: Jianhua Zhu <zhu@ogcvax.UUCP>
Subject: Re: Infinite alphabets - (Turing via Berke)
In article <417@russell.STANFORD.EDU>, nakashim@russell.STANFORD.EDU (Hideyuki Nakashima) writes:
> In article <8583@shemp.UCLA.EDU> berke@CS.UCLA.EDU (Peter Berke) writes:
> >
> >By Turing's above remarks, I think by his definition, Chinese cannot
> >succeed at having an enumerable infinity of symbols. It can only
> >"attempt" to have them. Unless you "go down a level" and consider
> >"things that make up" Chinese symbols "the symbols." Then, there
> >must be a finite number of them. Does anyone know if this is true
> >about Chinese? It would seem that even in English it does not apply to
> >orthography, though it apparently does to letters,
> >since we use in hand-writing no fixed "alphabet" of strokes, etc. Well,
> >I meant only to introduce Turing's words, forgive me for going on.
> >
>
> The comparison between Europian alphabetical system and Chinese
> character system is very interesting. Chinese can DO have infinite
> characters. What they do is to assign one character to each concept.
> Of course, they don't have infinite characters now. But I say it is
> possible to have them if they want.
>
...
>
> each concepts, is one of the examples. I further think that digital
> computer follows Europian way. I want to come up with Eastern
> equivalent of digital computer (not an analog one, though).
> Connectionism is one of the possibilities.
>
It looks like a symbol is defined as a finite set of discrete points in
a *finitely bounded* square (otherwise, the number of symbols would be
infinite). According to this definition, I cannot image how Chinese can
have infinite symbols. On the other hand, if one is allowed to use certain
operators operating on symbols to introduce new symbols, then the number of
symbols in any language can potentially be infinite.
As far as written languages are concerned, the main difference between Chinese
and English (or any Europian language) is that English words are composed
of letters from an finite alphabet in a one-dimensional manner (namely,
concatenation), whereas Chinese words are composed of *strokes* from a finite
stroke set as two-dimentional pictures. (YES the stroke set is VERY finite,
in fact not that bigger than English alphabet.)
Although we do have one more degree of freedom by which we can build words
from elements of one level down, we wouldn't want to go too far in this
other direction (we still want to have neatly printed document in our
language). As a matter of fact, currently nobody is inventing new words
by piling up existing words any more, and new words are added, as new
concepts come along, almost exclusively in a form of *word combinations*,
whose English analogy would be hyphenated words.
Yes I quite agree with Mr. Nakashima in that digital computers follow
Europian way. But I don't see how Connectionism in particular can lead
to digital computers (or equivalents) of Eastern way (I do know that
Connectionism architecture can do extremely well for tasks such as
pattern recognition), and would be delighted if someone could provide
further information.
--
/__|__ _/-\_ | /__|__ * Jianhua Zhu *** UUCP: ...!tektronix!ogcvax!zhu
"___|__, --- || /| ,|_/ * CSE Dept - OGC *** CSNet: zhu@oregon-grad
,"|", \|/ || ---+--- * 19600NW von Neumann Dr ***
/ \| \ /--- \| | * Beaverton, OR 97006 ***
------------------------------
Date: Fri, 16 Oct 87 04:17 EDT
From: Tim Smith <tsmith@gryphon.CTS.COM>
Subject: Re: Infinite alphabets - (Turing via Berke)
In article <971@kodak.UUCP> bayers@kodak.UUCP (mitch bayersdorfer) writes:
+=====
| I don't mean to make a reductio ad adsurdum, but can any alphabet which
| a given human can perceive be truely infinite? Given that human beings have
| a finite (but very large) number of neurons in their visual and cerebral
| cortex, and that any distinct alphabetic character would exceed the
| thresholds of an enumerable permutation of those neurons, mustn't there
| be only a finite number of characters (and concepts?). I am assuming that
| neural thresholds for a certain learned concept are relatively constant. If
| another concept produces the same permutation, but only to a greater
| degree, then couldn't that be reduced down to two concepts-- one a measure
| of the concept, and the other a measure of its degree?
| - Mitch Bayersdorfer
+=====
You are addressing a problem that linguists have had to wrestle
with for a long time. Back in the 1950's, the linguist Noam
Chomsky did some original research on the properties of formal
grammars that might be used to generate sentences in a natural
language. Among other things, he discovered that any reasonable
grammar (or production system) to generate sentences in natural
language would have recursive properties that would, in effect,
make the set of sentences of the language infinite in size. The
corollary to this, of course, is that there is no "longest
sentence" in a natural language. Realizing that it is somewhat
silly to claim that a human can utter or comprehend a sentence
that might be a few giga-centuries in duration, Chomsky fudged
around and invented a distinction between linguistic
"competence" and linguistic "performance". Actually, it's a
reasonable fudge.
A good analogy for this is simple arithmetic. If you know the
rules of, say, multiplication, there is no reason why you can't
multiply two numbers that require a few giga-light-years worth of
paper to write down (in 10-point type). You are theoretically
competent to do this. Your performance abilities (life span,
attention span, etc.) will, however, undoubtedly keep you from
achieving the end product.
Cantor discovered several kinds of infinities. Perhaps what we
need to discover is a kind of sub-infinity. The actual number of
sentences in any natural language is "infinite" in the same
sense that the number of grains of sand on a beach is infinite,
i.e. "not really".
Are alphabets like sentences? Uh, no, not at all! I have stayed out
of this discussion about "infinite alphabets", since I do not
understand the issue. I enter reluctantly. Here goes...
An alphabet is a set (decidedly finite) of glyphs (e.g., little
marks on paper). A language adopts an alphabet, and tries to map
its sound system (its phonology) onto the alphabet. Sometimes
this works well, sometimes not so well. For example, both
Italian and English use the Latin alphabet. Neither language's
phonology maps directly to the alphabet, but Italian maps better
than English. In Italian there are only a few little problems
(the letters "c", "g", "e", and "o" do not map one-to-one to
Italian phonemes). In English, there are many, many problems
(which we are all aware of).
The sound system of every language contains a finite number of
phonemes, and a finite (but much larger) number of syllables.
Therefore, any alphabet (phomeme-glyph mapping), or any
syllabary (syllable-glyph mapping) should be finite.
Ideographic writing systems, such as Japanese "kanji", are not
alphabets, and they are not syllabaries. If we assume that the
languages that use ideographic writing systems allow free
creation of new glyphs, then these writing systems have an
infinite number of glyphs, in exactly the same sense that there
is an infinite number of sentences in English, or that there is
an infinite number of art works that can be created by mankind.
--
Tim Smith
INTERNET: tsmith@gryphon.CTS.COM
UUCP: {hplabs!hp-sdd, sdcsvax, ihnp4, ....}!crash!gryphon!tsmith
UUCP: {philabs, trwrb}!cadovax!gryphon!tsmith
------------------------------
Date: Fri, 16 Oct 87 12:07 EDT
From: Richard Wojcik <rwojcik@bcsaic.UUCP>
Subject: Re: Infinite alphabets - (Turing via Berke)
There seems to be some confusion over the nature of writing systems for
human languages. There are three 'ideal' types: logographic
(word-based), syllabic, and alphabetic. Alphabetic writing systems are
based on phonemes, the basic units of sound that make up words. Since
most languages have between 30 and 50 phonemes, it makes little sense to
talk about natural writing systems with infinite alphabets. Logographic
writing systems, the least efficient type of writing, can have only as
many symbols as there are words (or morphemes) in the vocabulary. If it
is necessary to carry on this discussion, why not debate the size of the
class of objects that written symbols map onto?
------------------------------
Date: Fri, 16 Oct 87 12:16 EDT
From: Hideyuki Nakashima <nakashim@russell.STANFORD.EDU>
Subject: Re: Infinite alphabets - (Turing via Berke)
In article <971@kodak.UUCP> bayers@kodak.UUCP (mitch bayersdorfer) writes:
>In article <417@russell.STANFORD.EDU> nakashim@russell.UUCP (Hideyuki Nakashima) writes:
>>
>> Chinese can DO have infinite characters.
>
>I don't mean to make a reductio ad adsurdum, but can any alphabet which
>a given human can perceive be truely infinite?
OK. Let me change "infinite" to "unbounded". Who cares for infinity
in real life, anyway?
What I wanted to say was, however, not that point. What Chinese tried
was (or so seems tome was) to assign a symbol for each distinct
concepts. So, as the number of concepts grow, so does the number of
symbols (potentially). Europian "word"s correspond to Chinese
"character"s. No Chinese equivalent of Europian "alphabet"s exist.
--
Hideyuki Nakashima
CSLI and ETL
nakashima@csli.stanford.edu (until Aug. 1988)
nakashima%etl.jp@relay.cs.net (afterwards)
------------------------------
Date: Sun, 18 Oct 87 17:11 EDT
From: mitchell spector <spector@suvax1.UUCP>
Subject: Re: Infinite alphabets - (Turing via Berke)
In article <557@nitrex.UUCP>, rbl@nitrex.UUCP ( Dr. Robin Lake ) says:
> .... If you go back to the fundamentals
> of math (Robinson's axiomatic set theory) and remove the Axiom of Choice, one
> can derive a new mathematics which is more appropriate for situations where
> uncertainty prevails. This line of reasoning has worked extremely well in
> developing some powerful tools for "smart" data analyzers where the data is
> too dirty and uncertain for classic statistical techniques.
What specifically are you thinking of? Robinson developed non-standard
analysis, which is basically calculus with infinitesimals, done rigorously.
The axiom of choice *is* generally used to construct a hyperreal number
system (a structure containing both real numbers and infinitesimals which
has certain desirable properties), as in Robinson's theory.
While there is a substantial body of mathematics in which the axiom of
choice is rejected, two things should be pointed out. The first is that it
is almost never satisfactory merely to remove the axiom of choice from set
theory (or even to adopt as an axiom the negation of the axiom of choice;
the resulting theory is far too weak to prove many interesting results.
Instead, various strong axioms which happen to contradict the axiom of choice
are used as replacements; the chief of these is the axiom of determinateness.
The second is that the acceptance or rejection of the axiom of choice has
no known effect on any practical problem (for example, anything in physics).
(This is certainly true if one is willing to adopt the much weaker axiom of
dependent choices, as almost everyone in the field does.) For example, the
same sentences of number theory can be proven with the axiom of choice as
without it. Moreover, all statements of calculus or analysis of the type
used in physics can be rephrased as statements of number theory, although
this can get rather tedious and technical. It follows that none of these
statements require the axiom of choice for their proofs either. It is only
when one starts to consider things at a more abstract mathematical level that
the axiom of choice comes into play (for example, the existence of a
non-measurable set of real numbers).
I'd be interested in the examples referred to in the previous message
(although I don't know if this discussion belongs in sci.lang -- that depends
on the examples, I guess).
--
Mitchell Spector
Dept. of Computer Science/Software Engineering, Seattle University
Path: ...!uw-beaver!uw-entropy!dataio!suvax1!spector
or: dataio!suvax1!spector@entropy.ms.washington.edu
------------------------------
Date: Mon, 19 Oct 87 11:27 EDT
From: "Virendra Verma, DTN 297-5510" <verma@hpscad.dec.com>
Subject: Re: Infinite alphabets - (Turing via Berke)
In article <sci.lang:1530> tsmith@gryphon.CTS.COM (Tim Smith) writes:
> The sound system of every language contains a finite number of
> phonemes, and a finite (but much larger) number of syllables.
> Therefore, any alphabet (phomeme-glyph mapping), or any
> syllabary (syllable-glyph mapping) should be finite.
That is very true in Devanagari script. There are only 33 alphabets
in Devanagari script and they produce unique phonetic words.
This means that you can pronounce any unheard word (of any language)
exactly the way it is written in Devanagari script without any ambiguity.
Devanagari script is used by Sanskrit and Hindi, the two prominent
languages of India.
- Virendra
------------------------------
Date: Mon, 19 Oct 87 11:59 EDT
From: Randy Gordon <randyg@iscuva.ISCS.COM>
Subject: Re: Infinite alphabets - (Turing via Berke)
In article <436@russell.STANFORD.EDU> nakashim@russell.UUCP (Hideyuki Nakashima) writes:
>............. No Chinese equivalent of Europian "alphabet"s exist.
>--
>Hideyuki Nakashima
>nakashima@csli.stanford.edu (until Aug. 1988)
>nakashima%etl.jp@relay.cs.net (afterwards)
sigh, I am probably gonna get my head bit off, but....
I don't know much about Chinese writing, but in terms of formal game theory,
does it really make much of a difference to the equivalence of the languages
that chinese use simple tokens(well sorta) and europeans use complex tokens?
From what I can tell, there is nothing that is impossible to translate either
way(tho it may take a number of manipulations).
RANDY GORDON(TAO KUO TSE FUN PEE)
------------------------------
End of NL-KR Digest
*******************