Copy Link
Add to Bookmark
Report
NL-KR Digest Volume 02 No. 57
NL-KR Digest (6/22/87 16:40:06) Volume 2 Number 57
Today's Topics:
What are facts in linguistics?
The ISI Grapher
database for syntax analysis (or other) trees
Re: Could X's inflection in [X (Y) Z] depend on the presense of Y?
----------------------------------------------------------------------
Date: Thu, 11 Jun 87 10:35 EDT
From: Bruce Nevin <bnevin@cch.bbn.com>
Subject: What are facts in linguistics?
In NL-KR 2.53 (6/11/87) Meg (withgott.pa@Xerox.COM) objects to my
`surprising characterization' of aspirated h in French. She compares it
with the a/an alternation in English, `but with more wrinkles'. She
accepts it as an empirical fact of French, allowing
MW> If you try an empirical approach, and test all the aspirated h words in
MW> all the dictionaries you can find, you will no doubt find some
MW> distributional regularities (if you want to try, look at number of
MW> syllables, etymology, initial vowel quality). Then what do you do? You
MW> still have to figure out how to represent the thing, whether you write
MW> generative analyses, computer systems, or French text books.
But the issue is not how to represent facts, but what indeed constitute
significant facts in linguistics. Gross addresses this issue at some
length. It would help if you would read the article that I cited, On
the Failure of Generative Grammar, in _Language_ 55.4 (1979).
Let me quote a relevant excerpt from that article:
Consider two cases of liaison:
(16) les haricots: /leariko/, */lezariko/ 'the beans'
les animaux: */lesanimo/, /lezanimo/ 'the animals'
These are orthographically distinguished by means of aspirated h. But
it can be easily verified that this distinction is entirely artificial,
and has been explicitly imposed by the French educational system.
Only so-called educated persons possess the h, while most French
speakers struggle in vain to pronounce /leariko/, ending invariably
with /lezariko/. Furthermore, children never have h at the age when
they master the complete phonological system of French, i.e. before
they enter school. Teaching h is difficult, as can be heard daily in
the classroom and in the street. The non-existence of the linguistic
problem is confirmed by the lack of internal coherence of data:
he'ros 'hero' has h, but not fem. he'roi:n; he'ron 'heron' has h, and
fem. he'ronne does too. The verb harnarcher 'to harness' is supposed
to have h, implying that 1sg. je harnacherai is pronounced with schwa.
However, in 3pl ils harnacheront, the form with h is not accepted:
*/ilarnashro~/. [That should be s-hachek for sh and nasalized open o
(turned c) for o~, but I lack the character set!--BN] There are
numerous similar cases. Moreover, school teachers do not `correct'
liaisons of pupils beyond the commonest syntactic positions, between
article and noun and between subject pronoun and verb. Hence, in
constructions not taught at achool, all laisons are made in the
natural way, i.e. following the dominant consonant-vowel rule:
[examples omitted--read the article.--BN]
. . . These aspects of the use of h are artifacts of pedagogy, and
have nothing to do with the way in which the phonological and
syntactic system of French is learned. Generative linguists, unaware
of such considerations, have argued about this phenomenon as if it
were illuminating for the structure of language. . . . [Reference to
detailed review omitted.] (op cit, 868-9)
So much for similarity to the a/an contrast, whose loss is strictly
dialectal in English.
If researchers are discussing a significant fact about a language, from
which claims about Language and Universal Grammar might be drawn, then
yes the representational issues may be worth spilling some ink, and yes
indeed `looking at data only gets you part way there'. But when the
abstract issues of representation overshadow mere data (and the liaison
case was only one instance of many Gross cites)--well, to quote Minsky
not too long ago re Generative grammar, `a mind is a terrible thing to
waste.'
PS--an error on my part: a document by two authors, like the long piece
on the historiography of structuralism by Hymes and Fought, can't be
called a monograph, rather a--hmm!--a digraph? :-)
Bruce Nevin
bn@cch.bbn.com
(This is my own personal communication, and in no way expresses or
implies anything about the opinions of my employer, its clients, etc.)
------------------------------
Date: Sat, 13 Jun 87 16:35 EDT
From: Gabriel Robins <gabriel@vaxa.isi.edu>
Subject: The ISI Grapher
[Excerpted from AIList]
Greetings,
Due to the considerable interest drawn by the ISI Grapher so far, I am
posting this abstract summarizing its function and current status. Interested
parties may obtain further information by directly sending EMail to
"gabriel@vaxa.isi.edu" or by writing to:
Gabriel Robins
Intelligent Systems Division
Information Sciences Institute
4676 Admiralty Way
Marina Del Rey, Ca 90292-6695
If you want documentation in hardcopy, please include your U.S. Mail address.
Gabe
The ISI Grapher
June, 1987
Gabriel Robins
Intelligent Systems Division
Information Sciences Institute
The ISI Grapher is a set of functions that convert an arbitrary graph
structure (or relation) into an equivalent pictorial representation and
displays the resulting diagram. Nodes and edges in the graph become boxes and
lines on the workstation screen, and the user may then interact with the
Grapher in various ways via the mouse and the keyboard.
The fundamental motivation which gave birth to the ISI Grapher is the
observation that graphs are very basic and common structures, and the belief
that the ability to quickly display, manipulate, and browse through graphs may
greatly enhance the productivity of a researcher, both quantitatively and
qualitatively. This seems especially true in knowledge representation and
natural language research.
The ISI Grapher is both powerful and versatile, allowing an
application-builder to easily build other tools on top of it. The ISI NIKL
Browser is an example of one such tool. The salient features of the ISI
Grapher are its portability, speed, versatility, and extensibility. Several
additional applications were already built on top of the ISI Grapher,
providing the ability to graph lists, flavors, packages, divisors, functions,
and Common-Loops classes.
Several basic Grapher operations may be user-controlled via the specification
of alternate functions for performing these tasks. These operations include
the drawing of nodes and edges, the selection of fonts, the determination of
print-names, pretty-printing, and highlighting operations. Standard
definitions are already provided for these operations and are used by default
if the application-builder does not override them by specifying his own
custom-tailored functions for performing the same tasks.
The ISI Grapher now spans about 100 pages of CommonLisp code. The 120-page
ISI Grapher manual is available; this manual describes the general ideas, the
interface, the application-builder's back-end, the algorithms, the
implementation, and the data structures. The ISI Grapher presently runs on
both Symbolics (6 & 7) and TI Explorer workstations.
If you are interested in more information, the sources themselves, or just
the documentation/manual, please feel free to forward your U.S. Mail address to
"gabriel@vaxa.isi.edu" or write to "Gabriel Robins, c/o Information Sciences
Institute, 4676 Admiralty Way, Marina Del Rey, Ca 90292-6695."
------------------------------
Date: Tue, 16 Jun 87 16:21 EDT
From: COR_HVH%HNYKUN52.BITNET@wiscvm.wisc.edu
Subject: database for syntax analysis (or other) trees
Below you find a copy of an information folder about a database system
for tree structures. The pictures have lost some of their
attractiveness in the translation from graphics to characters, but I
hope they still give a reasonable impression of the system at work.
Hans van Halteren (COR_HVH@HNYKUN52.BITNET)
$$$
The LDB (Linguistic DataBase) project is concerned with the
construction and maintenance of a computer system for the
exploitation of analyzed corpus material.
To make possible a widespread use by linguists, the system is
designed without the need for specialized hardware and without
the need for computer expertise on the part of the user. The
first complete version features a menu system for overall
control, a sub-system for the examination of analysis trees on
standard terminal screens and a query language in which the
linguist can specify database actions in his own terminology.
The database has already been in use at 20 universities
throughout the world in its mainframe (VM/CMS) and supermini (VAX
with VMS or UNIX) versions. Now the availability of the database
has been improved even further with the completion of a version
for PC/AT (with the same possibilities and user-interface as the
other versions).
Packaged with the database system comes a 130,000 word corpus of
modern English with a full syntactical analysis of each
utterance (the Nijmegen corpus, analyzed in the CCPP project). In
the future more corpora will become available. Furthermore, as
the database system is formalism and language independent, it is
possible to use it for your own analyzed corpus material.
For scientific research, the system is available at a nominal fee.
For information about obtaining it, write to:
TOSCA Work Group
Dept. of English
University of Nijmegen
P.O. Box 9103
6500 HD Nijmegen
The Netherlands
or E-mail to COR_HVH @ HNYKUN52.BITNET
$$$
Figure I: The tree map view in the Tree Viewer
MANY A DOCTOR <# WHO APPEARS HESITANT AND RESERVED IN SOCIETY #> DONS
.-1- . . . . . MANY
|-2- . . . . . A
|-3- . . . . . DOCTOR
.-1---| .-1- . . . #WHO
| | |-2- . . . #APPEARS
| '(4)--| .-1- . #HESITANT
| |-3---+-2- . #AND
| | '-3- . #RESERVED
-*---| '-4-----1- . #IN
| '-2- . #SOCIETY
|-2- . . . . . . . DONS
|-3-----1- . . . . . COMPLETE
| '-2- . . . . . CONFIDENCE
| .-1- . . . . . WITH
'-4---| .-1- . . . HIS
'-2---+-2- . . . WHITE
|-3- . . . PROFESSIONAL
'-4- . . . COAT.
POSTMODIFIER:FINITE SENTENCE()
command:
scroll:YUDLR<>() focus:FS1-90PNMJ amb:CA view:V help:? exit:X
$$$
Figure II: A search pattern for sentences with noun phrases showing
a non-initial determiner and a postmodifying finite
sentence with subject WHO or THAT and a subject complement
of more than one word (for an example, see figure I)
.____________________.
1__|FUN = 'DET' |
| |SNO > 1 |
| | |
| `____________________'
.____________________. | .____________________.
|CAT = 'NP' |_2__|FUN = 'HD' |
| | | | |
| | | | |
`____________________' | `____________________'
| .____________________. ######################
3__|FUN = 'POM' |_1__#FUN = 'SU' #
|CAT = 'SF' | | #WOR = 'WHO' OR WOR =>
| | | # #
`____________________' | ######################
| .____________________.
2__|FUN = 'CS' |
|WCT > 1 |
| |
`____________________'
FUN = 'SU' ; WOR = 'WHO' OR WOR = 'THAT'
command:
scroll:YUDLR()<> focus:FS1-90PN edit:IETCOW view:V help:? exit:X
------------------------------
Date: Tue, 16 Jun 87 06:12 EDT
From: Amos Shapir <amos@instable.UUCP>
Subject: Re: Could X's inflection in [X (Y) Z] depend on the presense of Y?
In article <7421@boring.cwi.nl> lambert@boring.UUCP (Lambert Meertens) writes:
>Isn't there also something in Hebrew where X in [X Y] is different from
>stand-alone [X], something like
>
> ruach elohim vs ruch
> (spirit of god) (spirit)
>
>Hebrew isn't one of my strong points, so the transcription and translation
>may be off, but there is definitely something of that nature there.
Yes, in Hebrew adjacency (parallel to the use o 'of') does change the
next words, but 'ruach' is one of the (few) words that do not change.
A better example would be:
beyt elohim vs bayit
(house of god) (a house)
--
Amos Shapir
National Semiconductor (Israel)
6 Maskit st. P.O.B. 3007, Herzlia 46104, Israel Tel. (972)52-522261
amos%nsta@nsc.com @{hplabs,pyramid,sun,decwrl} 34 48 E / 32 10 N
------------------------------
Date: Tue, 16 Jun 87 07:40 EDT
From: Jim Scobbie <jim@epistemi.UUCP>
Subject: Re: Could X's inflection in [X (Y) Z] depend on the presense of Y?
In article <1645@pbhye.UUCP> rob@pbhye.UUCP (Rob Bernardo) writes:
>
>In article <1425@etlcom.etl.JUNET> hasida@etlcom.etl.JUNET (Hasida Koiti) writes:
>+Wanted: Natural languages which have the property as follows:
>+(1) X's inflection differs between the two contexts [X Y Z]
>+ and [X Z], for some (equivalence classes of) grammatical
>+ categories X, Y, and Z, such that both [X Y Z] and [X Z]
>+ constitute grammatical categories with Z the head, under
>+ some inflection (or conjugation, declension, or the like)
>+ of X, Y, and Z.
>
>How about English?
>
> an orange house vs a house
>
>From the desk of the Arbiter of Good Taste.
>Rob Bernardo, San Ramon, CA (415) 823-2417 {pyramid|ihnp4|dual}!ptsfa!rob
This is phonological, not grammatical conditioning. The form of the
indefinite article depends on the phonetic features of the following segment.
Thus we get:
1) a house an orange house
2) an orange a large orange
[XYZ] has X as 'an' against 'a' in [XZ], while 2 has quite the opposite.
nb: It's pronunciation that counts, not spelling. Thus we have
3) an NBC broadcast
4) a NATO exercise
And 'an historic ...' etc sound awful since they break the phonological
rule. Another example of prescriptive linguistics (like double negatives,
preposition stranding, hopefully etc etc etc etc) being silly.
Question: Since here we have an example of phonological rules ruling out
something as ungrammatical (thus acting like semantics in 'filtering'
syntactic output) can anyone think of other examples? Also, what is the
difference between the 'n' of 'an' and more phoneticky epentheticals. Is it
just that a-an is recognised in orthography and is obligatory?
--
Jim Scobbie: Centre for Cognitive Science, Edinburgh University,
2 Buccleuch Place, Edinburgh, EH8 9LW, SCOTLAND
UUCP: ...!ukc!cstvax!epistemi!jim
JANET: jim@uk.ac.ed.epistemi
------------------------------
Date: Tue, 16 Jun 87 11:33 EDT
From: D.GERTLER <gertler@mtuxo.UUCP>
Subject: Re: Could X's inflection in [X (Y) Z] depend on the presense of Y?
In article <7421@boring.cwi.nl>, lambert@cwi.nl (Lambert Meertens) writes:
> In article <1645@pbhye.UUCP> rob@pbhye.UUCP (Rob Bernardo) writes:
> >
> | In article <1425@etlcom.etl.JUNET> hasida@etlcom.etl.JUNET (Hasida Koiti) writes:
> | +Wanted: Natural languages which have the property as follows:
> | + [...]
> | How about English?
> |
> > an orange house vs a house
>
> Isn't there also something in Hebrew where X in [X Y] is different from
> stand-alone [X]
>
> Hebrew isn't one of my strong points, so the transcription and translation
> may be off, but there is definitely something of that nature there.
You are talking about the Hebrew "s@miykhuth" (construct). The structure
[X Y] usually is read "X of Y." Many very common examples of the construct
are based on the word "bayith" (house). When [X] is "bayith" (almost
rhymes with "buy it"), its construct form is "beyth" (rhymes with the
English "bait"). For example:
bayith shel ya`aqov (house of Ya`aqov) = beyth ya`aqov
school = beyth sefer (... book)
hospital = beyth Holiym (... sick)
Not all words inflect when used in the construct. The word "ruaH"
(spirit/wind), used in the example, is one of these. Regular plural
masculine nouns (ending in "iym", which rhymes with "seem") change
their endings to "ey", while the first letter of the second word loses
its dagesh (emphasis). For example:
ben pinHas (son of PinHas)
^
-> baniym shel pinHas (sons of PinHas)
^^^ ^
= b@ney finHas
^^ ^
(This loss of dagesh is not a feature particular to the construct
form, but an artifact of the new relationship between X's ending
and Y's beginning.)
--
-Don Gertler UUCP: ...!mtuxo!gertler
"If this works, we'll eat like kings."
------------------------------
Date: Wed, 17 Jun 87 01:40 EDT
From: Koiti Hasida <hasida@etlcom.etl.JUNET>
Subject: Re: Could X's inflection in [X (Y) Z] depend on the presense of Y?
In article <1425@etlcom.etl.JUNET> I wrote:
>Wanted: Natural languages which have the property as follows:
>(1) X's inflection differs between the two contexts [X Y Z]
> [...]
In article <1645@pbhye.UUCP> rob@pbhye.UUCP (Rob Bernardo) writes:
>How about English?
>
> an orange house vs a house
>
>The same thing happens with "the" when the following word begins with a vowel,
>except the difference in pronunciation is not reflected in spelling.
I'm sorry the property (1) as defined in my first article is
incomplete and wants further qualification. The above English examples
fit the present form of (1), of course, but do not falsify the
conjecture:
(0) Syntactic processing on X does not pay attention to Y,
unless the presence of Y is pragmatically predicted.
It is my fault that (1) failed to fully reflect what I should have
said in order to verify this conjecture.
The reason why 'a/an' and 'the' are not counterexamples to (0) is that
they are regarded as phonological phenomena and are handled in terms
of a 'lookahead' by only one syllable. Whether 'a' or 'an' is produced
depends only upon the first phoneme of the next word, but not upon
any syntactic property of the head of Y; same for 'the'. Put another
way, 'n' (or 'e' in 'the') phonologically is a part of Y (or of Z, in
the absense of Y), rather than of X. In fact, 'n' and the subsequent
vowel constitute a single syllable, don't they?
In a real counterexample, if any, the inflection of X might depend
upon, for instance, whether the head Y0 of Y is a verbal particliple
or a genuine adjective, but not upon whether there is an adverb,
adjoining to Y0, between X and Y0. The difference between the 'a/an'
('the') case and such a case is that the former can be attributed to
the finite-state phonetic process, while the latter cannot.
So, a qualification to (1) should be that the inflection of X must not
be dealt with by a simple finite-state procedure as in the 'a/an'
case.
(1) should be further revised, however. One thing that I found
misleading, thanks to HORI (<379@tansei.cc.u-tokyo.JUNET>), is that
(1) does not explicitly state that [X Y] should not be a constituent.
In (1) should have said that X must be 'governed' by Z, as might be
seen in the examples that I mentioned in my original article.
Still another quatlification was motivated, due to SIRAI (personal
communication). His example was one of Japanese:
X Y Z
(a) [kare ga] [Hanako ni] [itta] koto
(b) [kare ga] [itta] koto
(c) ?[kare no] [Hanako ni] [itta] koto
(d) [kare no] [itta] koto
he NOM Hanako DAT say PAST thing
'what he said (to Hanako)'
Even if (c) is ungrammatical, this cannot be a counterexample to (0),
because, after saying 'kare no', we can still speak grammatically,
simply by omitting 'Hanako ni'. If (b) were ungrammatical, then that
would be a real counterexample; the presense of Y cannot be
pragmatically predicted and, after saying 'kare ga', there is no easy
way, such as simply omitting something, to keep grammaticality.
All these things considered, to complete (1) would be a rather
cumbersome job. So perhaps I should instead ask:
Is there any counterexample to (0)?
HASIDA Koiti ('HASIDA' is the family name.)
JUNET: hasida@etl.junet
------------------------------
End of NL-KR Digest
*******************