Copy Link
Add to Bookmark
Report
NL-KR Digest Volume 02 No. 53
NL-KR Digest (6/11/87 00:27:00) Volume 2 Number 53
Today's Topics:
Request: Speech Data Compression
Re: more than one paradigm in linguistics
Could X's inflection in [X (Y) Z] depend on the presense of Y?
Re: Could X's inflection in [X (Y) Z] depend on the presense of Y?
re: parsing free word order languages
AAAI's Preregistration Deadline
East Asian MT work
----------------------------------------------------------------------
Date: Tue, 2 Jun 87 18:10 EDT
From: Rob Peck <imagen!auspyr!dlb!dana!rap@ucbvax.Berkeley.EDU>
Subject: Speech Data Compression
[Excerpted from AIList]
I am interested in finding some kind of data compression algorithm
that is suitable for compressing speech data. As I understand it,
human speech has a great deal of redundancy to it, i.e. repetitions
of virtually the same waveforms over a period of time, as well
as slow changes in many cases from one waveform to the next.
However, if one takes a set of audio samples of a spoken word,
the samples will not fall in the right spots to show up any such
redundancy. Thus, for a simplistic compression algorithm that
looks for repeated sequences, no opportunity to compress would
be noticed.
Could someone point me to the appropriate literature? Or is there
some public domain source code that is already available for this?
The code needn't be fast on the analysis and compression. On
playback, it should be pretty easy to expand, though. That is,
play so many repetitions of this waveform at this sampling rate,
then do this next one (or better still, adjust the current waveform
until it looks like this new one, as a slewing to the new output...
that'd be neat).
I've read a little about FFT's, but once calculated, I have no
idea how to use it or if it gives me remotely what I am looking
for here.
Please EMAIL directly to me. I will summarize any interesting
responses to the Net. Thanks very much.
Rob Peck ...ihnp4!hplabs!dana!rap
------------------------------
Date: Wed, 3 Jun 87 20:22 EDT
From: withgott.pa@Xerox.COM
Subject: Re: more than one paradigm in linguistics
One of surprises in Nevin's (6/3/87) reply to Pesetsky concerned the
nature of facts, empiricism, and phonology. Nevin writes:
[START QUOTE] Another `fact' about which floods of ink have been spilled
is socalled `aspirated h' and liaison in French, which can supposedly
motivate (or refute) a cycle in French grammar. This is in fact only an
artifact of a completely artificial policy of the French educational
system.[END QUOTE]
This surprising characterization makes aspirated h seem like some
unpleasant legislative ruse that can't be evaded, but that should be
avoided, like taxes. (for those in the dark, aspirated h shows up as a
set of effects in strings of words, somewhat on the order of the a/an
distinction in English, but with more wrinkles. Lightning will not
strike if you say 'a apple', ditto *l'hero* for *le hero*, but that
doesn't mean such a (an?) historically-based word boundary phenomenon is
artificial. )
If you try an empirical approach, and test all the aspirated h words in
all the dictionaries you can find, you will no doubt find some
distributional regularities (if you want to try, look at number of
syllables, etymology, initial vowel quality). Then what do you do? You
still have to figure out how to represent the thing, whether you write
generative analyses, computer systems, or French text books.
Representational issues caused the flood of ink, which is not so
unnusual. Do you associate rules with the entries like !don't drop
the schwa when the word <le> precedes! ? do you encode a sort of
consonant on aspirated h entries? Do you evade the issue? The point is
that looking at data only gets you part way there, and depending on the
particular problem, this might be pretty far or not very far at all.
--Meg
------------------------------
Date: Thu, 4 Jun 87 02:17 EDT
From: Hasida Koiti <hasida@etlcom.etl.JUNET>
Subject: Could X's inflection in [X (Y) Z] depend on the presense of Y?
Wanted: Natural languages which have the property as follows:
(1) X's inflection differs between the two contexts [X Y Z]
and [X Z], for some (equivalence classes of) grammatical
categories X, Y, and Z, such that both [X Y Z] and [X Z]
constitute grammatical categories with Z the head, under
some inflection (or conjugation, declension, or the like)
of X, Y, and Z.
By contrast, there are languages which have the following property.
(2) Y's inflection differs between the two contexts [X Y Z] and
[Y Z], ... (The rest of the lines are the same as in (1).)
For instance, German, Dutch, etc. have the property (2). In fact,
consider the following (nominative) noun phrases of German:
das kleine Maedchen kleines Maedchen
(the little girl) (little girl)
Here let X, Y, and Z be 'das', 'kleine(s)', and 'Maedchen',
respectively. Perhaps some slavic languages including Russian have
the same property, where X is numeral rather than article, though the
noun Z might not be regarded as the head.
The preverbal clitics of some Romance languages seem to have
the property (1), as in the Spanish example that follows:
se lo digo le digo
(to-you it I-tell) (to-you I-tell)
him him
her her
'I tell it to you/him/her' 'I tell you/him/her ...'
'Se (le)', 'lo', and 'digo' correspond to X, Y, and Z, respectively.
Such phenomena give some suggestions about how much partial structure
of setences should be entertained in mind while humans speak. That is,
syntactic constructions which embody (1) requires that you should know
much about Y (including whether it should appear at all) when you are
about to utter X.
My assumption is that human language faculty is not so made that the
(non-)existence of Y should be determined before X is generated. That
is:
(1) is not a property of any natural language.
Here, (1) requires an adviso that the existence of Y is unpredictable
in principle. The apparent Spanish counterexample shown above is thus
accounted for, because whether the clitic 'lo' should show up or not
can be predicted on the basis of the foregoing pragmatic context;
i.e., it must be there if and only if its semantic content (what "I
say" (digo)) has already been set up in the context. By contrast, Y
(an adjective) in the German case is not pragmatically predictable in
general.
I appreciate responses from anybody who knows real counterexamples to
the above thesis. Information on related studies is also welcome.
HASIDA Ko^iti
JUNET: hasida@etl.junet
------------------------------
Date: Wed, 10 Jun 87 01:41 EDT
From: Rob Bernardo <rob@pbhye.UUCP>
Subject: Re: Could X's inflection in [X (Y) Z] depend on the presense of Y?
In article <1425@etlcom.etl.JUNET> hasida@etlcom.etl.JUNET (Hasida Koiti) writes:
+Wanted: Natural languages which have the property as follows:
+(1) X's inflection differs between the two contexts [X Y Z]
+ and [X Z], for some (equivalence classes of) grammatical
+ categories X, Y, and Z, such that both [X Y Z] and [X Z]
+ constitute grammatical categories with Z the head, under
+ some inflection (or conjugation, declension, or the like)
+ of X, Y, and Z.
How about English?
an orange house vs a house
The same thing happens with "the" when the following word begins with a vowel,
except the difference in pronunciation is not reflected in spelling.
--
From the desk of the Arbiter of Good Taste.
Rob Bernardo, San Ramon, CA (415) 823-2417 {pyramid|ihnp4|dual}!ptsfa!rob
------------------------------
Date: Thu, 4 Jun 87 09:19 EDT
From: Linda G. Means <MEANS%gmr.com@RELAY.CS.NET>
Subject: re: parsing free word order languages
In response to Elizabeth Hinkelman's query:
Keep in mind that every language has some way of specifying
relationships among constituents of a sentence. Languages lacking
in a high degree of syntactic restriction usually make up for it
with a greater degree of morphology.
Linda Means
means%gmr.com@relay.cs.net
------------------------------
Date: Thu, 4 Jun 87 13:23 EDT
From: AAAI <AAAI-OFFICE@SUMEX-AIM.STANFORD.EDU>
Subject: AAAI's Preregistration Deadline
The AAAI would like to remind those individuals interested in attending
AAAI-87 in Seattle, July 13-17, that the preregistration deadline of Friday,
June 12, draws very near. If you would like registration materials, please
call or send us a msg with your name and mailing address. Thanks!
AAAI
445 Burgess Drive
Menlo Park, CA 94025
(415) 328-3123
AAAI-Office@sumex-aim.stanford.edu
------------------------------
Date: Fri, 5 Jun 87 03:41 EDT
From: Klaus Schubert <mcvax!dlt1!schubert@seismo.CSS.GOV>
Subject: East Asian MT work
An answer to the anonymous asker in NL-KR vol. 2, no. 43:
This a reference to non-Japanese East Asian MT work:
- Udom Warotamasikkadhit (1986): Computer aided translation project,
University Sains Malaysia, Penang, Malaysia.
In: Computers and Translation 1: 113
The work is on Malay, Thai and English.
Regards,
Klaus Schubert
------------------------------
End of NL-KR Digest
*******************