NL-KR Digest Volume 10 No. 12

Published in
· 11 months ago
NL-KR Digest      (Fri Mar 19 10:28:54 1993)      Volume 10 No. 12 

Today's Topics: 

	 Query: algorithms to split words into morphemes 
	 Query: English to Italian translation 
	 Talk: Jon Ogborn on Modelling clay for computers at BBN 
	 CFP: New OED Conference - Making Sense of Words 
	 Announcement: IJCAI-93 server 
	 Announcement: AISB93 Dinner speaker 
	 Announcement: Corpus-Based Frequency Count of Modern Chinese 
	 Announcement: HCRC Map Task Corpus on CD 

Submissions: nl-kr@cs.rpi.edu 
Requests, policy: nl-kr-request@cs.rpi.edu 
Back issues are available from host archive.cs.rpi.edu [128.213.3.18] in 
the files nl-kr/Vxx/Nyy (ie nl-kr/V01/N01 for V1#1), mail requests will 
not be promptly satisfied.  Starting with V9, there is a subject index 
in the file INDEX.  If you can't reach `cs.rpi.edu' you may want 
to use `turing.cs.rpi.edu' instead. 
BITNET subscribers: we now have a LISTSERVer for nl-kr. 
  You may send submissions to NL-KR@RPITSVM 
  and any listserv-style administrative requests to LISTSERV@RPITSVM. 
----------------------------------------------------------------- 

To: nl-kr@cs.rpi.edu 
From: J_KANE@unhh.unh.edu (John J Kane) 
Newsgroups: comp.ai.nlang-know-rep 
Subject: Query: algorithms to split words into morphemes 
Date: 16 Mar 1993 23:25:10 GMT 

... possibly including discussion of methods for handling ambiguous cases. 
Suggestions welcome.  Will share results of search.  

Limited news access; prefer mail at jjk%nhstrat@virgin.mv.com 
[Explaining astrophysics is child's play compared to explaining child's play.] 

------------------------------ 

To: nl-kr@cs.rpi.edu 
Newsgroups: comp.ai.nlang-know-rep 
From: ferretti@ipvmv1.unipv.it 
Subject: Query: English to Italian translation 
Summary: Is there any package ? 
Keywords: nat-language, translation 

Is anybody aware of a package for automatic translation from 
English to Italian for specific language domains, such as 
computer science, EE, and so on ? 
The ideal tool would allow to tailor the associated dictionary 
and would be capable of handling a fairly simple syntax. 

If this group is the wrong one, a redirection is gratefully 
acknowledged. 

Hints through the Net or directly to 

ferretti@ipvmv1.unipv.it 

Marco Ferretti 
DIS-University of Pavia, Italy 

------------------------------ 

To: nl-kr@cs.rpi.edu 
Date:     Thu, 11 Mar 93 9:54:08 EST 
From: Helene George <hgeorge@BBN.COM> 
Subject: Talk: Jon Ogborn on Modelling clay for computers at BBN 

                               AI Seminar Series                   

Who:    Jon Ogborn 
        Professor of Science Education 
        Institute of Education 
        University of London 

Title:  Modelling clay for computers 
  
Where:  6/471 

Time:  12:30 - 1:30 

Date:  March 30, 1993 

Abstract 

How can students of all ages use the computer to model the real  
world?   Modelling systems which iteratively solve difference  
equations are now common,  and useful for older students.   But  
they require that the world be imagined as composed of variables,  
not things.  And they need some minimum mathematical  
sophistication.  This paper discusses two new modelling tools  
suitable for quite young students,  which could provide an  
introduction to modelling.   One tool allows systems of variables  
to be constructed,  without having to specify mathematical  
relations between them.   The other provides for interacting  
objects whose behaviour can be specified,  again without  
mathematics,  through drawing Tbefore and afterU pictures to  
express interactions of objects.   It is argued that the different types  
of models fit naturally into a developmental sequence,  matching  
modelling at various ages to student's intellectual growth.   A  
radical re-sequencing of teaching about Mathematics in Science is  
proposed. 

To create a world,  whether constituted of variables or of objects,    
and to watch it evolve is a remarkable experience.   It can teach  
one what it means to have a model of reality,   which is to say  
what it is to think.    It can show both how good and how bad such  
models can be.    And by becoming a game played for its own sake  
it can be a beginning of purely theoretical thinking about forms.   
The microcomputer brings something of this within the reach of  
most pupils and teachers. 

------------------------------ 

To: nl-kr@cs.rpi.edu 
Date: Wed, 17 Mar 93 16:45:17 -0500 
From: Frank Wm Tompa <fwtompa@daisy.uwaterloo.ca> 
Subject: CFP: New OED Conference - Making Sense of Words 

                              CALL FOR PAPERS 
                           MAKING SENSE OF WORDS 

                        9th Annual Conference of the 
      University of Waterloo Centre for the New OED and Text Research 

                          September 27 - 28, 1993 
                             St. Cross Building 
                              Oxford, England 

     The  Ninth Annual Conference of the University of Waterloo Centre 
     for  the  New  OED  and  Text  Research, jointly sponsored by the 
     University  of  Waterloo and the Oxford University Press, will be 
     held  at  St.  Cross  Building (with accommodations at St. Edmund 
     Hall), Oxford, England, on September 27-28, 1993. 

     This  year's  conference will focus on computational solutions to 
     problems of equivalence among words and phrases.  Within lexicog- 
     raphy,  one of the most important problems in this area is one of 
     grouping  equivalents:  sifting  through corpus citations to form 
     sense  groups.   Within lexicology and computational linguistics, 
     there  are problems of finding equivalents: matching citations to 
     dictionary   senses,   aligning   one  dictionary's  senses  with 
     another's,  and  aligning parts of texts with their translations. 
     In  related  fields,  there  are problems of forming equivalents: 
     generating  translations,  expanding full-text queries to include 
     synonyms,   and  tailoring  texts  to  suit  specific  audiences. 
     Conference  participants will again include researchers from com- 
     puter science and the humanities, as well as representatives from 
     publishing houses and other industries. 

     Papers  presenting  original  research on theoretical and applied 
     aspects of the theme are being sought.  Typical but not exclusive 
     areas of interest include computational lexicology, computational 
     linguistics, syntactic and semantic analysis, computational lexi- 
     cography,  lexical  databases, computer-assisted translation, and 
     online reference works. 

     Submissions  will  be  refereed  by  the program committee listed 
     below.   Authors  should send seven copies of a detailed abstract 
     (5 to 10 pages) by April 27, 1993, to: 

                      Prof. Frank Tompa, Program Chair 
                UW Centre for the New OED and Text Research 
                           University of Waterloo 
                     Waterloo, Ontario, Canada N2L 3G1 
                                     or 
                         email: newoed@uwaterloo.ca 
                                     or 
                             fax: 519-885-1208 

     Late  submissions  risk rejection without consideration.  Authors 
     will  be notified of acceptance or rejection by June 18, 1993.  A 
     working  draft  of the paper, not exceeding 15 pages, will be due 
     by July 16, 1993, for inclusion in proceedings which will be made 
     available at the conference. 

                             Program Committee 

              Beryl T. Atkins (Oxford University Press) 
              Kenneth Church (AT&T Bell Laboratories) 
              Eduard Hovy (University of Southern California) 
              Nancy Ide (Vassar College) 
              Robert Ingria (BBN Laboratories) 
              Frank Tompa, Chair (University of Waterloo) 

------------------------------ 

To: nl-kr@cs.rpi.edu 
From: Jean-Pierre Laurent <jplaure@imag.fr> 
Date: Tue, 16 Mar 1993 17:49:24 +0100 
Subject: Announcement: IJCAI-93 server 

*************************************************************** 
*  INFORMATION ABOUT IJCAI-93,  USING THE EMAIL IJCAI SERVER  * 
*************************************************************** 

The IJCAI server contains the Conference Brochure of IJCAI-93  
and the list of accepted papers. 

To access to this information, you have to  send mails to the  
IJCAI server, as follows: 

*  First, to obtain the content of the IJCAI server,  
   send a mail to  

                ijcai-serv@imag.fr 

   the subject can be empty (or anything you want), 
   the content must be: 

                index 

   You will receive a reply with the list of all available files  
   in the IJCAI server (name and brief description of the content). 

* Second, to receive the file NAME, send a new mail at the  
  same address : 

                ijcai-serv@imag.fr 

  the subject is again empty or anything you want, 

  the content must be : 

                get  NAME 

  You will receive a reply with the content of the file NAME. 

*************************************************************** 

- -  
JP Laurent 

------------------------------ 

To: nl-kr@cs.rpi.edu 
To: comp-ai-nlang-know-rep 
From: axs@cs.bham.ac.uk (Aaron Sloman) 
Newsgroups: comp.ai,comp.ai.edu,comp.ai.neural-nets,comp.ai.nlang-know-rep 
Subject: Announcement: AISB93 Dinner speaker 
Date: 18 Mar 93 23:06:23 GMT 
Organization: School of Computer Science, University of Birmingham, UK 

I am very pleased to announce that Professor Derek Partridge, University 
of Exeter, has agreed to give the "after dinner" talk at the Conference 
Banquet on Thursday 1st April in the City of Birmingham's Repertory 
Theatre. 

His title is 

    "If you think connectionism killed AI wait till you hear 
    what it did to computer science." 

Reminder: the AISB93 conference, at the University of Birmingham 
March 30th to April 2nd has the theme "Prospects for AI as the 
General Science of Intelligence". There are very large reductions for 
student registrations. Full registration (excluding accommodation and 
meals) 175 pounds (+30 pounds for non AISB members). 40 pounds for 
full time students. 

* For a programme and registration form please email the auto-reply 
service aisb93-info@cs.bham.ac.uk 

Brochures and posters available from: 

* Other enquiries: AISB'93, School of Computer Science, The University of 
                   Birmingham, Edgbaston, Birmingham, B15 2TT, U.K. 

                   Phone:  +44-(0)21-414-3711  Fax: +44-(0)21-414-4281 
                   Email aisb93-prog@cs.bham.ac.uk 

Aaron Sloman (Programme Chair) 
======================================================================= 

------------------------------ 

To: nl-kr@cs.rpi.edu 
From: rocltsh@iis.sinica.edu.tw 
Subject: Announcement: Corpus-Based Frequency Count of Modern Chinese 
Date: Tue, 16 Mar 93 16:20:04 EAT 

	 Corpus-Based Frequency Count of Modern Chinese 

Corpus-based study of Chinese is one of the research projects of 
the Chinese Knowledge Information Processing Group (CKIP) at 
Academia Sinica.  The current research is based on a Chinese 
newspaper corpus, which amounts to 20,698,116 characters ( 
9,540,444 words after word segmentation.)  Four technical reports 
in Chinese are published.  These include: 

1. Corpus-Based Frequency Count of Characters in Journal Chinese 
   30 pages (US$ 5) 
2. Corpus-Based Frequency Count of Words in Journal Chinese 
   300 pages (US$ 20) 
3. The Most Frequent Verbs in Journal Chinese and Their 
Classification 
   140 pages (US$ 10) 
4. The Most Frequent Nouns in Journal Chinese and Their 
   Classification   150 pages (US$ 10) 

The first report lists 5,666 distinct characters which appear in 
the entire corpus.  The second report contains 42,686 words that 
occur more than three times in the corpus.   The most common 14,956 
words constitute more than 99.9995 percent of all the words 
occurring in the corpus.  The third and the fourth report include 
19,907 verbs and 21,368 nouns respectively which occur more than 
twice in the corpus with their syntactic or semantic 
classification.  To order, please list the desired title(s) and 
enclose a cheque of the appropriate amount payable to the 
Computational Linguistic Society of the R.O.C. (ROCLING).  The 
prices listed above include postage and handling. 

     Address   : Miss Tsai Shu-hui 
		    ROCLING 
		    Institute of Information Science 
		    Academia Sinica, Nankang 
		    Taipei, Taiwan 11529 
		    R.O.C. 

	  Tel.	: 886-2-788-1638 
	  Fax	: 886-2-788-1638 
       E-Mail	: rocltsh@iis.sinica.edu.tw 

------------------------------ 

To: nl-kr@cs.rpi.edu 
From: "Henry S. Thompson" <ht@cogsci.edinburgh.ac.uk> 
Date: Thu, 18 Mar 93 23:03:02 GMT 
Subject: Announcement: HCRC Map Task Corpus on CD 

			The HCRC Map Task Corpus 

The Human Communication Research Centre (HCRC) is happy to announce 
the release of the Map Task Corpus. The Map Task Corpus is a set of 8 
CD-ROMs containing linked audio and transcriptions of a total of about 
18 hours of spontaneous speech that was recorded from 128 two-person 
conversations according to a detailed experimental design. 

Altogether, the corpus as distributed provides a thorough and 
invaluable set of resources and tools for use in analyzing all levels 
of linguistic structure, via both text-based and speech-based 
investigation.  The range of research questions that are addressable 
using this corpus span a wide spectrum of linguistic and cognitive 
issues.  We have kept the price as low as possible to encourage 
researchers from many disciplines to use this corpus as a common 
reference point for many different kinds of research. 

The HCRC is an interdisciplinary research centre at the Universities 
of Edinburgh and Glasgow, supported by the UK Economic and Social 
Research Council and the Universities Funding Council.  The publication 
of the Map Task Corpus was made possible by assistance from the 
Linguistic Data Consortium. 

Corpus Details 

64 different speakers, 32 female, 32 male, all adults, each took part 
in four conversations in a quiet recording studio.  They were all 
students at the University of Glasgow, 61 of them being native Scots. 
The conversations were carried out in an experimental setting in which 
each participant has a schematic map in front of them, not visible to 
the other. Each map is comprised of an outline and roughly a dozen 
labelled features (e.g.  "a white cottage", "an oak forest", "Green 
Bay", etc). Most features are common to the two maps, but not all. One 
map has a route drawn in, the other does not. The task is for the 
participant without the route to draw one on the basis of discussion 
with the participant with the route. In addition to the conversations, 
each speaker provides a wordlist reading, consisting of the major 
vocabulary items contained in the conversations.  All recordings were 
direct to Digital Audio Tape (DAT) at 48KHz, providing very good 
acoustic quality. 

The experimental design allows a number of different phonemic, 
syntactico-semantic and pragmatic contrasts to be explored in a 
controlled way.  In particular, maps and feature names were designed 
to allow for controlled exploration of phonological reductions of 
various kinds in a number of different referential contexts, and to 
provide, via varying patterns of matches and mis-matches between the 
two maps, a range of different stimuli for referent negotiation.  Also 
the conditions of the conversations were carefully balanced: In half 
of them the speakers were strangers, in half friends; in half of them 
the speakers could see each other's faces, in half they could not. 

Subjects accommodated easily to the task and experimental setting, and 
produced evidently unselfconscious and fluent speech.  The syntax is 
largely clausal rather than sentential; showing good turn-taking, with 
modest amounts of overlap and interruption.  The total corpus runs to 
about 18 hours of speech, with the transcripts consisting of around 
150,000 word tokens drawn from just over 2,000 word form types. 

Transcription is at the orthographic level, quite detailed, 
including filled pauses, false starts and repetitions, broken words, 
etc.  Considerable care has been taken to ensure consistency of 
notation, which is thoroughly documented.  Although the full 
complexity of overlapped regions has not been reflected in the 
transcriptions, such regions are clearly set off from the rest of the 
transcripts.  Transcripts are connected to the acoustic sampled data 
by sample numbers marked every few turns. 

CD-ROM Contents 

The waveform data are provided in "raw" (headerless) files (16-bit 
samples, 20 kHz sample rate, 2 channels per conversation), and 
alternative header files are provided for use with software based on 
either the NIST "SPHERE" header structure or the European "SAM" header 
structure.  Transcriptions are provided for each conversation, marked 
up with TEI-compliant SGML, in a minimally intrusive and easily 
separated way.  PostScript files of the map images used in the 
experiments are provided, along with full documentation of the 
experimental design and data collection protocol, resources for using 
SGML tools on the transcriptions and other text materials, and an 
extensive set of source code for performing basic signal processing 
functions on the waveform data, such as down-sampling, 
de-multiplexing, channel summation, and D/A conversion for Sun 
workstations (including playback of segments selected via inspection 
of transcripts in Emacs). 

The CD-ROMs are in High Sierra (ISO 9660) format with the RockRidge 
extensions, and are compatible with (inter alia) Unix, MS-DOS and 
Macintosh operating systems. 

Copies of the Map Task Corpus are available from the LDC for $200 or 
from HCRC for 164.50 UK pounds (including VAT) at the addresses given 
below, plus postage and packing as necessary.  Please contact us (by 
e-mail if possible) for details of payment methods and shipping costs. 

In Europe please contact 

	Henry Thompson 
	University of Edinburgh 
	Human Communication Research Centre 
	2 Buccleuch Place 
	Edinburgh EH8 9LW 
	Scotland 
	Tel: +44 31 650-4440 
	Fax: +44 31 650-4587 
	email: maptask@cogsci.ed.ac.uk 

or 
	Dawn Griesbach 
	ELSNET 
	2 Buccleuch Place 
	Edinburgh EH8 9LW 
	Scotland 
	Tel: +44 31 650-4594 
	Fax: +44 31 650-4587 
	email: elsnet@cogsci.ed.ac.uk 

Outside Europe please contact 

	Elizabeth Hodas 
	Linguistic Data Consortium 
	441 Williams Hall 
	University of Pennsylvania 
	Philadelphia, PA 19104-6305 

	Tel: (215) 898-0464 
	Fax: (215) 573-2175 
	email: ehodas@unagi.cis.upenn.edu 

------------------------------ 
End of NL-KR Digest 
*******************
NL-KR Digest Volume 10 No. 12

Share this article

Let's discover also

Original call for a Netizens Association

NL-KR Digest Volume 03 index

Volume 2 - Back Orifice

Volume 1 - NetBus

INQUIRING MIND Volume 10, Number 2

AMIGAphile Volume 1 Number 2

Volume 3 - Extras

NL-KR Digest Volume 04 index

NL-KR Digest Volume 02 index

NL-KR Digest Volume 05 index

Recent Articles

Pizzette al taglio

Melikki and the Song of the Whale

Die Bayrische Hackerpost Systems 85

Die Bayrische Hackerpost IFA85

Die Bayrische Hackerpost 13

Die Bayrische Hackerpost 12

Die Bayrische Hackerpost 11A

Die Bayrische Hackerpost 11

Die Bayrische Hackerpost 10

Die Bayrische Hackerpost 9

Recent Comments