Copy Link
Add to Bookmark
Report

AIList Digest Volume 8 Issue 015

eZine's profile picture
Published in 
AIList Digest
 · 1 year ago

AIList Digest            Monday, 18 Jul 1988       Volume 8 : Issue 15 

Today's Topics:

More on the Soundex algorithm

----------------------------------------------------------------------

Date: 11 Jul 88 18:21:16 GMT
From: sundc!netxcom!sdutcher@seismo.css.gov (Sylvia Dutcher)
Subject: Re: Soundex algorithm

In article <12520@sunybcs.UUCP> stewart@sunybcs.UUCP (Norman R. Stewart) writes:
>
> The source I've used for Soundex (developed by the
>Remington Rand Corp., I believe), is
>
> Huffman, Edna K. (1972) Medical Record Management.
> Berwyn, Illonois: Physicians' Record Company.

I've written a soundex program based on the rules in Knuth's _Searching_and_
Sorting. These are also the rules used at the National Archives to sort
census data. These rules differ slightly from the ones posted by Mr. Stewart.
If you don't need to match anyone else's soundex, the most important rule is
to be consistent. I will insert Knuth's rules below.
>The algorithm is very simple,

1. Retain the first letter of the name, and drop all occurrances of a, e, h,
i, o, u, w, y in other positions.

>1: Assign number values to all but the first letter of the
>word, using this table
> 1 - B P F V 2 - C S K G J Q X Z
> 3 - D T 4 - L
> 5 - M N 6 - R
> 7 - A E I O U W H Y

2. Assign number values as above, except for 7.

>2: Apply the following rules to produce a code of one letter and
> three numbers.
> A: The first letter of the word becomes the initial character
> in the code.
> B: When two or more letters from the same group occur together
> only the first is coded.
> C: If two letters from the same group are seperated by an H or
> a W, code only the first.

3. If two or more letters with the same code were adjacent in the original
name (begore step 1), omit all but the first.

> D: Group 7 letters are never coded (this does not include the
> first letter in the word, which is always coded).

4. Convert to the form "letter, digit, digit, digit" by adding trailing
zeros or dropping rightmost digits.


BTW according to the reference in Knuth's book, this algorithm was
developed by Margaret Odell and Robert Russell in 1922.


>Norman R. Stewart Jr. * How much more suffering is
>C.S. Grad - SUNYAB * caused by the thought of death
>internet: stewart@cs.buffalo.edu * than by death itself!
>bitnet: stewart@sunybcs.bitnet * Will Durant


--
Sylvia Dutcher * The likeliness of things
NetExpress Communications, Inc. * to go wrong is in direct
1953 Gallows Rd. * proportion to the urgency
Vienna, Va. 22180 * with which they shouldn't.

------------------------------

Date: Wed, 13 Jul 88 10:35:06 EDT
From: "William J. Joel" <JZEM%MARIST.BITNET@MITVMA.MIT.EDU>
Subject: Re: Soundex algorithm

/* The following is source code for a Soundex algorithm written in */
/* Waterloo Prolog. */
/* William J. Joel*/
/* Marist College */
/* Poughkeepsie, NY */
/* jzem@marist.bitnet */

key(a,-1).
key(b,1).
key(c,2).
key(d,3).
key(e,-1).
key(f,1).
key(g,2).
key(h,-2).
key(i,0).
key(j,2).
key(k,2).
key(l,4).
key(m,5).
key(n,5).
key(o,-1).
key(p,1).
key(q,2).
key(r,6).
key(s,2).
key(t,3).
key(u,-1).
key(v,1).
key(w,-3).
key(x,2).
key(y,-2).
key(z,2).

soundex(Name,Code)<-
string(Name,Code1) & write(Code1) &
soundex1(Code1,A.B.C.D.Rem) &
string(Code,A.B.C.D.nil).

soundex1(Head.Code1,Head.Code)<-
keycode(Head.Code1,Code2) & write(Code2) &
reduce(Code2,T.Code3) & write(T.Code3) &
eliminate(Code3,Code4) & write(Code4) &
append(Code4,0.0.0.nil,Code).

reduce(X.(-2).X.Rem,List)<-
reduce(X.Rem,List).
reduce(X.X.Rem,List)<-
reduce(X.Rem,List).
reduce(X.Y.Z.Rem,X.List)<-
^X==Z &
reduce(Y.Z.Rem,List).
reduce(X.Y.Rem,X.List)<-
^X==Y &
reduce(Y.Rem,List).
reduce(X.nil,X.nil).
reduce(nil,nil).

eliminate(X.Rem,List)<-
lt(X,0) &
eliminate(Rem,List).
eliminate(X.Rem,X.List)<-
gt(X,0) &
eliminate(Rem,List).
eliminate(nil,nil).

keycode(H.T,N.CodeList)<-
key(H,N) &
keycode(T,CodeList).
keycode(nil,nil).


append(Head.Tail,List,Head.NewList)<-
append(Tail,List,NewList).
append(nil,List,List).

------------------------------

Date: 13 Jul 88 17:02:31 GMT
From: ihnp4!twitch!anuck!jrl@bloom-beacon.mit.edu (j.r.lupien)
Subject: Re: Soundex algorithm

>From article <1552@randvax.UUCP>, by leverich@randvax.UUCP (Brian Leverich):
> Incidentally, does anyone know if there's been any genealogy applications
> built using Prolog or the like? Looks like a logic programming approach
> to maintaining relations between individuals might be a big win. -B

Not really a C question anymore, but there is such a beast. The
United Kingdom imigration and naturalization department has a
Prolog implementation for their citizenship status analysis system.

Apparently the rules involved are very complicated and perversely
interconnected, having to do with date and place of birth, date and
place of birth of both parents and their citizenship at the time of
your birth as well as now, and probably the phase of the moon etc.

I can't give you a reference for this offhand, since I heard about
it in a keynote lecture by the MIT AI lab. They should be able to
point you in the right direction.

John Lupien
twitch!mvuxe!anuxh!jrl

------------------------------

End of AIList Digest
********************

← previous
next →
loading
sending ...
New to Neperos ? Sign Up for free
download Neperos App from Google Play
install Neperos as PWA

Let's discover also

Recent Articles

Recent Comments

Neperos cookies
This website uses cookies to store your preferences and improve the service. Cookies authorization will allow me and / or my partners to process personal data such as browsing behaviour.

By pressing OK you agree to the Terms of Service and acknowledge the Privacy Policy

By pressing REJECT you will be able to continue to use Neperos (like read articles or write comments) but some important cookies will not be set. This may affect certain features and functions of the platform.
OK
REJECT