What is a radix?
The Insert Counting algorithm is the fastest sort algo i've ever seen. Read this. This document explains the radix sort algorithm that doesn't use linked lists, but rather an index hashtable, or insert counting, which makes it mush faster. It also includes some samplecode in pascal. Later I may include C++ samplecode as well.
Contents
- What is a radix?
- Sort a list by one radix
- Sort a list by n radices
- Optimization
- Pascal samplecode
What da phuck is a radix?
A radix is a position in a value. The value 342 has three radices. The value 17 has two radices. So a radix is a number at a position in a value. The radices are counted from the least significant number to the most significant number in the value. So, in the value 153, 3 would be the 1st radix, 5 the second radix and 1 the third radix.
The radices of a 10 (sedecimal) based value is 1,10,100,1000, ...etc...
The radices of a 16 (hexadecimal) based value is 1,16,256,4096, ...etc...
The radices of a 2 (binary) based value is 1,2,4,8,16, ...etc...
One may also choose a value with the base 256. Such a value would have the radices 1,256,65536, ...etc...
This would make it easy to get each radix out of a value. Since a byte has the size 256, each byte is a radix. A word with the base 256 would then have 2 radices. The lower byte is the 1st and the upper is the 2nd. The radixsort algo I'm going to explain in this document sorts values by their radices, from least significant to most significant. Byte by byte.
Unsorted -> 1st radix sorted -> 2nd radix sorted
435Fh 5A1Bh 4320h
5A36h 4320h 435Fh
4320h 5A36h 5A1Bh
5A1Bh 435Fh 5A36h
To sort a list with the entrysize of one byte (one radix to sort by):
We have a sourcelist with the unsorted values. We sort that list into a second destinationlist, using a hashtable to reindex the sourcelist entries to the right destlist entries. Then we copy the destlist to the sourcelist. Since the radixbase is 256 we need a hashtable of the size 256+1 and each entry must be able to hold the number of entries with of that value (hash[0..256] of word will do).
Get each value of the entries in sourcelist. Fo every entry, store it in index and increase the entry index+1 of the hashtable.
- index = source[entry];
- inc (hash[index+1]);
A list of the size 0..13 that looks like this.
00 01 02 03 04 05 06 07 08 09 10 11 12 13
15 01 06 10 04 14 11 13 04 15 03 04 15 11
will make a hashtable[0..15] that looks like this.
00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 .. 256
0 0 1 0 1 3 0 1 0 0 0 1 2 0 1 1 3 0 0 .. 0
You sorta count the number of 01's in the list and store that number in the 02nd entry of the hashtable. Then you count the number of 02's in the list and store that number into the 03rd entry of the hashtable. Then add every hash entry with the next hashentry.
01th's number is added to 02th's number.
02th's number is added to 03th's number.
hash[entry] = hash[entry] + hash[entry-1];
So the hashtable before the addon looks like this.
00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 .. 256
0 0 1 0 1 3 0 1 0 0 0 1 2 0 1 1 3 0 0 .. 0
And the hashtable after the addon looks like this.
00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 .. 256
0 0 1 1 2 5 5 6 6 6 6 7 9 9 10 11 14 14 14 .. 14
It kinda builds a staircase of the hashtable. Now we get the index (n'th radix) of each value in the the source list. We copy the current value of the source list to the entry in the destination list that the hashtable at entry index is "pointing" to. We also increase the entry in the hashtable, to make it "point" on the next entry in the destination list, that is to get the same value.
- index = source[entry];
- dest[hash[index]] = source[entry];
- inc (hash[index]);
In a scheme this would look something like this.
SourceList
00 01 02 03 04 05 06 07 08 09 10 11 12 13
15 01 06 10 04 14 11 13 04 15 03 04 15 11
HashTable
00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 .. 256
0 0 1 1 2 5 5 6 6 6 6 7 9 9 10 11 14 14 14 .. 14
1 2 3 6 7 8 10 11 12
4 9 13
5 14
DestinationList
00 01 02 03 04 05 06 07 08 09 10 11 12 13
01 03 04 04 04 06 10 11 11 13 14 15 15 15
The number 15 at the first entry of the sourcelist will make as an entryindex to the hashtable. At entry 15 of the hashtable has the value 11. That means that the first entry of the sourcelist is to be copied to the 11th entry of the destinationlist. Dest[11]=Source[0]; Then the value 11 at the 15th entry of the hashtable in increased to 12. The number 01 in the sourcelist's second entry refers to the value 0 at the 01th entry in the hastable, which points to the 00th entry in the destinationlist. Therefor the second entry of the sourcelist will be copied to the 00th entry of the destinationlist. Dest[0]=Source[1]; The value 00 at the 01th entry of the hashtable in increased to 01. Note that the source and destination lists are nullbased. So when I say the first entry of the sourcelist I mean the 0th entry, or Source[0]. To increase the HashTable is nessesary to prevent the next entry in the SourceList, that has the same value as this entry, to replace this entry. First SourceList[00] has value 15 and is copied to DestList[11] according to the hashtable[15]. The later SourceList[09] is then copied to the DestList[12] since the hashtable[15] has been increased. If it hadn't been increase all the SourceList entries with the value 15 would have copied to the same DestList entry, namely DestList[11].
Short summary...
Count the number of every present value in the sourcelist. Store that number in the hastable at the entry of the counted value+1. I say it again. You count the times the value n appears in the sourcelist and store that count in hash[n+1]. Add every entry of the hashtable to the next entry of the hashtable. Take every sourcelist-entry and copy it to the destinationlist-entry that the hashtable-entry of that sourcelist-entry point to. Also increase that hashtable-entry so that it later points to the next destinationlist-entry, for the next sourcelist-entry with the same value to be copied to.
To sort a list with the entrysize n (n radices to sort by):
Sort the least significant radix, just as you sort a list with only one-radix-sized values. Then you sort the next least significant radix. So the sortalgo must be modified slightly to satisfy any number of radices. Instead of sorting the 1st byte we change it so shift the value radix*8 bits right and and it with $FF to get only the first byte, but the wanted radix. So, for every radix we do the same algo as for one radix but shift+and the value to get the right radix out.
Optimization
Finally you should have pointers to the lists instead of passing the whole list to the procedure and not copy the whole destlist to sourcelist but rather just swap their pointers in between each radix. Destlist must then either be allocated and deallocated inside and by the procedure with the size of (MaxIndex*IndexSize) or rather allocated outside, by you, and then passed to the procedure. As soon as you sort more then one or two times you gain on allocating and deallocating it outside only once instead of every time you enter and leave the sorting procedure. Also, this algo should be easy to do in assembler, since it's not so very complex.
Btw, thanks to Mikael Kalms who told me about this way to do the algo. Also read OTMSort.TXT by Voltair/OTM (Zach Mortensen) for other radixsort algorithms, such using linked lists.
Pascal sample code - One radix (not tested)
Type
tList = Array[0..99] of Byte;
{---------------------------------------------------------------------------}
Procedure RadixSort (Var SourceList: tList);
Var
Entry : Byte;
Index : Byte;
Hash : Array[0..256] of Word;
DestList : tList;
Begin
FillChar (Hash,SizeOf(Hash),0);
For Entry:=0 to 99 do Begin
Index:=SourceList[Entry];
Inc (Hash[Index+1]);
End;
For Entry:=1 to 256 do Hash[Entry]:=Hash[Entry]+Hash[Entry-1];
For Entry:=0 to 99 do Begin
Index:=SourceList[Entry];
DestList[Hash[Index]]:=SourceList[Entry];
Inc (Hash[Index]);
End;
SourceList:=DestList;
End;
{---------------------------------------------------------------------------}
Var
MyList : tList;
Loop : Byte;
Begin
{ Generate random list values }
For Loop:=0 to 99 do MyList[Loop]:=Random(255);
{ Sort list }
RadixSort(MyList);
{ Display the result }
For Loop:=0 to 99 do WriteLn (MyList[Loop]);
End.
Pascal sample code - Four radices (not tested)
Type
tList : Array[0..99] of LongInt;
{---------------------------------------------------------------------------}
Procedure RadixSort (Var SourceList: tList);
Var { Preallocate these, instead of every time we enter SortRadix }
DestList : tList;
Hash : Array[0..256] of Word;
{- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -}
Procedure SortRadix (Shift: Byte);
Var
Entry : Byte;
Index : Byte;
Begin
FillChar (Hash,SizeOf(Hash),0);
For Entry:=0 to 99 do Begin
Index:=(SourceList[Entry] Shr Shift) And $FF;
Inc (Hash[Index+1]);
End;
For Entry:=1 to 256 do Hash[Entry]:=Hash[Entry]+Hash[Entry-1];
For Entry:=0 to 99 do Begin
Index:=(SourceList[Entry] Shr Shift) And $FF;
DestList[Hash[Index]]:=SourceList[Entry];
Inc (Hash[Index]);
End;
SourceList:=DestList;
End;
{- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -}
Begin
For Radix:=0 to 3 do Begin
SortRadix (Radix*8);
End;
End;
{---------------------------------------------------------------------------}
Begin
For Loop:=0 to 99 do MyList[Loop]:=Random($FFFF)*$FFFF;
RadixSort (MyList);
For Loop:=0 to 99 do WriteLn(MyList[Loop]);
End.