PACKING ALGORITHMS
by Axe of Superior
Axe of Superior (previously known as Axe of Delight) is a German hacker that has gained quite a lot of respect and fame by creating what it assumed to be the best packer on the ST - "Pack Ice". In this issue, we are proud to offer you an EXCLUSIVE article he wrote about packing techniques - as well as the latest version of the packer, which you will find in the PROGRAMS folder. Be warned, dear reader, that some knowledge of programming is required to understand this article!
Hello everybody. Well, here is an article that should interest you if you want to save some disk space. In fact, it should interest everybody who does some programming or other serious work with a computer.
I am sure you have all been confronted with packers, that can often by recognised by a color flashing with random colors in the background color of the screen. Depending on the size of the program and the speed of the packer, the flashing lasts longer or shorter.
I was not the first programmer who coded a packer on the ST, but I have always been interested in compressing data. When I had my Apple Computer, I was always angry that keeping screen shots takes so much memory, even if there is not much on the screens. Every picture took 8192 bytes (I think) no matter if the screen was empty or filled with nice graphics. For that reason I programmed my first packer ever - a picture compressor. I first wrote the program in Basic and then the unpacking routine in Assembler. Unpacking time took one second, which is very fast for an 8-Bit machine.
On the ST, there was need to pack more than just pictures. All data blocks and programs were bigger and needed to be packed. In search for a good packer, I tried several packing algorithms:
Run Length Decoding
It is always easy to program a packer that packs repeating bytes and sets one command byte if the following bytes are packed, but this way was not very effective. I will now explain it anyway, because it is very easy. Suppose you have the following bytes to pack:
--> 06, f0, 46, 46, 46, 46, 46, 46, 46, 46, 46, ef, 34.
You can define one byte as command byte (say ff) after which you indicate the byte and amount to be repeated. This would give you the following:
--> 06, f0, ff, 46, 09, ef, 34.
You have packed 13 bytes to 7 bytes which is about 50%. This is a good packing rate, but most times, you will not have 9 repeating bytes in a sequence in Assembler or C code. Well, it is good for packing blank data areas or screens with big areas of one color.
By the way, it will be more effective if you select a command byte that appears least often in your programming code. So, first count all bytes and define the command byte as first byte in your packed data. This Packing Algorithm is called Run Length Encoding.
Bytekiller
Don't get too enthusiasic with a packing rate of 50% in the upper example. The bytes were specially selected. Just imagine you have a data block with a repeating pattern. For example:
Assembler code: moveq #0,d0
nop
nop
nop
nop
nop
nop
bra.s test
If you assemble this, you will get the following bytes:
--> 70 00 4e 71 4e 71 4e 71 4e 71 4e 71 4e 71 60 08
Using the upper routine, you will not be able to reduce the length of the program by a single byte. You need a new routine that looks for repeating bytes and sets an information for the unpacker, like the following:
--> 70 00 4e 71 60 08
This would exactly look like the following:
Packed data: Unpacked data:
Write: 70 00 4e 71 > 70 00 4e 71
Copy the string with the length
of 2 bytes from position 2 to the
current position. 7000 4e71 > 4e 71
Copy string: length 4 from pos.
2 to current pointer. 7000 4e714e71 > 4e 71 4e 71
Copy string: length 4 from pos.
2 to current pointer 7000 4e714e71 4e714e71 > 4e714e71
Write: 60 08 70004e714e714e714e714e714e71 > 6008
That means: You kept 6 bytes of data in your packed block and you need 3 informations to copy bytes from another position to the current. If you consider that the informations to copy bytes don't need to be bytes themselves, but can also be bits, you can save some memory. Are you wondering how to mix bits with bytes? No problem. There are 2 ways. The first one is used in the "Bytekiller" (also known as "Jek-packer" - which was not programmed but converted from Amiga to ST by Sharaz Jek. By the way, "Bytekiller" is .‰still.Ä one of the most often used packers). You rip every byte apart to 8 bits and combine them with the bits for the packing information. Those bits are together put in the packed memory as bytes.
The other - 40% faster - method is used in "Pack-Ice" and most of the good packers. The idea is to combine unpacked bytes and information bytes, that consist of bits. The unpacking routine will know when there is an information byte and when there is an unpacked byte.
Now I will tell you in detail what these information bits look like and what they do in the example of "Pack-Ice".
When data can not be packed, that means there are no repeating byte patterns, it must be copied to the packed data directly, so it can be copied back to unpacked data when unpacking. This is the only way to keep all data. So you put those un-packable bytes in the packed data, followed by an information about the length in the upper mentioned format.
Let's say you have the following bytes:
d3 a7 7b 3a 44 69 79 b0 Those bytes can not be packed, so they
can also be found in the packed data,
followed by the length information (8)
in bits (see below).
The number of bytes is indicated by the following bits.
Number of bytes: 0 Information Bits: %0
1 %10
2 %1100
3 %1101
4 %1110
5 %111100
6 %111101
7 %111110
8 %111111000
9 %111111001
10 %111111010
..... .....
As you can see, the smallest length (0) needs just one bit. The reason is that this length appears most often. In fact it appears every time when packed data follows and not unpacked bytes that just have to be copied. So there .‰is .Äa good reason why the bits look so strange and do not all have the same length.
If data can be packed, then you need the %0 from the upper table for no unpacked bytes to switch to packed bytes and after that follows the information for the packed data, which is the following:
Copy data with from to the current position.
For length and object you use a similar table as the one shown above. When you have read the two values, you simply copy the needed bytes.
Example: 01 02 03 04 05 06 07 08 09 0a 0b <> 0c 0d 0e
Note that in "Pack-Ice" and nearly all other packers the offset is not counted from the beginning of the file, but it is counted backwards from the current position, so you will copy the following bytes: 08 09 0a (4 to the left from << and then 3 bytes taken). So you will get the following unpacked data:
01 02 03 04 05 06 07 08 09 0a 0b 08 09 0a 0c 0d 0e
Well, now you should know everything about the most often used packing algorithm. And if you use it in an optimized way, like in "Pack-Ice", you will get good Packing-Rates. The old "Bytekiller" uses the same algorithm, but not as optimized which makes the file a lot bigger and unpacking half as fast.
Huffmann
Now I have introduced you to two packing routines and will now show you a third method. It might be the best known - the Huffmann Packer.
As the Huffmann Packer is quite slow for unpacking, it is not often used, except for some archive programs like "lzh" or "arc". Right now I don't feel like explaining in detail how it works, otherwise I won't finish this article in time and it won't be in .ÅST NEWS.Ä at all. I am already late with writing it. But if you are interested in the Huffmann Packer, you will find lots of documentation in your local library.
The Huffmann Packer expects that the bytes in a file appear with a different frequency. Most times the bytes $00, $ff, $01, $02 are more likely to come up than $be or $97. For that reason, you can shorten the 8 bits of the most frequent bytes to - say 6 bits (to save 2 bits whenever this byte appears) and enlargen other rare bytes' 8 bit to 10 or more bits. If you do this in the best possible way, you will save a lot of bytes all in all. To pack with Huffmann, you will need to create a tree and sort the bytes for the most often and least often appearing bytes. If every byte appears the same times as the others, you will not save any bytes. In fact you will lose some, because for the Huffmann packer you need a table in front of the packed data that tells the unpacker how many bits every byte takes.
Dynamic Huffmann Packer
If you still improve your Huffmann Packer, you will get a Dynamic Huffmann Packer. It is better because it modifies its table while unpacking. Imagine you have a lot of 00 in the beginning of the file and a lot of ff bytes in the end. In this case it will be useful to change your table during unpacking.
Shannon-Fano Compression
Shannon-Fano Compression is another algorithm to reduce your amount of bytes. Like Huffmann, the SF Packer is a variable length packer. That means that you operate with bits, not with bytes. To explain how it works, imagine you have a text. Like in Huffmann, you count how often each character appears in this text. Then, arrange your character set in the right order based upon the probability of occurrence. After that is done, the set must be divided into two equal or almost equal subsets based upon the probability of occurrence of the characters in each subset. The first digit in one subset is assigned a %0 and the first digit in the second subset is %1. Now keep on dividing the subsets until each character has its own binary number.
Compared to the Huffmann Packer, you will get similar packing rates, so use the one you like best.
Others
There are still some other Packing Algorithms, that are good for special uses, like compression of text, pictures, digital sound etc.
To pack text, you can sometimes use the 8th bit that is not used by ASCII. Or you can use a dictionary packer, in which the text is analysed and every known word, like "and" or "computer" is replaced by a number. This is a special use and requires a big dictionary. You can also replace spaces by tabs (good for source codes). There are many ways to save some bytes for texts.
Pictures can be reorganized. Low res screens consist of 4 planes. One pixel is saved in 4 bits, that are in 4 different words in a sequence. If you put the 4 bits in 1 word, repeating pixels can be packed. You need different routines for medium res screens. You don't need to do any changes for hires screens, because there is only one plane. Another way to pack pictures is to use the fact that repeating pixels are often to be found in vertical lines.
Digital Sound is not easy to be packed, but it can be done with relative packing. That means, you always calculate the difference from every byte to the preceeding byte. Those differences repeat more often than the other bytes.
Animations are also packed with relative packing. That means, you just save the difference to the old picture. This saves a lot of data. Remember this when you digitize your next porno show...
Packers - a short summary of the most popular ones
Now I have shown you the most common packing routines and I think that your brain is burning now. So let us change the subject to some easier-to-read subject. I will introduce some different packers:
The first packer I have seen on the ST was the "Happy-Packer". It was named Happy Packer because at that time there was an article about packing algorithms in a German magazine called "Happy Computer". That was maybe 4 years ago and as you know this packer was not very good in size and speed of unpacking. A 60 kB program was unpacked in about 10 seconds.
The "Happy Packer" is a two-pass packer. It first does run length encoding and then uses the Huffmann Packer. In fact, it works exactly as explained in the magazine.
Another packer that was made was an improved "Happy Packer". This was done by -me- of The Exceptions. He added a pass and reduced the size of the packed programs. This packer was used in the legendary "Union Demo".
The next packer was the "Jek-Packer" ("Bytekiller" on the Amiga). It had best results compared to other packers and I often used it to pack some data. Until I suddenly found out that there were bugs in the unpacked data. The unpacked data was different than the original data. I changed the program a little and inserted a verify routine, that simply checks if packing was done correctly and I was stunned to see how often there were errors. The reasons was that the data was unpacked to the same address were the packed data was. This is a nice thing, but has the effect that packed data is sometimes overwritten by the unpacked data. This causes some fatal errors. There were some other bugs in the packer and an improved version coming from Ilja/Level 16, STC/Brainpower, -me-/Tex and Axe/Delight (note: now Superior!) was finished more than one year ago (February 1990). I named this packer "Delight Packer".
Oh yes, the "Jek Packer" only uses string packing, and it is also one of the packers that unpack all data from behind. There is not much advantage in this method, except that the files can be packed from the front, starting with the first byte and ending with the last. The "Bytekiller" uses d5 for a checksum and after unpacking a byte tells if the file was damaged on the disk.
The "Automation Packer" was well known, but there were also a lot of bugs in there. The packer didn't work on hard drives and the like. In fact, it was some bad programming and I am sure that the packing routine was also ported from the Amiga. It was the only intelligent part of the program, even though it was also full of bugs. Now, the "Automation Packer" has a nice shell and has been improved a lot. I don't know if the packing routine has also been improved or not. I never use it anyway.
The "Pompey Pirates Packer" is one of the better packers around.
It can compress strings, repeating bytes and replace bytes with only 1 bit set (1,2,4,8,..) by less than 8 bits. This is quite useful, as those bytes come up very often in files. Some things could be better though: The string packer is not very optimized, the offset could be set higher for better compression (up to $10000) and packing takes very long (Hi JPM! Are these enough improvement suggestions?).
Note: I am referring to the "Pompey Pirates Packer" 1.5, as I don't have 1.9 yet. Well, it might be improved. I don't know.
A valuable lesson
By doing a lot of packing stuff, I have learned that it is very useful to have a verify option in a packer, because there are often bugs hidden in the packer, even if you think that you have eliminated all. And nothing is worse than creating damaged files when packing. So I suggest all packer writers to include verify.
Another thing that is interesting for all lamers: Please learn to remove this horrible color flashing in the background. It hurts my eyes, it makes good graphics look like shit and uses up memory and processor time. Especially those guys who use "Pack- Ice" and insert their color flashing shit in the unpacking routine should be hot into space where they can do as much color flashing as they want to. Why do I optimize an unpack routine as much as possible if some lamer just adds 20 bytes for color flashing.
If you want to demonstrate that a program is packed, then say "Packed by...". That should do it.
I have often been asked when "Pack-Ice" 3.0 will be finished.... Well, to be honest, I haven't done any coding on "Pack-Ice" for 2 or 3 months. But right now I have the time to get back to coding. Maybe you will not get 3.0, but only 2.5 or something, but I'll try to give you the best packer in your hands, so you and everybody else can save a lot of disk-space.