Machine language data structures
Programmers' Workshop Conference
April 5, 1990
by John Leeson
The design and use of code or program structures for data used in programming is one of the most important considerations in early software development. This is particularly true for programs which process or use large amounts of data, but can be important in nearly any programming job. Modern programming methods concentrate more emphasis on data structures, and modern programming languages provide more code mechanisms for their design, definition, processing and protection. Machine language, however, being the lowest level of program code, is far removed from the high level languages when it comes to data definition. This does not make the ML programmers job any easier, and it is even more important to understand the types of data structures that might be used and how to implement them in the lowest level form. This paper outlines several of the most common data structures used and suggests how they can be implemented in 65xx assembly code.
Data structure TYPES that this paper discusses are ARRAY, LIST, QUEUE, and RECORD. Stacks are generally similar to the queue or list structure except for how they are accessed. Sets, maps and graphs are beyond the scope of this paper. Trees will be briefly discussed and related to list structures for the most common case of binary trees.
Array
The array can be single or multiple dimensions and is characterized by the ability of it's associated processing to randomly select, replace or swap the contents of the elements of the array. Each element of the array is like every other element of the array and a sub-division of an array has the same characteristics as it's parent has, except for size. Arrays are nearly always fixed in size rather than dynamic sizes. The array may be at a location fixed at assembly time or may be dynamically allocated memory at run time. The array, once it is set up, always exists. There is no such thing as an empty element. Each element can contain data of the type or size defined for the element. If an array element has not been assigned data by the program then the contents of that element are undefined. But the element still exists.
There are various ways to set up an array data structure in assembly coding. The method chosen depends on whether a generalized array structure is desired or you want an optimized setup for the particular array characteristics. The general case can be setup similar to the most complex case so it will be shown first, followed by simplifications for special cases that can be optimized. Array structures are important because most other structures will be set up using the principles shown for arrays.
In the most general case, the array will contain multi-byte values, have a large order (number of elements), and be in dynamically allocated memory. That is, the array will not be at a location determined at assembly time. For this structure, each element will be set up to occupy consecutive memory locations and have the index values increasing in memory starting at a pointer value which addresses the start of the array in memory. The zeroth element of the array using n bytes per element will thus start at the address contained in a pointer "array'ptr" and occupy n consecutive bytes. Element number 1 of this array will start at the address in "array'ptr" + n bytes and occupy n consecutive bytes from there. The total size of the array will be n * k bytes where k is the maximum dimension of the array. Here is the assembly language code to set up an array like this with maximum size equal to the assembler constant max'size and number of bytes equal to the assembler constant element'size.
zp'array'ptr = $20 ;zero page pointer (2 bytes) where a working
;pointer can be calculated and used
max'size = 300 ;assembler constant, max number of elements
element'size = 7 ;assembler constant, number of bytes per element
array'ptr .word 0 ;location where the address of the array will
;be placed at run time
;
;get'element : call with .X = element index low byte, .Y = element
; index high byte.
; returns with zp'array'ptr set to address of element
; in the array.
;
get'element = *
stx zp'array'ptr ;save the index for calculation
sty zp'array'ptr+1
lda #element'size ;get the size of each element
jsr multiply ;multiply the index in zp'array'ptr by the size
lda array'ptr ;and add the base address of the array
clc
adc zp'array'ptr
sta zp'array'ptr
lda array'ptr+1
adc zp'array'ptr+1
sta zp'array'ptr+1
rts
;
multiply = *
; this routine multiplies the value in the pointer at location
; zp'array'ptr by the contents of the accumulator and returns
; with the result left at zp'array'ptr.
sta mpy'multiplier ;set up work locations
lda #0
sta mpy'result
sta mpy'result+1
ldx #8 ;count of bits to multiply
- ror mpy'multiplier ;get bit of multiplier in carry
bcc + ;don't add if bit is %0
clc ;else add current value to result
lda zp'array'ptr
adc mpy'result
sta mpy'result
lda zp'array'ptr+1
adc mpy'result+1
sta mpy'result+1
+ dex
beq + ;finished if 8 bits counted
asl zp'array'ptr ;else multiply value * 2
rol zp'array'ptr+1
jmp - ;then go back and do next bit
+ lda mpy'result ;done. move result to zp'array'ptr
sta zp'array'ptr
lda mpy'result+1
sta zp'array'ptr+1
rts
mpy'multiplier .byte 0 ;temporary location to hold multiplier
mpy'result .word 0 ;working space to calculate result
Notice that it is necessary to have a general purpose multiply routine that is used to multiply the element size by the array index. I have included the general purpose multiply routine here for completeness. Even then, this is not a fully general purpose 16 bit multiply. It is a multiply of a 16 bit value by an 8 bit value and is optimized to do that. In general you don't want to do extensively more than is required in the assembly code. You could have set up for and used the BASIC interpreter floating point multiply here instead, but that would take a couple milliseconds to calculate where this special purpose integer multiply gets the job done in about 300 microseconds. For the best routine here you should have also checked to make sure that the index value did not exceed max'size and returned with an error flag if it happened.
There are many smaller cases of arrays that allow simplifications. If the maximum index value is 255 or less then one byte is sufficient to hold the index instead of 2 bytes being required. If the product of the number of elements and the element size in bytes is greater than 256 you will still need to do the 2 byte pointer and use a zero page location for it. The use of the zero page pointer allows you to use indirect-indexed addressing mode to get the contents of the array element addressed. For example, after executing the code above to set the element pointer then you could read the 7 values from the array with...
ldy #0
lda (zp'array'ptr),y ;get the first byte
iny
lda (zp'array'ptr),y ;get the next byte
<etc...>
The operation becomes a bit simpler if the array is fixed in memory at assembly time. That allows you to do some of the math with assembler arithmetic instead of having to do the arithmetic at run time. For example, to access element number 2 of the array at fixed memory address "array" you could do the following:
ldx #0
lda array+7*2,x ;get the first byte
inx
lda array+7*2,x ;get the next byte
<etc...>
The array would be declared at a location in absolute memory with an assembler label assignment. Or, if you wanted to imbed the array in your code to make it be part of the loaded executable program then...
array = *
.buf max'size*element'size
The operation becomes trivial when the product of the element size and max'size is less than 255. In that case you can simply use absolute indexed addressing mode if the array is at a location fixed at assembly time or use indirect indexed addressing mode if it is not. Simply multiply the index by the element size and put it in the .Y register, then do a lda array,y or, for a dynamic location array, lda (zp'array'ptr),y. You can take advantage of this simplification for multi-byte elements if the max size of the array is less than 255. To do this, instead of locating each byte of an element in consecutive memory locations you can put the first byte of each element together in consecutive memory locations, then put the second byte in another set of consecutive locations, etc. For example, to declare the array with 2 bytes per element and the max index of 199 you could do thus:
array0 = *
.buf 200
array1 = *
.buf 200
or the alternative for fixed memory outside of your code area...
array0 = $c000
array1 = $c000+200
Now to access element n of the array you can use a subroutine like this...
;get'element : enter with .A containing the index value to access.
; return with .X containing the first byte, .A containing
; the next byte of the contents of the array element
get'element = *
tay ;index to element
lda array0,y ;get the first byte of the contents
tax ;put it in the return parameter location
lda array1,y ;get the next byte of the contents
rts ;done
The simplest form of all for an array using ML is the conversion table. This is often used to convert 8 bit data bytes from one coding system to another. For example, to convert ASCII to CBM ASCII. In this structure the table is an array that is normally assembled into your program. The table is 256 bytes long and the contents of each element of the array are the CBM ASCII code for the equivalent ASCII character code used as an index to the array. A conversion routine to convert the contents of the accumulator as an ASCII coded character to a CBM ASCII coded character looks like this:
ascii2pet = *
tax
lda ascii2pet'table,x
rts
ascii2pet'table = *
.byte 0, 0, 0, 0, 0, 0, 7, 157, 0, 0, 0, 12, 13, 0, 0, 0
.byte 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
.byte 32, 33, ...
.byte 48, 49, ...
.byte 64, 97, 98, ...
Notice that the ASCII code 8 is the backspace character, which gets translated to the CBM ASCII code for cursor left. ASCII characters which have no CBM ASCII equivalent are translated to a byte value of 0, which can be printed with no effect. Also notice that ASCII code 65 is an upper case 'A' which is translated to character code 197, the CBM ASCII code for upper case A. If I had put in the complete table you would find that ASCII code 97 accesses the table with an index value of 97 and gets 65, the CBM ASCII code for a lower case 'a'. Also, using an index value of 127 for the ASCII DELETE character would get CBM ASCII code 20, the delete character. Values greater than 127 would have identical table entries to the values for character codes 0 through 127 since standard ASCII does not define characters greater than code 127. Thus the table can be shortened to 128 bytes by adding 'AND #$7F' before the 'TAX' in the above routine. That trades the execution time of the one instruction for 128 bytes of memory. If you wanted the routine to run as fast as possible you might not do that, however.
List
The list is a collection of similar items in any order. It can be likened to a single dimensioned array in most respects. However, the processing performed on a list structure is generally sequential rather than random. The abstract 'list' is defined specifically as a sequential structure rather than a random access structure. You can add or remove things only at the bottom of the list. To remove an item from inside a list you split the list into two lists at the place the item is to be removed, then remove the item from the bottom, then rejoin the two lists. Lists are often dynamic size or memory location and have a pointer linkage from one item to another on the list. Unlike an array, it is possible for a list to not have an element. A list that is only 9 items long does not have any 10th item, thus it is necessary for lists to define an empty item. Usually this is called a NULL item and is the result of the last item on the list having a NULL pointer.
Two forms of list are used, depending on what kind of access is desired to the list elements. The simplest form is a singly linked list where each piece of data has a LINK to the next piece of data. This list can only be read from 'top' to 'bottom' since that is the way the links go. The top item on the list points to the second item, which points to the third item, etc. The last item on the list points to a NULL item using a NULL pointer. An empty list has a NULL pointer to the list itself. The more general form of a list is doubly linked. There is a pointer from each element to the next item lower on the list and to the next item higher on the list. In this way you can traverse a list either 'top' to 'bottom' or vica-versa. The use of a linked list allows the items in the list to be ordered differently than the order the list was created in without moving the data in memory. Thus this structure is ideal for sorting items or for adding and removing items. Only the pointers get changed as items move around.
In the 6502 family of processors a LINK is a 2 byte absolute address, in general. For special cases of small lists it could be a one byte value that is added to the base address of the list in memory to get an absolute address. For a list of multi-byte values and a maximum size of less than 256 items, the link could be a single byte value that is multiplied by the size of the item in bytes and added to the base address of the list. Set it up with the link bytes first followed by data bytes in consecutive memory locations. Like with arrays, it is so convenient to have a power of 2 for the size of an item that you often may want to set the size of an item in bytes to be so, even if you don't need all the bytes. For example, if you need 12 bytes for a name string plus 2 bytes for pointers then you may set it up to use 16 bytes per item and leave the extra 2 bytes unused. Then a multiply of a pointer by the item size can be done by shifting the pointer value left 4 times instead of a general purpose multiply.
Working with a list is best accomplished in general by using a doubly linked list. You will need 3 pointer variables to maintain a list in memory. The first is a pointer to the memory address of the start of the list. If the list is fixed in memory at assembly time then this can be an assembler label rather than a pointer storage place. The second is a pointer to the HEAD of the list. This pointer addresses the first item on the list. Finally you need a pointer variable for the next available memory address where data can be added. An optional fourth is a pointer to the TAIL of the list. If an item is to be added to the tail of the list then this pointer gives direct access to it. You CAN get by without this pointer variable if the list has to be searched before adding an item to the list anyway. You know when you have reached the end of the list when you find an item with a NULL link and this is by definition the TAIL of the list.
This gives a structure of the list in memory as shown below:
back link <--- (ptr to start of list memory)
forward link
...
back link (null) <--- (ptr to head of list)
forward link
back link
forward link
...
back link <--- (ptr to tail of list)
forward link (null)
...
back link
forward link
start of available memory <--- (ptr to free memory)
The process of adding an item to a list is to put the data at the location pointed to by the next available memory address pointer. Then add the item's size in bytes (including the pointers) to the next memory address pointer. Set the new data item's forward pointer to NULL (usually a 0 value is used for NULL). Then set the back link of the new item to the pointer value for the tail of the list, set the previously null pointer in the item that was at the tail of the list to the pointer value for the item just put in memory, and update the pointer to the tail of the list to set it to the pointer value for the item just added.
To remove an item from the list you take the data of the last item in memory and overwrite the data of the item being removed. Then subtract the size of the item from the next memory address pointer. Copy the backward link of the deleted item to the backward link of the item that the deleted item pointed forwards to. Copy the forward link of the deleted item to the forward link of the item that the deleted item pointed backwards to. Change the link of the items pointing to the moved item (one forward and one backward) to point to it's new location. Finally, copy the old forward and backward pointers from the moved item's old location to its new location. These operations require two temporary pointer variable locations (usually zero page) plus one temporary storage location which holds the pointer to the deleted item (and the new location of the item moved during a delete).
To swap two items you simply swap the two item's pointers. A temporary pointer variable space is needed but can usually be the stack, while you use the accumulator for a pointer byte and the .Y index register to access the pointers using indirect indexed addressing. The code might look like...
;swap with 2 byte pointers for doubly linked list
swap ldy #0
- lda (from'ptr),y : pha
lda (to'ptr),y : sta (from'ptr),y
pla : sta (to'ptr),y
iny : cpy #4 : bne -
rts
A binary TREE is similar to a LIST except there are TWO pointers to the next data item. One points to the item referred to as the LEFT CHILD, another to the item referred to as the RIGHT CHILD. If it is a doubly linked tree then there is also a pointer backwards to the PARENT of the item. A NODE which has no children is called a LEAF in the tree. Binary trees are excellent for setting up and accessing large amounts of data sorted in some specified order. It is common practice to have the left child contain data that is less than the node value and the right child contain data that is greater than the node value. Each node on the tree contains both data and pointers and looks pretty much like an item in the LIST except that there are two forward pointers. A simpler form of tree has only pointer nodes in the tree structure itself. The pointers point to either another pointer node or to a leaf node. Leaf nodes contain only data. Pointer nodes in this tree structure sometimes contain some 'quality' of the data that is below them to be used for searching the tree. For example, in a decision tree the pointer nodes contain a question which can be answered true or false (or yes or no). A false answer for the question might then mean to take the left child of that node and a true answer to take the right child of that node. If the resulting child is a leaf then it contains the data being searched for. If the resulting child is another pointer node (has another question) then the process is repeated.
The storage of the tree in memory should, in most cases, be the same as for a list except for including space for the two forward pointers. The pointer to the HEAD of the list will instead be a pointer to the TOP of the tree. You will need no pointer to the TAIL since there is no such thing as the tail of a tree. You may need to have a flag byte reserved in the item storage space to determine if a node is a leaf node with data in it or a decision node.
The QUEUE is commonly used as a BUFFER in programming. A queue can be in many forms: First-in/First-out (FIFO - which is the common form for a buffer); Last-in/First-out (LIFO - which is nearly the same as a STACK); Or other forms with priority, balking or random access. Whole books are written on queues and queuing theory, so I obviously cannot cover much on the subject here. The FIFO form, however, is so common as a buffer that it really needs to be covered.
Depending on the data that is queued, the queue could take the internal data structure of a linked list or a simple list. Two variables are needed to control the queue - the FRONT of the queue and either the END or the LENGTH of the queue. In order to get to some examples and useful processing I am only going to show forms which have single byte elements. This is likely to be just what you need anyway, and the ideas are easily extended to larger data items.
Three forms of the queue are common for byte buffers; the static linear buffer, dynamic linear buffer and circular buffer or queue. An example of the first is a disk buffer. It is first filled with bytes by an input routine, then emptied by an output routine. For this structure, a flag is needed to allow the input routine to tell the output routine when there is no more data to be put into the queue (the EOF flag). The control flow for a disk buffer is generally that the input routine reads data into the buffer until it is full or the end of file is reached. It then calls the output routine, which removes data from the buffer until it is empty. Then the output routine checks the input EOF flag and if not set, calls the input routine. If it is set, then the output routine returns to the higher level control program. Buffer empty occurs if the length of the queue is zero or if the end of queue variable is equal to the start of queue variable. Buffer full occurs if the length is equal to the maximum length or the end of queue variable is one greater than the maximum address allowed for the queue storage space. The end of queue variable always points to the next position where data can be written. If a queue length variable is used then the next position for write is equal to the start of queue variable plus the length.
A dynamic linear buffer is used where the input routine and output routine are independent from each other. Data is put into the queue asynchronously with it's removal from the queue. An example of this type of queue is the keyboard buffer. Characters are put into the queue by the keyboard scan routine executing during IRQ interrupts. Characters are removed from the queue by an application program with calls to the Kernal GETIN routine with the default input device active. It is usually only used where the length of the queue is quite short. The keyboard buffer is only 10 characters long. It should never be used where the length of the queue exceeds 255, so that indexed addressing can be used to access it. The length of the queue then becomes the index register value used to write to the queue. The data is removed from the queue if the length is non-zero. The next byte is removed from the start of queue position and all remaining bytes are moved down by one in memory until queue'length-1 bytes have been moved and the length is then decremented by one. If the buffer is assembled into your code then the move is simplified. If the buffer is at a location determined at run time then indirect indexed addressing must be used. A comparison of the two move routines follows.
move'queue = * ;fixed location move'queue = * ;dynamic location
ldy #1 ldy #1
- cpy length - cpy length
bcs done bcs done
lda qstart,y lda (qstart'ptr),y
sta qstart-1,y dey
iny sta (qstart'ptr),y
bne - ;unconditional iny
done dec length iny
rts bne -
done dec length
rts
"qstart" is an assembled-in absolute location for the queue. "qstart'ptr" is a zero page pointer variable that is set to the start of the queue at run time.
The third form of the buffer queue is also asynchronous. An example of the circular queue is the RS-232 buffer. Bytes are placed into the receive queue by interrupt service routines for RS-232 input. They are removed from the queue by calls to the Kernal GETIN routine with the RS-232 channel active as input. Bytes are placed into the output queue by calls to the Kernal CHROUT routine with RS-232 channel active. They are removed from the queue by the RS-232 transmit routine which is interrupt activated as long as there is data in the queue. The output start is initiated by writing a byte to the transmit queue when it is empty. The input start is initiated during setup of the channel. For the circular queue there are 3 variables needed. The start of queue, the Next Read Position (NRP) and the Next Write Position (NWP). The buffer is empty when the NRP and NWP are equal. The buffer is full if the NWP is one less than the NRP. When the variable NWP or NRP is incremented after writing a byte or reading a byte and it then exceeds the end address for the queue then it is reset to the start address for the queue. A queue length of 256 bytes makes the circular queue work extremely well for the 65xx. It is only necessary to increment the index and ignore overflow to make it be circular. The code follows for a circular queue of 256 bytes.
qwrite = * ;enter with byte to write in .A
ldy nwp
iny ;wraps around from 255 to 0, overflow ignored
cpy nrp
beq queue'full
sta (qstart'ptr),y
sty nwp ;nwp updated AFTER the byte is stored, NEVER before.
rts
queue'full
qread = * ;return a byte in .A or return with carry set if empty
ldy nrp
cpy nwp
beq queue'empty ;equal condition also has carry set
lda (qstart'ptr),y
iny ;wraps around from 255 to 0, overflow ignored
sty nrp ;nrp updated AFTER the byte is read, NEVER before.
clc
queue'empty rts
For queue lengths less than 256 you can do a CPY MAX'LENGTH after the INY in these routines and reset it to 0 if .Y is greater than the desired length. For queue lengths of greater than 256 then the NRP and NWP become zero page pointers which are incremented until they exceed the maximum size then reset to the start of the buffer. Then the loads and stores use (NWP),Y with .Y set to zero. The compares for NWP = NRP need to be 16 bit compares instead of simple register compares.
A RECORD contains several pieces of information which are not similar but all pertain to a single object. A person, for example, has a name, address, phone number, and possibly other characteristics such as bowling average or amount that he owes you. A item of stock in a store might have an inventory number, manufacturers name, amount you pay for it, amount you sell it for, and how many you have in stock. The collection of information about the object is the RECORD. Each separate piece of information is called a FIELD. A number of records can be put together into a DATA BASE which might be structured like an array, a tree or a list. Each field in the record has a name which can be used to access it. In high level languages a specific field is completely determined by concatenating the field name to the record identifier. For example, you might refer to your zip code as ME.ZIP where ME is the identifier of the record containing info on yourself, and ZIP is the name for the field containing your zip code.
For working with records in assembly language you generally will want to keep the information in fixed length fields. This is the Pascal style record structure. With fixed length fields, each field has a predetermined number of memory bytes reserved for it. The data in the field either exactly fills the field or has a predetermined terminator byte marking its end. To access a specific field in a record you can then use the address for the record and an index value for the offset to the field from the start of the record. I prefer to define assembler constants for the offsets using labels which are appropriate for field names. A structure for storing a disk directory in memory as an array of records then might look like this in assembly code:
directory = *
fileflag = *-directory
.byte 0 ;flag for file selected or not: multipurpose
blocks = *-directory
.byte 0,0 ;file size, low byte/high byte disk blocks
filename = *-directory
.asc "16 char filename" ;file name padded with bytes of zero
splat = *-directory
.asc "*" ;"*" if splat, else space
filetype = *-directory
.asc "p" ; p, s, u, r, c, d (1st letter of type)
lock = *-directory
.asc ">" ;">" if file locked, else space
spare = *-directory
.byte 0,0 ;unused
.buf 4680 ;space for 195 more. Total 196 for 1581 directory.
This structure is flexibly defined in assembler code such that it would be easy to change. If you wanted to add a second flag byte after 'fileflag' you could take one away from the spare bytes and just stick in a new label for the field as the others were done at the place where you want the byte to appear. All the rest of the offset labels will be adjusted accordingly. Also, the total count of bytes in the record have been set up for 24. This is a fairly easy number to work with in assembly language math for indexing. Here is an example program fragment to "process" each marked directory entry:
do'em = *
inc an'entry ;present entry number
lda an'entry
cmp numb'files ;all done?
bcs exit
jsr get'new'fileflag ;get its flag
beq + ;no processing needed
jsr process ;whatever that is
+ jmp do'em ;note - structured block
exit = * ;continue
get'new'fileflag = * ;get directory(.A).fileflag for new entry
jsr set'dir'ptr ;set zp'dir'ptr to point to directory(.A)
get'fileflag = * ;get fileflag of present entry
ldy #fileflag ;offset of fileflag field in directory record
lda (zp'dir'ptr),y ;here it is
rts ;destroys .Y, .X is preserved, result in .A
;
set'dir'ptr = * ;set pointer to directory(.A)
ldy #0
sty zp'dir'ptr+1 ;directory pointer variable, high byte
asl : rol zp'dir'ptr+1 ;*2
asl : rol zp'dir'ptr+1 ;*4
asl : rol zp'dir'ptr+1 ;*8
sta zp'dir'ptr : ldx zp'dir'ptr+1 ;*8 in .A and .Y
asl : rol zp'dir'ptr+1 ;*16 in .A and memory
adc zp'dir'ptr : sta zp'dir'ptr ;*24 low byte (carry always clear)
tya : adc zp'dir'ptr+1 : sta zp'dir'ptr+1 ;*24 high byte
clc
lda zp'dir'ptr : adc #directory : sta zp'dir'ptr+1
rts ;Destroys .A and .Y, .X is preserved.
Data structures have been presented to help familiarize you with how to set up and use ARRAY, LIST QUEUE, and RECORD type structures in 65xx assembly language. Also, a small amount of information on TREE structures was included. These structures SHOULD cover nearly all the ones you will need in programming as a home hobbiest. There are other exotic structures, maps, graphs and sets, which are beyond the scope of what I can cover in just a short paper such as this. I also did not cover the STACK, but these structures are very similar to an array or a list structure, depending on whether it is open-ended or fixed. The main characteristics differentiating them are the processing that is permitted and the way the structures are accessed. I believe that sufficient information is included in what IS here so that the extension to these structures is relatively easy.
This set of data structures is sufficient to provide program implementation of the data that is needed for any application. The methods to do so should be fairly obvious to a person that understands the structures presented in this paper. It may require use of an array of records as in the last example worked or some other combination of data structures. I hope that this paper helps you in doing so.
disclaimer
The above document is the sole work of the author and is for informational and educational purposes. It is intended as a review and should be used as such. I except no money, royalties, or gratuities for its contents. I also will not be liable for misuse or any damages, either direct or consequential, from use of any information found here. ALL INFO IS USE AT YOUR OWN RISK !!!