Issue # 7 DTACK GROUNDED Newsletter - February and March 1982
LET HAPPINESS REIGN TRIUMPHANT THROUGHOUT THE LAND! Joy, joy, halloo! Oh sweet rapture, felicitous delectations! TANDY HAS ANNOUNCED ITS 68000 MACHINE!
And the Tandy 68000 machine, dear friends, is essentially identical in concept to the approach we have taken. The Tandy 68000 will continue to use the existing Z80 operating system of the Model II. All existing Model II programs can be run on the new machine. Sound familiar? Old Model IIs are upgradeable to the new configuration. Some - but not much - software using the power of the Tandy 68000 will be available as the first upgrades are shipped, but much more (naturally) will be available around the end of this year (also naturally).
If any part of the above information comes as a surprise to you, this is obviously the first issue of this newsletter you have ever read. And if you think that we are unhappy over having a another 68000 processor/upgrade competing with us, you are out of your mind! Here are a few reasons for our happiness:
FIRST: Although we didn't invent the attached processor, we WERE the first to introduce the concept to the mass personal computer market. And now the LARGEST domestic manufacturer of personal computers has essentially endorsed our concept. That is called instant credibility!
SECOND: There will now be an enormous number of independent software vendors developing software for 68000 attached (or upgrade) processors. True, most of them will be developing software to work with CP/M, but there are a few Apple owners with softcards who are runnings CP/M, no? And any software written for one 68000 attached/upgrade processor can be readily adapted to another with a different host, as we have pointed out previously (you may recall).
THIRD: Although Tandy developed the upgrade processor for its own Model II, it is the Apple Model II which dominates the U.S. personal computer market. Our personal estimate is that for every TRS 80 II there is one (big) Pet and about twelve Apple IIs (domestically). WE, not Tandy, are selling attached processors for the Apple II. And a few for the Pet. So we have the larger market for OUR upgrade.
FOURTH: Now that Tandy has retained the existing operating system for their 68000 upgrade processor, the pressure is removed from US to develop an operating system for OUR attached 68000 processor. Now, we had no intention of doing so. But many of the subscribers to this newsletter, who have perhaps not thought the problem through, have written that we need a new operating system. WRONG!
FIFTH: We have tried to explain that since the 68000 is new, there CAN'T be a whole lot of software for it RIGHT NOW. Tandy is ALSO admitting that they don't have much 68000 software right now (SURPRISE!). But with both Tandy and - ahem! - Dtack Grounded developing software for the 68000 processor, there WILL be.
Page 2
INTRODUCING THE TANDY 68000 UPGRADE PROCESSOR:
Much of the following information is taken from the 20 column inch story that appeared in the Jan 20 issue of the Wall Street Journal. Other information came in via several phone calls the same day (one caller was nice enough to read information from the Tandy multi-color brochure to us over the phone).
The new unit is a $1499 upgrade, 128K expandable to 256K, for the TRS 80 Model II. It is a two board unit which plugs inside the Model II.
For people who do not already own a Model II, they will sell a VERY slightly different unit to you. They call this the TRS 80 Model 16. It has an ivory colored case and is expandable to 512K. Otherwise it is the same as the Model II upgrade. The price is $5798 and you get TWO 8 inch floppies, the new compressed style so they fit in the same space as the old single drive. Cheapskates can buy the one disk drive version for only $4999.
The 68000 SHARES the bus with a Z80. When one is running, the other isn't. With our boards, the 6502 and the 68000 both run full time. On the other hand, neither can access the other's memory space directly, as can the Tandy system. There are advantages both ways. We prefer not to shut down the 68000 while the old 8 bit chip monitors the keyboard or does a disk access.
It appears that our prediction of a low end (pricewise) machine with a 4 MHz clock was wrong, they came in at 6MHz instead (looks like the engineers won over the marketing types). However, it is clear that our 8 MHz unit is a lot faster than the Tandy machine.
If YOU had any reservations over whether the 68000 is REALLY faster than the 8088, listen to the WS Journal: ... much more powerful but about the same price as personal computers made by Apple Computer Inc. and International Business Machines Corp."
THERE AIN'T NO INSTANT 68000 SOFTWARE: The Journal also reported: "buyers of the Model 16 won't be stuck with a machine that can't do anything until new software written specifically for the Model 16 becomes available later this year..." Translation: you get to run Z80 code until Tandy gets around to writing some 68000 stuff.
TANDY - DTACK GROUNDED DIFFERENCES: Aside from clock frequency (and hence performance) differences, the smallest upgrade is the $1499 128K two board, unit. This compares with $695 for a 4K or $1395 for a 92K SINGLE board configuration from us. We use static RAM, they use dynamic.
The use of static RAM is the penalty WE have to pay to run at 8MHz and also to have a nice reliable memory system. The Tandy unit has a parity error indicator but no error correction, and with the new 64K dynamics that is not quite good enough for serious computing purposes.
In any event, if you already own a Pet or an Apple, you are stuck with US. Tandy is only supporting their own Model II. Even their Model III is left out in the cold. But isn't it nice they didn't call their new unit the TRS 80 ////////////////?
Page 3
CLEAR ALL THE CHILDREN FROM THE ROOM: You will not want to expose any of the following to your children. We remember when a certain minicomputer manufacturer introduced a multi-user system which (supposedly) supported 16 ea. teletypes, the old ASR33. The CPU on this minicomputer was 8 bits wide. As soon as one user learned how to write a loop, everybody else found themselves with a dead terminal. Even without a loop, the response time was ludicrous.
WE ARE PREJUDICED: Possibly as a result of that exposure, we are severely prejudiced in favor of ONE MAN, ONE (or more) CPU(s). Radio Shack has decided that their 6 MHz Model 16 can support 3 (three) users. So Radio Shack users get to learn about foreground-background, system and user stack pointers, the fact that the user cannot execute many 68000 instructions because they are 'privileged' (available to the system only) and a whole lot of other wonderful stuff.
WE do NOT like someone to tell us there are parts of OUR computer system that we cannot access. Whether that someone is an imperious Data Processing Supervisor in his white coat or over-complicated operating system software. There is very definitely NONE of this nonsense in our Dtack Grounded hardware or software.
Please note that we used the name Radio Shack rather than Tandy just now. We would not want the respectable name of Tandy associated with an (ugh!) multi-user system!
Oh? You don't think the Tandy name is respectable? You are hereby advised that the Tandy operating profit for the most recent three month period is $70 million. The corresponding figure for Apple Computer is about $13 million.
THE CP/M CONNECTION:
We had intended to eventually work our way around to the S100-CP/M systems via the Apple Softcard. Our attached processor doesn't care WHAT processor is in charge of the Apple bus - 6502, Z80, 6809 or 8088 (anybody working with the 4 bit TMS 1000 yet?). So, with appropriate software for the floating point formats used by the Microsoft Z80 BASIC, our board can be used to enhance CP/M systems.
The question is, do we write that software ourselves, or do we let one of the independent suppliers of CP/M based software to the new Tandy 16 develop it? And then adapt it to run on the Apple softcard because there are a lot of softcards? And a SYSGEN program to determine whether our 68000 board is available? (These are called rhetorical questions, folks.)
EVEN MORE 68000/Z80 SYSTEMS YET!
Cromemco has jumped on the 68000 bandwagon, and they have ALSO kept the Z80 so they will have some software to run. They are running at 8MHz nominal (but with DTACK not grounded, so it is a little slower than our board). They are using 64K dynamic RAMs - CORRECTLY! "Each 16-bit data word has 6 bits appended to allow a modified Hamming-code detection-and-correction algorithm to detect 1- or 2- bit errors in each word and to correct 1-bit errors. An error log on each card stores error locations." We applaud!
Page 4
That quotation is taken from Electronics magazine, 10 Feb '82, p.182. and that method is absolutely the ONLY way that 64K dynamic RAMs can be used reliably. It is NOT simple, however. If you have or like S-100 systems and you like the 68000 (and the Z80) you should go talk to Cromemco. Their prices are higher than ours, but then EVERY other 68000 system we have seen is priced higher than ours, with the exception of the Motorola educational board which has not yet surfaced (and is not designed for speed).
MORE RHETORICAL QUESTIONS: Do you think the industry has caught on to the fact that 68000 systems need (in 1982) an older processor which has substantial existing software support to 'carry' the system until more 68000 support is forthcoming? Now let's see: just WHO has been pushing that idea for some time now? Did you know that the Tandy system was originally slated to have a 6809 as an I/O processor, NOT the old Z80? And that the decision to keep the old Z80 was made (according to a reliable source) AFTER our fourth newsletter was published? Maybe you should re-read the second paragraph on page 5 of our newsletter #4? Did you know that large numbers of (photocopied) Dtack Grounded newsletters are circulating in the Dallas-Ft. Worth-Houston area?
Here is the scenario as we see it: a technician or junior engineer who has read our newsletter brings up the idea of keeping the Z80 (and all that software) with his engineering group leader. The engineering group leader, recognizing a good idea, takes it to the management oversight committee. Result: five middle management types are walking around Tandy Towers taking credit for coming up with the idea of keeping all that Z80 software. NONE of them have ever heard of Dtack Grounded, and ALL of them would be indignant if presented with this hypothetical scenario.
But you can be absolutely certain that WE didn't steal the idea from THEM!
EXCITING NEWS FROM MOTOROLA:
Motorola has just announced the obvious: They are taking the 68000 architecture to its next logical step. A true 32 bit data bus, inside and outside. Not, repeat not, multiplexed with the address bus. The device will be called the 68020 and will be made in an 80 pin (or more) leadless JEDEC package. Yes, this package is socketable. It will have indirect addressing, what we would call double indirect: the address register (or other addressing mode) points to a memory location which contains the address of the operand. The 68020 will have an instruction prefetch 'cache' which Motorola seems to think is a new invention. The 8086 has been using one for the past three years!
The 68020 will have 'hooks' to a math coprocessor chip, called the 68881. The 68881 is an 8087 with a Motorola label. Oops, reset! The 68881 will be a math coprocessor which will function in a nearly identical way in nearly all respects to the 8087. See the distinction? You do? Would you please explain it to us?
The 68881 will be usable with the 68000 as a peripheral, not as a coprocessor.
Page 5
WE AREN'T EITHER DUMBER THAN THAT BIG GREEN FROG: Although the 32 bit 68020 has been ANNOUNCED, we suggest that you refrain from rushing to your nearest Motorola distributor with cash in hand. It is very unlikely that commercial quantities of this product will be available before the second half of 1983. Nevertheless, this announcement confirms that we were on the right track when we picked the 68000 over other competing microprocessors. There will be a very logical and simple conversion of software from the 68000 to the (extremely high performance) 68020/68881.
Don't let the fact that we poke a little fun at Motorola obscure the fact that the 68020/68881 will be THE highest performance system available (when it is available!).
FIGHT! FIGHT! Motorola and Intel are slugging it out for performance leadership in the microprocessor market. We, as users, are fortunate to have two such competent organizations around. Without such vigorous competition, we might still be using the 8008 or 4004.
Both organizations (and others) have adopted the practice of announcing wayyyyyy in advance of delivery of product. Our problem, as users, is to correctly select which family of microprocessors to use. This is an important decision because, as some of you may have noticed, switching microprocessors can be a traumatic experience.
BAD, BAD 16 BITS (?): The experience is evidently even more traumatic when switching from an 8 to a 16 bit data bus (WE don't understand that; we have had no problems with 16 bits at all) and when switching from a $2.50 processor to a $100 (appx) processor. Most purchasing agents are too dumb to understand that a $100 price tag is irrelevant when it comes to choosing the power plant of a computer with typewriter keyboard, 80 column CRT, two floppy disks and a daisywheel printer. The performance that comes WITH that $100 price tag is VERY relevant.
Intel apparently correctly identified the problem about five years ago (Intel does a lot of things right). They chose to build their 16 bit processors in a way that preserved downward (backward?) compatibility with 8080 code. Unfortunately, this meant they had to retain the architectural features of the 8080 rather than adopting an architecture selected for high performance.
Since the 8086 has been around for over three years - it had a two year lead over the 68000 in the marketplace - we cannot understand why the IBMs and the Victor/Sirius computers and such did not appear much earlier. Especially since the 8080 code those machines are running was available before the 8086 became real. This is an obvious case of a (very nearly) missed marketing window.
Motorola either did not identify the problems of upward migration or (and this is our opinion) they assumed that only the minicomputer crowd would be using the 68000. WRONG! There were, and are, too many people from the microcomputer crowd who appreciate the advanced features of the 68000 to let them get away with that! But they did manage to stonewall US for over six months after we bought our first two samples (8MHz, $249 each).
By choosing to go with a completely new, highest performance architecture in the 68000, Motorola guaranteed a MAXIMUM HASSLE upward migration path.
Page 6
8086 - 68000 UPGRADE EQUIVALENCE:
It turns out that what has just emerged as the de facto standard 68000 upward migration technique is functionally identical to the 8086 technique. In both cases, we start out running programs and code written for older 8 bit processors while true 16 bit code is developed which takes advantage of the vastly increased performance possible in the new processors. And while we feel that the 68000 is significantly superior to the 8086, it is unquestionable that the 8086 is greatly superior to ALL of the previous generation 8 bitters.
The difference is that the 8086 runs the old code in a quasi-emulation mode while the 68000 runs the old code in the old $2.50 processor! Result: when real code is written for the two contending 16 bit processors, the 68000 is better because it is not handicapped by having to retain obsolete architectural features. Intel loses the performance race because of superior advance planning and Motorola wins by ignoring the 8 bit micro crowd. Life is not always fair.
WHAT HAS MOTOROLA DONE RIGHT THAT INTEL HAS DONE WRONG?
Simple. Motorola (evidently) researched past developments in the field of very high performance minicomputers. The highest performance minicomputer over the preceding decade was the PDP11 series, especially the 11/45 and the 11/70. The general architecture of the 68000 is therefore modelled after this line (the basic features are VERY similar).
But Motorola avoided the fatal mistake made by Fairchild with their defunct Microflame 16 bitter. Fairchild copied the old Nova 1200 architecture so closely that they also copied all the ten-year-old mistakes such as a 4 (yes, four) bit ALU in the 16 bit machine.
MISTAKES? IN THE PDP11/70? Yep. First, the address registers are limited to 16 bits, so that segment switching must be used to address beyond 64K. Second, the 16 bit data registers are too small. But the biggest problem, the one that forced DEC to develop the VAX (32 bit) systems to achieve higher performance, was (and is) that fabled, patented and trademarked bottleneck called the UNIBUS. The UNIBUS was/is a 16 bit wide bus over which all data and address information must be passed. Since both data and address information appear on the bus, the bus is multiplexed (obviously). DEC is, of course, the venerable Digital Equipment Corporation.
DON'T GET STUCK ON THE BUS:
As ALU (Arithmetic and Logic Unit) speeds converge asymptotically toward zero, the limiting factor in high speed computations turns out to be the 'bus bandwidth'. Motorola decided to use separate, non-multiplexed buses for data and addresses in the 68000. They also provided what most of us consider adequate linearly addressable memory with no segment switching (linearly addressable with no segment switching is a redundant phrase). The 68020 carries this philosophy to the next higher plateau by doubling the data bus bandwidth.
The bus bandwidth is the limiting speed factor for von Neumann architectures. This accounts for the recently-renewed interest in non- von Neumann computer architectures.
Page 7
HOW DOES THE INTEL BUS STRUCTURE WORK?
Slowly. Seriously, the 8086, the newly announced 80286 and the 60 foot tall gorilla called the iAPX 432 all use a SINGLE 16 bit path over which both data and addresses must be transferred. In the case of the iAPX 432, up to 5 (five) 32 (thirty-two) bit execution units are hung on 1 (one) 16 (sixteen) bit bus which, as we have already stated, must carry both data and addresses. By the way, EACH 'execution unit' consists of 2 (two) integrated circuits.
We think that the bus structure adopted by Intel is a mistake.
PAST SINS DEPT: In the last issue, we briefly discussed data path analysis as an introduction to comments on the iAPX 432. What we did not know at the time was that the latest issue of Computer (the IEEE publication) was at the printers and featured - guess what? - data path analysis (they call it 'data flow', but what does the IEEE know?). The whole darn Feb '82 issue is dedicated to the subject. If this interests you. don't miss it.
For ourselves, we were reminded of the 'penguin problem'. You know, "This book told me more about penguins than I really wanted to know."
Also, about that iAPX 432 critique in the last issue: the ideas, opinions and interpretations expressed originated completely and exclusively here at Dtack Grounded. The main reason for that writeup was our inability to obtain information (from the industry media, technical representatives or personal friends and co-workers) about the 432 that was that was either consistent or sensible.
We wish to make this explicitly clear because it turns out that an Intel competitor, one who also manufactures high performance microprocessors, has an in-house document on the iAPX 432 (an in-house document is to be greatly preferred over its opposite). According to a phone call we received AFTER the last newsletter was mailed, the information in that document is VERY similar to what we printed. We are now lobbying to be let into the inner circle of those who have been privileged to read it.
WE ARE NOT, REPEAT NOT, ANYONE'S STALKING HORSE: Because the contents of that document and what we wrote are (we are told) so similar, and because the conclusions we reached do not immediately follow from the promotional information on the iAPX 432 provided by Intel, an informed person could reasonably conclude that the item was either 'planted' or that we had violated a confidence.
Neither conclusion is true. Although we regularly publish information from other sources - this is being written several days ahead of the official unveiling of the 68020 - we pass ALL of that information through the official Dtack Grounded filter prior to publication.
FOR INSTANCE: We are happy that Motorola has no 'NIH' factor (in engineering, anyway) and that they are willing to incorporate advances into their products whatever the source. The 68020 incorporates an instruction prefetch cache, coprocessor hooks and works with a math coprocessor. In each case, the 8086 was there first.
Page 8
PAST SINS II: With this issue of the newsletter you will have received, if you are a subscriber, a copy of a letter written just before we decided to go commercial with Dtack Grounded. There is a MISTAKE (no! gasp!) on the first page which, along with other parts of the letter, became part of issue #1 of this newsletter. We stated that:
- The 16 bit add in the 6809 is not an add with carry,
- Two 16 bit adds therefore cannot be combined as a 32 bit add,
- The 6809 is not pipelined like the 6502 and requires an extra clock for simple instructions such as a zero page fetch,
- That the 6909 is therefore slower than the 6502 in performing a 32 bit add,
- That the 6809 is nevertheless faster overall than the 6502, although by a small margin.In newsletter #1, we added the following:
- That the 6809 would require 48 microseconds (1 MHz) to perform a 32 bit add,
- That the 68000 performs a 32 bit add in 0.75 microseconds,
- That the 68000 is therefore 64 times faster than the 6809 for this particular operation.
SEVERAL 6809 ENTHUSIASTS WHO READ THAT LETTER have written to conclusively disprove 4) and 6). In the process, they have also conclusively proven that 5) is emphatically correct. All of the writers have avoided discussion of 8), which is also wrong ( 1), 2), 3) and 7) are correct).
IN OUR IGNORANCE, we had overlooked the double-byte loads and stores which operate slightly faster than two single byte moves in the 6502. As a result, the 6809 can add 32 bits in 34 microseconds versus 38 microseconds for the 6502. The 6809 is therefore 11.8% faster than the 6502 (whoopee), while the 68000 is ONLY 4,433.3% faster than the 6809 (for this particular operation). NOT 6,300.0% as claimed in 8).
WHY, FRIENDS, OH WHYYYY? We wonder why the several letter writers who so vigorously defended the 6809 performance versus the 6502 neglected to point out that the 68000 is a mere 4,433.3% faster?
MORE 6809 STUFF: We have been prodding the two magazines dedicated primarily to 6502 systems to provide some coverage of the 68000 processor. Not OUR processor particularly (honest!), but the 68000 processor in general. Both have declined, based (we think) on a lack of knowledge among the editors and publishers about the 68000. A similar state of knowledge about the 6502 existed in 1971, no?
We have written to both magazines, not always in the most tactful manner, that they were risking becoming irrelevant by overlooking the TRUE new generation processors. (The 6809 is NOT a new generation processor, it is a SLIGHTLY improved older generation processor.) The matter is now moot because someone, seeing a vacuum, is starting a real magazine (not a newsletter) dedicated to the 68000.
Page 9
A NEW INFORMATION CHANNEL FOR 68000 COGNOSCENTI:
The magazine is to be called '68000 MICRO NEWS' and is subtitled 'A Journal for the Serious Micro Computer User'. Welcome! We had been wondering how we could maybe learn something about COMPLEX 68000 systems (although we don't want to manufacture one).
As soon as this magazine is ready to accept subscriptions, we will publish the scoop here. They already have a very professional looking two page letterhead for their correspondence.
MORE NEW 68000 STUFF: Motorola is also announcing the 68008 which is to the 68000 as the 8088 is to the 8086. That is, it is software identical but requires two 8 bit fetches for each 16 bit word fetch. One of these will doubtless show up on an Apple II processor card, one which takes over the Apple II bus which ours doesn't. The 68010 is optimized for virtual memory systems, of which more next issue. There is even, from another vendor, a one-chip 68000. Well, it's kinda like a 68000. We don't think any of these will be as interesting to YOU as the 32 bit version, so enough, already.
SOFTWARE STATUS REPORT:
This month we have by far the largest incremental software report yet, which is why we are beginning it this early in the newsletter. Unfortunately for you Commodore types, most of this is, for now, mostly applicable to the Apple II. We lead off with a report of a COMMERCIAL cross-assembler:
A REAL, HONEST-TO-GOODNESS 68000 CROSS ASSEMBLER?
Yep, for the Apple II. It's written in 6502 machine language by a professional software house. It will be available in late March, we are told. At $95, it has to be a bargain! Requires a DOS based text editor such as DOS Toolkit. Contact:
PHASE ZERO LTD.
2509 N. Campbell
Tucson, AZ 85719
These folks have already contributed a 68000 monitor which runs under Applesoft with our board to the public domain. This program (written by David Rifkind) was included in our recent Apple II Software release #2, which has already been distributed to all prior Dtack Grounded customers (Apple section).
Release #2 (Pet section) is about a month off.
3D GRAPHICS DEMO: Below is the three dimensional figure of rotation which we will be discussing on the next page and a half. The next page is taken directly from software release #2.
Page 10
THREE DIMENSIONAL GRAPHICS:
Included are several versions of a 3-D demonstration program originated, we believe, by MTU (Micro Technology Unlimited). The original version runs in 53 MINUTES and the latest version runs in 18.9 SECONDS! Here is the chronology:
D3.ORIG
This is the program originally published by MTU as adapted for the Apple II. We suggest that you list the program on your CRT; it is quite short and ELEGANT. Being elegant, it is also slow. Runs in 53 minutes on your Apple II.
D3A
Delete line 20 and this is the original program but optimized for speed. It is not NEARLY as elegant as the original. With line 20 deleted, it takes 30 minutes and 5 seconds to run. That's as fast as Applesoft gets, folks.
Leave line 20 in and you have a primitive 'link' into our 68000 board. The run time drops to 10 minutes and 40 seconds.
D3A.C
This is program D3A compiled, courtesy of the Hayden compiler. Line 20 is left in, so we keep the 68000 to help with the floating point, plus the compiler removes much of the BASIC interpretive overhead. Run time is 5 minutes 6 seconds.
D3.68K
The 6502 runs HPLOT routines very slowly, we discovered. So we wrote a 68000 machine code version of HPLOT, a very simple threaded programmable calculator simulator, and stuck the whole program (well, almost all - list what's left and see!) into the 68000. The Apple II is used exclusively as an I/O processor, something we have been preaching for some time. The run time is 45 seconds.
D3.FAST
We hate waiting around for 45 seconds, so we redid our 68000 floating point package into a FAST graphics version with a 16 bit mantissa. A resolution of 4.5 digits is adequate to plot on a 280 X 192 grid! The run time is 18.9 seconds. Are there any questions?
Isn't it a rotten shame we don't have DMA on our board? You will, of course, be able to run this demo MUCH faster using one of those 6809 processors which directly access the Apple memory space. Won't you?
(The run times given do not include disk access time.)
Page 11
D3.FASTER
We were curious how much faster we could make that program run. In addition, we wanted to go on to the next phase of our graphics demonstration programs with the fastest possible routines. So, we stripped out the threaded programmable calculator simulator. Although the time 'overhead' of the simulator was (and is) quite small, it does tie up three of the address registers. Getting rid of the simulator left us with six undedicated address registers, which we promptly dedicated (during floating point calculations) as pointers to S1, S2, M1, M2, X1 and X2. Each reference to one of these memory locations got two bytes shorter and half a microsecond faster. For instance:
1E 38 MOVE .B S1, D7
16 16 S1
BF 38 EOR .B D7, S2
16 1A S2
BECOMES:
1E 11 MOVE .B (A1), D7
BF 12 EOR .B D7, (A2)
Since the number of memory references is cut in half (for this short example), the program is both more compact and significantly faster. As a result, the run time is about 15.8 seconds, about three seconds faster than before.
HOW ABOUT SOME VERY COMPACT CODE?
Program D3.FASTER contains the four basic floating point routines, the sine and square root routine, the FRAC (fraction) routine (needed by sine), a dedicated HPLOT routine, TABGEN to generate the lookup tables for HPLOT and oh, yes, the sequence of formulas and for - next loops to run the program. Exclusive of variables and table, guess how much memory is required for ALL, repeat ALL, of the code just listed? We talk about BYTES (NOT words), and decimal numbers, please. Write down your guess, then look at the bottom of page 12 for the right answer.
IT'S BYTES, NOT WORDS:
Yes, we know that the 68000 memory is organized by 16 bit words, at least for program information. However, you should immediately and forthwith adopt the practice of referring to chunks of memory as BYTES exclusively. Forget words.
There are two reasons for this: First, it is the industry standard method of referring to memory (some computers use 36 bit words and how now, brown cow?). Second, text (that's ASCII code) is VERY commonly encountered and talking about 'words' of text causes more than one problem!
Some youngsters just getting into 16 bitters like to talk about words to show that they are au courant. In fact, they are half ASCII.
Page 12
IS THAT THE END OF THE SOFTWARE STATUS REPORT? Of course not. We have also extended the nine decimal digit Microsoft compatible floating point package to include sine, cosine and square root. The square root is calculated using Newton's method, which is much faster than taking the exponent of half the log. The execution time is now about 0.6 milliseconds versus about 50 milliseconds in standard Applesoft (those are approximate times).
And THAT is the end of the software status report.
COMING ATTRACTIONS: The extraordinarily fast 3D graphics (over 100 times faster than Applesoft with the 6502) makes it obvious that we can do things with the 68000, graphics-wise, that are impossible with the 6502. We have an INTERACTIVE three dimensional graphics package half written. User has control in three linear axes, three angular axes, control of acceleration on any axis. The standard version simulates Newton's equations of motion in free space. You cannot control the position or velocity DIRECTLY. Instead, you have control over the acceleration to control the object. Take your hands off the keyboard and all motion continues without change.
Another version lets you control the velocity directly. Take your hands off the keyboard. and the object stays put. The initial release will be 'transparent', hidden lines coming later.
HARDWARE STATUS REPORT: Work will resume on the expansion board, with a top priority, as soon as we get that interactive 3D graphics demo out. We have not (yet) held up a single customer for lack of the expansion board but a couple of them are just about ready for some more memory.
No, we have not yet priced this board, but figure it's a 200 nanosecond static memory board and look in Byte magazine for equivalent pricing and you won't be far off.
ACKNOWLEDGEMENTS: Apple; singular, II and soft are trademarks of the Apple Computer Co. Pet is a trademark of Commodore Business Machines, TRS 80 is a trademark of the Tandy Corp. and DTACK GROUNDED is OUR trademark. PDP11, 11/45, 11/70 and UNIBUS are trademarks of DEC. CP/M is a trademark of Digital Research of CA. Did we forget anybody?
SUBSCRIPTIONS: $15/6 issues U.S. and Canada, $25 U.K. or Germany. We use strictly first class mail. Payment should be made to DTACK GROUNDED. The subscription will start with the first issue unless otherwise specified. The address is:
DTACK GROUNDED
1415 E. McFADDEN, St. F
SANTA ANA CA 92705
D3.FASTER uses a total of #1050 bytes of code. And it is written for SPEED, not compactness.
REDLANDS: Redlands is back, if just barely. We thought that the really fast 68000 HPLOT stuff might interest you.
Page 13
THE PLOT ROUTINE: First we check to see that X (D0) and Y (D1) are within limits, then we set a number of pointers into the address registers using the MOVEM (move multiple registers) instruction. A5 points to the pixel table, so an indexed lookup including D0 fetches a byte to D7 which has the pixel (the one bit) in the correct location.
Then we double Y in D2 and use A2 indexed by D2 to move the two byte base address of the pixel into D4. Next we add the byte offset using index register A1 as the pointer and again using D0 as the offset. In this case, a byte add into a word is legal because no page overflow can occur.
Finally, we add A4. This is the base address of the current HIRES graphics page, either $2000 or $4000. We now have the address (in the Apple) of the pixel in D4 and the pixel in D7. Elapsed time, including testing for the limits: 16 microsecnds! Try that on your 6502!
This plotting method requires two 280 byte tables and one 192 byte tables, a total of 944 bytes. Do we carry all that around with us in the binary object file? No, we have an 86 byte subroutine which generates the three tables, listed on the next page. We did this to make the object code file smaller and also to prove that the table lookup method was used by choice rather than desperation! We're really NOT stone age programmers here. And it's not OUR fault that Apple decided to store 7 (seven) pixels per byte and to thoroughly mix up the starting address of each line.
Code Listing
1 OPT P=68000,BRS,FRS
001000 2 ORG $001000
3
4 * -- TEST X AND Y FOR LIMITS --
5
001000 0C40 0118 6 PLOT CMPI.W #280,D0
001004 64 34 7 BCC SIGOFF
001006 0C41 00C0 8 CMPI.W #192,D1
00100A 64 2E 9 BCC SIGOFF
10
11 * -- LIMITS OK; SET UP THE TABLE POINTERS --
12
00100C 4CB8 7E00 133A 13 MOVEM.W TABLES,A1-A6
14 * SET POINTERS A1 THRU A6
15
16 * -- NEXT FETCH THE PIXEL TO D7 --
17
001012 1E35 0000 18 MOVE.B (A5,D0.W),D7
19
20 * - CALC THE APPLE II HIRES GRAPHICS ADDRESS -
21 * - CORRESPONDING TO X, Y (D0, D1) -
22
001016 3401 23 MOVE.W D1,D2
001018 D441 24 ADD.W D1,D2 D2 = 2 * Y
00101A 3832 2000 25 MOVE.W (A2,D2.W),D4
26 * FETCH THE START OF YTH LINE
00101E D831 0000 27 ADD.B (A1,D0.W),D4
28 * ADD BYTE OFFSET IN LINE
001022 D84C 29 ADD.W A4,D4 HIRES 1 OR 2
30
31 * -- READY TO SEND THE TWO BYTE ADDRESS
32 * OF THE PIXEL, THEN THE PIXEL ITSELF --
33
34 * MOVE THE POINTERS INTO THE ADDRESS REGISTERS
001024 4CB8 6E00 133A 35 TABGEN MOVEM.W TABLES,A1-A3/A5-A6
36
37 * -- FIRST GENERATE THE TWO 280 BYTE
38 * BYTE OFFSET TABLES (INDEXED BY X) --
39
40 * -- SET FOR 40 OUTER LOOPS --
41
00102A 72 27 42 MOVEQ #39,D1
00102C 4207 43 CLR.B D7 ZERO BYTE OFFSET
00102E 70 06 44 STADR MOVEQ #6,D0 SET FOR 7 LOOPS
001030 7C 01 45 MOVEQ #1,D6 FIRST DOT = D0
46
47 * -- THIS IS THE INNER LOOP, REPEAT 7 TIMES --
48
001032 12C7 49 STDOT MOVE.B D7,(A1)+
001034 1AC6 50 MOVE.B D6,(A5)+
001036 E30E 51 LSL.B #1,D6 INCR DOT POS
001038 51C8 FFF8 52 DBF D0,STDOT LOOP 7 TIMES
00103C 5207 53 ADDQ.B #1,D7 INCR BYT ADR
00103E 51C9 FFEE 54 DBF D1,STADR LOOP 40 TIMES
55
56 * -- NOW GENERATE THE 384 BYTE BASE
57 * ADDRESS TABLE (INDEXED BY 2 * Y) --
58
001042 347C 1884 59 MOVE.W #TABLE,A2
60 * SET THE TABLE ADDR IN THE 68K
001046 4244 61 CLR.W D4 READY FOR LOOP 1
001048 4242 62 K CLR.W D2 READY FOR LOOP 2
00104A 4241 63 J CLR.W D1 READY FOR LOOP 3
00104C 3001 64 I MOVE.W D1,D0
00104E D042 65 ADD.W D2,D0
001050 D043 66 ADD.W D3,D0
001052 3584 0000 67 MOVE.W D4,(A2,D0.W)
68 * STORE ADDRESS IN THE TABLE
001056 D87C 0028 69 ADD.W #40,D4
70 * INCREMENT D4 BY #64
00105A D27C 0080 71 ADD.W #128,D1
00105E 0C41 0140 72 CMPI.W #320,D1
001062 65 E8 73 BCS I
74
001064 5044 75 ADDQ #8,D4
001066 D47C 0010 76 ADD.W #16,D2
00106A 0C42 0078 77 CMPI.W #120,D2
00106E 65 DA 78 BCS J
79
001070 5443 80 ADDQ.W #2,D3
001072 0C43 000F 81 CMPI.W #15,D3
001076 65 D0 82 BCS K
83
001078 4E75 84 RTS
85
# 0000103A 86 SIGOFF EQU $00103A
# 0000133A 87 TABLES EQU $00133A
# 00001884 88 TABLE EQU $001884