Copy Link
Add to Bookmark
Report
Playstation GPU information
===========================================================================
GPU information.
===========================================================================
About this document.
---------------------------------------------------------------------------
This document is a collection of all info on the GPU i could find and my
own notes. Most of this is the result of experiment, so not all info might
be correct. This document is most probably not complete, and not all
capabilities and quirks of the GPU are documented. No responsibility is
taken for anything that might occur using the information in this document.
The K-communications text and the one by Nagra/Blackbag are the basis of
this document.
Notations and conventions
When the format of data is given it's shown as a bitwise representation
like this:
pixel| |
bit |0f|0e 0d 0c 0b 0a|09 08 07 06 05|04 03 02 01 00|
desc.|S |Blue |Green |Red |
The "pixel" row shows how large the data is in the frame buffer. Each mark
one this line denotes the size of the data in frame buffer pixels, as that
is the mininum size that kind be addressed.
The bit row shows which bits of the data are used, and separators are used
to show where the different elements of the data stop and start. MSB is on
the left, LSB is on the right. Stuff like |0f-08| means bit $0f to bit $08.
The desc. row shows the description of the different elements. With
separators where the element starts and ends.
--------------------------------------------------------------------------
The Graphics Processing Unit (GPU) - overview.
--------------------------------------------------------------------------
The GPU is the unit responsible for the graphical output of the PSX. It
handles display and drawing of all graphics. It has the control over an 1MB
frame buffer and contains a 2Kb texture cache. It has a command and
data port. It has a 64 byte command FIFO buffer, which can hold up to
3 commands and is connected to a DMA channel for transfer of image data and
linked command lists and a DMA channel for reverse clearing an OT.
---------------------------------------------------------------------------
The Frame Buffer.
---------------------------------------------------------------------------
The frame buffer is the memory which stores all grpahic data which the GPU
can access and manipulate, while drawing and displaying an image . The
memory is under the GPU and cannot be accessed by the CPU directly. It is
operated solely by the GPU. The frame buffer has a size of 1 MB and is
treated as a space of 1024 pixels wide and 512 pixels high. Each "pixel"
has the size of one word (16 bit). It is not treated linearly like usual
memory, but is accessed through coordinates, with an upperleft corner of
(0,0) and a lower right corner of (1023,511).
When data is displayed from the frame buffer, a rectangular area is read
from the specified coordinate within this memory. The size of this area can
be chosen from several hardware defined types. Note that these hardware
sizes are only valid when the X and Y stop/start registers are at their
default values. This display area can be displayed in two color formats,
being 15bit direct and 24bit direct. The data format of one pixel is as
follows:
15bitDirect display.
pixel| |
bit |0f|0e 0d 0c 0b 0a|09 08 07 06 05|04 03 02 01 00|
desc.|M |Blue |Green |Red |
This means each color has a value of 0-31. The MSB of a pixel (M) is used
to mask the pixel.
24bit Direct Display.
The GPU can also be set to 24bit mode, in which case 3 bytes form one
pixel, 1 byte for each color. Data in this mode is arranged as follows:
pixel|0 |1 |2 |
Bit |F-8|7-0|F-8|7-0|F-8|7-0|
desc.|G0 |R0 |R1 |B0 |B1 |G1 |
Thus 2 display pixels are encoded in 3 frame buffer pixels. They are
displayed as follows: [R0,G0,B0] [R1,G1,B1]
---------------------------------------------------------------------------
Primitives.
---------------------------------------------------------------------------
A basic firgure which the GPU can draw is called a primitive, and it can
draw the following:
* Polygon
The GPU can draw 3 point and 4 point polygons. Each point of the polygon
specifies a point in the frame buffer. The polygon can be gouroud shaded.
The correct order of vertices for 4 point polygons is as follows:
1--2 Note: A 4 point polygon is processed internally as two 3 point
| | polygons.
3--4 Note: When drawing a polygon the GPU will not draw the right
most and bottom edge. So a (0,0)-(32,32) rectangle will actually
be drawn as (0,0)-(31,31). Make sure adjoining polygons have the same
coordinates if you want them to touch eachother!. Haven't checked how this
works with 3 point polygons.
* Polygon with texture
A primitive of this type is the same as above, except that a texture is
applied. Each vertex of the polygon maps to a point on a texture page in
the frame buffer. The polygon can be gouroud shaded.
Note: Because a 4 point polygon is processed internally as two 3 point
polygons, texture mapping is also done independently for both halfs.
This has some annoying consequences.
* Rectangle
A rectangle is defined by the location of the top left corner and its width
and height. Width and height can be either free, 8*8 or 16*16. It's drawn
much faster than a polygon, but gouroud shading is not possible.
* Sprite
A sprite is a textured rectangle, defined as a rectangle with coordinates
on a texture page. Like the rectangle is drawn much faster than the polygon
equivalent. No gouroud shading possible.
Note: Even though the primitive is called a sprite, it has nothing in
common with the traditional sprite, other than that it's a rectangular
piece of graphics. Unlike the psx sprite, the traditional sprite is NOT
drawn to the bitmap, but gets sent to the screen instead of the actual
graphics data at that location at display time.
* Line
A line is a straight line between 2 specified points. The line can be
gouroud shaded. A special form is the polyline, for which an arbitrary
number of points can be specified.
* Dot
The dot primitive draws one pixel at the specified coordinate and in the
specified color. It is actually a special form of rectangle, with a size
of 1*1.
---------------------------------------------------------------------------
Texture
---------------------------------------------------------------------------
A texture is an image put on a polygon or sprite. It is necessary to
prepare the data beforehand in the frame buffer. This image is called a
texture pattern. The texture pattern is located on a texture page which
has a standard size and is located somewhere in the frame buffer, see
below. The data of a texture can be stored in 3 different modes:
* 15bitDirect mode.
bit |0f|0e 0d 0c 0b 0a|09 08 07 06 05|04 03 02 01 00|
desc.|S |Blue |Green |Red |
This means each color has a value of 0-31. The MSB of a pixel (S)is used
to specify it the pixel is semi transparent or not. More on that later.
* 8bit CLUT mode,
Each pixel is defined by 8bits and the value of the pixel is converted to
a 15bit color using the CLUT(color lookup table) much like standard vga
pictures. So in effect you have 256 colors which are in 15bit precision.
Bit: |0F-08|07-00|
desc:|I1 |I0 |
I0 is the index to the CLUT for the left pixel, I1 for the right.
* 4bitCLUT mode,
Same as above except that only 16 colors can be used. Data is arranged as
follows:
Bit |F-C|B-8|7-4|3-0|
desc. |I3 |I2 |I1 |I0 |
0 is drawn to the left
* Texture Pages
Texture pages have a unit size of 256*256 pixels, regardless of colormode.
This means that in the frame buffer they will be 64 pixels wide for 4bit
CLUT, 128 pixels wide for 8bit CLUT and 256 pixels wide for 15bit direct.
The pixels are addressed with coordinates relative to the location of the
texture page, not the framebuffer. So the topleft texture coordinate on
a texture page is (0,0) and the bottom right one is (255,255)
The pages can be located in the frame buffer on X multiples of 64 and Y
multiples of 256. More than one texture page can be set up, but each
primitive can only contain texture from one page.
* Texture Windows
The area within a texture window is repeated throughout the texture
page. The data is not actually stored all over the texture page but
the GPU reads the repeated patterns as if they were there. The X and Y
and H and W must be multiples of 8.
* CLUT (Color Lookup Table)
The clut is a the table where the colors are stored for the image data in
the CLUT modes. The pixels of those images are used as indexes to this
table. The clut is arranged in the frame buffer as a 256x1 image for the
8bit clut mode, and a 16x1 image for the 4bit clut mode. Each pixel as a 16
bit value, the first 15 used of a 15 bit color, and the 16th used for
semitransparency. The clut data can be arranged in the frame buffer at X
multiples of 16 (X=0,16,32,48,etc) and anywhere in the Y range of 0-511.
More than one clut can be prepared but only one can be used for each
primitive.
* Texture Caching
If polygons with texture are displayed, the GPU needs to read these from
the frame buffer. This slows down the drawing process, and as a result
the number of polygons that can be drawn in a given timespan. To speed up
this process the GPU is equipped with a texture cache, so a given piece
of texture needs not to be read multiple times in succession.
The texture cache size depends on the color mode used for the textures.
In 4 bit CLUT mode it has a size of 64x64, in 8 bit CLUT it's 32x64 and in
15bitDirect is 32x32. A general speed up can be achieved by setting up
textures according to these sizes. For further speed gain a more precise
knowledge of how the cache works is necessary.
- Cache blocks
The texture page is divided into non-overlapping cache blocks, each of a
unit size according to color mode. These cache blocks are tiled within
the texture page.
+-----+-----+-----+--
|cache| | |
|block| |
| 0| 1 | 2 ..
+-----+-----+--
| | |
..
- Cache entries
Each cache block is divided into 256 cache entries, which are numbered
sequentially, and are 8 bytes wide. So a cache entry holds 16 4bit clut
pixels 8 8bit clut pixels, or 4 15bitdirect pixels.
4bit and 8bit clut: 15bitdirect:
+----+----+----+----+ +----+----+----+----+----+----+----+----+
| 0| 1| 2| 3| | 0| 1| 2| 3| 4| 5| 6| 7|
+----+----+----+----+ +----+----+----+----+----+----+----+----+
| 4| 5| 6| 7| | 8| 9| a| b| c| d| e| f|
+----+----+----+----+ +----+----+----+----+----+----+----+----+
| 8| 9| .. | 10| 11| ..
+----+----+-- +----+----+--
| c| ..| | 18| ..|
+----+-- +----+--
| .. | ..
The cache can hold only one cache entry by the same number, so if f.e. a
piece of texture spans multiple cache blocks and it has data on entry 9 if
block 1, but also on entry 9 of block 2, these cannot be in the cache at
once.
---------------------------------------------------------------------------
Rendering options.
---------------------------------------------------------------------------
There are 3 modes which affect the way the GPU renders the primitives to
the frame buffer.
* Semi Transparency
When semi transparency is set for a pixel, the GPU first reads the pixel it
wants to write to, and then calculates the color it will write from the 2
pixels according to the semitransparency mode selected. Processing speed is
lower in this mode because additional reading and calculating are
necessary. There are 4 semitransparency modes in the GPU.
B= the pixel read from the image in the frame buffer, F = the
halftransparent pixel
* 0.5 x B + 0.5 x F
* 1.0 x B + 1.0 x F
* 1.0 x B - 1.0 x F
* 1.0 x B +0.25 x F
A new semi transparency mode can be set for each primitive. For primitives
without texture semi transparency can be selected. For primitives with
texture semi transparency is stored in the MSB of each pixel, so some pixels
can be set to STP others can be drawn opaque. For the CLUT modes the STP bit
is obtained from the CLUT. So if a color index points to a color in the
CLUT with the MSB set, it will be drawn semi transparent.
When the color is black(BGR=0), STP is processed different from when it's not
black (BGR<>0). The table below shows the differences:
transparency proccessing (bit 1 of command packet)
BGR STP off on
0,0,0 0 Transparent Transparent
0,0,0 1 Non-transparent Non-Transparent
x,x,x 0 Non-Transparent Non-Transparent
x,x,x 1 Non-Transparent Transparent
* Shading
The GPU has a shading function, which will scale the color of a primitive
to a specified brightness. There are 2 shading modes: Flat shading, and
gouraud shading. Flat shading is the mode in which one brightness value is
specified for the entire primitive. In Gouraud shading mode, a different
brightness value can be given for each vertex of a primitive, and the
brightness between these points is automatically interpolated.
* Mask
The mask function will prevent to GPU to write to specific pixels when
drawing in the framebuffer. This means that when the gpu is drawing a
primitive to a masked area, it will first read the pixel at the coordinate
it wants to write to, check if it's masking bit is set, and if so refrain
from writing to that particular pixel. The masking bit is the MSB of the
pixel, just like the STP bit.
To set this masking bit, the GPU provides a mask out mode, which will set
the MSB of any pixel it writes. If both mask out and mask evaluation are
on, the GPU will not draw to pixels with set MSB's, and will draw pixels
with set MSB's to the others, these in turn becoming masked pixels.
---------------------------------------------------------------------------
Drawing Environment
---------------------------------------------------------------------------
The drawing environment specifies all global parameters the GPU needs for
drawing primitives.
* Drawing offset.
This locates the top left corner of the drawing area. Coordinates of
primitives originate to this point. So if the drawing offset is (0,240)
and a vertex of a poligon is located at (16,20) it will be drawn to the
frame buffer at (0+16,240+20).
* Drawing clip area
This specifies the maximum range the GPU draws primitives to. So in effect
it specifies the top left and bottom right corner of the drawing area.
* Dither enable
When dither is enabled the GPU will dither areas during shading. It will
process internally in 24 bit and ditter the colors when converting back to
15bit. When it is off, the lower 3 bits of each color simply get
discarded.
* Draw to display enable.
This will enable/disable any drawing to the area that is currently
displayed.
* Mask enable
When turned on any pixel drawn to the framebuffer by the GPU will have a
set masking bit. (= set MSB)
* Mask judgement enable
Specifies if the mask data from the frame buffer is evaluated at the time
of drawing.
---------------------------------------------------------------------------
Display Environment.
---------------------------------------------------------------------------
This contains all information about the display, and the area displayed.
* Display area in frame buffer
This specifies the resolution of the display. The size can be set
as follows:
Width: 256,320,384,512 or 640 pixels
Height: 240 or 480 pixels
These sizes are only an indication on how many pixels will be displayed
using a default start end. These settings only specify the resolution of
the display.
* Display start/end.
Specifies where the display area is positioned on the screen, and how
much data gets sent to the screen. The screen sizes of the display area
are valid only if the horizontal/vertical start/end values are default. By
changing these you can get bigger/smaller display screens. On most TV's
there is some black around the edge, which can be utilised by setting the
start of the screen earlier and the end later. The size of the pixels is
NOT changed with these settings, the GPU simply sends more data to the
screen. Some monitors/TVs have a smaller display area and the extended
size might not be visible on those sets.(Mine is capable of about 330
pixels horizontal, and 272 vertical in 320*240 mode)
* Interlace enable
When enabled the GPU will display the even and odd lines of the display
area alternately. It is necessary to set this when using 480 lines as the
number of scan lines on a TV screen are not sufficient to display 480
lines.
* 15bit/24bit direct display
Switches between 15bit/24bit display mode.
* Video mode
Selects which video mode to use, which are either PAL or NTSC.
--------------------------------------------------------------------------
Communication and OT's.
--------------------------------------------------------------------------
All data regarding drawing and drawing environment are sent as packets to
the GPU. Each packet tells the GPU how and where to draw one primitive, or
it sets one of the drawing environment parameters. The display environment
is set up through single word commands using the control port of the GPU.
Packets can be forwarded word by word through the data port of the GPU, or
more efficiently for large numbers of packets through DMA. A special DMA
mode was created for this so large numbers of packets can be sent and
managed easily. In this mode a list of packets is sent, where each entry in
the list contains a header which is one word containing the address of the
next entry and the size of the packet and the packet itself. A result of
this is that the packets do not need to be stored sequentially. This makes
it possible to easily control the order in which packets get processed. The
GPU processes the packets it gets in the order they are offered. So the
first entry in the list also gets drawn first. To insert a packet into the
middle of the list simply find the packet after which you want it to be
processed, replace the address in that packet with the address of the new
packet, and let that point to the address you replaced.
To aid you in finding a location in the list the Ordering Table was
invented. At first this is basically a linked list with entries of packet
size 0, so it's a list of only listentryheaders, where each entry points to
to the next entry. Then as primitives are generated by your program you can
then add them to the table at a certain index. Just read the address in the
table entry and replace it with the address of the new packet and store the
address from the table in the packet. When all packets are generated and
you want to draw, just pass the address of the first listentry to the DMA
and the packets will get drawn in the order you entered the packets to the
table. Packets entered at a higher table index will get drawn after those
entered at a lower table index. Packets entered at the same index will get
drawn in the order they were entered, the last one first.
In 3d drawing it's most common that you want the primitives with the highest
Z value to be drawn first, so it would be nice if the table would be drawn
the other way around, so the Z value can be used as index. This is a simple
thing, just make a table of which each entry points to the previous entry,
and start the DMA with the address of the last table entry. To assist you
in making such a table, a special DMA channel is available which creates
it for you.
--------------------------------------------------------------------------
GPU operation
--------------------------------------------------------------------------
* GPU control registers.
There are 2 32 bit io ports for the GPU, which are:
$1f801810 GPU Data
$1f801814 GPU control/Status
The data register is used to exchange data with the GPU.
The control/status register, gives the status of the GPU when read, and
sets the control bits when written to.
* Control/Status Register $1f801814
Status (Read)
-----------------------------------------------------------------------------
|1f |1e 1d|1c |1b |1a |19 18|17 |16 |15 |14 |13 |12 11 |10 |
|lcf|dma |com|img|busy| ? ?|den|isinter|isrgb24|Video|Height|Width0|Width1|
-----------------------------------------------------------------------------
W0 W1
Width: 00 0 256 pixels
01 0 320
10 0 512
11 0 640
00 1 384
Height: 0 240 pixels
1 480
Video: 0 NTSC
1 PAL
isrgb24: 0 15 bit direct mode
1 24 bit direct mode
isinter: 0 Interlace off
1 Interlace on
den: 0 Display enabled
1 Display disabled
busy: 0 GPU is Busy (ie. drawing primitives)
1 GPU is Idle
img: 0 Not Ready to send image (packet $c0)
1 Ready
com: 0 Not Ready to recieve commands
1 Ready
dma: 00 DMA off, communication through GP0
01
10 DMA CPU -> GPU
11 DMA GPU -> CPU
lcf: 0 Drawing even lines in interlace mode
1 Drawing uneven lines in interlace mode
----------------------------------------------------
|0f 0e 0d|0c|0b|0a |09 |08 07|06 05|04|03 02 01 00|
| ? ? ?|me|md|dfe |dtd|tp |abr |ty|tx |
----------------------------------------------------
tx: 0 0 Texture page X = tx*64
1 64
2 128
3 196
4 ...
ty 0 0 Texture page Y
1 256
abr %00 0.5xB+0.5 xF Semi transparent state
%01 1.0xB+1.0 xF
%10 1.0xB-1.0 xF
%11 1.0xB+0.25xF
tp %00 4bit CLUT Texture page color mode
%01 8bit CLUT
%10 15bit
dtd 0 Ditter off
1 Ditter on
dfe 0 Draw to display area prohibited
1 Draw to display area allowed
md 0 off
1 on Apply mask bit to drawn pixels.
me 0 off
1 on No drawing to pixels with set mask bit.
Control (Write)
--------------------------------------------------------------------------
A control command is composed of one word as follows:
bit 1f-18 17-0
command parameter.
The composition of the parameter is different for each command.
--------------------------------------------------------------------------
*Reset GPU
command $00
parameter $000000
Description Resets the GPU. Also seems to turn off screen.
(sets status to $14802000)
--------------------------------------------------------------------------
*Reset Command Buffer
command $01
parameter $000000
Description Resets the command buffer.
--------------------------------------------------------------------------
*Reset IRQ
command $02
parameter $000000
Description Resets the IRQ. No idea of what this means.
--------------------------------------------------------------------------
*Display Enable
command $03
parameter $000000 Display enable
$000001 Display disable
Description Turns on/off display. Note that a turned off
screen still gives the flicker of NTSC on a
pal screen if NTSC mode is selected..
--------------------------------------------------------------------------
*DMA setup.
command $04
parameter $000000 DMA disabled
$000001 DMA ?
$000002 DMA CPU to GPU
$000003 DMA GPU to CPU
Description Sets dma direction. K-comm also mentions something
about parameter $01, but i wasn't able to translate.
--------------------------------------------------------------------------
*Start of display area
command $05
parameter bit $00-$09 X (0-1023)
bit $0A-$12 Y (0-512)
= Y<<10 + X
description Locates the top left corner of the display area.
--------------------------------------------------------------------------
*Horizontal Display range
command $06
parameter bit $00-$0b X1 ($1f4-$CDA)
bit $0c-$17 X2
= X1+X2<<12
description Specifies the horizontal range within which the
display area is displayed. The display is relative
to the display start, so X coordinate 0 will be at
the value in X1. The display end is not relative to
the display start. The number of pixels that get sent
to the screen in 320 mode are (X2-X1)/8. How many
actually are visible depends on your TV/monitor.
(normally $260-$c56)
--------------------------------------------------------------------------
*Vertical Display range
command $07
parameter bit $00-$09 Y1
bit $0a-$14 Y2
= Y1+Y2<<10
description Specifies the vertical range within which the
display area is displayed. The display is relative
to the display start, so Y coordinate 0 will be at
the value in Y1. The display end is not relative to
the display start. The number of pixels that get sent
to the display are Y2-Y1, in 240 mode.
(Not sure about the default values, should be
something like NTSC $010-$100, PAL $023-$123)
--------------------------------------------------------------------------
*Display mode
command $08
parameter bit $00-$01 Width 0
bit $02 Height
bit $03 Videomode See above
bit $04 Isrgb24
bit $05 Isinter
bit $06 Width1
bit $07 Reverseflag
description Sets the display mode.
--------------------------------------------------------------------------
*GPU Info
command $10
parameter $000000
$000001
$000002
$000003 Draw area top left
$000004 Draw area bottom right
$000005 Draw offset
$000006
$000007 GPU Type, should return 2 for a standard GPU.
description Returns requested info. Read result from GP0.
0,1 seem to return draw area top left also
6 seems to return draw offset too.
--------------------------------------------------------------------------
*Some other commands i do not know the function of:
*?????
command $20
parameter ???????
description i've seen it used with value $000504
what it does?????
*?????
command $09
parameter $000001 ??
description I've seen it used with value $000001
what it does?????
--------------------------------------------------------------------------
Command Packets, Data Register.
--------------------------------------------------------------------------
Primitive command packets use an 8 bit command value which is present in
all packets. They contain a 3 bit type block and a 5 bit option block of
which the meaning of the bits depend on the type. Layout is as follows:
Type:
000 GPU command
001 Polygon primitive
010 Line primitive
011 Sprite primitive
100 Transfer command
111 Environment command
Configuration of the option blocks for the primitives is as follows:
Polygon:
| 7 6 5 | 4 | 3 | 2 | 1 | 0 |
| 0 0 1 |IIP|3/4|Tme|Abe|Tge|
Line:
| 7 6 5 | 4 | 3 | 2 | 1 | 0 |
| 0 1 0 |IIP|Pll| 0 |Abe| 0 |
Sprite:
| 7 6 5 | 4 3 | 2 | 1 | 0 |
| 1 0 0 | Size |Tme|Abe| 0 |
IIP 0 Flat Shading
1 Gouroud Shading
3/4 0 3 vertex polygon
1 4 vertex polygon
Tme 0 Texture mapping off
1 on
Abe 0 Semi transparency off
1 on
Tge 0 Brightness calculation at time of texture mapping on
1 off. (draw texture as is)
Size 00 Free size (Specified by W/H)
01 1 x 1
10 8 x 8
11 16 x 16
Pll 0 Single line (2 vertices)
1 Polyline (n vertices)
* Color information
Color information is forwarded as 24 bit data. It is parsed to
15 bit by the GPU.
Layout as follows:
17-10 $0f-$08 $07-$00
Blue Green Red
* Shading information.
For textured primitive shading data is forwarded by this packet.
Layout is the same as for color data, the RGB values controlling
the brightness of the individual colors ($00-$7f). A value of $80 in a
color will take the former value as data.
*Texture Page information
The Data is 16 bit wide, layout is as follows:
|F E D C B A 9|8 7|6 5|4 |3 2 1 0|
|0 |tp |abr|ty|tx |
tx 0-f X*64 texture page x coord
ty 0 0 texture page y coord
1 256
abr 0 0.5xB+0.5 xF Semi transparency mode
1 1.0xB+1.0 xF
2 1.0xB-1.0 xF
3 1.0xB+0.25xF
tp 0 4bit CLUT
1 8bit CLUT
2 15bit direct
CLUT-ID
Specifies the location of the CLUT data. Data is 16bits.
F-6 Y coordinate 0-511
5-0 X coordinate X/16
--------------------------------------------------------------------------
abbreviations in packet list
--------------------------------------------------------------------------
BGR Color/Shading info see above.
xn,yn 16 bit values of X and Y in frame buffer.
un,vn 8 bit values of X and Y in texture page
tpage texture page information packet, see above
clut clut ID, see above.
--------------------------------------------------------------------------
Packet list.
--------------------------------------------------------------------------
The packets sent to the GPU are processed as a group of data,
each one word wide. The data must be written to the GPU data register
($1f801810) sequentially. Once all data has been recieved, the GPU
starts operation.
Overview of packet commands:
Primitive drawing packets
$20 monochrome 3 point polygon
$24 textured 3 point polygon
$28 monchrome 4 point polygon
$2c textured 4 point polygon
$30 gradated 3 point polygon
$34 gradated textured 3 point polygon
$38 gradated 4 point polygon
$3c gradated textured 4 point polygon
$40 monochrome line
$48 monochrome polyline
$50 gradated line
$58 gradated line polyline
$60 rectangle
$64 sprite
$68 dot
$70 8*8 rectangle
$74 8*8 sprite
$78 16*16 rectangle
$7c 16*16 sprite
GPU command & Transfer packets
$01 clear cache
$02 frame buffer rectangle draw
$80 move image in frame buffer
$a0 send image to frame buffer
$c0 copy image from frame buffer
Draw mode/environment setting packets
$e1 draw mode setting
$e2 texture window setting
$e3 set drawing area top left
$e4 set drawing area bottom right
$e5 drawing offset
$e6 mask setting
--------------------------------------------------------------------------
Packet Descriptions
--------------------------------------------------------------------------
Primitive Packets
--------------------------------------------------------------------------
$20 monochrome 3 point polygon
|1f-18|17-10|0f-08|07-00|
1|$20 |BGR |command+color
2|y0 |x0 |vertexes
3|y1 |x1 |
4|y2 |x2 |
--------------------------------------------------------------------------
$24 textured 3 point polygon
|1f-18|17-10|0f-08|07-00|
1|$24 |BGR |command+color
2|y0 |x0 |vertex 0
3|clut |v0 |u0 |clutid+ texture coords vertext 0
4|y1 |x1 |
5|tpage |v1 |u1 |
6|y2 |x2 |
7| |v2 |u2 |
--------------------------------------------------------------------------
$28 monchrome 4 point polygon
|1f-18|17-10|0f-08|07-00|
1|$28 |BGR |command+color
2|y0 |x0 |vertexes
3|y1 |x1 |
4|y2 |x2 |
5|y3 |x3 |
--------------------------------------------------------------------------
$2c textured 4 point polygon
|1f-18|17-10|0f-08|07-00|
1|$2c |BGR |command+color
2|y0 |x0 |vertex 0
3|clut |v0 |u0 |clutid+ texture coords vertext 0
4|y1 |x1 |
5|tpage |v1 |u1 |
6|y2 |x2 |
7| |v2 |u2 |
8|y3 |x3 |
9| |v3 |u3 |
--------------------------------------------------------------------------
$30 graduation 3 point polygon
|1f-18|17-10|0f-08|07-00|
1|$30 |BGR0 |command+color
2|y0 |x0 |vertexes
3| |BGR1 |
4|y1 |x1 |
5| |BGR2 |
6|y2 |x2 |
--------------------------------------------------------------------------
$34 shaded textured 3 point polygon
|1f-18|17-10|0f-08|07-00|
1|$34 |BGR0 |command+color
2|y0 |x0 |vertex 0
3|clut |v0 |u0 |clutid+ texture coords vertex 0
4| |BGR1 |
5|y1 |x1 |
6|tpage |v1 |u1 |
7| |BGR2 |
8|y2 |x2 |
9| |v2 |u2 |
--------------------------------------------------------------------------
$38 gradated 4 point polygon
|1f-18|17-10|0f-08|07-00|
1|$38 |BGR0 |command+color
2|y0 |x0 |vertexes
3| |BGR1 |
4|y1 |x1 |
5| |BGR2 |
6|y2 |x2 |
7| |BGR3 |
8|y3 |x3 |
--------------------------------------------------------------------------
$3c shaded textured 4 point polygon
|1f-18|17-10|0f-08|07-00|
1|$3c |BGR0 |command+color
2|y0 |x0 |vertex 0
3|clut |v0 |u0 |clutid+ texture coords vertex 0
4| |BGR1 |
5|y1 |x1 |
6|tpage |v1 |u1 |texture page location
7| |BGR2 |
8|y2 |x2 |
9| |v2 |u2 |
a| |BGR3 |
b|y3 |x3 |
c| |v3 |u3 |
--------------------------------------------------------------------------
$40 monochrome line
|1f-18|17-10|0f-08|07-00|
1|$40 |BGR |command+color
2|y0 |x0 |vertex 0
3|y1 |x1 |vertex 1
--------------------------------------------------------------------------
$48 single color polyline
|1f-18|17-10|0f-08|07-00|
1|$48 |BGR |command+color
2|y0 |x0 |vertex 0
3|y1 |x1 |vertex 1
4|y2 |x2 |vertex 2
.|yn |xn |vertex n
.|$55555555 Temination code.
Any number of points can be entered, end with termination code.
--------------------------------------------------------------------------
$50 gradated line
|1f-18|17-10|0f-08|07-00|
1|$50 |BGR0 |command+color
2|y0 |x0 |
3| |BGR1 |
4|y1 |x1 |
--------------------------------------------------------------------------
$58 gradated line polyline
|1f-18|17-10|0f-08|07-00|
1|$58 |BGR0 |command+color
2|y0 |x0 |
3| |BGR1 |
4|y1 |x1 |
5| |BGR2 |
6|y2 |x2 |
.| |BGRn |
.|yn |xn |
.|$55555555 Temination code.
Any number of points can be entered, end with termination code.
--------------------------------------------------------------------------
$60 rectangle
|1f-18|17-10|0f-08|07-00|
1|$60 |BGR |command+color
2|y |x |
3|h |w |
--------------------------------------------------------------------------
$64 sprite
|1f-18|17-10|0f-08|07-00|
1|$64 |BGR |command+color
2|y |x |
3|clut |v |u |clut location, texture page y,x
4|h |w |
--------------------------------------------------------------------------
$68 dot
|1f-18|17-10|0f-08|07-00|
1|$68 |BGR |command+color
2|y |x |
--------------------------------------------------------------------------
$70 8*8 rectangle
|1f-18|17-10|0f-08|07-00|
1|$70 |BGR |command+color
2|y |x |
--------------------------------------------------------------------------
$74 8*8 sprite
|1f-18|17-10|0f-08|07-00|
1|$74 |BGR |command+color
2|y |x |
3|clut |v |u |clut location, texture page y,x
--------------------------------------------------------------------------
$78 16*16 rectangle
|1f-18|17-10|0f-08|07-00|
1|$78 |BGR |command+color
2|y |x |
--------------------------------------------------------------------------
$7c 16*16 sprite
|1f-18|17-10|0f-08|07-00|
1|$7c |BGR |command+color
2|y |x |
3|clut |v |u |clut location, texture page y,x
--------------------------------------------------------------------------
GPU command & Transfer packets
--------------------------------------------------------------------------
$01 clear cache
|1f-18|17-10|0f-08|07-00|
1|$01 |0 |clear cache.
Seems to be the same as the GP1 command.
--------------------------------------------------------------------------
$02 frame buffer rectangle draw
|1f-18|17-10|0f-08|07-00|
1|$02 |BGR |command+color
2|Y |X |Topleft corner
3|H |W |Width & Height
Fills the area in the frame buffer with the value in RGB. This command
will draw without regard to drawing environment settings. Coordinates are
absolute frame buffer coordinates. Max width is $3ff, max height is $1ff.
--------------------------------------------------------------------------
$80 move image in frame buffer
|1f-18|17-10|0f-08|07-00|
1|$02 | 0|command
2|sY |sX |Source coord.
3|dY |dX |Destination coord.
4|H |W |Height+Width of transfer
Copys data within framebuffer
--------------------------------------------------------------------------
$01 $a0 send image to frame buffer
|1f-18|17-10|0f-08|07-00|
|$01 | |Reset command buffer (write to GP1 or GP0)
1|$A0 | |
2|Y |X |Destination coord.
3|H |W |Height+Width of transfer
4|pix1 |pix0 |image data
5..
?|pixn |pixn-1 |
Transfers data from mainmemory to frame buffer
If the number of pixels to be sent is odd, an extra should be
sent. (32 bits per packet)
---------------------------------------------------------------------------
$01 $c0 copy image from frame buffer
|1f-18|17-10|0f-08|07-00|
|$01 | |Reset command buffer (write to GP1 or GP0)
1|$C0 | |
2|Y |X |Destination coord.
3|H |W |Height+Width of transfer
4|pix1 |pix0 |image data (read from data port)
5..
?|pixn |pixn-1 |
Transfers data from frame buffer to mainmemory. Wait for bit 27
of the status register to be set before reading the image data.
When the number of pixels is odd, an extra pixel is read at the
end.(because on packet is 32 bits)
--------------------------------------------------------------------------
Draw mode/environment setting packets
--------------------------------------------------------------------------
Some of these packets can also be by primitive packets, in any
case it is the last packet of either that the GPU recieved
that is active. so if a primitive sets tpage info, it will over
write the existing data, even if it was sent by an $e? packet.
--------------------------------------------------------------------------
$e1 draw mode setting
|1f-18|17-0b|0a |09 |08 07|06 05|04|03 02 01 00|
1|$e1 | |dfe|dtd|tp |abr |ty|tx | command +values
see above for explanations
It seems that bit $0b-$0d of the status reg can also be passed with this
command on some GPU's other than type 2. (ie. Command $10000007 doesn't
return 2)
--------------------------------------------------------------------------
$e2 texture window setting
|1F-18|17-14|13-0F|0E-0A|09-05|04-00|
1|$E2 |twy |twx |twh |tww | command + value
twx Texture window X, (twx*8)
twy Texture window Y, (twy*8)
tww Texture window width, 256-(tww*8)
twh Texture window height, 256-(twh*8)
--------------------------------------------------------------------------
$e3 set drawing area top left
|1f-18|17-14|13-0a|09-00|
1|$e3 | |Y |X |
sets the drawing area topleft corner. X&Y are absolute frame
buffer coords.
--------------------------------------------------------------------------
$e4 set drawing area bottom right
|1f-18|17-14|13-0a|09-00|
1|$e4 | |Y |X |
sets the drawing area bottom right. X&Y are absolute frame
buffer coords.
--------------------------------------------------------------------------
$e5 drawing offset
|1f-18|17-14|14-0b|0a-00|
1|$e5 | |OffsY|OffsX|
(offset Y = y << 11)
sets the drawing area offset within the drawing area. X&Y are
offsets in the frame buffer.
--------------------------------------------------------------------------
$e6 mask setting
|1f-18|17-02|01 |00 |
1|$e6 | |Mask2|Mask1|
Mask1 Set mask bit while drawing. 1 = on
Mask2 Do not draw to mask areas. 1= on
While mask1 is on, the GPU will set the MSB of all pixels it draws.
While mask2 is on, the GPU will not write to pixels with set MSB's
--------------------------------------------------------------------------
DMA
--------------------------------------------------------------------------
The GPU has two DMA channels allocated to it. DMA channel 2 is used to send
linked packet lists to the GPU and to transfer image data to and from the
frame buffer. DMA channel 6 is sets up an empty linked list, of which each
entry points to the previous (ie. reverse clear an OT.)
--------------------------------------------------------------------------
D2_MADR DMA base address. $1f8010a0
bit |1f 00|
desc|madr |
madr pointer to the adress the DMA will start reading from/writing to
--------------------------------------------------------------------------
D2_BCR DMA block control $1f8010a4
bit |1f 10|0f 00|
desc|ba |bs |
ba Amount of blocks
bs Blocksize (words)
Sets up the DMA blocks. Once started the DMA will send ba blocks of bs
words. Don't set a blocksize larger then $10 words, as the command buffer
of the GPU is 64 bytes.
--------------------------------------------------------------------------
D2_CHCR DMA channel control $1f8010a8
bit |1f-19|18|17-0c|0b|0a|09|08|07 01|00|
desc| 0|Tr| 0| 0|Li|Co| 0| 0|Dr|
Tr 0 No DMA transfer busy.
1 Start DMA transfer/DMA transfer busy.
Li 1 Transfer linked list.
Co 1 Transfer continous stream of data.
Dr 0 direction to memory
1 direction to GPU
This configures the DMA channel. The DMA starts when bit 18 is set. DMA
is finished as soon as bit 18 is cleared again. To send or recieve data
to/from VRAM send the appriopriate GPU packets first ($a0/$c0)
--------------------------------------------------------------------------
D6_MADR DMA base address. $1f8010e0
bit |1f 00|
desc|madr |
madr Last table entry.
--------------------------------------------------------------------------
D6_BCR DMA block control $1f8010e4
bit |1f 00|
desc|bc |
bc Number of list entries.
--------------------------------------------------------------------------
D6_CHCR DMA channel control $1f8010e8
bit |1f-1d|1c|1b-19|18|17-02|01|00|
desc| 0|OT| 0|Tr| 0|Ot| 0|
Tr 0 No DMA transfer busy.
1 Start DMA transfer/DMA transfer busy.
Ot 1 Set to do an OT clear.
When this register is set to $11000002, the DMA channel will create an
empty linked list of D6_BCR entries ending at the address in D6_MADR. Each
entry has a size of 0, and points to the previous. The first entry is
So if D6_MADR = $80100010, D6_BCR=$00000004, and the DMA is kicked this
will result in a list looking like this:
$80100000 $00ffffff
$80100004 $00100000
$80100008 $00100004
$8010000c $00100008
$80100010 $0010000c
--------------------------------------------------------------------------
DPCR Dma control register $1f8010f0
|1f 1c|1b 18|17 14|13 10|0f 0c|0b 08|07 04|03 00|
| |Dma6 |Dma5 |Dma4 |Dma3 |Dma2 |Dma1 |Dma0 |
Each register has a 4 bit control block allocated in this
register.
Bit 3: 1= Dma Enabled
2: ?
1: ?
0: ?
Bit 3 must be set for a channel to operate.
--------------------------------------------------------------------------
Common GPU functions, step by step.
--------------------------------------------------------------------------
* Initializing the GPU.
First thing to do when using the GPU is to initialize it. To do that take
the following steps:
1 - Reset the GPU (GP1 command $00). This turns off the display aswell.
2 - Set horizontal and vertical start/end. (GP1 command $06, $07)
3 - Set display mode. (GP1 command $08)
4 - Set display offset. (GP1 command $05)
5 - Set draw mode. (GP0 command $e1)
6 - Set draw area. (GP0 command $e3, $e4)
7 - Set draw offset. (GP0 command $e5)
8 - Enable display.
* Sending a linked list.
The normal way to send large numbers of primitives is by using a linked
list dma transfer. This list is built up of entries of which each points to
the next. One entry looks like this:
dw $nnYYYYYY ; nn = the number of words in the list entry
; YYYYYY = address of next list entry & $00ffffff
1 dw .. ; here goes the primitive.
2 dw .. ;
. dw .. ;
nn-1 dw .. ;
nn dw .. ;
The last entry in the list should have $ffffff as pointer, which is the
terminator. As soon as this value is found DMA is ended. If the entry
size is set to 0, no data will be transferred to the GPU and the next
entry is processed.
To send the list do this:
1 - Wait for the GPU to be ready to recieve commands. (bit $1c == 1)
2 - Enable DMA channel 2
3 - Set GPU to DMA cpu->gpu mode. ($04000002)
3 - Set D2_MADR to the start of the list
4 - Set D2_BCR to zero.
5 - Set D2_CHCR to link mode, mem->GPU and dma enable. ($01000401)
* Uploading Image data through DMA.
To upload an image to VRAM take the following steps:
1 - Wait for the GPU to be idle and DMA to finish. Enable DMA channel 2
if necessary.
2 - Send the 'Send image to VRAM' primitive. (You can send this through
dma if you want. Use the linked list method described above)
3 - Set DMA to CPU->GPU ($04000002) (if you didn't do so already in the
previous step)
4 - Set D2_MADR to the start of the list
5 - Set D2_BCR with : bits 31-16 = Number of words to send (H*W /2)
bits 15- 0 = Block size of 1 word. ($01)
if H*W is odd, add 1. (Pixels are 2 bytes, send
an extra blank pixel in case of an odd amount)
6 - Set D2_CHCR to continuous mode, mem -> GPU and dma enable. ($01000201)
Note that H, W, X and Y are always in frame buffer pixels, even if you send
image data in other formats.
You can use bigger block sizes if you need more speed. If the number of
words to be sent is not a multiple of the blocksize, you'll have to send
the remainder seperately, because the GPU only accepts an extra halfword
if the number of pixels is odd. (ie. of the last word sent, only the low
half word is used.) Also take care not to use blocksizes bigger than $10, as
the buffer of the GPU is only 64 bytes (=$10 words).
* Waiting to send commands
You can send new commands as soon as DMA has ceased and the GPU is ready.
1 - Wait for bit $18 to become 0 in D2_CHCR
2 - Wait for bit $1c to become 1 in GP1.
* Vsync
Step by step for a VSYNC counter coming up (not)soon.
Meanwhile you can init the pad driver and as soon as you want to
check for VSYNC, fill the return buffer with 0 and wait for it to change.
The pad driver checks the pads every VSYNC. Check the greentro source for
an example.
--------------------------------------------------------------------------
Missing info.
--------------------------------------------------------------------------
There's still a lot yet uncovered, so if you have/know anything that's not
in here please mail it to me. Things i'm looking for particularly are
info on the differences between the various versions and revisions of the
GPU, and something about drawing speeds and other timing.
--------------------------------------------------------------------------
History:
--------------------------------------------------------------------------
23/apr/1999 First public release.
28/apr/1999 Some bugfixes and rewrites.
Info on texture pages corrected. <Silpheed>
8/may/1999 Detailed packet composition.
20/may/1999 DMA & Step by steps added.
25/jun/1999 More DMA, OT and lists.
30/aug/1999 Correction. ($03)
--------------------------------------------------------------------------
Maintained by doomed/padua. Any errors, additions -> <doomed@c64.org>
--------------------------------------------------------------------------
--== http://psx.rules.org/ ==--
--== http://www.padua.org/ ==--
--------------------------------------------------------------------------
Thanx & Hello to:
Silpheed Groepaz Brainwalker & Hitmen, Antiloop Middy Danzig & Napalm,
K-Communications, Blackbag, TDJ Sander & Focus, Burglar LCF & SCS*TRC,
Deekay & Crest, Graham NO-XS & Oxyron, MrAlpha Fungus & F4CG, Zealot &
Wrath Design, Shape, Naphalm Jazzcat & Onslaught, Reyn Ouwehand, WHW & WOW,
all active people on PSX and C64, #psxdev, #c-64.
--------------------------------------------------------------------------