Copy Link
Add to Bookmark
Report
Xine - issue #3 - Phile 103
/-----------------------------\
| Xine - issue #3 - Phile 103 |
\-----------------------------/
*----------------------------------*
* Mid-Infection on relocations *
* by b0z0/iKX, Padania 1998 *
*----------------------------*
1) The story so far:
--------------------
Midfile infection is undoubtely one of the most interesting but not yet
totally explored topics in virus writing. There aren't actually many midfile
infectors around and there are even less "real" midfile infectors. With "real"
I mean viruses that get control in a random moment from the host, not just
viruses that places themselves physically in the middle of the infected
file but anyway gain the control immediately at the start of execution or
viruses that gain control at the beginning of the execution after jumps
or pieces of code placed somewhere in the middle of the host. This kind of
midinfection has been more developed and is anyway easier to implement. Of
course it is also easier to detect such a virus with a good code emulator,
since it gets control directly or after some garbage code quite soon.
While the other kind of midfile infectors that gain control somewhere in
the middle of the host execution are undoubtely more complicated and more
interesting. For example there a few viruses (Nexiv_Der, Sailor_Moon and
The_Bugger) that put a CALL in the infected COM file that passes the control
to the virus from a random position of the file. Of course they have to
execute the host step-by-step to be sure that a valid instruction is
real starting there, check that the COM files aren't selfmodifying and
so on. Of course the CALL to the virus may not always be executed,
depending on the actual program execution. For an antivirus is very hard
to find such a virus, since to get to it's entrypoint the antivirus should
have to emulate the entire COM file, trying all the possible combinations
of the program execution. And this is quite impossible. At least some
antiviruses will try to examine some amount of bytes at the end of the file,
even without being sure it will became executed, to scan for viruses. But this
lame scanning is simply avoidable by the virus by using a good poly engine.
But COM files are fading away, there are just a few of them around on newer
operating systems and programs. Under both DOS and Windows (3.x/95/98/NT/...)
you quite don't have anymore any COM files, but rather more types of EXE files.
The format and organization of the DOS EXE and other Windows executables are
anyway more complex and midfile infection should not be as easy as in COM
files. But more complex doesn't mean impossible to implement, and some
features expecially when working under Windows in protected mode can came
in our help.
Ah, from this point ahead some knowledge of EXEs infection and EXE structure
is required, but I'll try to put some references to other material and to
explain as easier as possible the main concepts.
2) Infection on relocations:
----------------------------
Since on all kind of EXEs, from DOS to various Windows ones, we don't have
anymore just one 64k segment to work on like with COMs, we can't just put a
CALL to our code, since it should be too far away (with DOS EXE) or should be
just in another code segment (for example when dealing with NE files).
So one way to get control from somewhere in the user code we should use a
CALL FAR or a JMP FAR that will jump to a given pair segment:offset. Of course
since we don't know where the operating system will load the various segments
in memory we need to use a relocation, this is correct the adress where
we are going to jump depending on where the stuff has been loaded.
The calculation of this adress in memory should be done in some cases
manually with a few calculations, but it is always done at load time by the
operating system. Infact when the operating system loads the executable that
is going to be run it checks in its tables (I'll be more precise for various
EXEs later) for adresses that need to be corrected, or with another word need
to be relocated.
So if you would like to call your code from the original host you should put
a JMP/CALL FAR using the physical offset (this is the offset expressed as pair
segment:offset in the real file on the disk) and then add an entry in the
relocation table that will make sure the real adress of the code loaded in
memory will be put instead of the physical one. There is noone that prevents
you from adding some relocation entry, you will just have to mess around with
various structures and look for avaiable space, depending on the type of EXE.
While you should get some ideas from this I'll now focus the attention on a
method for NE infection based on relocations, but that is slightly different
from the concept introduced above.
3) Implementation in the NewEXE:
--------------------------------
It's well known that Windows programs are everyday bigger and bigger and
so as you can imagine jumps from one NE segment to another (each has a limit
of 64k) are very common. Even more common under Windows is the use of APIs
to perform various tasks. In both cases the programmer has to code an adress
that will be dinamically at load time updated with the actual real position
of the needed piece of code in memory.
In this method of infection we will modify one of those entries to point to
our virus code. In this way when the original host will have to call that API
(or jump somewhere or whatever) the control will pass to our virus. The virus
will do its work and then will do the call (or whatever) the host was executing
before infection.
It's not so hard, so let's go step-by-step considering we already have our
file to infect. I won't present code here since it should became messy and
you can anyway find quite commented code in the Free_Padania virus that comes
with the Xine that uses this method. Now go on:
*) First of all we will have to do the various checks if it is already
infected, if it is suitable for infection and such normal stuff
*) Now we have to add our code to the executable and of course add the
needed changes to the NE header so it won't be treated as junk :-).
The virus provided with the zine for the demonstration of this
method (the Free_Padania) adds its own code segment in the NE, using
sorta VLAD method (check out VLAD#5 and the code for more!). You should
also infect the NE in some other way anyway, just remember that we will
need to add a relocation too!
*) Now we suppose to have the NE header ready and we have our pair of
segment:offset that points to our virus code. Now we have to read the
NE segment table (which starting offset is at offset 22h). Of course
this should be done before infection, but that are just small details
in which we aren't going to lose our time :) Each segment table is 8
bytes long, please refeer to the bottom of this document for the
details.
*) Now we select one of the segments in which we will change the relocation
entry. Of course we should not consider the just added virus segment and
usually we should not consider the last host segment, which is usually
a DATA segment with stack and such. So it is important to check that the
segment we would like to use for infection is type CODE, so it will be
executed, and has relocations, so we have something to infect. This is
simply done by checking the segment attributes from bits 0 and 8.
*) Selected the segment now we have to read the relocation items of the
segment. The relocation items are put just after the segment itself.
So we can just get the offset where the segment starts from the segment
table entry (pay attention since it is expressed in logical sectors, so
you must shift this value left by the value at offset 32h of the NE
header) and add its lenght (already in bytes at offset 02h of the
segment entry). The first word just after the segment contains the
number of relocations present. After this we will find all the relocation
entries for the segment.
*) Now we have a number of potential relocation items to change. We have
now to select randomly one. The only check we will do is that the
relocation is made on a 32bit pointer. Infact our CALL FARs to the virus
will be 32bit pointers, so we don't want to get just a 16bit relocation
or some other type of relocation (look at tables below for the possible
ones).
*) When a relocation item has been choosen we just have to change it so
it will point to the virus entry point. The relocation item should be
a 32bit PTR (so 03h at offset 0 in the relocation item) and should be
a normal (not additive! since this would relocate the entry in the
opposite way) relocation (so 00h at offset 1). The location of the
relocatable in the file is of course unchanged. We must then finally
change the module and the offset to which the relocation will point.
So we will put the segment of the virus at offset 04h and the offset
of the virus entry point at offset 06h of the relocation table entry.
Since we will need the original relocation entry later you must put
a copy of it somewhere.
*) Now we have to write the modified relocation entry again to the file
where it originally was. Now the job to pass the control from the host
to the virus is done. We must just make the restoration work.
*) Since we deleted a call to a function we must make this somewhere else.
This will be done in the restoration part of the virus. We should have
to make a CALL FAR to the old function after our virus will make its work
(going resident or infecting files at runtime). Of course the virus
will have to preserve the registers and then jump back after the
modified call in the host. But remember that the virus got control
via a CALL FAR (-theorically- it should also get control via a JMP FAR,
but this isn't present in NExes. Anyway if this should succeed we won't
need to return back to the calling point, since JMP never returns),
so the adress is still on the stack. So instead of doing the old CALL
FAR we should just do a JMP FAR and when the routine will try to came
back it will get the right old adress and the host will continue on its
way.
So shortly this restoration part:
- save all registers and such
- do virus work
- restore all registers and such
- jmp far to the old routine and then continue on it's way
Of course there is just a minor thing: the adress after the JMP FAR
needs to be relocated as the old funcion! So we can just simply add
a relocation that is just the same as the old infected one, just with
the difference that we have to put the right offset to it in at offset
02h.
And that's it. Now when the host will call the infected call (very usually
an API under Windows) it will pass the control to the virus. This will
do its work, execute the original call and then continue the host code.
Pay attention that the virus should be called more than once during the
execution of the program! So for example if your virus is polymorphic after
the first execution you must change the code in memory so the decryptor
won't be executed again or the effects will be crashes and such like, since
the virus has been already decrypted in the previous call! This is anyway
easy to solve with a JMP over the decryptor. This will also make the execution
faster, since just the first time the virus is called it should be nice to
check to be activated. As for the rest of the times the virus is called (if
you infected a very used call it should be called even twenty or more times!)
it should just directly jump to the JMP FAR to old code without losing time
at residency check and such, because anyway we know we are resident.
Before discussing of the aspects of this way of infection I would like to
point out about some things I founded with relocations when coding this
method. On a lot of specifications and books I haven't founded about this,
but it should be interesting at least for those, like me, that don't have
this documentation :) This should be of use to someone in future projects
maybe.
You'll be able to find written that to make a call to an API for example in
pure assembly you'll have to do something like a:
call far ptr 0000h:0ffffh (err, db 9a,ff,ff,00,00 of course)
and then put in the relocation entry at the bottom of the file that the
adress where the ffffh is must be relocated (as a 32bit PTR).
This is true, but isn't the only way it is done. While studing NExes I
founded that with one relocation entry you can relocate as many pointers
as you want. Infact if instead of putting the 0ffffh you put the offset
(calculated from the start of the segment) to another pointer then it
will be relocated too. So:
0001h:0100h call far ptr 0000:00111h ; 9a,11,01,00,00
0001h:0110h call far ptr 0000:00201h ; 9a,01,02,00,00
0001h:0120h call far ptr 0000:0ffffh ; 9a,ff,ff,00,00
0001h:0200h call far ptr 0000:00121h ; 9a,21,01,00,00
By putting just one relocation entry on 0001h:0101h (the first of the
"chain") all the other ones in the example will be also relocated, since the
first points to the second, the second to the fourth and the fourth to the
third that, since it has 0ffffh as offset, is the last to be relocated with
the used relocation entry. Of course this works when relocations are in the
same segment (module).
Of course this a great advantage for this method of infection, since by
infecting the first relocation entry in the previous example the virus will
be called from four different points, so the probability to get a chance to
get to virus code is even bigger. Since all the four points originally
wanted to be relocated as the same call then also the restoration code in the
virus will be just one for all of them. Of course we also don't have to change
(and that's better, one work less :)) ) the code that is going to be relocated,
since if we would put a typical call far 0:ffff to the virus it should break
the chain and some code shouldn't be relocated thus making windows crash
(and here is where I had initally problems and so I had to investigate WTF are
that values different to ffffh even if the fucked manual doesn't talk about
them :) ).
4) Pro and cons about this way of infection:
--------------------------------------------
First let's try to see the good aspects of this type of infection. It is
simple to see that the virus should get control in a quite random moment
of the execution. The virus should trigger just on some special events and
so it is not so simple to spot it. For example it should just activate when
a dialog box is going to be drawn (if it infected a call of the COMMDLG.xx)
or just when the user will exit from the application or when the user uses
the cos function in the calculator or something else. The time the virus
needs do its work in this way isn't concentrated anymore at the beginning
of the program execution, so this should also decrease possible user
preoccupations.
On the other side in this way the virus should not always get activated. If
it infects a rarely used call it should not get control. But of course this
should be solved by infecting the file more than once or by scanning for
some typical APIs to be infected (this should be anyway harder since
you should have to find first the module reference number and then look for
the typical function. But with a few given assumpions it should be done
quite easily), like an initialization or exit procedure.
Another "bad" aspect that many should notice is that anyway there is a
reference to our virus from at least one relocation, unlike with the COM
infection mentioned before where absolutely nothing points to the CALL.
Well, yes, an antivirus should definitely scan all the relocations of all
the segments of the executable and analyze the code the calls points to.
But, man, even the most banal NExe has hundreds of relocations (for ex.
CLOCK has about 90 and CHARMAP about 120... and they are just 20kb files...
try to give a look to something like Word ;)) ) and scanning up and down
the entire EXE would make the user waste a lot of time and shouldn't be
reailable. Infact if the virus should have a quite good poly engine
then the antivirus should need to emulate kbs of code of every relocation.
Actual AVs don't do something like that, but I don't think that AVs will going
to implement this sort of scanning over all possible relocations as entry
points to the virus. Should get too much time on scanning of every file.
Heh, no, the user will eat the AV producer's hat ;))
AVs should anyway dumbly try to check the last segment (or the segments that
are shorter than a predefined value) for viruses. Well, yea this should work
fine with non polymorphic viruses but with a little of work from the VW this
method should be easily made ineffective. Just put some bytes of nonsense code
at the start of the virus segment, then make the poly decryptor (where the call
will point to, of course not to the nonsense bytes before :)) ) that decrypts
the body and maybe again some nonsense bytes after it. Of course you'll need
to anyway keep clear the dwords to be relocated, but that's not a big flaw.
The Free_Padania sample virus uses the scheme described above, so the virus
segment looks like
Ú----------------Â-----------------------------------Â----------------Â-------¿
| nonsense bytes | poly decryptor and encrypted code | nonsense bytes | reloc |
À----------------Á-----------------------------------Á----------------Á-------Ù
Where the reloc is the relocation item entry. The dword to be relocated (the
0:ffff) is put somewhere in the first chunk of the nonsense bytes and the
offset to it is stored in the encrypted code. The nonsense bytes have the
job to make the scanning for the entry point of the poly harder, so just
trying to begin emulating from the beginning of the segment won't work and
a scan of all the relocations in the executable should be needed. Of course
this must be used with a good poly.
5) Possible add-ons:
--------------------
Some things that should be done to make the stuff better:
- make some more fake relocations to make the segment seem more simillar
to other segments
- use relocations to change your code in the way you want. So on the disk
the usual 0:ffff will be there, but the code you wants will be put by
the loader by adding some good relocations. This isn't too hard to do,
but should be done with caution. But is should make the life of AVs very
hard, thus they should have to consider the code with relocated code, not
the one they see on the hd. Of course just some part of the relocated
code should be used, like offsets. Since we don't know how the segment
is relocated we can't use that. Do this for homework :)
- swap some segments. So the virus segment won't be always the last, making
scanning even and even harder. This anyway should require quite a lot,
maybe too much work. Besides the moving of large chunks of code you'll
have also to correct all the relocations in all the segments pointing
to the changed one.
- use an already present segment. Find a hole an put the virus there instead
of creating a new segment.
That's a lot you should do in NExes, just let your mind go wild :) Even if
NExes are not used anymore, the format is very interesting. It is quite
complicated, thus a lot of strange things can be done to make the life of AVs
look like hell :) Even in the newest Windowses (like 98) there are many NExes
(some of which very used, like SOL.EXE :)) ), so at least they should be used
to keep the system permanelty infected with a good infection scheme.
6) Implementation in other executables:
---------------------------------------
I explained and brought the code for a NE infector on relocations. What about
with normal DOS executables? The things there are a lot worse. The general idea
of the relocation should work, but the problem is that DOS isn't in protected
mode, so you must also pay attention that the original program doesn't
overwrite your code! This is the big advantage in working in protected mode,
we don't have to worry that someone will acidentally overwrite our code since
if someone should try to do a general protection fault will occour. In
protected mode applications are a lot more correct than the DOS ones. Infact
a DOS application should just write without any limitation anywhere. And
it is very usual that DOS programs just assume that after their body (the
original one, not the infected) there isn't anything vital and so they just
write there (and should corrupt our virus). This of course should be
prevented by working like the Nexiv_Der, that is by putting a definite char
after the loaded program in memory and at the end of the execution try to
look for a place big enough that hasn't been modified. But, as for personal
tests (I still have a sorta prototype of infector on relocations for normal
DOS exes that I wrote some time ago somewhere), the number of possible victims
will be very limted since a lot of them use memory (at least for stack) after
the body. Yea, you should just search for space a lot later in memory (and so
leaving some hole between the host and the virus code where the original host
should write) and hope all will be ok. But it is definitely a not too secure
way of infection even if many checks are done (since the code should be
executed differently from the time when the infection was done) and a lot of
code is needed to implement it. But it actually should be done of course.
As for midinfection in PE since the methods and theories behind that are
quite different I'll talk about that in another article, look for it!
7) Appendix:
------------
Here are some tables and such thing as I refeered mostly in the text. For
more (expecially about the NE header which you'll need to undestand too to
add the virus code) check out Ralph Brown Interrupt Lists (AX=4b00) and
Qark and Quantum article about NewEXE infection in VLAD #5.
Format of new executable segment table record:
00h WORD offset in file (shift left by align shift to get byte offs)
02h WORD length of image in file (0000h = 64K)
04h WORD segment attributes (see below)
06h WORD number of bytes to allocate for segment (0000h = 64K)
Note: the first segment table entry is entry number 1
Bitfields for segment attributes:
Bit(s) Description
0 data segment rather than code segment
1 unused???
2 real mode
3 iterated
4 movable
5 sharable
6 preloaded rather than demand-loaded
7 execute-only (code) or read-only (data)
8 relocations (directly following code for this segment)
9 debug info present
10,11 80286 DPL bits
12 discardable
13-15 discard priority
Format of new executable relocation data (immediately follows segment image):
Offset Size Description
00h WORD number of relocation items
02h 8N BYTEs relocation items
Offset Size Description
00h BYTE relocation type
00h LOBYTE
02h BASE
03h PTR
05h OFFS
0Bh PTR48
0Dh OFFS32
01h BYTE flags
bit 2: additive
02h WORD offset within segment
04h WORD target address segment
06h WORD target address offset