Copy Link
Add to Bookmark
Report
29A Issue 01 02 03
Polymorphism
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ>
An¡bal Lecter
I know it may be a bit strong featuring both an encryption article with a
polymorphism one in the same issue, but this one is dedicated to those of
you who have a more advanced level. If you are still a bit confused with
encryption, better forget this article and try with YAM.
We'll very basically introduce polymorphic routines: design, construction
and functioning.
In this article, we'll study a 'pseudo-polymorphic' generator, this is:
grounding on a basic routine, make more difficult the detection of the
virus (as the routine's kernel isn't variated), depending on your aims of
work.
What's a PER (Polymorphic Encryption Routine)?:
PERs are born aiming to avoid detection schemes based on the uneffective
strings of bytes.
These systems are based on the idea that viruses always preserve a number
of stable bytes in each generation (at least in the header, when encryp-
ted).
With PERs we are trying to avoid this unconvinience, always trying to va-
riate that header, either:
1. Lexically: substituting directly some hex codes for others.
label : 2825 sub [di],ah
turns into: 0005 add [di],al
In this case it would be enough to substract 2820h from the word at
label:, although we should have thought that we should put the value
to use in AH or AL; depending on the case, such code would change too.
How? keep on reading }:-)
2. Sintactically: changing the order of the commands, but seeking the sa-
me result.
label : add [di], ah
xor [di], ah
turns to: xor [di], ah
add [di], ah
This time we should keep in mind the order during encryption so as to
invert it correctly.
3. Morphologically: variating its external appearance, but maintaining
its kernel, including garbage in between the code.
In order to do this, we can take hand of the classics:
90h = nop
f8h = clc
f9h = stc
fah = cli
fbh = sti
fch = cld
dch = std
Look out! these last two are dangerous if using registers SI, DI and
CX at the same time, cause you'll have to bear it in mind whether you
pretend the loop to increase or decrease :-P
Of course, you can combine them; the easiest example is for the ty-
pical bait files:
dw 2000 dup (90fbh)
mov ax,4c00h
int 21h
This way, we can avoid the virus from not infecting it by searching a
a big number of the same bytes. For the decrypting header, we would
have a routine with the XOR and a little algorithm which would add a
series of 'non-code'.
4. Finally, combining the three preceeding ways as you like. Further on,
we'll see a bit on language grammar, specially Conway notation ;-) It
would be equally possible to see BNF, but as far as i see it, it is
less clear.
From the three basic methods we've just seen for making possible the mu-
tations, the one i see the easiest is the third one, as it just affects
the code size and the 'innocent' instructions we include in it. The 2nd 1
is a bit more complex, because of having to change the order of the ins-
tructions though they're kept the same. Whereas the first one, it's nece-
ssary to know all of the opcodes and build a table (we have an example in
MtE).
I'll start with the third case; once you pick it up, you all will know e-
nough so as to jump directly onto the second type as well as the third,
or even combine them all, or whatever you think of.
Let's see it with a direct example; let's take the decryption routine
from the 13th July virus (original source code by Jordi Mas, a spanish
ex-AVer dickhead which was known to have some of their viruses in the
wild... and these viruses sucked as much as he did) }:-)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ->8
longitud equ fin-inicio_virus ; Defines the size
inicio_virus:
inicio: mov al,cs:hasta_final ; Decrypts the executable
xor al,090h ; viral code
mov si,offset hasta_final
mov cx,longitud
des_bl: xor byte ptr cs:[si],al
inc si
loop des_bl
hasta_final db 90h ; Stores the random value
; it uses for the encryption
; .... ; routine
; 'Normal' viral code
; ....
fin: ; End of the code
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ->8
In this case, we can see how the string of bytes that starts at 'inicio'
and goes until 'hasta_final' (keeping the encryption value outside) will
always be the same in each infection; it will only have 255 different po-
ssibilities plus one more in which the code will be visible (00h). But
these strings don't change: the virus is 100% detectable.
It can be mutated, but must be done hand-helped, for instance, fitting
into it some of these 'innocent' bytes in between the first and the se-
cond instruction. Nevertheless, this can be done by any lamer ;-) We are
leet and can do it better, can't we? ;-)
In this approach to polymorphism by the third method, we can notice three
of its characteristics:
1. Routine's kernel doesn't change.
2. The size does.
3. Distance in bytes between main instructions also variate.
How can we do it? let's see:
Grammar in Conway notation
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
This kind of grammar, as some of you will yet know, is mostly used to
describe programming language's structure; it is also valid for spoken
language, and it is used in AI for recognizing speech, or at least that's
what i think.
What's pretended with it is to represent the morphological structure of
language by means of a flow chart. Take as an example:
ÚÄÄÄÄÄ®ÄÄÄÄÄÄÄÄ®ÄÄÄÄÄÄÄÄÄÄÄÄ®ÄÄÄÄÄÄÄÄÄÄÄ¿
³ ÚÄÄįÄÄÄÄendÄÄÄ´
start ³ ³ ³
¯ÄÂÄÄarticleÄÄÄsubjectÄÂÄverb ÅÄįÄÄadverbÄÄįÄÄÄÄÄÄÄÄÄÙ ³
³ ³ ³ ÚÄmodeÄÄÄÄÄ´
³ ³ ³ ³ ³
ÀįÄÄpronounÄÄįÄÄÄÄÄÙ ÀÄÄÄÄįÄÄÄmannerÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄtimeÄÄÄÄÄ´
³ ³
ÀÄplaceÄÄÄÄÙ
Let's see it in detail:
ÚÄÄÄÄÄÄÄÄÄÄÄ®ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ®ÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ ÚÄÄįÄÄÄÄ(end)ÄÄÄįÄÄÄÄÄ´
(start) ³ ³ ³
įÄÄTheÄÄÂÄvirusÄÄÂÄinfectedÄÅÄÄÄÄalwaysÄÄįÄÄÄÄÙ ÚÄmassivelyÄÄÄ´
³ ³ ³ ³ ³
ÀÄZhengxiÙ ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄfr.3h to 5hÄ´
³ ³
ÀÄsome PCsÄÄÄÄÙ
Experts on language will pardon me... this ain't a grammar lesson ;-)
If we oberve and analyze the diagram, we can see it is possible to say
the same with different words. Eg:
"The virus always infected"
"The virus infected massively"
"The Zhengxi virus infected from 3h till 5h"
But we're also allowed to 'toy' around. Eg:
"The virus infected always, always"
"The Zhengxi virus infected massively from 3h till 5h some PCs"
"The Zhengxi virus infected always some PCs, always from 3h till 5h,
always massively"
Can you get the 'hidden' intention of this explanation? ;-)
Better then, cause next issue we'll put all of this into practice and be-
tter to have all the concepts assimilated.
* An¡bal *