Phrack Inc. Volume 07 Issue 48 File 14

Published in
· 5 years ago
  

                             ==Phrack Magazine== 

             Volume Seven, Issue Forty-Eight, File 14 of 18 

 
			 [ IP-spoofing Demystified ] 
		      (Trust-Relationship Exploitation) 

 
			by daemon9 / route / infinity 
			     for Phrack Magazine 
		      June 1996 Guild Productions, kid 

		       comments to route@infonexus.com 

 
        The purpose of this paper is to explain IP-spoofing to the 
masses.  It assumes little more than a working knowledge of Unix and 
TCP/IP.  Oh, and that yur not a moron... 
        IP-spoofing is complex technical attack that is made up of 
several components.  (In actuality, IP-spoofing is not the attack, but 
a step in the attack.  The attack is actually trust-relationship 
exploitation.  However, in this paper,  IP-spoofing will refer to the 
whole attack.)  In this paper, I will explain the attack in detail, 
including the relevant operating system and networking information. 

 
                [SECTION I.  BACKGROUND INFORMATION] 

 
        --[ The Players ]-- 

 
        A:      Target host 
        B:      Trusted host 
        X:      Unreachable host 
        Z:      Attacking host 
        (1)2:   Host 1 masquerading as host 2 

 
        --[ The Figures ]-- 

 
        There are several figures in the paper and they are to be 
interpreted as per the following example: 

ick   host a      control     host b 
1       A       ---SYN--->      B 

tick:   A tick of time.  There is no distinction made as to *how* 
much time passes between ticks, just that time passes.  It's generally 
not a great deal. 
host a: A machine particpating in a TCP-based conversation. 
control: This field shows any relevant control bits set in the TCP 
header and the direction the data is flowing 
host b: A machine particpating in a TCP-based conversation. 

In this case, at the first refrenced point in time host a is sending 
a TCP segment to host b with the SYN bit on.  Unless stated, we are 
generally not concerned with the data portion of the TCP segment. 

 
        --[ Trust Relationships ]-- 

 
        In the Unix world, trust can be given all too easily.  Say you 
have an account on machine A, and on machine B.  To facilitate going 
betwixt the two with a minimum amount of hassle, you want to setup a 
full-duplex trust relationship between them.  In your home directory 
at A you create a .rhosts file: `echo "B username" > ~/.rhosts` In 
your home directory at B you create a .rhosts file: `echo "A username" 
> ~/.rhosts` (Alternately, root can setup similar rules in 
/etc/hosts.equiv, the difference being that the rules are hostwide, 
rather than just on an individual basis.)  Now, you can use any of the 
r* commands without that annoying hassle of password authentication. 
These commands will allow address-based authentication, which will 
grant or deny access based off of the IP address of the service 
requestor. 

 
        --[ Rlogin ]-- 

 
        Rlogin is a simple client-server based protocol that uses TCP 
as it's transport.  Rlogin allows a user to login remotely from one 
host to another, and, if the target machine trusts the other, rlogin 
will allow the convienience of not prompting for a password.  It will 
instead have authenticated the client via the source IP address.  So, 
from our example above, we can use rlogin to remotely login to A from 
B (or vice-versa) and not be prompted for a password. 

 
        --[ Internet Protocol ]-- 

 
        IP is the connectionless, unreliable network protocol in the 
TCP/IP suite.  It has two 32-bit header fields to hold address 
information.  IP is also the busiest of all the TCP/IP protocols as 
almost all TCP/IP traffic is encapsulated in IP datagrams.  IP's job 
is to route packets around the network.  It provides no mechanism for 
reliability or accountability, for that, it relies on the upper 
layers.  IP simply sends out datagrams and hopes they make it intact. 
If they don't, IP can try to send an ICMP error message back to the 
source, however this packet can get lost as well.  (ICMP is Internet 
Control Message Protocol and it is used to relay network conditions 
and different errors to IP and the other layers.)  IP has no means to 
guarantee delivery.  Since IP is connectionless, it does not maintain 
any connection state information.  Each IP datagram is sent out without 
regard to the last one or the next one.  This, along with the fact that 
it is trivial to modify the IP stack to allow an arbitrarily choosen IP 
address in the source (and destination) fields make IP easily subvertable. 

 
        --[ Transmission Control Protocol ]-- 

 
        TCP is the connection-oriented, reliable transport protocol 
in the TCP/IP suite.  Connection-oriented simply means that the two 
hosts participating in a discussion must first establish a connection 
before data may change hands.  Reliability is provided in a number of 
ways but the only two we are concerned with are data sequencing and 
acknowledgement.  TCP assigns sequence numbers to every segment and 
acknowledges any and all data segments recieved from the other end. 
(ACK's consume a sequence number, but are not themselves ACK'd.) 
This reliability makes TCP harder to fool than IP. 

 
        --[ Sequence Numbers, Acknowledgements and other flags ]-- 

 
        Since TCP is reliable, it must be able to recover from 
lost, duplicated, or out-of-order data.  By assigning a sequence 
number to every byte transfered, and requiring an acknowledgement from 
the other end upon receipt, TCP can guarantee reliable delivery.  The 
receiving end uses the sequence numbers to ensure proper ordering of 
the data and to eliminate duplicate data bytes. 
        TCP sequence numbers can simply be thought of as 32-bit 
counters.  They range from 0 to 4,294,967,295.  Every byte of 
data exchanged across a TCP connection (along with certain flags) 
is sequenced.  The sequence number field in the TCP header will 
contain the sequence number of the *first* byte of data in the 
TCP segment.  The acknowledgement number field in the TCP header 
holds the value of next *expected* sequence number, and also 
acknowledges *all* data up through this ACK number minus one. 
        TCP uses the concept of window advertisement for flow 
control.  It uses a sliding window to tell the other end how much 
data it can buffer.  Since the window size is 16-bits a receiving TCP 
can advertise up to a maximum of 65535 bytes.  Window advertisement 
can be thought of an advertisment from one TCP to the other of how 
high acceptable sequence numbers can be. 
        Other TCP header flags of note are RST (reset), PSH (push) 
and FIN (finish).  If a RST is received, the connection is 
immediately torn down.  RSTs are normally sent when one end 
receives a segment that just doesn't jive with current connection 
(we will encounter an example below).  The PSH flag tells the 
reciever to pass all the data is has queued to the aplication, as 
soon as possible.  The FIN flag is the way an application begins a 
graceful close of a connection (connection termination is a 4-way 
process). When one end recieves a FIN, it ACKs it, and does not 
expect to receive any more data (sending is still possible, however). 

 
        --[ TCP Connection Establishment ]-- 

 
        In order to exchange data using TCP, hosts must establish a 
a connection.  TCP establishes a connection in a 3 step process called 
the 3-way handshake.  If machine A is running an rlogin client and 
wishes to conect to an rlogin daemon on machine B, the process is as 
follows: 

                fig(1) 

1       A       ---SYN--->      B 

2       A    <---SYN/ACK---     B 

3       A       ---ACK--->      B 

 
At (1) the client is telling the server that it wants a connection. 
This is the SYN flag's only purpose.  The client is telling the 
server that the sequence number field is valid, and should be checked. 
The client will set the sequence number field in the TCP header to 
it's ISN (initial sequence number).  The server, upon receiving this 
segment (2) will respond with it's own ISN (therefore the SYN flag is 
on) and an ACKnowledgement of the clients first segment (which is the 
client's ISN+1).  The client then ACK's the server's ISN (3).  Now, 
data transfer may take place. 

 
        --[ The ISN and Sequence Number Incrementation ]-- 

 
        It is important to understand how sequence numbers are 
initially choosen, and how they change with respect to time.  The 
initial sequence number when a host is bootstraped is initialized 
to 1. (TCP actually calls this variable 'tcp_iss' as it is the initial 
*send* sequence number.  The other sequence number variable, 
'tcp_irs' is the initial *receive* sequence number and is learned 
during the 3-way connection establishment.  We are not going to worry 
about the distinction.)  This practice is wrong, and is acknowledged 
as so in a comment the tcp_init() function where it appears.  The ISN 
is incremented by 128,000 every second, which causes the 32-bit ISN 
counter to wrap every 9.32 hours if no connections occur.  However, 
each time a connect() is issued, the counter is incremented by 
64,000. 
        One important reason behind this predictibility is to 
minimize the chance that data from an older stale incarnation 
(that is, from the same 4-tuple of the local and remote 
IP-addresses TCP ports) of the current connection could arrive 
and foul things up.  The concept of the 2MSL wait time applies 
here, but is beyond the scope of this paper.  If sequence 
numbers were choosen at random when a connection arrived, no 
guarantees could be made that the sequence numbers would be different 
from a previous incarnation.  If some data that was stuck in a 
routing loop somewhere finally freed itself and wandered into the new 
incarnation of it's old connection, it could really foul things up. 

 
        --[ Ports ]-- 

 
        To grant simultaneous access to the TCP module, TCP provides 
a user interface called a port.  Ports are used by the kernel to 
identify network processes.  These are strictly transport layer 
entities (that is to say that IP could care less about them). 
Together with an IP address, a TCP port provides provides an endpoint 
for network communications.  In fact, at any given moment *all* 
Internet connections can be described by 4 numbers: the source IP 
address and source port and the destination IP address and destination 
port.  Servers are bound to 'well-known' ports so that they may be 
located on a standard port on different systems.  For example, the 
rlogin daemon sits on TCP port 513. 

 
                [SECTION II.  THE ATTACK] 

 
        ...The devil finds work for idle hands.... 

 
        --[ Briefly... ]-- 

 
        IP-spoofing consists of several steps, which I will 
briefly outline here, then explain in detail.  First, the target host 
is choosen.  Next, a pattern of trust is discovered, along with a 
trusted host.  The trusted host is then disabled, and the target's TCP 
sequence numbers are sampled.  The trusted host is impersonated, the 
sequence numbers guessed, and a connection attempt is made to a 
service that only requires address-based authentication.  If 
successful, the attacker executes a simple command to leave a 
backdoor. 

 
        --[ Needful Things ]-- 

 
        There are a couple of things one needs to wage this attack: 

                (1) brain, mind, or other thinking device 
                (1) target host 
                (1) trusted host 
                (1) attacking host (with root access) 
                (1) IP-spoofing software 

Generally the attack is made from the root account on the attacking 
host against the root account on the target.  If the attacker is 
going to all this trouble, it would be stupid not to go for root. 
(Since root access is needed to wage the attack, this should not 
be an issue.) 

 
        --[ IP-Spoofing is a 'Blind Attack' ]-- 

 
        One often overlooked, but critical factor in IP-spoofing 
is the fact that the attack is blind.  The attacker is going to be 
taking over the identity of a trusted host in order to subvert the 
security of the target host.  The trusted host is disabled using the 
method described below.  As far as the target knows, it is carrying on 
a conversation with a trusted pal.  In reality, the attacker is 
sitting off in some dark corner of the Internet, forging packets 
puportedly from this trusted host while it is locked up in a denial 
of service battle.  The IP datagrams sent with the forged IP-address 
reach the target fine (recall that IP is a connectionless-oriented 
protocol--  each datagram is sent without regard for the other end) 
but the datagrams the target sends back (destined for the trusted 
host) end up in the bit-bucket.  The attacker never sees them.  The 
intervening routers know where the datagrams are supposed to go.  They 
are supposed to go the trusted host.  As far as the network layer is 
concerned, this is where they originally came from, and this is where 
responses should go.  Of course once the datagrams are routed there, 
and the information is demultiplexed up the protocol stack, and 
reaches TCP, it is discarded (the trusted host's TCP cannot respond-- 
see below).  So the attacker has to be smart and *know* what was sent, 
and *know* what reponse the server is looking for.  The attacker 
cannot see what the target host sends, but she can *predict* what it 
will send; that coupled with the knowledge of what it *will* send, 
allows the attacker to work around this blindness. 

 
        --[ Patterns of Trust ]-- 

 
        After a target is choosen the attacker must determine the 
patterns of trust (for the sake of argument, we are going to assume 
the target host *does* in fact trust somebody.  If it didn't, the 
attack would end here).  Figuring out who a host trusts may or may 
not be easy.  A 'showmount -e' may show where filesystems are 
exported, and rpcinfo can give out valuable information as well. 
If enough background information is known about the host, it should 
not be too difficult.  If all else fails, trying neighboring IP 
addresses in a brute force effort may be a viable option. 

 
        --[ Trusted Host Disabling Using the Flood of Sins ]-- 

 
        Once the trusted host is found, it must be disabled.  Since 
the attacker is going to impersonate it, she must make sure this host 
cannot receive any network traffic and foul things up.  There are 
many ways of doing this, the one I am going to discuss is TCP SYN 
flooding. 
        A TCP connection is initiated with a client issuing a 
request to a server with the SYN flag on in the TCP header.  Normally 
the server will issue a SYN/ACK back to the client identified by the 
32-bit source address in the IP header.  The client will then send an 
ACK to the server (as we saw in figure 1 above) and data transfer 
can commence.  There is an upper limit of how many concurrent SYN 
requests TCP can process for a given socket, however.  This limit 
is called the backlog, and it is the length of the queue where 
incoming (as yet incomplete) connections are kept.  This queue limit 
applies to both the number of imcomplete connections (the 3-way 
handshake is not complete) and the number of completed connections 
that have not been pulled from the queue by the application by way of 
the accept() system call.  If this backlog limit is reached, TCP will 
silently discard all incoming SYN requests until the pending 
connections can be dealt with.  Therein lies the attack. 
        The attacking host sends several SYN requests to the TCP port 
she desires disabled.  The attacking host also must make sure that 
the source IP-address is spoofed to be that of another, currently 
unreachable host (the target TCP will be sending it's response to 
this address.  (IP may inform TCP that the host is unreachable, 
but TCP considers these errors to be transient and leaves the 
resolution of them up to IP (reroute the packets, etc) effectively 
ignoring them.)  The IP-address must be unreachable because the 
attacker does not want any host to recieve the SYN/ACKs that will be 
coming from the target TCP (this would result in a RST being sent to 
the target TCP, which would foil our attack).  The process is as 
follows: 

                fig(2) 

1       Z(x)    ---SYN--->      B 

        Z(x)    ---SYN--->      B 

        Z(x)    ---SYN--->      B 

        Z(x)    ---SYN--->      B 

        Z(x)    ---SYN--->      B 

                ... 

2       X    <---SYN/ACK---     B 

        X    <---SYN/ACK---     B 

                ... 

3       X      <---RST---       B 

 
At (1) the attacking host sends a multitude of SYN requests to the 
target (remember the target in this phase of the attack is the 
trusted host) to fill it's backlog queue with pending connections. 
(2) The target responds with SYN/ACKs to what it believes is the 
source of the incoming SYNs.  During this time all further requests 
to this TCP port will be ignored. 
        Different TCP implementations have different backlog sizes. 
BSD generally has a backlog of 5 (Linux has a backlog of 6).  There 
is also a 'grace' margin of 3/2.  That is, TCP will allow up to 
backlog*3/2+1 connections.  This will allow a socket one connection 
even if it calls listen with a backlog of 0. 

        AuthNote: [For a much more in-depth treatment of TCP SYN 
flooding, see my definitive paper on the subject.  It covers the 
whole process in detail, in both theory, and practice.  There is 
robust working code, a statistical analysis, and a legnthy paper. 
Look for it in issue 49 of Phrack. -daemon9 6/96] 

 
        --[ Sequence Number Sampling and Prediction ]-- 

 
        Now the attacker needs to get an idea of where in the 32-bit 
sequence number space the target's TCP is.  The attacker connects to 
a TCP port on the target (SMTP is a good choice) just prior to launching 
the attack and completes the three-way handshake.  The process is 
exactly the same as fig(1), except that the attacker will save the 
value of the ISN sent by the target host.  Often times, this process is 
repeated several times and the final ISN sent is stored.  The attacker 
needs to get an idea of what the RTT (round-trip time) from the target 
to her host is like.  (The process can be repeated several times, and an 
average of the RTT's is calculated.)  The RTT is necessary in being 
able to accuratly predict the next ISN.  The attacker has the baseline 
(the last ISN sent) and knows how the sequence numbers are incremented 
(128,000/second and 64,000 per connect) and now has a good idea of 
how long it will take an IP datagram to travel across the Internet to 
reach the target (approximately half the RTT, as most times the 
routes are symmetrical).  After the attacker has this information, she 
immediately proceeds to the next phase of the attack (if another TCP 
connection were to arrive on any port of the target before the 
attacker was able to continue the attack, the ISN predicted by the 
attacker would be off by 64,000 of what was predicted). 
        When the spoofed segment makes it's way to the target, 
several different things may happen depending on the accuracy of 
the attacker's prediction: 
- If the sequence number is EXACTly where the receiving TCP expects 
it to be, the incoming data will be placed on the next available 
position in the receive buffer. 
- If the sequence number is LESS than the expected value the data 
byte is considered a retransmission, and is discarded. 
- If the sequence number is GREATER than the expected value but 
still within the bounds of the receive window, the data byte is 
considered to be a future byte, and is held by TCP, pending the 
arrival of the other missing bytes.  If a segment arrives with a 
sequence number GREATER than the expected value and NOT within the 
bounds of the receive window the segment is dropped, and TCP will 
send a segment back with the *expected* sequence number. 

 
        --[ Subversion... ]-- 

 
        Here is where the main thrust of the attack begins: 

                fig(3) 

1       Z(b)    ---SYN--->      A 

2       B     <---SYN/ACK---    A 

3       Z(b)    ---ACK--->      A 

4       Z(b)    ---PSH--->      A 

                [...] 

 
The attacking host spoofs her IP address to be that of the trusted 
host (which should still be in the death-throes of the D.O.S. attack) 
and sends it's connection request to port 513 on the target (1).  At 
(2), the target responds to the spoofed connection request with a 
SYN/ACK, which will make it's way to the trusted host (which, if it 
*could* process the incoming TCP segment, it would consider it an 
error, and immediately send a RST to the target).  If everything goes 
according to plan, the SYN/ACK will be dropped by the gagged trusted 
host.  After (1), the attacker must back off for a bit to give the 
target ample time to send the SYN/ACK (the attacker cannot see this 
segment).  Then, at (3) the attacker sends an ACK to the target with 
the predicted sequence number (plus one, because we're ACKing it). 
If the attacker is correct in her prediction, the target will accept 
the ACK.  The target is compromised and data transfer can 
commence (4). 
        Generally, after compromise, the attacker will insert a 
backdoor into the system that will allow a simpler way of intrusion. 
(Often a `cat + + >> ~/.rhosts` is done.  This is a good idea for 
several reasons: it is quick, allows for simple re-entry, and is not 
interactive.  Remember the attacker cannot see any traffic coming from 
the target, so any reponses are sent off into oblivion.) 

 
        --[ Why it Works ]-- 

 
        IP-Spoofing works because trusted services only rely on 
network address based authentication.  Since IP is easily duped, 
address forgery is not difficult.  The hardest part of the attck is 
in the sequence number prediction, because that is where the guesswork 
comes into play.  Reduce unknowns and guesswork to a minimum, and 
the attack has a better chance of suceeding.  Even a machine that 
wraps all it's incoming TCP bound connections with Wietse Venema's TCP 
wrappers, is still vulnerable to the attack.  TCP wrappers rely on a 
hostname or an IP address for authentication... 

 
                [SECTION III. PREVENTITIVE MEASURES] 

 
        ...A stich in time, saves nine... 

 
        --[ Be Un-trusting and Un-trustworthy ]-- 

 
        One easy solution to prevent this attack is not to rely 
on address-based authentication.  Disable all the r* commands, 
remove all .rhosts files and empty out the /etc/hosts.equiv file. 
This will force all users to use other means of remote access 
(telnet, ssh, skey, etc). 

 
        --[ Packet Filtering ]-- 

 
        If your site has a direct connect to the Internet, you 
can use your router to help you out.  First make sure only hosts 
on your internal LAN can particpate in trust-relationships (no 
internal host should trust a host outside the LAN).  Then simply 
filter out *all* traffic from the outside (the Internet) that 
puports to come from the inside (the LAN). 

 
        --[ Cryptographic Methods ]-- 

 
        An obvious method to deter IP-spoofing is to require 
all network traffic to be encrypted and/or authenticated.  While 
several solutions exist, it will be a while before such measures are 
deployed as defacto standards. 

 
        --[ Initial Sequence Number Randomizing ]-- 

 
        Since the sequence numbers are not choosen randomly (or 
incremented randomly) this attack works.  Bellovin describes a 
fix for TCP that involves partitioning the sequence number space. 
Each connection would have it's own seperate sequence number space. 
The sequence numbers would still be incremented as before, however, 
there would be no obvious or implied relationship between the 
numbering in these spaces.  Suggested is the following formula: 

        ISN=M+F(localhost,localport,remotehost,remoteport) 

Where M is the 4 microsecond timer and F is a cryptographic hash. 
F must not be computable from the outside or the attacker could 
still guess sequence numbers.  Bellovin suggests F be a hash of 
the connection-id and a secret vector (a random number, or a host 
related secret combined with the machine's boot time). 

 
                [SECTION IV.  SOURCES] 

 
        -Books:         TCP/IP Illustrated vols. I, II & III 
        -RFCs:          793, 1825, 1948 
        -People:        Richard W. Stevens, and the users of the 
                        Information Nexus for proofreading 
        -Sourcecode:    rbone, mendax, SYNflood 

 
This paper made possible by a grant from the Guild Corporation.
Phrack Inc. Volume 07 Issue 48 File 14

Share this article

Let's discover also

Phrack Inc. Volume 02 Issue 25 File 02

Phrack Inc. Volume 11 Issue 64 File 04

Phrack Inc. Volume 04 Issue 44 File 02

Phrack Inc. Volume 08 Issue 52 File 02

Phrack Inc. Volume 14 Issue 67 File 12

Phrack Inc. Volume 14 Issue 68 File 05

Phrack Inc. Volume 04 Issue 43 File 01

Phrack Inc. Volume 04 Issue 41 File 02

Phrack Inc. Volume 03 Issue 31 File 02

Phrack Inc. Volume 11 Issue 59 File 01

Recent Articles

Discovered in Saqqara the Tomb of Teti-neb-Fu

4x02: PHP-Nuke: Estudio de Vulnerabilidades

4x01: Editorial

4x00: Index

3x19: Esto es todo Amigos ...

3x18: Extract

3x17: Llaves PGP

3x16: Edicion Especial: Aniversario de 0ri0n

3x15: Seccion Anti-Stress

3x14: Proyectos Varios

Recent Comments