Sunday, April 21, 2013

Connecting to a remote website: Explained


From a browser, when we access a URL, let us examine what happens actually.

In the client side, part of the URL is parsed into a hostname and the hostname is translated into a IP address. The browser is assigned an unprivileged port(for example TCP port 14000) for the connection. A HTTP message is constructed for the web server. Its encapsulated in a TCP message, wrapped in an IP packet header and sent out to the web server listening on port 80.

Now a TCP connection need to be established with the server using a three-way handshake :

When the client program(say, browser) sends it's first connection message, the SYN flag is accompanied by a synchronization sequence number. This sequence number is used as the starting point to number all the rest of the data bytes the client will send.

On the server machine, the kernel hands SYN to the server. The server is listening on port 80 and it's notified of an incoming connection request (the SYN connection synchronization request flag) from the source IP address and port socket pair(client IP address,14000). Server calls accept(). The server allocates a new socket on it's end(web server IP address, 80) and associates it with the client socket.

The kernel in the server machine, sends an acknowledgment(ACK) to the SYN message sent by client,along with it's own synchronization request(SYN) - SYN/ACK. The connection is now half open.

Along with the ACK flag, the server includes the client's sequence number incremented by one. The purpose of ACK flag is to acknowledge the data the client referred by it's sequence number. The server acknowledges this by incrementing client's sequence number - sequence number plus one is the next data byte the server expects to receive. So now the client is free to throw away it's original SYN message since the server has acknowledged the receipt of it.

The server also sets the SYN flag in it's first message. Similar to the client's first message, the SYN flag is accompanied by a synchronization sequence number. The server is passing along it's own starting sequence number for it's half of the connection.

The first message is the only message the server will send with the SYN flag set. This and all subsequent messages have the ACK flag set. The presence of the ACK in all server messages, as compared to lack of an ACK flag in the client's first message is a critical difference.

The client machine receives the SYN/ACK message sent by server and replies with it's own acknowledgement(ACK). Now the TCP three-way handshake is complete and the connection is ESTABLISHED.

From here on, both the client and server set the ACK flag. The SYN flag won't be set by either program.

TCP three-way handshake can be briefed as follows:

1) Client end
    Client sends SYN

2) Server end
    Kernel hands SYN to Server
    Server calls accept()
    Kernel sends SYN/ACK to the client

3) Client end
   Client sends ACK

TCP Connection is ESTABLISHED.

Now the first http request data is passed from client to server.
In the server end, the kernel passes connection into the server's accept() method.

So, a  TCP connection for a typical HTTP request looks as follows

SYN (client->server), 
SYN/ACK (server->client), 
ACK (client->server) - TCP Handshake Complete. TCP Connection Established 

Now the request data is passed from client to server

HTTP Request Data (client->server)

Illustration with example

Let us illustrate it with an output of tcpdump command.

Suppose Iam accessing the website 74.125.236.183 (www.google.co.in) using curl command as follows


$ curl -IL 74.125.236.183
HTTP/1.1 200 OK
Date: Sun, 21 Apr 2013 07:48:49 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Set-Cookie: NID=67=Y-udEdGqbMrB-O-d_Hw09tH2nhhZJ6LmLNt-LAGXVymAq1Fzejl5qFZFdK68DwpE6wxzVLZ7KJFTucQ2zIs6MLxD7KY3MiR2XBqcGvMrmBc5eUaMn-W6Tw7XStpL9QhI; expires=Mon, 21-Oct-2013 07:48:49 GMT; path=/; domain=.; HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Transfer-Encoding: chunked

Using tcpdump command, the above transaction shall be captured as follows

# tcpdump -w google.pcap -i bond0 host 74.125.236.183

Press Ctrl+C once the above curl command is complete

Examine the tcpdump command output in file google.pcap as follows

Here, 192,168.1.33 is the client IP
          74.125.236.183 is the server IP

# tcpdump -nnr google.pcap

reading from file google.pcap, link-type EN10MB (Ethernet)
13:20:36.724978 IP 192.168.1.33.41002 > 74.125.236.183.80: Flags [S], seq 3247485552, win 14600, options [mss 1460,sackOK,TS val 1544059 ecr 0,nop,wscale 7], length 0
13:20:36.756286 IP 74.125.236.183.80 > 192.168.1.33.41002: Flags [S.], seq 3589360082, ack 3247485553, win 62392, options [mss 1430,sackOK,TS val 977680945 ecr 1544059,nop,wscale 6], length 0
13:20:36.756326 IP 192.168.1.33.41002 > 74.125.236.183.80: Flags [.], ack 1, win 115, options [nop,nop,TS val 1544091 ecr 977680945], length 0

So far, TCP three-way handshake is complete and TCP connection is established between the client and the server.

After the TCP connection is established, now the first request data is passed from client to server
13:20:36.756402 IP 192.168.1.33.41002 > 74.125.236.183.80: Flags [P.], seq 1:171, ack 1, win 115, options [nop,nop,TS val 1544091 ecr 977680945], length 170

13:20:36.788779 IP 74.125.236.183.80 > 192.168.1.33.41002: Flags [.], ack 171, win 992, options [nop,nop,TS val 977680978 ecr 1544091], length 0
13:20:36.844585 IP 74.125.236.183.80 > 192.168.1.33.41002: Flags [P.], seq 1:598, ack 171, win 992, options [nop,nop,TS val 977681032 ecr 1544091], length 597
13:20:36.844603 IP 192.168.1.33.41002 > 74.125.236.183.80: Flags [.], ack 598, win 124, options [nop,nop,TS val 1544179 ecr 977681032], length 0

Data transfer is complete 

13:20:36.844773 IP 192.168.1.33.41002 > 74.125.236.183.80: Flags [F.], seq 171, ack 598, win 124, options [nop,nop,TS val 1544179 ecr 977681032], length 0
13:20:36.875132 IP 74.125.236.183.80 > 192.168.1.33.41002: Flags [F.], seq 598, ack 172, win 992, options [nop,nop,TS val 977681064 ecr 1544179], length 0
13:20:36.875166 IP 192.168.1.33.41002 > 74.125.236.183.80: Flags [.], ack 599, win 124, options [nop,nop,TS val 1544209 ecr 977681064], length 0


A very interesting read

http://igoro.com/archive/what-really-happens-when-you-navigate-to-a-url/

No comments:

Post a Comment