August 8, 2006
TCP is transmission Control Protocol, the Layer4 protocol for communication over both wireline as well as wireless links. It is one of the most widespread of protocols in usage today. All key applications defining the web today, http, email transfer, file transfer, etc. use TCP as the backbone transmission protocol.
TCP offers a stream oriented reliable delivery service. Unusually, for a transport layer protocol, it has no explicit support for QoS. Perhaps this is one of the reasons for its success – by eschewing features, it has managed to be simultaneously very efficient and very powerful. Also, it is designed to operate on any standard network layer protocol with as few assumptions as possible.
The key feature in TCP is a very sophisticated scheme of congestion management and rate adaptation. There are multiple variants of TCP, each principally differing in the way they execute these features.
For a definitive specification of TCP protocol, please see the standard in rfc0793. In the rest of this article, we shall review the theory of TCP congestion avoidance and the different algorithms available for handling the same.
Congestion collapse takes place when multiple sources start transmitting together at a constant rate. If the number of sources is sufficiently high, this will be greater than the clearing rate of the network. This will, then, lead to overflow of buffers and packet dropping. The dropped packets will, in turn, lead to further retransmissions, which will only increase the load in the network. Eventually, we will reach the stage where no data is going through.
The job of the transmitter is then, to discover a transmission rate so that the combined transmission rate of all transmitters is less than the network clearing rate. The fact that this is even possible comes from a very deep result in distributed autonomous control, see [Kleinrock95].
TCP is a sliding window, asynchronous ARQ protocol. It is termed asynchronous because it does not use clocks explicitly for most operations and also, does not need clocks of a very fine granularity. Rather, whenever a packet is received, this is treated as an event, which triggers the next activity. The lack of clocks ensures that processing load is relatively low and TCP has run successfully on 5-6 generations of computers, ranging from the 5Mhz VAX machines of the 1970s to modern day embedded systems and teraflop supercomputers.
TCP Reno is the most widespread variant of TCP currently. It is derived from the original version of TCP (TCP Tahoe) which was circulated as part of the FreeBSD protocol stack. The TCP stack in Solaris, Linux and Windows is a variant of TCP Reno.
TCP Reno uses two key techniques for congestion management. One is slow-start and the other is congestion avoidance.
Standard TCP was designed for a network where links were relatively reliable, but processing computers were slow and starved for memory. Link speeds were limited (the first Internet was run on 56kb/s dialup lines, which, in the USA would have meant a noise margin of more than 40dB), but the ability of computers to transmit over them was slower still. Consequently, packet dropping was usually due to lack of buffering capability.
Standard TCP thus uses packet dropping as a sign of congestion. Packet dropping can be signalled when the receiver gets a discontinuous sequence, or, in cases of heavy congestion, an entire burst is lost. The difference between the first and the second is the method of feedback. While discontinuous packets can be signaled to the transmitter using duplicate acknowledgments, a whole burst being lost can only be detected autonomously by the transmitter using a retransmission timeout. One of the biggest challenges in TCP protocol design is to correctly compute the RTO wait period, taking into account queueing delay which can vary substantially and will also dominate link delay.