We’ve talked about network latency several times in previous blog posts
. Today, let’s drill a bit deeper on that topic, and especially how it influences application performance. Network latency
is the time it takes for a packet to travel across the network. It’s usually measured and reported as the Round Trip Time (RTT). An easy way to measure RTT is with the ping
command. The traceroute
under Windows) command goes a step further and reports the RTT for each hop between the local system and the remote system being tracerouted.
The end-to-end latency between two systems depends on four factors:
- The bitrate. The average packet size on the internet is about 500 bytes, or 4000 bits. For instance, at 10 Mbps, 10,000,000 / 4000 = 2500 packets can be transmitted per second. So a 10 Mbps link adds 0.4 milliseconds (1000 ms / 2500 packets) to the RTT, a 100 Mbps link 0.04 ms and a 1 Gbps link just 4 microseconds.
- The speed of light. Light travels at about 200,000 kilometers per second through optical fibers, roughly two thirds of the speed of light in a vacuum. This means that every 100 km or 60 miles of distance a packet has to traverse adds half a millisecond to the one-way latency and thus 1 ms to the RTT.
- Processing. Routers along the way and the destination system each need time to process packets. These days, routers and hosts measure their packet handling capacity in millions of packets per second, but that doesn’t necessarily mean that handling an individual packet takes less than a microsecond. However, for normal data packets processing time is usually negligible. But special packets such as ping and traceroute packets and other ICMP (Internet Control Message Protocol) packets may be processed on a slower path, so the RTT shown by these tools may not reflect the RTT incurred by regular data packets.
- Time spent in buffers. If packets come in faster than they can be transmitted, they’re buffered for some time. Queuing theory shows that the average buffer size depends on the utilization of the network. A link under 90% load will have an average queue of 9 packets buffered (0.9 / (1 – 0.9) = 9). At 99%, this is 99 packets, at 99.9% it’s 999 packets, and so on. This adds up quickly for slower links, but in the core of the internet where 10 Gbps is a common speed, a queue of 999 packets is drained in less than a millisecond.
In other words: in most situations, the dominant factor that determines latency is the physical distance that packets have to travel. Obviously, not much can be done about the actual distance between the two systems that communicate, but sometimes packets are routed over much longer paths than necessary.
We now know where latency comes from. Let’s look at its effects.
The obvious effect of high latency is that it takes longer for a network request to be handled. In theory, if the RTT is 10 ms, it takes 5 ms for a request to flow from the client system to a server, and then 5 ms for the requested data to be returned. In practice, it’s longer, because most protocols need to send multiple packets back and forth before data can start to flow. For instance, TCP uses a three-way handshake: the client sends the first packet to the server. The server sends back an acknowledgment, and then a third packet from the client to the server is the first that can carry data in the form of (for instance) an HTTP request. The fourth packet in the exchange is then the first that can deliver the requested data to the client. With an RTT of 10 ms, this takes 20 ms, but if the RTT is 100 ms, all of this takes 200 ms. This is well above the value suggested by the rule of thumb that users start noticing latency of applications at 100 ms.
Also, before a TCP session can be opened, an application will almost always need to look up the destination server’s IP address in the Domain Name System (DNS). That’s at least one other round trip (time). So it’s important to avoid using a DNS server far away with a high RTT. Some server software first does a reverse DNS lookup for the client’s IP address before it accepts a connection or processes a request. This can add significant delays for users, especially ones far away, while testers who come from previously seen IP addresses won’t see such a delay. Make sure that such reverse lookups, if needed for logging purposes, happen asynchronously and don’t block handling connections and requests.
Another source of application delays is the negotiation of security parameters. Setting up TLS/SSL takes several more round trips as encryption and hashing algorithms are negotiated, the identity of the server is verified by the client and a session encryption key is determined. Again, because developers and testers are usually relatively close to a system, the impact of the additional round trips is usually not obvious to them. But users far away will see a noticeable delay as a dozen or so packets need to go back and forth before actual data can be transferred.
However, the delays aren’t over once the first data packet starts to flow.
A very simple way to communicate over a network is to send a data packet, wait for the other side to acknowledge that the packet has been received, and then send the next packet. This works well over short distances, but it quickly becomes unworkable over longer distances. Suppose the RTT is 50 ms, because the distance between the sender and receiver is about 5000 km (3000 miles). After sending a packet, the sender then waits 100 ms before receiving an acknowledgment and sending the next packet. So the transmission rate is one packet per RTT. This allows for just 20 packets per second across a continent or an ocean—just 30 kilobytes per second with standard 1500-byte Ethernet packets!
So TCP’s approach is to try and figure out how many packets need to be “in flight” in order to fully utilize the available bandwidth. Suppose the available bandwidth is 100 Mbps. That’s about 8300 1500-byte packets per second, or 415 packets per 50 ms RTT. Thus, TCP sends 415 packets. Then it waits for the first packet to be acknowledged to send the 416th, and so on, keeping a window of 415 packets in flight. In practice, TCP is continuously probing available bandwidtfh so it will inject more packets into the network than the available bandwidth can handle. Then at some point, one or more packets are lost, TCP reduces its window size and as a result, slows down its transmission rate.
In order to avoid massively overloading the network at the beginning of a session, TCP uses an algorithm called slow start. The name of the algorithm isn’t particularly well-chosen, as TCP doubles its window and thus its transmission rate every round trip. So it ramps up its transmission rate pretty quickly. The trouble is, it starts with a window size of only a few packets. So it takes about five round trips for the window to reach the 415 packets needed to fill up a 100 Mbps connection with a 50 ms RTT. So that’s 250 ms before TCP reaches full speed. This get worse fast as RTT increases: with a 100 ms RTT, it now takes six round trips, but each round trip takes twice as long, so it’s 600 ms before TCP is running at full speed.
The moral of the story: when dealing with any kind of interactive network application, it’s always helpful to keep latency as low as possible: this will allow the data to start flowing sooner as well as let TCP reach its full speed faster. Also, a high latency tends to make the problems caused by packet loss worse, and vice versa.