What Is Latency and How to Reduce Latency?
Latency Definition
Latency refers to the delay that happens between when a user takes an action on a network or web application and when they get a response. Another latency definition is the total time or “round trip” needed for a packet of data to travel.
What does latency mean? Latency refers to the delay that happens between when a user takes an action on a network or web application and when it reaches its destination, which is measured in milliseconds. This can be caused by a variety of factors and components within the network itself. Adding elements to a network can therefore increase latency.
Latency is typically measured in milliseconds. While it is possible to design a network where latency is reduced to a relatively few milliseconds, it is impossible to have a zero-latency network because of how data travels.
How Latency Works?
Latency occurs due to the distance between the user and key elements of the network, including their internal local-area network (LAN) and the internet or a privately managed wide-area network (WAN). As the user initiates a command on their device, several steps have to happen before the request gets fulfilled.
For example, when a user tries to add something to an online shopping cart, the following has to happen:
- The user adds the item to their cart.
- The browser the user is using sends a request to the servers of the website that has the shopping cart.
- The request has to go to the site’s server, and it needs to have all the necessary information. Transmitting this information takes a certain amount of time, depending on how much information is being sent.
- The site server then gets the request, completing the first part of the latency cycle.
- The server then either accepts or rejects the request before processing it. Each of these steps also takes a certain amount of time, depending on the server’s capabilities and the amount of data being processed.
- The server for the site sends a reply to the user with the necessary information pertaining to the purchase.
- The user’s browser gets the request, and the product is then added to their cart. This completes the latency cycle.
If you add up all the increments of time, starting from when the user clicks on the button to add the item to their cart and when they see that it has been added, you get the total latency resulting from the request.
What Causes Network Latency?
In the typical latency meaning, one of the main causes of latency in a network is the distance data has to travel—in particular, the distance between the client devices making requests and the servers that have to respond to each request. In many cases, the client device refers to a computer or device the end-user is using. But it can also involve one that is part of a process, positioned between the end-user and the server they are trying to get information from.
For example, the latency between a firewall and a server receiving and sending data can be measured, in addition to the overall latency between the user’s request and when it is fulfilled.
Distance impacts latency because data has to travel from point A to point B, so the longer the distance, the greater the latency. For example, a request that originates in New York will experience more latency if it has to interact with a server in California than if it merely has to travel to Philadelphia. The difference could be as much as 40 milliseconds. This may not seem like a lot, but when instantaneous results to queries are necessary, 50 milliseconds, particularly when multiplied across several concurrent requests, can make a significant difference.
Furthermore, data often has to go through several networks as it travels. Each network presents opportunities for more latency. Each router associated with the various networks has to process the packets, break them into smaller data packets, and forward them to the next node. Every time this happens, it takes time.
Transmission Medium
The transmission medium refers to the physical path that exists between where the data begins its journey and where it ends. The kind of transmission medium can affect latency. For example, using copper wiring instead of fiber optic cables can increase latency because an optic connection transmits data faster.
Propagation
Propagation refers to the time it takes for a packet of data to go from the source to the desired destination. As data travels greater distances, latency increases. However, the final latency can be more or less depending on the components used within the network.
Routers
Because routers have to receive and forward data, the speed at which they do this has a significant impact on latency. In many networks, there are several routers working in a chain. Each one adds latency to the process.
Storage Delays
A storage network needs time to process information and send it to the device making the request. The specs of the storage network can therefore affect latency.
Latency vs. Bandwidth vs. Throughput—What Is the Difference?
Latency, throughput, and bandwidth are all connected, but they refer to different things. Bandwidth measures the amount of data that is able to pass through a network at a given time. A network with a gigabit of bandwidth, for example, will often perform better than a network with only 10 Mbps of bandwidth.
Throughput refers to how much data can pass through, on average, over a specific period of time. Throughput is impacted by latency, so there may not be a linear relationship between how much bandwidth a network has and how much throughput it is capable of producing. For example, a network with high bandwidth may have components that process their various tasks slowly, while a lower-bandwidth network may have faster components, resulting in higher overall throughput.
Latency is a result of a combination of throughput and bandwidth. It refers to the amount of time it takes data to travel after a request has been made. With more or less bandwidth or throughput, latency increases or decreases accordingly.
How to Reduce Latency?
One way to reduce latency is to use a content delivery network (CDN). With a CDN, you get the ability to cache content. With caching, certain content that will be needed is kept, so it can be accessed on-demand without having to obtain it from the original server. This is done using a CDN server.
CDN servers can be placed in various locations strategically to ensure content is stored in close proximity to end-users and their devices. In this way, the data packets do not have to travel as far after a request is made. As a result, the website delivering the content is able to load faster, and content reaches its final destination sooner.
You can also do things to your content to reduce latency. For example, you can cut down on the amount of resources that block rendering. If you load JavaScript as the last step in the rendering process, for example, you can get the content to its destination faster. You can also optimize images on your website in a way that allows them to load faster. This may include reducing the size of the image files.
In general, reducing the file sizes of content can reduce latency. One way to reduce file size is to minimize the amount of code used. Each aspect of code needs to be forwarded, so including less code can result in lower latency.
In some instances, latency is a matter of perception being reality. For example, if a user goes to a website looking for a specific type of content, you may want to make sure the site delivers that content first. Other aspects of the site may contribute to its aesthetic appeal, but in the end, the user only visits to fill a specific need. Therefore, when they get the content they come for, it may feel like the site loads faster than it actually does.
Each area of a complete webpage as it appears on the user’s screen is referred to as being above the fold. If your site has minimal information above key text, images, or video the user may be looking for, you may be able to have this load first, giving them what they wanted above the fold. As they consume that content, the other assets can be loading in the background underneath the fold.
You can also set up a website so it only loads the assets that are needed in the moment. This is sometimes called “lazy loading.” The assets the user needs the most are loaded, while others are kept on the server. When those are needed, they can be loaded, giving the user the impression they are getting all they came for from the site.
Users can take steps to fix latency as well. For example, they can:
How Fortinet Can Help
The Fortinet next-generation firewall (NGFW), FortiGate, can significantly reduce latency in your network without sacrificing protection. FortiGate is equipped with high-throughput processors that can filter large amounts of information quickly, forwarding the approved data packets through to users faster than other solutions.
For example, the FortiGate 3900E can process data at a rate of 1.05 Tbps, and the FortiGate 4200F NGFW provides throughput at 800 Gbps.
FAQs
What does latency mean?
Latency refers to the delay that happens between when a user takes an action on a network or web application and when they get a response.
What is a good latency?
A good, acceptable latency depends on the user and application, but generally speaking, anything below 150 milliseconds is considered good.
Is high or low latency better?
Lower latency is generally better than higher latency because higher latency means users cannot get what they need as soon as they want it.
What does high latency mean?
High latency means a long time passes between when a user clicks or taps on something and when they get what they want.