Numerous components on a network, such as DNS servers, switches, load balancers, and others can generate errors anywhere in the life of a given request. A common way to handle theses failures is through the use of retries, and retries with backoff and jitter.
As an engineer, your should clearly enforce these practices when dealing with network connectivity or similar communication protocols over the internet.
Retries, as mentioned previously, are a nice way for dealing with transient remote API errors in client applications. When a client receives an error response for a timeout, it is the responsibility of the client to retry.
Therefore, having a good retry mechanism is important for making our operations run smoothly.
Backoff is a technique for performing retries gracefully, without overloading or burning out your backend systems. A simple way to perform retries is by adding a delay between calls. This approach is called a linear backoff. While this is easy to implement and can handle transient failures in a majority of cases, it does not help when a downstream service is impacted for a prolonged period of time, as the retries sent at a fixed rate will continue to overload the service.
Exponential backoff is a less aggressive form of backoff. As the name suggests, with this approach the delay between each retry increases exponentially, means that clients multiply their backoff by a constant after each attempt until the request succeeds or a maximum backoff limit is hit. This is a more graceful strategy because it avoids overloading downstream servers, which can result in resource starvation.
Exponential backoff with Jitter
Most exponential backoff algorithms use jitter (randomized delay) to prevent successive collisions. In this case, we introduce randomness to the retry intervals.
This is especially beneficial when using concurrent clients.
Reactor - RetryBackoffSpec
RetryBackoffSpec is Retry strategy based on exponential backoffs with jitter.
The client blocks for a brief initial wait time on the first failure, but as the operation continues to fail, it waits proportionally to 2^n, where n is the number of failures that have occurred, a well choosen amount of random jitter is added to each client’s wait time.
I will refer to minBackoff and maxBackOff as min and max respectively.
I will refer to jitterOffset as j.
jitterOffset is obtained by multiplying the jitterFactor (default =0.5) and the computed delay , it defines the interval from which we will pick the jitter.
The delay is calculated as the following:
For each retry the minBackoff is multiplied by 2^n, where n is the number of failures that have occurred and we check if maxBackoff has not been hit.
the next delay would be
where minimum is the usual minimum function, max and min are the maxBackOff and minBackOff respectively.
so far it’s an Exponential backoff, as discussed earlier we want to add some randomness,
where j is the jitterOffset.
but we have to ensure that the nextBackoff is in the correct interval, which means
but we have also that
this leads us to define epsilon as
this explains this part of the code:
If the retry requests fails after exceeding the max, an error is reported:
In this tutorial, we've explored how we can improve how client applications retry failed calls by augmenting exponential backoff with jitter.
Non-Blocking Reactive Foundation for the JVM. Contribute to reactor/reactor-core development by creating an account on…
Exponential Backoff And Jitter | Amazon Web Services
Introducing OCC Optimistic concurrency control (OCC) is a time-honored way for multiple writers to safely modify a…
Better Retries with Exponential Backoff and Jitter | Baeldung
In this tutorial, we'll explore how we can improve client retries with two different strategies: exponential backoff…
Error retries and exponential backoff in AWS
Numerous components on a network, such as DNS servers, switches, load balancers, and others can generate errors…