Mastering Request Retrying in Python with Tenacity: A Developer’s Journey

7 September 2024 at 01:49

Meet Jafer, a talented developer (self boast) working at a fast growing tech company. His team is building an innovative app that fetches data from multiple third-party APIs in realtime to provide users with up-to-date information.

Everything is going smoothly until one day, a spike in traffic causes their app to face a wave of “HTTP 500” and “Timeout” errors. Requests start failing left and right, and users are left staring at the dreaded “Data Unavailable” message.

Jafer realizes that he needs a way to make their app more resilient against these unpredictable network hiccups. That’s when he discovers Tenacity a powerful Python library designed to help developers handle retries gracefully.

Join Jafer as he dives into Tenacity and learns how to turn his app from fragile to robust with just a few lines of code!

Step 0: Mock FLASK Api

from flask import Flask, jsonify, make_response
import random
import time

app = Flask(__name__)

# Scenario 1: Random server errors
@app.route('/random_error', methods=['GET'])
def random_error():
    if random.choice([True, False]):
        return make_response(jsonify({"error": "Server error"}), 500)  # Simulate a 500 error randomly
    return jsonify({"message": "Success"})

# Scenario 2: Timeouts
@app.route('/timeout', methods=['GET'])
def timeout():
    time.sleep(5)  # Simulate a long delay that can cause a timeout
    return jsonify({"message": "Delayed response"})

# Scenario 3: 404 Not Found error
@app.route('/not_found', methods=['GET'])
def not_found():
    return make_response(jsonify({"error": "Not found"}), 404)

# Scenario 4: Rate-limiting (simulated with a fixed chance)
@app.route('/rate_limit', methods=['GET'])
def rate_limit():
    if random.randint(1, 10) <= 3:  # 30% chance to simulate rate limiting
        return make_response(jsonify({"error": "Rate limit exceeded"}), 429)
    return jsonify({"message": "Success"})

# Scenario 5: Empty response
@app.route('/empty_response', methods=['GET'])
def empty_response():
    if random.choice([True, False]):
        return make_response("", 204)  # Simulate an empty response with 204 No Content
    return jsonify({"message": "Success"})

if __name__ == '__main__':
    app.run(host='localhost', port=5000, debug=True)

To run the Flask app, use the command,

python mock_server.py

Step 1: Introducing Tenacity

Jafer decides to start with the basics. He knows that Tenacity will allow him to retry failed requests without cluttering his codebase with complex loops and error handling. So, he installs the library,

pip install tenacity

With Tenacity ready, Jafer decides to tackle his first problem, retrying a request that fails due to server errors.

Step 2: Retrying on Exceptions

He writes a simple function that fetches data from an API and wraps it with Tenacity’s @retry decorator

import requests
import logging
from tenacity import before_log, after_log
from tenacity import retry, stop_after_attempt, wait_fixed

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

@retry(stop=stop_after_attempt(3),
        wait=wait_fixed(2),
        before=before_log(logger, logging.INFO),
        after=after_log(logger, logging.INFO))
def fetch_random_error():
    response = requests.get('http://localhost:5000/random_error')
    response.raise_for_status()  # Raises an HTTPError for 4xx/5xx responses
    return response.json()
 
if __name__ == '__main__':
    try:
        data = fetch_random_error()
        print("Data fetched successfully:", data)
    except Exception as e:
        print("Failed to fetch data:", str(e))

This code will attempt the request up to 3 times, waiting 2 seconds between each try. Jafer feels confident that this will handle the occasional hiccup. However, he soon realizes that he needs more control over which exceptions trigger a retry.

Step 3: Handling Specific Exceptions

Jafer’s app sometimes receives a “404 Not Found” error, which should not be retried because the resource doesn’t exist. He modifies the retry logic to handle only certain exceptions,

import requests
import logging
from tenacity import before_log, after_log
from requests.exceptions import HTTPError, Timeout
from tenacity import retry, retry_if_exception_type, stop_after_attempt, wait_fixed
 

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

@retry(stop=stop_after_attempt(3),
        wait=wait_fixed(2),
        retry=retry_if_exception_type((HTTPError, Timeout)),
        before=before_log(logger, logging.INFO),
        after=after_log(logger, logging.INFO))
def fetch_data():
    response = requests.get('http://localhost:5000/timeout', timeout=2)  # Set a short timeout to simulate failure
    response.raise_for_status()
    return response.json()

if __name__ == '__main__':
    try:
        data = fetch_data()
        print("Data fetched successfully:", data)
    except Exception as e:
        print("Failed to fetch data:", str(e))

Now, the function retries only on HTTPError or Timeout, avoiding unnecessary retries for a “404” error. Jafer’s app is starting to feel more resilient!

Step 4: Implementing Exponential Backoff

A few days later, the team notices that they’re still getting rate-limited by some APIs. Jafer recalls the concept of exponential backoff a strategy where the wait time between retries increases exponentially, reducing the load on the server and preventing further rate limiting.

He decides to implement it,

import requests
import logging
from tenacity import before_log, after_log
from tenacity import retry, stop_after_attempt, wait_exponential

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)


@retry(stop=stop_after_attempt(5),
       wait=wait_exponential(multiplier=1, min=2, max=10),
       before=before_log(logger, logging.INFO),
       after=after_log(logger, logging.INFO))
def fetch_rate_limit():
    response = requests.get('http://localhost:5000/rate_limit')
    response.raise_for_status()
    return response.json()
 
if __name__ == '__main__':
    try:
        data = fetch_rate_limit()
        print("Data fetched successfully:", data)
    except Exception as e:
        print("Failed to fetch data:", str(e))

With this code, the wait time starts at 2 seconds and doubles with each retry, up to a maximum of 10 seconds. Jafer’s app is now much less likely to be rate-limited!

Step 5: Retrying Based on Return Values

Jafer encounters another issue: some APIs occasionally return an empty response (204 No Content). These cases should also trigger a retry. Tenacity makes this easy with the retry_if_result feature,

import requests
import logging
from tenacity import before_log, after_log

from tenacity import retry, stop_after_attempt, retry_if_result

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
  

@retry(retry=retry_if_result(lambda x: x is None), stop=stop_after_attempt(3), before=before_log(logger, logging.INFO),
       after=after_log(logger, logging.INFO))
def fetch_empty_response():
    response = requests.get('http://localhost:5000/empty_response')
    if response.status_code == 204:
        return None  # Simulate an empty response
    response.raise_for_status()
    return response.json()
 
if __name__ == '__main__':
    try:
        data = fetch_empty_response()
        print("Data fetched successfully:", data)
    except Exception as e:
        print("Failed to fetch data:", str(e))

Now, the function retries when it receives an empty response, ensuring that users get the data they need.

Step 6: Combining Multiple Retry Conditions

But Jafer isn’t done yet. Some situations require combining multiple conditions. He wants to retry on HTTPError, Timeout, or a None return value. With Tenacity’s retry_any feature, he can do just that,

import requests
import logging
from tenacity import before_log, after_log

from requests.exceptions import HTTPError, Timeout
from tenacity import retry_any, retry, retry_if_exception_type, retry_if_result, stop_after_attempt
 
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

@retry(retry=retry_any(retry_if_exception_type((HTTPError, Timeout)), retry_if_result(lambda x: x is None)), stop=stop_after_attempt(3), before=before_log(logger, logging.INFO),
       after=after_log(logger, logging.INFO))
def fetch_data():
    response = requests.get("http://localhost:5000/timeout")
    if response.status_code == 204:
        return None
    response.raise_for_status()
    return response.json()

if __name__ == '__main__':
    try:
        data = fetch_data()
        print("Data fetched successfully:", data)
    except Exception as e:
        print("Failed to fetch data:", str(e))

This approach covers all his bases, making the app even more resilient!

Step 7: Logging and Tracking Retries

As the app scales, Jafer wants to keep an eye on how often retries happen and why. He decides to add logging,

import logging
import requests
from tenacity import before_log, after_log
from tenacity import retry, stop_after_attempt, wait_fixed

 
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
 
@retry(stop=stop_after_attempt(2), wait=wait_fixed(2),
       before=before_log(logger, logging.INFO),
       after=after_log(logger, logging.INFO))
def fetch_data():
    response = requests.get("http://localhost:5000/timeout", timeout=2)
    response.raise_for_status()
    return response.json()

if __name__ == '__main__':
    try:
        data = fetch_data()
        print("Data fetched successfully:", data)
    except Exception as e:
        print("Failed to fetch data:", str(e))

This logs messages before and after each retry attempt, giving Jafer full visibility into the retry process. Now, he can monitor the app’s behavior in production and quickly spot any patterns or issues.

The Happy Ending

With Tenacity, Jafer has transformed his app into a resilient powerhouse that gracefully handles intermittent failures. Users are happy, the servers are humming along smoothly, and Jafer’s team has more time to work on new features rather than firefighting network errors.

By mastering Tenacity, Jafer has learned that handling network failures gracefully can turn a fragile app into a robust and reliable one. Whether it’s dealing with flaky APIs, network blips, or rate limits, Tenacity is his go-to tool for retrying operations in Python.

So, the next time your app faces unpredictable network challenges, remember Jafer’s story and give Tenacity a try you might just save the day!

Reading view