Avoid Cache Pitfalls: Key Problems and Fixes

By: Mr.ParottaSalna

16 February 2025 at 09:22

Caching is an essential technique for improving application performance and reducing the load on databases. However, improper caching strategies can lead to serious issues.

I got inspired from ByteByteGo https://www.linkedin.com/posts/bytebytego_systemdesign-coding-interviewtips-activity-7296767687978827776-Dizz

In this blog, we will discuss four common cache problems: Thundering Herd Problem, Cache Penetration, Cache Breakdown, and Cache Crash, along with their causes, consequences, and solutions.

Thundering Herd Problem

What is it?

The Thundering Herd Problem occurs when a large number of keys in the cache expire at the same time. When this happens, all requests bypass the cache and hit the database simultaneously, overwhelming it and causing performance degradation or even a system crash.

Example Scenario

Imagine an e-commerce website where product details are cached for 10 minutes. If all the products’ cache expires at the same time, thousands of users sending requests will cause an overwhelming load on the database.

Solutions

Staggered Expiration: Instead of setting a fixed expiration time for all keys, introduce a random expiry variation.
Allow Only Core Business Queries: Limit direct database access only to core business data, while returning stale data or temporary placeholders for less critical data.
Lazy Rebuild Strategy: Instead of all requests querying the database, the first request fetches data and updates the cache while others wait.
Batch Processing: Queue multiple requests and process them in batches to reduce database load.

Cache Penetration

What is it?

Cache Penetration occurs when requests are made for keys that neither exist in the cache nor in the database. Since these requests always hit the database, they put excessive pressure on the system.

Example Scenario

A malicious user could attempt to query random user IDs that do not exist, forcing the system to repeatedly query the database and skip the cache.

Solutions

Cache Null Values: If a key does not exist in the database, store a null value in the cache to prevent unnecessary database queries.
Use a Bloom Filter: A Bloom filter helps check whether a key exists before querying the database. If the Bloom filter does not contain the key, the request is discarded immediately.
Rate Limiting: Implement request throttling to prevent excessive access to non-existent keys.
Data Prefetching: Predict and load commonly accessed data into the cache before it is needed.

Cache Breakdown

What is it?

Cache Breakdown is similar to the Thundering Herd Problem, but it occurs specifically when a single hot key (a frequently accessed key) expires. This results in a surge of database queries as all users try to retrieve the same data.

Example Scenario

A social media platform caches trending hashtags. If the cache expires, millions of users will query the same hashtag at once, hitting the database hard.

Solutions

Never Expire Hot Keys: Keep hot keys permanently in the cache unless an update is required.
Preload the Cache: Refresh the cache asynchronously before expiration by setting a background task to update the cache regularly.
Mutex Locking: Ensure only one request updates the cache, while others wait for the update to complete.
Double Buffering: Maintain a secondary cache layer to serve requests while the primary cache is being refreshed.

Cache Crash

What is it?

A Cache Crash occurs when the cache service itself goes down. When this happens, all requests fall back to the database, overloading it and causing severe performance issues.

Example Scenario

If a Redis instance storing session data for a web application crashes, all authentication requests will be forced to hit the database, leading to a potential outage.

Solutions

Cache Clustering: Use a cluster of cache nodes instead of a single instance to ensure high availability.
Persistent Storage for Cache: Enable persistence modes like Redis RDB or AOF to recover data quickly after a crash.
Automatic Failover: Configure automated failover with tools like Redis Sentinel to ensure availability even if a node fails.
Circuit Breaker Mechanism: Prevent the application from directly accessing the database if the cache is unavailable, reducing the impact of a crash.

class CircuitBreaker:
    def __init__(self, failure_threshold=5):
        self.failure_count = 0
        self.failure_threshold = failure_threshold
    
    def call(self, func, *args, **kwargs):
        if self.failure_count >= self.failure_threshold:
            return "Service unavailable"
        try:
            return func(*args, **kwargs)
        except Exception:
            self.failure_count += 1
            return "Error"

Caching is a powerful mechanism to improve application performance, but improper strategies can lead to severe bottlenecks. Problems like Thundering Herd, Cache Penetration, Cache Breakdown, and Cache Crash can significantly degrade system reliability if not handled properly.

Parotta Salna
Learning Notes #53 – The Expiration Time Can Be Unexpectedly Lost While Using Redis SET EX
12 January 2025 at 09:14

Learning Notes #53 – The Expiration Time Can Be Unexpectedly Lost While Using Redis SET EX

Parotta Salna

By: Mr.ParottaSalna

12 January 2025 at 09:14

Redis, a high-performance in-memory key-value store, is widely used for caching, session management, and various other scenarios where fast data retrieval is essential. One of its key features is the ability to set expiration times for keys. However, when using the SET command with the EX option, developers might encounter unexpected behaviors where the expiration time is seemingly lost. Let’s explore this issue in detail.

Understanding `SET` with `EX`

The Redis SET command with the EX option allows you to set a key’s value and specify its expiration time in seconds. For instance


SET key value EX 60

This command sets the key key to the value value and sets an expiration time of 60 seconds.

The Problem

In certain cases, the expiration time might be unexpectedly lost. This typically happens when subsequent operations overwrite the key without specifying a new expiration. For example,


SET key value1 EX 60
SET key value2

In the above sequence,

The first SET command assigns a value to key and sets an expiration of 60 seconds.
The second SET command overwrites the value of key but does not include an expiration time, resulting in the key persisting indefinitely.

This behavior can lead to subtle bugs, especially in applications that rely on key expiration for correctness or resource management.

Why Does This Happen?

The Redis SET command is designed to replace the entire state of a key, including its expiration. When you use SET without the EX, PX, or EXAT options, the expiration is removed, and the key becomes persistent. This behavior aligns with the principle that SET is a complete update operation.

When using Redis SET with EX, be mindful of operations that might overwrite keys without reapplying expiration. Understanding Redis’s behavior and implementing robust patterns can save you from unexpected issues, ensuring your application remains efficient and reliable.

Parotta Salna
Learning Notes #28 – Unlogged Table in Postgres
2 January 2025 at 17:30

Learning Notes #28 – Unlogged Table in Postgres

Parotta Salna

By: Mr.ParottaSalna

2 January 2025 at 17:30

Today, As part of daily reading, i came across https://raphaeldelio.com/2024/07/14/can-postgres-replace-redis-as-a-cache/ where they discussing about postgres as a cache ! and comparing it with redis !! I was surprised at the title so gave a read through. Then i came across a concept of UNLOGGED table which act as a fast retrieval as cache. In this blog i jot down notes on unlogged table for future reference.

Highly Recommended Links: https://martinheinz.dev/blog/105, https://raphaeldelio.com/2024/07/14/can-postgres-replace-redis-as-a-cache/, https://www.crunchydata.com/blog/postgresl-unlogged-tables

Unlogged tables offer unique benefits in scenarios where speed is paramount, and durability (the guarantee that data is written to disk and will survive crashes) is not critical.

What Are Unlogged Tables?

Postgres Architecture : https://miro.com/app/board/uXjVLD2T5os=/

In PostgreSQL, a table is a basic unit of data storage. By default, PostgreSQL ensures that data in regular tables is durable. This means that all data is written to the disk and will survive server crashes. However, in some situations, durability is not necessary. Unlogged tables are special types of tables in PostgreSQL where the database does not write data changes to the WAL (Write-Ahead Log).

The absence of WAL logging for unlogged tables makes them faster than regular tables because PostgreSQL doesn’t need to ensure data consistency across crashes for these tables. However, this also means that if the server crashes or the system is powered off, the data in unlogged tables is lost.

Key Characteristics of Unlogged Tables

No Write-Ahead Logging (WAL) – By default, PostgreSQL writes changes to the WAL to ensure data durability. For unlogged tables, this step is skipped, making operations like INSERTs, UPDATEs, and DELETEs faster.
No Durability – The absence of WAL means that unlogged tables will lose their data if the database crashes or if the server is restarted. This makes them unsuitable for critical data.
Faster Performance – Since WAL writes are skipped, unlogged tables are faster for data insertion and modification. This can be beneficial for use cases where data is transient and doesn’t need to persist beyond the current session.
Support for Indexes and Constraints – Unlogged tables can have indexes and constraints like regular tables. However, the data in these tables is still non-durable.
Automatic Cleanup – When the PostgreSQL server restarts, the data in unlogged tables is automatically dropped. Therefore, unlogged tables only hold data during the current database session.

Drawbacks of Unlogged Tables

Data Loss on Crash – The most significant disadvantage of unlogged tables is the loss of data in case of a crash or restart. If the application depends on this data, then using unlogged tables would not be appropriate.
Not Suitable for Critical Applications – Applications that require data persistence (such as financial or inventory systems) should avoid using unlogged tables, as the risk of data loss outweighs any performance benefits.
No Replication – Unlogged tables are not replicated in standby servers in a replication setup, as the data is not written to the WAL.

Creating an Unlogged Table

Creating an unlogged table is very straightforward in PostgreSQL. You simply need to add the UNLOGGED keyword when creating the table.


CREATE UNLOGGED TABLE temp_data (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    value INT
);

In this example, temp_data is an unlogged table. All operations performed on this table will not be logged to the WAL.

When to Avoid Unlogged Tables?

If you are working with critical data that needs to be durable and persistent across restarts.
If your application requires data replication, as unlogged tables are not replicated in standby servers.
If your workload involves frequent crash scenarios where data loss cannot be tolerated.

Examples

1. Temporary Storage for processing


CREATE UNLOGGED TABLE etl_staging (
    source_id INT,
    raw_data JSONB,
    processed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Insert raw data into the staging table
INSERT INTO etl_staging (source_id, raw_data)
VALUES 
    (1, '{"key": "value1"}'),
    (2, '{"key": "value2"}');

-- Perform transformations on the data
INSERT INTO final_table (id, key, value)
SELECT source_id, 
       raw_data->>'key' AS key, 
       'processed_value' AS value
FROM etl_staging;

-- Clear the staging table
TRUNCATE TABLE etl_staging;

2. Caching


CREATE UNLOGGED TABLE user_sessions (
    session_id UUID PRIMARY KEY,
    user_id INT,
    last_accessed TIMESTAMP DEFAULT NOW()
);

-- Insert session data
INSERT INTO user_sessions (session_id, user_id)
VALUES 
    (uuid_generate_v4(), 101),
    (uuid_generate_v4(), 102);

-- Update last accessed timestamp
UPDATE user_sessions
SET last_accessed = NOW()
WHERE session_id = 'some-session-id';

-- Delete expired sessions
DELETE FROM user_sessions WHERE last_accessed < NOW() - INTERVAL '1 hour';

Normal view

Thundering Herd Problem

What is it?

Example Scenario

Solutions

Cache Penetration

What is it?

Example Scenario

Solutions

Cache Breakdown

What is it?

Example Scenario

Solutions

Cache Crash

What is it?

Example Scenario

Solutions

Understanding SET with EX

The Problem

Why Does This Happen?

What Are Unlogged Tables?

Key Characteristics of Unlogged Tables

Drawbacks of Unlogged Tables

Creating an Unlogged Table

When to Avoid Unlogged Tables?

Examples

1. Temporary Storage for processing

2. Caching

Understanding `SET` with `EX`