❌

Normal view

There are new articles available, click to refresh the page.
Today β€” 18 January 2025Main stream

Learning Notes #58 – Command Query Responsibility Segregation – An Idea Overview

17 January 2025 at 16:42

Today, i came across a video on ByteMonk on Event Sourcing. In that video, they mentioned about CQRS, then i delved into that. This blog is on understanding CQRS from a high level. I am planning to dive deep into Event Driven Architecture conceptually in upcoming weekend.

In this blog, i jot down notes for basic understanding of CQRS.

In the world of software development, there are countless patterns and practices aimed at solving specific problems. One such pattern is CQRS, short for Command Query Responsibility Segregation. While it might sound complex (it did for me), the idea is quite straightforward when broken down into simple terms.

What is CQRS?

Imagine you run a small bookstore. Customers interact with your store in two main ways

  1. They buy books.
  2. They ask for information about books.

These two activities buying (command) and asking (querying) are fundamentally different. Buying a book changes something in your store (your inventory decreases), whereas asking for information doesn’t change anything; it just retrieves details.

CQRS applies the same principle to software. It separates the operations that change data (called commands) from those that read data (called queries). This separation brings clarity and efficiency (not sure yet πŸ™‚ )

In simpler terms,

  • Commands are actions like β€œAdd this book to the inventory” or β€œUpdate the price of this book.” These modify the state of your system.

  • Queries are questions like β€œHow many books are in stock?” or β€œWhat’s the price of this book?” These fetch data but don’t alter it.

By keeping these two types of operations separate, you make your system easier to manage and scale.

Why Should You Care About CQRS?

Let’s revisit our bookstore analogy. Imagine if every time someone asked for information about a book, your staff had to dig through boxes in the storage room. It would be slow and inefficient!

Instead, you might keep a catalog at the front desk that’s easy to browse.

In software, this means that,

  • Better Performance: By separating commands and queries, you can optimize them individually. For instance, you can have a simple, fast database for queries and a robust, detailed database for commands.

  • Simpler Code: Each part of your system does one thing, making it easier to understand and maintain.

  • Flexibility: You can scale the command and query sides independently. If you get a lot of read requests but fewer writes, you can optimize the query side without touching the command side.

CQRS in Action

Let’s say you’re building an app for managing a library. Here’s how CQRS might look,

  • Command: A librarian adds a new book to the catalog or updates the details of an existing book.

  • Query: A user searches for books by title or checks the availability of a specific book.

The app could use one database to handle commands (storing all the book details and history) and another optimized database to handle queries (focused on quickly retrieving book information).

Does CQRS Always Make Sense?

As of now, its making items complicated for small applications. As usual every pattern is devised for their niche problems. Single Bolt can go through all Nuts.

In upcoming blogs, let’s learn more on CQRS.

Before yesterdayMain stream

Learning Notes #54 – Architecture Decision Records

14 January 2025 at 02:35

Last few days, i was learning on how to make a accountable decision on deciding technical stuffs. Then i came across ADR. So far i haven’t used or seen used by our team. I think this is a necessary step to be incorporated to make accountable decisions. In this blog i share details on ADR for my future reference.

What is an ADR?

An Architectural Decision Record (ADR) is a concise document that captures a single architectural decision, its context, the reasoning behind it, and its consequences. ADRs help teams document, share, and revisit architectural choices, ensuring transparency and better collaboration.

Why Use ADRs?

  1. Documentation: ADRs serve as a historical record of why certain decisions were made.
  2. Collaboration: They promote better understanding across teams.
  3. Traceability: ADRs link architectural decisions to specific project requirements and constraints.
  4. Accountability: They clarify who made a decision and when.
  5. Change Management: ADRs help evaluate the impact of changes and facilitate discussions around reversals or updates.

ADR Structure

A typical ADR document follows a standard format. Here’s an example:

  1. Title: A clear and concise title describing the decision.
  2. Context: Background information explaining the problem or opportunity.
  3. Decision: A summary of the chosen solution.
  4. Consequences: The positive and negative outcomes of the decision.
  5. Status: Indicates whether the decision is proposed, accepted, superseded, or deprecated.

Example:

Optimistic locking on MongoDB https://docs.google.com/document/d/1olCbicQeQzYpCxB0ejPDtnri9rWb2Qhs9_JZuvANAxM/edit?usp=sharing

References

  1. https://cognitect.com/blog/2011/11/15/documenting-architecture-decisions
  2. https://www.infoq.com/podcasts/architecture-advice-process/
  3. Recommended: https://github.com/joelparkerhenderson/architecture-decision-record/tree/main

Learning Notes #53 – The Expiration Time Can Be Unexpectedly Lost While Using Redis SET EX

12 January 2025 at 09:14

Redis, a high-performance in-memory key-value store, is widely used for caching, session management, and various other scenarios where fast data retrieval is essential. One of its key features is the ability to set expiration times for keys. However, when using the SET command with the EX option, developers might encounter unexpected behaviors where the expiration time is seemingly lost. Let’s explore this issue in detail.

Understanding SET with EX

The Redis SET command with the EX option allows you to set a key’s value and specify its expiration time in seconds. For instance


SET key value EX 60

This command sets the key key to the value value and sets an expiration time of 60 seconds.

The Problem

In certain cases, the expiration time might be unexpectedly lost. This typically happens when subsequent operations overwrite the key without specifying a new expiration. For example,


SET key value1 EX 60
SET key value2

In the above sequence,

  1. The first SET command assigns a value to key and sets an expiration of 60 seconds.
  2. The second SET command overwrites the value of key but does not include an expiration time, resulting in the key persisting indefinitely.

This behavior can lead to subtle bugs, especially in applications that rely on key expiration for correctness or resource management.

Why Does This Happen?

The Redis SET command is designed to replace the entire state of a key, including its expiration. When you use SET without the EX, PX, or EXAT options, the expiration is removed, and the key becomes persistent. This behavior aligns with the principle that SET is a complete update operation.

When using Redis SET with EX, be mindful of operations that might overwrite keys without reapplying expiration. Understanding Redis’s behavior and implementing robust patterns can save you from unexpected issues, ensuring your application remains efficient and reliable.

Learning Notes #24 – Competing Consumer | Messaging Queue Patterns

1 January 2025 at 09:45

Today, i learnt about competing consumer, its a simple concept of consuming messages with many consumers. In this blog, i jot down notes on competing consumer for better understanding.

The competing consumer pattern is a commonly used design paradigm in distributed systems for handling workloads efficiently. It addresses the challenge of distributing tasks among multiple consumers to ensure scalability, reliability, and better resource utilization. In this blog, we’ll delve into the details of this pattern, its implementation, and its benefits.

What is the Competing Consumer Pattern?

The competing consumer pattern involves multiple consumers that independently compete to process messages or tasks from a shared queue. This pattern is particularly effective in scenarios where the rate of incoming tasks is variable or high, as it allows multiple consumers to process tasks concurrently.

Key Components

  1. Producer: The component that generates tasks or messages.
  2. Queue: A shared storage medium (often a message broker) that holds tasks until a consumer is ready to process them.
  3. Consumer: The component that processes tasks. Multiple consumers operate concurrently and compete for tasks in the queue.
  4. Message Broker: Middleware (e.g., RabbitMQ, Kafka) that manages the queue and facilitates communication between producers and consumers.

How It Works (Message as Tasks)

  1. Task Generation
    • Producers create tasks and push them into the queue.
    • Tasks can represent anything, such as processing an image, sending an email, or handling a database operation.
  2. Task Storage
    • The queue temporarily stores tasks until they are picked up by consumers.
    • Queues often support features like message persistence and delivery guarantees to enhance reliability.
  3. Task Processing
    • Consumers pull tasks from the queue and process them independently.
    • Each consumer works on one task at a time, and no two consumers process the same task simultaneously.
  4. Task Completion
    • Upon successful processing, the consumer acknowledges the task’s completion to the message broker.
    • The message broker then removes the task from the queue.

Handling Poison Messages

A poison message is a task or message that a consumer repeatedly fails to process. Poison messages can cause delays, block the queue, or crash consumers if not handled appropriately.

Strategies for Handling Poison Messages

  1. Retry Mechanism
    • Allow a fixed number of retries for a task before marking it as failed.
    • Use exponential backoff to reduce the load on the system during retries.
  2. Dead Letter Queue (DLQ)
    • Configure a Dead Letter Queue to store messages that cannot be processed after a predefined number of attempts.
    • Poison messages in the DLQ can be analyzed or reprocessed manually.
  3. Logging and Alerting
    • Log details about the poison message for further debugging.
    • Set up alerts to notify administrators when a poison message is encountered.
  4. Idempotent Consumers
    • Design consumers to handle duplicate processing gracefully. This prevents issues if a message is retried multiple times.

RabbitMQ Example

Producer


import pika

connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()

channel.queue_declare(queue='task_queue', durable=True)

messages = ["Task 1", "Task 2", "Task 3"]

for message in messages:
    channel.basic_publish(
        exchange='',
        routing_key='task_queue',
        body=message,
        properties=pika.BasicProperties(
            delivery_mode=2,  # Makes the message persistent
        )
    )
    print(f"[x] Sent {message}")

connection.close()

Dead Letter Exchange


channel.queue_declare(queue='task_queue', durable=True, arguments={
    'x-dead-letter-exchange': 'dlx_exchange'
})
channel.exchange_declare(exchange='dlx_exchange', exchange_type='fanout')
channel.queue_declare(queue='dlq', durable=True)
channel.queue_bind(exchange='dlx_exchange', queue='dlq')

Consumer Code


import pika
import time

def callback(ch, method, properties, body):
    try:
        print(f"[x] Received {body}")
        # Simulate task processing
        if body == b"Task 2":
            raise ValueError("Cannot process this message")
        time.sleep(1)
        print(f"[x] Processed {body}")
        ch.basic_ack(delivery_tag=method.delivery_tag)
    except Exception as e:
        print(f"[!] Failed to process message: {body}, error: {e}")
        ch.basic_nack(delivery_tag=method.delivery_tag, requeue=False)

connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()

channel.queue_declare(queue='task_queue', durable=True)
print('[*] Waiting for messages. To exit press CTRL+C')

channel.basic_qos(prefetch_count=1)
channel.basic_consume(queue='task_queue', on_message_callback=callback)

channel.start_consuming()

Benefits of the Competing Consumer Pattern

  1. Scalability Adding more consumers allows the system to handle higher workloads.
  2. Fault Tolerance If a consumer fails, other consumers can continue processing tasks.
  3. Resource Optimization Consumers can be distributed across multiple machines to balance the load.
  4. Asynchronous Processing Decouples task generation from task processing, enabling asynchronous workflows.

Challenges and Considerations

  1. Message Duplication – In some systems, messages may be delivered more than once. Implement idempotent processing to handle duplicates.
  2. Load Balancing – Ensure tasks are evenly distributed among consumers to avoid bottlenecks.
  3. Queue Overload – High task rates may lead to queue overflow. Use rate limiting or scale your infrastructure to prevent this.
  4. Monitoring and Metrics – Implement monitoring to track queue sizes, processing rates, and consumer health.
  5. Poison Messages – Implement a robust strategy for handling poison messages, such as using a DLQ or retry mechanism.

References

  1. https://www.enterpriseintegrationpatterns.com/patterns/messaging/CompetingConsumers.html
  2. https://dev.to/willvelida/the-competing-consumers-pattern-4h5n
  3. https://medium.com/event-driven-utopia/competing-consumers-pattern-explained-b338d54eff2b

Learning Notes #22 – Claim Check Pattern | Cloud Pattern

31 December 2024 at 17:03

Today, i learnt about claim check pattern, which tells how to handle a big message into the queue. Every message broker has a defined message size limit. If our message size exceeds the size, it wont work.

The Claim Check Pattern emerges as a pivotal architectural design to address challenges in managing large payloads in a decoupled and efficient manner. In this blog, i jot down notes on my learning for my future self.

What is the Claim Check Pattern?

The Claim Check Pattern is a messaging pattern used in distributed systems to manage large messages efficiently. Instead of transmitting bulky data directly between services, this pattern extracts and stores the payload in a dedicated storage system (e.g., object storage or a database).

A lightweight reference or β€œclaim check” is then sent through the message queue, which the receiving service can use to retrieve the full data from the storage.

This pattern is inspired by the physical process of checking in luggage at an airport: you hand over your luggage, receive a claim check (a token), and later use it to retrieve your belongings.

How Does the Claim Check Pattern Work?

The process typically involves the following steps

  1. Data Submission The sender service splits a message into two parts:
    • Metadata: A small piece of information that provides context about the data.
    • Payload: The main body of data that is too large or sensitive to send through the message queue.
  2. Storing the Payload
    • The sender uploads the payload to a storage service (e.g., AWS S3, Azure Blob Storage, or Google Cloud Storage).
    • The storage service returns a unique identifier (e.g., a URL or object key).
  3. Sending the Claim Check
    • The sender service places the metadata and the unique identifier (claim check) onto the message queue.
  4. Receiving the Claim Check
    • The receiver service consumes the message from the queue, extracts the claim check, and retrieves the payload from the storage system.
  5. Processing
    • The receiver processes the payload alongside the metadata as required.

Use Cases

1. Media Processing Pipelines In video transcoding systems, raw video files can be uploaded to storage while metadata (e.g., video format and length) is passed through the message queue.

2. IoT Systems – IoT devices generate large datasets. Using the Claim Check Pattern ensures efficient transmission and processing of these data chunks.

3. Data Processing Workflows – In big data systems, datasets can be stored in object storage while processing metadata flows through orchestration tools like Apache Airflow.

4. Event-Driven Architectures – For systems using event-driven models, large event payloads can be offloaded to storage to avoid overloading the messaging layer.

Example with RabbitMQ

1.Sender Service


import boto3
import pika

s3 = boto3.client('s3')
bucket_name = 'my-bucket'
object_key = 'data/large-file.txt'

response = s3.upload_file('large-file.txt', bucket_name, object_key)
claim_check = f's3://{bucket_name}/{object_key}'

# Connect to RabbitMQ
connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()

# Declare a queue
channel.queue_declare(queue='claim_check_queue')

# Send the claim check
message = {
    'metadata': 'Some metadata',
    'claim_check': claim_check
}
channel.basic_publish(exchange='', routing_key='claim_check_queue', body=str(message))

connection.close()

2. Consumer


import boto3
import pika

s3 = boto3.client('s3')

# Connect to RabbitMQ
connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()

# Declare a queue
channel.queue_declare(queue='claim_check_queue')

# Callback function to process messages
def callback(ch, method, properties, body):
    message = eval(body)
    claim_check = message['claim_check']

    bucket_name, object_key = claim_check.replace('s3://', '').split('/', 1)
    s3.download_file(bucket_name, object_key, 'retrieved-large-file.txt')
    print("Payload retrieved and processed.")

# Consume messages
channel.basic_consume(queue='claim_check_queue', on_message_callback=callback, auto_ack=True)

print('Waiting for messages. To exit press CTRL+C')
channel.start_consuming()

References

  1. https://learn.microsoft.com/en-us/azure/architecture/patterns/claim-check
  2. https://medium.com/@dmosyan/claim-check-design-pattern-603dc1f3796d

Learning Notes #18 – Bulk Head Pattern (Resource Isolation) | Cloud Pattern

30 December 2024 at 17:48

Today, i learned about bulk head pattern and how it makes the system resilient to failure, resource exhaustion. In this blog i jot down notes on this pattern for better understanding.

In today’s world of distributed systems and microservices, resiliency is key to ensuring applications are robust and can withstand failures.

The Bulkhead Pattern is a design principle used to improve system resilience by isolating different parts of a system to prevent failure in one component from cascading to others.

What is the Bulkhead Pattern?

The term β€œbulkhead” originates from shipbuilding, where bulkheads are partitions that divide a ship into separate compartments. If one compartment is breached, the others remain intact, preventing the entire ship from sinking. Similarly, in software design, the Bulkhead Pattern isolates components or services so that a failure in one part does not bring down the entire system.

In software systems, bulkheads:

  • Isolate resources (e.g., threads, database connections, or network calls) for different components.
  • Limit the scope of failures.
  • Allow other parts of the system to continue functioning even if one part is degraded or completely unavailable.

Example

Consider an e-commerce application with a product-service that has two endpoints

  1. /product/{id} – This endpoint gives detailed information about a specific product, including ratings and reviews. It depends on the rating-service.
  2. /products – This endpoint provides a catalog of products based on search criteria. It does not depend on any external services.

Consider, with a fixed amount of resource allocated to product-service is loaded with /product/{id} calls, then they can monopolize the thread pool. This delays /products requests, causing users to experience slowness even though these requests are independent. Which leads to resource exhaustion and failures.

With bulkhead pattern, we can allocate separate client, connection pools to isolate the service interaction. we can implement bulkhead by allocating some connection pool (10) to /product/{id} requests and /products requests have a different connection pool (5) .

Even if /product/{id} requests are slow or encounter high traffic, /products requests remain unaffected.

Scenarios Where the Bulkhead Pattern is Needed

  1. Microservices with Shared Resources – In a microservices architecture, multiple services might share limited resources such as database connections or threads. If one service experiences a surge in traffic or a failure, it can exhaust these shared resources, impacting all other services. Bulkheading ensures each service gets a dedicated pool of resources, isolating the impact of failures.
  2. Prioritizing Critical Workloads – In systems with mixed workloads (e.g., processing user transactions and generating reports), critical operations like transaction processing must not be delayed or blocked by less critical tasks. Bulkheading allocates separate resources to ensure critical tasks have priority.
  3. Third-Party API Integration – When an application depends on multiple external APIs, one slow or failing API can delay the entire application if not isolated. Using bulkheads ensures that issues with one API do not affect interactions with others.
  4. Multi-Tenant Systems – In SaaS applications serving multiple tenants, a single tenant’s high resource consumption or failure should not degrade the experience for others. Bulkheads can segregate resources per tenant to maintain service quality.
  5. Cloud-Native Applications – In cloud environments, services often scale independently. A spike in one service’s load should not overwhelm shared backend systems. Bulkheads help isolate and manage these spikes.
  6. Event-Driven Systems – In event-driven architectures with message queues, processing backlogs for one type of event can delay others. By applying the Bulkhead Pattern, separate processing pipelines can handle different event types independently.

What are the Key Points of the Bulkhead Pattern? (Simplified)

  • Define Partitions – (Think of a ship) it’s divided into compartments (partitions) to keep water from flooding the whole ship if one section gets damaged. In software, these partitions are designed around how the application works and its technical needs.
  • Designing with Context – If you’re using a design approach like DDD (Domain-Driven Design), make sure your bulkheads (partitions) match the business logic boundaries.
  • Choosing Isolation Levels – Decide how much isolation is needed. For example: Threads for lightweight tasks. Separate containers or virtual machines for more critical separations. Balance between keeping things separate and the costs or extra effort involved.
  • Combining Other Techniques – Bulkheads work even better with patterns like Retry, Circuit Breaker, Throttling.
  • Monitoring – Keep an eye on each partition’s performance. If one starts getting overloaded, you can adjust resources or change limits.

When Should You Use the Bulkhead Pattern?

  • To Isolate Critical Resources – If one part of your system fails, other parts can keep working. For example, you don’t want search functionality to stop working because the reviews section is down.
  • To Prioritize Important Work – For example, make sure payment processing (critical) is separate from background tasks like sending emails.
  • To Avoid Cascading Failures – If one part of the system gets overwhelmed, it won’t drag down everything else.

When Should You Avoid It?

  • Complexity Isn’t Needed – If your system is simple, adding bulkheads might just make it harder to manage.
  • Resource Efficiency is Critical – Sometimes, splitting resources into separate pools can mean less efficient use of those resources. If every thread, connection, or container is underutilized, this might not be the best approach.

Challenges and Best Practices

  1. Overhead: Maintaining separate resource pools can increase system complexity and resource utilization.
  2. Resource Sizing: Properly sizing the pools is critical to ensure resources are efficiently utilized without bottlenecks.
  3. Monitoring: Use tools to monitor the health and performance of each resource pool to detect bottlenecks or saturation.

References:

  1. AWS https://aws.amazon.com/blogs/containers/building-a-fault-tolerant-architecture-with-a-bulkhead-pattern-on-aws-app-mesh/
  2. Resilience https://resilience4j.readme.io/docs/bulkhead
  3. https://medium.com/nerd-for-tech/bulkhead-pattern-distributed-design-pattern-c673d5e81523
  4. Microsoft https://learn.microsoft.com/en-us/azure/architecture/patterns/bulkhead

Learning Notes #12 – Alternate Exchanges | RabbitMQ

27 December 2024 at 10:36

Today i learnt about Alternate Exchange, which provide a way to handle undeliverable messages. In this blog, i share the notes on what alternate exchanges are, why they are useful, and how to implement them in your RabbitMQ setup.

What Are Alternate Exchanges?

In the normal flow, producer will send a message to the exchange and if the queue is binded correctly then it will be placed in the correct queue.

An alternate exchange in RabbitMQ is a fallback exchange configured for another exchange. If a message cannot be routed to any queue bound to the primary exchange, RabbitMQ will publish the message to the alternate exchange instead. This mechanism ensures that undeliverable messages are not lost but can be processed in a different way, such as logging, alerting, or storing them for later inspection.

When this scenario happens

A message goes to an alternate exchange in RabbitMQ in the following scenarios:

1. No Binding for the Routing Key

  • The primary exchange does not have any queue bound to it with the routing key specified in the message.
  • Example: A message with routing key invalid_key is sent to a direct exchange that has no queue bound to invalid_key.

2. Unbound Queues:

  • Even if a queue exists, it is not bound to the primary exchange or the specific routing key used in the message.
  • Example: A queue exists for the primary exchange but is not explicitly bound to any routing key.

3. Exchange Type Mismatch

  • The exchange type (e.g., direct, fanout, topic) does not match the routing pattern of the message.
  • Example: A message is sent with a specific routing key to a fanout exchange that delivers to all bound queues regardless of the key.

4. Misconfigured Bindings

  • Bindings exist but do not align with the routing requirements of the message.
  • Example: A topic exchange has a binding for user.* but receives a message with the routing key order.processed.

5. Queue Deletion After Binding

  • A queue was bound to the exchange but is deleted or unavailable at runtime.
  • Example: A message with a valid routing key arrives, but the corresponding queue is no longer active.

6. TTL (Time-to-Live) Expired Queues

  • Messages routed to a queue with a time-to-live setting expire before being consumed and are re-routed to an alternate exchange if dead-lettering is enabled.
  • Example: A primary exchange routes messages to a TTL-bound queue, and expired messages are forwarded to the alternate exchange.

7. Exchange Misconfiguration

  • The primary exchange is operational, but its configurations prevent messages from being delivered to any queue.
  • Example: A missing or incorrect alternate-exchange argument setup leads to misrouting.

Use Cases for Alternate Exchanges

  • Error Handling: Route undeliverable messages to a dedicated queue for later inspection or reprocessing.
  • Logging: Keep track of messages that fail routing for auditing purposes.
  • Dead Letter Queues: Use alternate exchanges to implement dead-letter queues to analyze why messages could not be routed.
  • Load Balancing: Forward undeliverable messages to another exchange for alternative processing

How to Implement Alternate Exchanges in Python

Let’s walk through the steps to configure and use alternate exchanges in RabbitMQ using Python.

Scenario 1: Handling Messages with Valid and Invalid Routing Keys

producer.py

import pika

# Connect to RabbitMQ
connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()

# Declare the alternate exchange
channel.exchange_declare(exchange='alternate_exchange', exchange_type='fanout')

# Declare a queue and bind it to the alternate exchange
channel.queue_declare(queue='unroutable_queue')
channel.queue_bind(exchange='alternate_exchange', queue='unroutable_queue')

# Declare the primary exchange with an alternate exchange argument
channel.exchange_declare(
    exchange='primary_exchange',
    exchange_type='direct',
    arguments={'alternate-exchange': 'alternate_exchange'}
)

# Declare and bind a queue to the primary exchange
channel.queue_declare(queue='valid_queue')
channel.queue_bind(exchange='primary_exchange', queue='valid_queue', routing_key='key1')

# Publish a message with a valid routing key
channel.basic_publish(
    exchange='primary_exchange',
    routing_key='key1',
    body='Message with a valid routing key'
)

print("Message with valid routing key sent to 'valid_queue'.")

# Publish a message with an invalid routing key
channel.basic_publish(
    exchange='primary_exchange',
    routing_key='invalid_key',
    body='Message with an invalid routing key'
)

print("Message with invalid routing key sent to 'alternate_exchange'.")

# Close the connection
connection.close()

consumer.py

import pika

# Connect to RabbitMQ
connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()

# Consume messages from the alternate queue
method_frame, header_frame, body = channel.basic_get(queue='unroutable_queue', auto_ack=True)
if method_frame:
    print(f"Received message from alternate queue: {body.decode()}")
else:
    print("No messages in the alternate queue")

# Close the connection
connection.close()

Scenario 2: Logging Unroutable Messages

producer.py

import pika

# Connect to RabbitMQ
connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()

# Declare the alternate exchange
channel.exchange_declare(exchange='logging_exchange', exchange_type='fanout')

# Declare a logging queue and bind it to the logging exchange
channel.queue_declare(queue='logging_queue')
channel.queue_bind(exchange='logging_exchange', queue='logging_queue')

# Declare the primary exchange with a logging alternate exchange argument
channel.exchange_declare(
    exchange='primary_logging_exchange',
    exchange_type='direct',
    arguments={'alternate-exchange': 'logging_exchange'}
)

# Publish a message with an invalid routing key
channel.basic_publish(
    exchange='primary_logging_exchange',
    routing_key='invalid_logging_key',
    body='Message for logging'
)

print("Message with invalid routing key sent to 'logging_exchange'.")

# Close the connection
connection.close()

consumer.py

import pika

# Connect to RabbitMQ
connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()

# Consume messages from the logging queue
method_frame, header_frame, body = channel.basic_get(queue='logging_queue', auto_ack=True)
if method_frame:
    print(f"Logged message: {body.decode()}")
else:
    print("No messages in the logging queue")

# Close the connection
connection.close()

Learning Notes #10 – Lazy Queues | RabbitMQ

26 December 2024 at 06:54

What Are Lazy Queues?

  • Lazy Queues are designed to store messages primarily on disk rather than in memory.
  • They are optimized for use cases involving large message backlogs where minimizing memory usage is critical.

Key Characteristics

  1. Disk-Based Storage – Messages are stored on disk immediately upon arrival, rather than being held in memory.
  2. Low Memory Usage – Only minimal metadata for messages is kept in memory.
  3. Scalability – Can handle millions of messages without consuming significant memory.
  4. Message Retrieval – Retrieving messages is slower because messages are fetched from disk.
  5. Durability – Messages persist on disk, reducing the risk of data loss during RabbitMQ restarts.

Trade-offs

  • Latency: Fetching messages from disk is slower than retrieving them from memory.
  • Throughput: Not suitable for high-throughput, low-latency applications.

Choose Lazy Queues if

  • You need to handle very large backlogs of messages.
  • Memory is a constraint in your system.Latency and throughput are less critical.

Implementation

Pre-requisites

1. Install and run RabbitMQ on your local machine.


docker run -it --rm --name rabbitmq -p 5672:5672 -p 15672:15672 rabbitmq:4.0-management

2. Install the pika library


pip install pika

Producer (producer.py)

This script sends a persistent message to a Lazy Queue.

import pika

# RabbitMQ connection parameters for localhost
connection_params = pika.ConnectionParameters(host="localhost")

# Connect to RabbitMQ
connection = pika.BlockingConnection(connection_params)
channel = connection.channel()

# Custom Exchange and Routing Key
exchange_name = "custom_exchange"
routing_key = "custom_routing_key"
queue_name = "lazy_queue_example"

# Declare the custom exchange
channel.exchange_declare(
    exchange=exchange_name,
    exchange_type="direct",  # Direct exchange routes messages based on the routing key
    durable=True
)

# Declare a Lazy Queue
channel.queue_declare(
    queue=queue_name,
    durable=True,
    arguments={"x-queue-mode": "lazy"}  # Configure the queue as lazy
)

# Bind the queue to the custom exchange with the routing key
channel.queue_bind(
    exchange=exchange_name,
    queue=queue_name,
    routing_key=routing_key
)

# Publish a message
message = "Hello from the Producer via Custom Exchange!"
channel.basic_publish(
    exchange=exchange_name,
    routing_key=routing_key,
    body=message,
    properties=pika.BasicProperties(delivery_mode=2)  # Persistent message
)

print(f"Message sent to Lazy Queue via Exchange: {message}")

# Close the connection
connection.close()

Consumer (consumer.py)

import pika

# RabbitMQ connection parameters for localhost
connection_params = pika.ConnectionParameters(host="localhost")

# Connect to RabbitMQ
connection = pika.BlockingConnection(connection_params)
channel = connection.channel()

# Custom Exchange and Routing Key
exchange_name = "custom_exchange"
routing_key = "custom_routing_key"
queue_name = "lazy_queue_example"

# Declare the custom exchange
channel.exchange_declare(
    exchange=exchange_name,
    exchange_type="direct",  # Direct exchange routes messages based on the routing key
    durable=True
)

# Declare the Lazy Queue
channel.queue_declare(
    queue=queue_name,
    durable=True,
    arguments={"x-queue-mode": "lazy"}  # Configure the queue as lazy
)

# Bind the queue to the custom exchange with the routing key
channel.queue_bind(
    exchange=exchange_name,
    queue=queue_name,
    routing_key=routing_key
)

# Callback function to process messages
def callback(ch, method, properties, body):
    print(f"Received message: {body.decode()}")
    ch.basic_ack(delivery_tag=method.delivery_tag)  # Acknowledge the message

# Start consuming messages
channel.basic_consume(queue=queue_name, on_message_callback=callback, auto_ack=False)

print("Waiting for messages. To exit, press CTRL+C")
try:
    channel.start_consuming()
except KeyboardInterrupt:
    print("Stopped consuming.")

# Close the connection
connection.close()

Explanation

  1. Producer
    • Defines a custom exchange (custom_exchange) of type direct.
    • Declares a Lazy Queue (lazy_queue_example).
    • Binds the queue to the exchange using a routing key (custom_routing_key).
    • Publishes a persistent message via the custom exchange and routing key.
  2. Consumer
    • Declares the same exchange and Lazy Queue to ensure they exist.
    • Consumes messages routed to the queue through the custom exchange and routing key.
  3. Custom Exchange and Binding
    • The direct exchange type routes messages based on an exact match of the routing key.
    • Binding ensures the queue receives messages published to the exchange with the specified key.
  4. Lazy Queue Behavior
    • Messages are stored directly on disk to minimize memory usage.

Learning Notes #7 – AMQP Protocol and RabbitMQ | An Overview

24 December 2024 at 18:22

Today, i learned about AMQP Protocol, Components of RabbitMQ (Connections, Channels, Queues, Exchanges, Bindings and Different Types of Exchanges, Acknowledgement and Publisher Confirmation). I learned these all from CloudAMQP In this blog, you will find a crisp details on these topics.

1. Overview of AMQP Protocol

  • Advanced Message Queuing Protocol (AMQP) is an open standard for messaging middleware. It enables systems to exchange messages in a reliable and flexible manner.
  • Key components:
    • Producers: Applications that send messages.
    • Consumers: Applications that receive messages.
    • Broker: Middleware (e.g., RabbitMQ) that manages message exchanges.
    • Message: A unit of data transferred between producer and consumer.

2. How AMQP Works in RabbitMQ

  • RabbitMQ implements AMQP to facilitate message exchange. It acts as the broker, managing queues, exchanges, and bindings.
  • AMQP Operations:
    1. Producer sends a message to an exchange.
    2. The exchange routes the message to one or more queues based on bindings.
    3. Consumer retrieves the message from the queue.

3. Connections and Channels

Connections

A connection is a persistent, long-lived TCP connection between a client application and the RabbitMQ broker. Connections are relatively resource-intensive because they involve socket communication and the overhead of establishing and maintaining the connection. Each connection is uniquely identified by the broker and can be shared across multiple threads or processes.

When an application establishes a connection to RabbitMQ, it uses it as a gateway to interact with the broker. This includes creating channels, declaring queues and exchanges, publishing messages, and consuming messages. Connections should ideally be reused across the application to reduce overhead and optimize resource usage.

Channels

A channel is a lightweight, logical communication pathway established within a connection. Channels provide a way to perform multiple operations concurrently over a single connection. They are less resource-intensive than connections and are designed to handle operations such as queue declarations, message publishing, and consuming.

Using channels allows applications to:

  1. Scale efficiently: Instead of opening multiple connections, applications can open multiple channels over a single connection.
  2. Isolate operations: Each channel operates independently. For instance, one channel can consume messages while another publishes.

How They Work Together

When a client connects to RabbitMQ, it first establishes a connection. Within that connection, it can open multiple channels. Each channel operates as a virtual connection, allowing concurrent tasks without needing separate TCP connections.


import pika

# Establish a connection to RabbitMQ
connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))

# Create multiple channels on the same connection
channel1 = connection.channel()
channel2 = connection.channel()

# Declare queues on each channel
channel1.queue_declare(queue='queue1')
channel2.queue_declare(queue='queue2')

# Publish messages on different channels
channel1.basic_publish(exchange='', routing_key='queue1', body='Message for Queue 1')
channel2.basic_publish(exchange='', routing_key='queue2', body='Message for Queue 2')

print("Messages sent to both queues!")

# Close the connection
connection.close()

Best Practices (Not Tried; Got this from the video)

  1. Reusing Connections: Establish one connection per application or service and share it across threads or processes for efficiency.
  2. Using Channels for Parallelism: Open separate channels for different operations like publishing and consuming.
  3. Graceful Cleanup: Always close channels and connections when done to avoid resource leaks.

4. Queues

  • Act as message storage.
  • Can be:
    • Durable: Survives broker restarts.
    • Exclusive: Used by a single connection.
    • Auto-delete: Deleted when the last consumer disconnects.

# Declaring a durable queue
channel.queue_declare(queue='durable_queue', durable=True)

# Sending a persistent message
channel.basic_publish(
    exchange='',  # Default exchange
    routing_key='durable_queue',
    body='Persistent message',
    properties=pika.BasicProperties(delivery_mode=2)  # Persistent
)

5. Exchanges

An exchange in RabbitMQ is a routing mechanism that determines how messages sent by producers are directed to queues. Exchanges act as intermediaries between producers and queues, enabling flexible and efficient message routing based on routing rules and patterns.

Types of Exchanges

RabbitMQ supports four types of exchanges, each with its unique routing mechanism:

1. Direct Exchange

  • Routes messages to queues based on an exact match of the routing key.
  • If the routing key in the message matches the binding key of a queue, the message is routed to that queue.
  • Use case: Task queues where each task has a specific destination.

Example:

  • Queue queue1 is bound to the exchange with the routing key info.
  • A message with the routing key info is routed to queue1.

channel.exchange_declare(exchange='direct_exchange', exchange_type='direct')
channel.queue_declare(queue='direct_queue')
channel.queue_bind(exchange='direct_exchange', queue='direct_queue', routing_key='info')
channel.basic_publish(exchange='direct_exchange', routing_key='info', body='Direct message')

2. Fanout Exchange

  • Broadcasts messages to all queues bound to the exchange, ignoring routing keys.
  • Use case: Broadcasting events to multiple consumers, such as notifications or logs.

Example:

  • All queues bound to the exchange receive the same message.

channel.exchange_declare(exchange='fanout_exchange', exchange_type='fanout')
channel.queue_declare(queue='queue1')
channel.queue_declare(queue='queue2')

channel.queue_bind(exchange='fanout_exchange', queue='queue1')
channel.queue_bind(exchange='fanout_exchange', queue='queue2')

channel.basic_publish(exchange='fanout_exchange', routing_key='', body='Broadcast message')

3. Topic Exchange

  • Routes messages to queues based on pattern matching of routing keys.
  • Routing keys are dot-separated words, and queues can bind with patterns using wildcards:
    • * matches exactly one word.
    • # matches zero or more words.
  • Use case: Complex routing scenarios, such as logging systems with multiple log levels and sources.

Example:

  • Queue queue1 is bound with the pattern logs.info.*.
  • A message with the routing key logs.info.app1 is routed to queue1.

channel.exchange_declare(exchange='topic_exchange', exchange_type='topic')
channel.queue_declare(queue='topic_queue')

channel.queue_bind(exchange='topic_exchange', queue='topic_queue', routing_key='logs.info.*')

channel.basic_publish(exchange='topic_exchange', routing_key='logs.info.app1', body='Topic message')

4. Headers Exchange

  • Routes messages based on message header attributes instead of routing keys.
  • Headers can specify conditions like x-match:
    • x-match = all: All specified headers must match.
    • x-match = any: At least one specified header must match.
  • Use case: Advanced filtering scenarios.

Example:

  • Queue queue1 is bound with headers format=json and type=report.

channel.exchange_declare(exchange='headers_exchange', exchange_type='headers')
channel.queue_declare(queue='headers_queue')

channel.queue_bind(
    exchange='headers_exchange',
    queue='headers_queue',
    arguments={'format': 'json', 'type': 'report', 'x-match': 'all'}
)

channel.basic_publish(
    exchange='headers_exchange',
    routing_key='',
    body='Headers message',
    properties=pika.BasicProperties(headers={'format': 'json', 'type': 'report'})
)

Exchange Lifecycle

  1. Declaration: Exchanges must be explicitly declared before use. If an exchange is not declared and a producer tries to publish a message to it, an error will occur.
  2. Binding: Queues are bound to exchanges with routing keys or header arguments.
  3. Publishing: Producers publish messages to exchanges with optional routing keys.

Durable and Non-Durable Exchanges

  • Durable Exchange: Survives broker restarts. Useful for critical applications.
  • Non-Durable Exchange: Deleted when the broker restarts. Suitable for transient tasks.

# Declare a durable exchange
channel.exchange_declare(exchange='durable_exchange', exchange_type='direct', durable=True)

Default Exchange

RabbitMQ provides a built-in default exchange (unnamed exchange) that routes messages directly to a queue with a name matching the routing key.


channel.queue_declare(queue='default_queue')
channel.basic_publish(exchange='', routing_key='default_queue', body='Default exchange message')

Best Practices for Exchanges

  • Use durable exchanges for critical applications that require persistence across broker restarts.
  • Use direct exchanges for targeted delivery when routing keys are predictable.
  • Use fanout exchanges for broadcasting to multiple queues.
  • Use topic exchanges for complex routing needs, especially with hierarchical routing keys.
  • Use headers exchanges for advanced filtering based on metadata.

6. Bindings

Bindings connect queues to exchanges with routing rules.


# Binding a queue with a routing key
channel.queue_bind(exchange='direct_logs', queue='error_logs', routing_key='error')

7. Consumer Acknowledgments

Two acknowledgment types:

  • Manual: Consumer explicitly sends an acknowledgment.
  • Automatic: RabbitMQ assumes successful processing.
# Auto ACK
channel.basic_consume(queue='test_queue', on_message_callback=lambda ch, method, properties, body: print(body), auto_ack=True)

# Manual ACK
def callback(ch, method, properties, body):
    print(f"Received {body}")
    ch.basic_ack(delivery_tag=method.delivery_tag)

channel.basic_consume(queue='test_queue', on_message_callback=callback)
channel.start_consuming()

8. Publisher Confirmations

  • Guarantees that RabbitMQ successfully received a message.
  • Enable publisher confirms for robust systems.

# Enable delivery confirmation
channel.confirm_delivery()

# Publish and handle confirmation
try:
    channel.basic_publish(exchange='direct_logs', routing_key='error', body='Message with confirmation')
    print("Message successfully published!")
except pika.exceptions.UnroutableError:
    print("Message could not be routed.")

9. Virtual Hosts (vhosts)

  • Logical partitions to segregate exchanges, queues, and users.
  • Use vhosts for multi-tenant setups.

Tomm, I am planning to explore more on RabbitMQ. Let’s see tomm.

❌
❌