❌

Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

RabbitMQ – All You Need To Know To Start Building Scalable Platforms

1 February 2025 at 02:39

  1. Introduction
  2. What is a Message Queue ?
  3. So Problem Solved !!! Not Yet
  4. RabbitMQ: Installation
  5. RabbitMQ: An Introduction (Optional)
    1. What is RabbitMQ?
    2. Why Use RabbitMQ?
    3. Key Features and Use Cases
  6. Building Blocks of Message Broker
    1. Connection & Channels
    2. Queues – Message Store
    3. Exchanges – Message Distributor and Binding
  7. Producing, Consuming and Acknowledging
  8. Problem #1 – Task Queue for Background Job Processing
    1. Context
    2. Problem
    3. Proposed Solution
  9. Problem #2 – Broadcasting NEWS to all subscribers
    1. Problem
    2. Solution Overview
    3. Step 1: Producer (Publisher)
    4. Step 2: Consumers (Subscribers)
      1. Consumer 1: Mobile App Notifications
      2. Consumer 2: Email Alerts
      3. Consumer 3: Web Notifications
      4. How It Works
  10. Intermediate Resources
    1. Prefetch Count
    2. Request Reply Pattern
    3. Dead Letter Exchange
    4. Alternate Exchanges
    5. Lazy Queues
    6. Quorom Queues
    7. Change Data Capture
    8. Handling Backpressure in Distributed Systems
    9. Choreography Pattern
    10. Outbox Pattern
    11. Queue Based Loading
    12. Two Phase Commit Protocol
    13. Competing Consumer
    14. Retry Pattern
    15. Can We Use Database as a Queue
  11. Let’s Connect

Introduction

Let’s take the example of an online food ordering system like Swiggy or Zomato. Suppose a user places an order through the mobile app. If the application follows a synchronous approach, it would first send the order request to the restaurant’s system and then wait for confirmation. If the restaurant is busy, the app will have to keep waiting until it receives a response.

If the restaurant’s system crashes or temporarily goes offline, the order will fail, and the user may have to restart the process.

This approach leads to a poor user experience, increases the chances of failures, and makes the system less scalable, as multiple users waiting simultaneously can cause a bottleneck.

In a traditional synchronous communication model, one service directly interacts with another and waits for a response before proceeding. While this approach is simple and works for small-scale applications, it introduces several challenges, especially in systems that require high availability and scalability.

The main problems with synchronous communication include slow performance, system failures, and scalability issues. If the receiving service is slow or temporarily unavailable, the sender has no choice but to wait, which can degrade the overall performance of the application.

Moreover, if the receiving service crashes, the entire process fails, leading to potential data loss or incomplete transactions.

In this book, we are going to solve how this can be solved with a message queue.

What is a Message Queue ?

A message queue is a system that allows different parts of an application (or different applications) to communicate with each other asynchronously by sending and receiving messages.

It acts like a buffer or an intermediary where messages are stored until the receiving service is ready to process them.

How It Works

  1. A producer (sender) creates a message and sends it to the queue.
  2. The message sits in the queue until a consumer (receiver) picks it up.
  3. The consumer processes the message and removes it from the queue.

This process ensures that the sender does not have to wait for the receiver to be available, making the system faster, more reliable, and scalable.

Real-Life Example

Imagine a fast-food restaurant where customers place orders at the counter. Instead of waiting at the counter for their food, customers receive a token number and move aside. The kitchen prepares the order in the background, and when it’s ready, the token number is called for pickup.

In this analogy,

  • The counter is the producer (sending orders).
  • The queue is the token system (storing orders).
  • The kitchen is the consumer (processing orders).
  • The customer picks up the food when ready (message is consumed).

Similarly, in applications, a message queue helps decouple systems, allowing them to work at their own pace without blocking each other. RabbitMQ, Apache Kafka, and Redis are popular message queue systems used in modern software development. πŸš€

So Problem Solved !!! Not Yet

It seems like problem is solved, but the message life cycle in the queue is need to handled.

  • Message Routing & Binding (Optional) – How a message is routed ?. If an exchange is used, the message is routed based on predefined rules.
  • Message Storage (Queue Retention) – How long a message stays in the queue. The message stays in the queue until a consumer picks it up.
  • If the consumer successfully processes the message, it sends an acknowledgment (ACK), and the message is removed. If the consumer fails, the message requeues or moves to a dead-letter queue (DLQ).
  • Messages that fail multiple times, are not acknowledged, or expire may be moved to a Dead-Letter Queue for further analysis.
  • Messages stored only in memory can be lost if RabbitMQ crashes.
  • Messages not consumed within their TTL expire.
  • If a consumer fails to acknowledge a message, it may be reprocessed twice.
  • Messages failing multiple times may be moved to a DLQ.
  • Too many messages in the queue due to slow consumers can cause system slowdowns.
  • Network failures can disrupt message delivery between producers, RabbitMQ, and consumers.
  • Messages with corrupt or bad data may cause repeated consumer failures.

To handle all the above problems, we need a tool. Stable, Battle tested, Reliable tool. RabbitMQ is one kind of that tool. In this book we will cover the basics of RabbitMQ.

RabbitMQ: Installation

For RabbitMQ Installation please refer to https://www.rabbitmq.com/docs/download. In this book we will go with RabbitMQ docker.

docker run -it --rm --name rabbitmq -p 5672:5672 -p 15672:15672 rabbitmq:4.0-management


RabbitMQ: An Introduction (Optional)

What is RabbitMQ?

Imagine you’re sending messages between friends, but instead of delivering them directly, you drop them in a mailbox, and your friend picks them up when they are ready. RabbitMQ acts like this mailbox, but for computer programs. It helps applications communicate asynchronously, meaning they don’t have to wait for each other to process data.

RabbitMQ is a message broker, which means it handles and routes messages between different parts of an application. It ensures that messages are delivered efficiently, even when some components are running at different speeds or go offline temporarily.

Why Use RabbitMQ?

Modern applications often consist of multiple services that need to exchange data. Sometimes, one service produces data faster than another can consume it. Instead of forcing the slower service to catch up or making the faster service wait, RabbitMQ allows the fast service to place messages in a queue. The slow service can then process them at its own pace.

Some key benefits of using RabbitMQ include,

  • Decoupling services: Components communicate via messages rather than direct calls, reducing dependencies.
  • Scalability: RabbitMQ allows multiple consumers to process messages in parallel.
  • Reliability: It supports message durability and acknowledgments, preventing message loss.
  • Flexibility: Works with many programming languages and integrates well with different systems.
  • Efficient Load Balancing: Multiple consumers can share the message load to prevent overload on a single component.

Key Features and Use Cases

RabbitMQ is widely used in different applications, including

  • Chat applications: Messages are queued and delivered asynchronously to users.
  • Payment processing: Orders are placed in a queue and processed sequentially.
  • Event-driven systems: Used for microservices communication and event notification.
  • IoT systems: Devices publish data to RabbitMQ, which is then processed by backend services.
  • Job queues: Background tasks such as sending emails or processing large files.

Building Blocks of Message Broker

Connection & Channels

In RabbitMQ, connections and channels are fundamental concepts for communication between applications and the broker,

Connections: A connection is a TCP link between a client (producer or consumer) and the RabbitMQ broker. Each connection consumes system resources and is relatively expensive to create and maintain.

Channels: A channel is a virtual communication path inside a connection. It allows multiple logical streams of data over a single TCP connection, reducing overhead. Channels are lightweight and preferred for performing operations like publishing and consuming messages.

Queues – Message Store

A queue is a message buffer that temporarily holds messages until a consumer retrieves and processes them.

1. Queues operate on a FIFO (First In, First Out) basis, meaning messages are processed in the order they arrive (unless priorities or other delivery strategies are set).

2. Queues persist messages if they are declared as durable and the messages are marked as persistent, ensuring reliability even if RabbitMQ restarts.

3. Multiple consumers can subscribe to a queue, and messages can be distributed among them in a round-robin manner.

Consumption by multiple consumers,

Can also be broadcasted,

4. If no consumers are available, messages remain in the queue until a consumer connects.

Analogy: Think of a queue as a to-do list where tasks (messages) are stored until someone (a worker/consumer) picks them up and processes them.

Exchanges – Message Distributor and Binding

An exchange is responsible for routing messages to one or more queues based on routing rules.

When a producer sends a message, it doesn’t go directly to a queue but first reaches an exchange, which decides where to forward it.πŸ”₯

The blue color line is called as Binding. A binding is the link between the exchange and the queue, guiding messages to the right place.

RabbitMQ supports different types of exchanges

Direct Exchange (direct)

  • Routes messages to queues based on an exact match between the routing key and the queue’s binding key.
  • Example: Sending messages to a specific queue based on a severity level (info, error, warning).


Fanout Exchange (fanout)

  • Routes messages to all bound queues, ignoring routing keys.
  • Example: Broadcasting notifications to multiple services at once.

Topic Exchange (topic)

  • Routes messages based on pattern matching using * (matches one word) and # (matches multiple words).
  • Example: Routing logs where log.info goes to one queue, log.error goes to another, and log.* captures all.

Headers Exchange (headers)

  • Routes messages based on message headers instead of routing keys.
  • Example: Delivering messages based on metadata like device: mobile or region: US.

Analogy: An exchange is like a traffic controller that decides which road (queue) a vehicle (message) should take based on predefined rules.

Binding

A binding is a link between an exchange and a queue that defines how messages should be routed.

  • When a queue is bound to an exchange with a binding key, messages with a matching routing key are delivered to that queue.
  • A queue can have multiple bindings to different exchanges, allowing it to receive messages from multiple sources.

Example:

  • A queue named error_logs can be bound to a direct exchange with a binding key error.
  • Another queue, all_logs, can be bound to the same exchange with a binding key # (wildcard in a topic exchange) to receive all logs.

Analogy: A binding is like a GPS route guiding messages (vehicles) from the exchange (traffic controller) to the right queue (destination).

Producing, Consuming and Acknowledging

RabbitMQ follows the producer-exchange-queue-consumer model,

  • Producing messages (Publishing): A producer creates a message and sends it to RabbitMQ, which routes it to the correct queue.
  • Consuming messages (Subscribing): A consumer listens for messages from the queue and processes them.
  • Acknowledgment: The consumer sends an acknowledgment (ack) after successfully processing a message.
  • Durability: Ensures messages and queues survive RabbitMQ restarts.

Why do we need an Acknowledgement ?

  1. Ensures message reliability – Prevents messages from being lost if a consumer crashes.
  2. Prevents message loss – Messages are redelivered if no ACK is received.
  3. Avoids unintentional message deletion – Messages stay in the queue until properly processed.
  4. Supports at-least-once delivery – Ensures every message is processed at least once.
  5. Enables load balancing – Distributes messages fairly among multiple consumers.
  6. Allows manual control – Consumers can acknowledge only after successful processing.
  7. Handles redelivery – Messages can be requeued and sent to another consumer if needed.

Problem #1 – Task Queue for Background Job Processing

Context

A company runs an image processing application where users upload images that need to be resized, watermarked, and optimized before they can be served. Processing these images synchronously would slow down the user experience, so the company decides to implement an asynchronous task queue using RabbitMQ.

Problem

  • Users upload large images that require multiple processing steps.
  • Processing each image synchronously blocks the application, leading to slow response times.
  • High traffic results in queue buildup, making it challenging to scale the system efficiently.

Proposed Solution

1. Producer Service

  • Publishes image processing tasks to a RabbitMQ exchange (task_exchange).
  • Sends the image filename as the message body to the queue (image_queue).

2. Worker Consumers

  • Listen for new image processing tasks from the queue.
  • Process each image (resize, watermark, optimize, etc.).
  • Acknowledge completion to ensure no duplicate processing.

3. Scalability

  • Multiple workers can run in parallel to process images faster.

producer.py

import pika

connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()

# Declare exchange and queue
channel.exchange_declare(exchange='task_exchange', exchange_type='direct')
channel.queue_declare(queue='image_queue')

# Bind queue to exchange
channel.queue_bind(exchange='task_exchange', queue='image_queue', routing_key='image_task')

# List of images to process
images = ["image1.jpg", "image2.jpg", "image3.jpg"]

for image in images:
    channel.basic_publish(exchange='task_exchange', routing_key='image_task', body=image)
    print(f" [x] Sent {image}")

connection.close()

consumer.py

import pika
import time

connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()

# Declare exchange and queue
channel.exchange_declare(exchange='task_exchange', exchange_type='direct')
channel.queue_declare(queue='image_queue')

# Bind queue to exchange
channel.queue_bind(exchange='task_exchange', queue='image_queue', routing_key='image_task')

def process_image(ch, method, properties, body):
    print(f" [x] Processing {body.decode()}")
    time.sleep(2)  # Simulate processing time
    print(f" [x] Finished {body.decode()}")
    ch.basic_ack(delivery_tag=method.delivery_tag)

# Start consuming
channel.basic_consume(queue='image_queue', on_message_callback=process_image)
print(" [*] Waiting for image tasks. To exit press CTRL+C")
channel.start_consuming()

Problem #2 – Broadcasting NEWS to all subscribers

Problem

A news application wants to send breaking news alerts to all subscribers, regardless of their location or interest.

Use a fanout exchange (news_alerts_exchange) to broadcast messages to all connected queues, ensuring all users receive the alert.

πŸ”Ή Example

  • mobile_app_queue (for users receiving push notifications)
  • email_alert_queue (for users receiving email alerts)
  • web_notification_queue (for users receiving notifications on the website)

Solution Overview

  • We create a fanout exchange called news_alerts_exchange.
  • Multiple queues (mobile_app_queue, email_alert_queue, and web_notification_queue) are bound to this exchange.
  • A producer publishes messages to the exchange.
  • Each consumer listens to its respective queue and receives the alert.

Step 1: Producer (Publisher)

This script publishes a breaking news alert to the fanout exchange.

import pika

# Establish connection
connection = pika.BlockingConnection(pika.ConnectionParameters("localhost"))
channel = connection.channel()

# Declare a fanout exchange
channel.exchange_declare(exchange="news_alerts_exchange", exchange_type="fanout")

# Publish a message
message = "Breaking News: Major event happening now!"
channel.basic_publish(exchange="news_alerts_exchange", routing_key="", body=message)

print(f" [x] Sent: {message}")

# Close connection
connection.close()

Step 2: Consumers (Subscribers)

Each consumer listens to its respective queue and processes the alert.

Consumer 1: Mobile App Notifications

import pika

# Establish connection
connection = pika.BlockingConnection(pika.ConnectionParameters("localhost"))
channel = connection.channel()

# Declare exchange
channel.exchange_declare(exchange="news_alerts_exchange", exchange_type="fanout")

# Declare a queue (auto-delete if no consumers)
queue_name = "mobile_app_queue"
channel.queue_declare(queue=queue_name)
channel.queue_bind(exchange="news_alerts_exchange", queue=queue_name)

# Callback function
def callback(ch, method, properties, body):
    print(f" [Mobile App] Received: {body.decode()}")

# Consume messages
channel.basic_consume(queue=queue_name, on_message_callback=callback, auto_ack=True)
print(" [*] Waiting for news alerts...")
channel.start_consuming()

Consumer 2: Email Alerts

import pika

connection = pika.BlockingConnection(pika.ConnectionParameters("localhost"))
channel = connection.channel()

channel.exchange_declare(exchange="news_alerts_exchange", exchange_type="fanout")

queue_name = "email_alert_queue"
channel.queue_declare(queue=queue_name)
channel.queue_bind(exchange="news_alerts_exchange", queue=queue_name)

def callback(ch, method, properties, body):
    print(f" [Email Alert] Received: {body.decode()}")

channel.basic_consume(queue=queue_name, on_message_callback=callback, auto_ack=True)
print(" [*] Waiting for news alerts...")
channel.start_consuming()

Consumer 3: Web Notifications

import pika

connection = pika.BlockingConnection(pika.ConnectionParameters("localhost"))
channel = connection.channel()

channel.exchange_declare(exchange="news_alerts_exchange", exchange_type="fanout")

queue_name = "web_notification_queue"
channel.queue_declare(queue=queue_name)
channel.queue_bind(exchange="news_alerts_exchange", queue=queue_name)

def callback(ch, method, properties, body):
    print(f" [Web Notification] Received: {body.decode()}")

channel.basic_consume(queue=queue_name, on_message_callback=callback, auto_ack=True)
print(" [*] Waiting for news alerts...")
channel.start_consuming()

How It Works

  1. The producer sends a news alert to the fanout exchange (news_alerts_exchange).
  2. All queues (mobile_app_queue, email_alert_queue, web_notification_queue) bound to the exchange receive the message.
  3. Each consumer listens to its queue and processes the alert.

This setup ensures all users receive the alert simultaneously across different platforms. πŸš€

Intermediate Resources

Prefetch Count

Prefetch is a mechanism that defines how many messages can be delivered to a consumer at a time before the consumer sends an acknowledgment back to the broker. This ensures that the consumer does not get overwhelmed with too many unprocessed messages, which could lead to high memory usage and potential performance issues.

To Know More: https://parottasalna.com/2024/12/29/learning-notes-16-prefetch-count-rabbitmq/

Request Reply Pattern

The Request-Reply Pattern is a fundamental communication style in distributed systems, where a requester sends a message to a responder and waits for a reply. It’s widely used in systems that require synchronous communication, enabling the requester to receive a response for further processing.

To Know More: https://parottasalna.com/2024/12/28/learning-notes-15-request-reply-pattern-rabbitmq/

Dead Letter Exchange

A dead letter is a message that cannot be delivered to its intended queue or is rejected by a consumer. Common scenarios where messages are dead lettered include,

  1. Message Rejection: A consumer explicitly rejects a message without requeuing it.
  2. Message TTL (Time-To-Live) Expiry: The message remains in the queue longer than its TTL.
  3. Queue Length Limit: The queue has reached its maximum capacity, and new messages are dropped.
  4. Routing Failures: Messages that cannot be routed to any queue from an exchange.

To Know More: https://parottasalna.com/2024/12/28/learning-notes-14-dead-letter-exchange-rabbitmq/

Alternate Exchanges

An alternate exchange in RabbitMQ is a fallback exchange configured for another exchange. If a message cannot be routed to any queue bound to the primary exchange, RabbitMQ will publish the message to the alternate exchange instead. This mechanism ensures that undeliverable messages are not lost but can be processed in a different way, such as logging, alerting, or storing them for later inspection.

To Know More: https://parottasalna.com/2024/12/27/learning-notes-12-alternate-exchanges-rabbitmq/

Lazy Queues

  • Lazy Queues are designed to store messages primarily on disk rather than in memory.
  • They are optimized for use cases involving large message backlogs where minimizing memory usage is critical.

To Know More: https://parottasalna.com/2024/12/26/learning-notes-10-lazy-queues-rabbitmq/

Quorom Queues

  • Quorum Queues are distributed queues built on the Raft consensus algorithm.
  • They are designed for high availability, durability, and data safety by replicating messages across multiple nodes in a RabbitMQ cluster.
  • Its a replacement of Mirrored Queues.

To Know More: https://parottasalna.com/2024/12/25/learning-notes-9-quorum-queues-rabbitmq/

Change Data Capture

CDC stands for Change Data Capture. It’s a technique that listens to a database and captures every change that happens in it. These changes can then be sent to other systems to,

  • Keep data in sync across multiple databases.
  • Power real-time analytics dashboards.
  • Trigger notifications for certain database events.
  • Process data streams in real time.

To Know More: https://parottasalna.com/2025/01/19/learning-notes-63-change-data-capture-what-does-it-do/

Handling Backpressure in Distributed Systems

Backpressure occurs when a downstream system (consumer) cannot keep up with the rate of data being sent by an upstream system (producer). In distributed systems, this can arise in scenarios such as

  • A message queue filling up faster than it is drained.
  • A database struggling to handle the volume of write requests.
  • A streaming system overwhelmed by incoming data.

To Know More: https://parottasalna.com/2025/01/07/learning-notes-45-backpressure-handling-in-distributed-systems/

Choreography Pattern

In the Choreography Pattern, services communicate directly with each other via asynchronous events, without a central controller. Each service is responsible for a specific part of the workflow and responds to events produced by other services. This pattern allows for a more autonomous and loosely coupled system.

To Know More: https://parottasalna.com/2025/01/05/learning-notes-38-choreography-pattern-cloud-pattern/

Outbox Pattern

The Outbox Pattern is a proven architectural solution to this problem, helping developers manage data consistency, especially when dealing with events, messaging systems, or external APIs.

To Know More: https://parottasalna.com/2025/01/03/learning-notes-31-outbox-pattern-cloud-pattern/

Queue Based Loading

The Queue-Based Loading Pattern leverages message queues to decouple and coordinate tasks between producers (such as applications or services generating data) and consumers (services or workers processing that data). By using queues as intermediaries, this pattern allows systems to manage workloads efficiently, ensuring seamless and scalable operation.

To Know More: https://parottasalna.com/2025/01/03/learning-notes-30-queue-based-loading-cloud-patterns/

Two Phase Commit Protocol

The Two-Phase Commit (2PC) protocol is a distributed algorithm used to ensure atomicity in transactions spanning multiple nodes or databases. Atomicity ensures that either all parts of a transaction are committed or none are, maintaining consistency in distributed systems.

To Know More: https://parottasalna.com/2025/01/03/learning-notes-29-two-phase-commit-protocol-acid-in-distributed-systems/

Competing Consumer

The competing consumer pattern involves multiple consumers that independently compete to process messages or tasks from a shared queue. This pattern is particularly effective in scenarios where the rate of incoming tasks is variable or high, as it allows multiple consumers to process tasks concurrently.

To Know More: https://parottasalna.com/2025/01/01/learning-notes-24-competing-consumer-messaging-queue-patterns/

Retry Pattern

The Retry Pattern is a design strategy used to manage transient failures by retrying failed operations. Instead of immediately failing an operation after an error, the pattern retries it with an optional delay or backoff strategy. This is particularly useful in distributed systems where failures are often temporary.

To Know More: https://parottasalna.com/2024/12/31/learning-notes-23-retry-pattern-cloud-patterns/

Can We Use Database as a Queue

Developers try to use their RDBMS as a way to do background processing or service communication. While this can often appear to β€˜get the job done’, there are a number of limitations and concerns with this approach.

There are two divisions to any asynchronous processing: the service(s) that create processing tasks and the service(s) that consume and process these tasks accordingly.

To Know More: https://parottasalna.com/2024/06/15/can-we-use-database-as-queue-in-asynchronous-process/

Let’s Connect

Telegram: https://t.me/parottasalna/1

LinkedIn: https://www.linkedin.com/in/syedjaferk/

Whatsapp Channel: https://whatsapp.com/channel/0029Vavu8mF2v1IpaPd9np0s

Youtube: https://www.youtube.com/@syedjaferk

Github: https://github.com/syedjaferk/

Learning Notes #39 – Compensation Pattern | Cloud Pattern

5 January 2025 at 12:50

Today i learnt about compensation pattern, where it rollback a transactions when it face some failures. In this blog i jot down notes on compensating pattern and how it relates with SAGA pattern.

Distributed systems often involve multiple services working together to perform a business operation. Ensuring data consistency and reliability across these services is challenging, especially in cases of failure. One solution is the use of compensation transactions, a mechanism designed to maintain consistency by reversing the effects of previous operations when errors occur.

What Are Compensation Transactions?

A compensation transaction is an operation that undoes the effect of a previously executed operation. Unlike traditional rollback mechanisms in centralized databases, compensation transactions are explicitly defined and executed in distributed systems to maintain consistency after a failure.

Key Characteristics

  • Explicit Definition: Compensation logic must be explicitly implemented.
  • Independent Execution: Compensation operations are separate from the main transaction.
  • Eventual Consistency: Ensures the system reaches a consistent state over time.
  • Asynchronous Nature: Often triggered asynchronously to avoid blocking main processes.

Why Are Compensation Transactions Important?

1. Handling Failures in Distributed Systems

In a distributed architecture, such as microservices, different services may succeed or fail independently. Compensation transactions allow partial rollbacks to maintain overall consistency.

2. Avoiding Global Locking

Traditional transactions with global locks (e.g., two-phase commits) are not feasible in distributed systems due to performance and scalability concerns. Compensation transactions provide a more flexible alternative.

3. Resilience and Fault Tolerance

Compensation mechanisms make systems more resilient by allowing recovery from failures without manual intervention.

How Compensation Transactions Work

  1. Perform Main Operations: Each service performs its assigned operation, such as creating a record or updating a database.
  2. Log Operations: Log actions and context to enable compensating transactions if needed.
  3. Detect Failure: Monitor the workflow for errors or failures in any service.
  4. Trigger Compensation: If a failure occurs, execute compensation transactions for all successfully completed operations to undo their effects.

Example Workflow

Imagine an e-commerce checkout process involving three steps

  • Step 1: Reserve inventory.
  • Step 2: Deduct payment.
  • Step 3: Confirm order.

If Step 3 fails, compensation transactions for Steps 1 and 2 might include

  • Releasing the reserved inventory.
  • Refunding the payment.

Design Considerations for Compensation Transactions

1. Idempotency

Ensure compensating actions are idempotent, meaning they can be executed multiple times without unintended side effects. This is crucial in distributed systems where retries are common.

2. Consistency Model

Adopt an eventual consistency model to align with the asynchronous nature of compensation transactions.

3. Error Handling

Design robust error-handling mechanisms for compensating actions, as these too can fail.

4. Service Communication

Use reliable communication protocols (e.g., message queues) to trigger and manage compensation transactions.

5. Isolation of Compensation Logic

Keep compensation logic isolated from the main business logic to maintain clarity and modularity.

Use Cases for Compensation Transactions

1. Financial Systems

  • Reversing failed fund transfers or unauthorized transactions.
  • Refunding payments in e-commerce platforms.

2. Travel and Booking Systems

  • Canceling a hotel reservation if flight booking fails.
  • Releasing blocked seats if payment is not completed.

3. Healthcare Systems

  • Undoing scheduled appointments if insurance validation fails.
  • Revoking prescriptions if a linked process encounters errors.

4. Supply Chain Management

  • Canceling shipment orders if inventory updates fail.
  • Restocking items if order fulfillment is aborted.

Challenges of Compensation Transactions

  1. Complexity in Implementation: Designing compensating logic for every operation can be tedious and error-prone.
  2. Performance Overhead: Logging operations and executing compensations can introduce latency.
  3. Partial Rollbacks: It may not always be possible to fully undo certain operations, such as sending emails or notifications.
  4. Failure in Compensating Actions: Compensation transactions themselves can fail, requiring additional mechanisms to handle such scenarios.

Best Practices

  1. Plan for Compensation Early: Design compensating transactions as part of the initial development process.
  2. Use SAGA Pattern: Combine compensation transactions with the SAGA pattern to manage distributed workflows effectively.
  3. Test Extensively: Simulate failures and test compensating logic under various conditions.
  4. Monitor and Log: Maintain detailed logs of operations and compensations for debugging and audits.

Learning Notes #29 – Two Phase Commit Protocol | ACID in Distributed Systems

3 January 2025 at 13:45

Today, i learnt about compensating transaction pattern which leads to two phase commit protocol which helps in maintaining the Atomicity of a distributed transactions. Distributed transactions are hard.

In this blog, i jot down notes on Two Phase Commit protocol for better understanding.

The Two-Phase Commit (2PC) protocol is a distributed algorithm used to ensure atomicity in transactions spanning multiple nodes or databases. Atomicity ensures that either all parts of a transaction are committed or none are, maintaining consistency in distributed systems.

Why Two-Phase Commit?

In distributed systems, a transaction might involve several independent nodes, each maintaining its own database. Without a mechanism like 2PC, failures in one node can leave the system in an inconsistent state.

For example, consider an e-commerce platform where a customer places an order.

The transaction involves updating the inventory in one database, recording the payment in another, and generating a shipment request in a third system. If the payment database successfully commits but the inventory database fails, the system becomes inconsistent, potentially causing issues like double selling or incomplete orders. 2PC mitigates this by providing a coordinated protocol to commit or abort transactions across all nodes.

The Phases of 2PC

The protocol operates in two main phases

1. Prepare Phase (Voting Phase)

The coordinator node initiates the transaction and prepares to commit it across all participating nodes.

  1. Request to Prepare: The coordinator sends a PREPARE request to all participant nodes.
  2. Vote: Each participant checks if it can commit the transaction (e.g., no constraints violated, resources available). It logs its decision (YES or NO) locally and sends its vote to the coordinator. If any participant votes NO, the transaction cannot be committed.

2. Commit Phase (Decision Phase)

Based on the votes received in the prepare phase, the coordinator decides the final outcome.

Commit Decision:

If all participants vote YES, the coordinator logs a COMMIT decision, sends COMMIT messages to all participants, and participants apply the changes and confirm with an acknowledgment.

Abort Decision:

If any participant votes NO, the coordinator logs an ABORT decision, sends ABORT messages to all participants, and participants roll back any changes made during the transaction.

Implementation:

For a simple implementation of 2PC, we can try out the below flow using RabbitMQ as a medium for Co-Ordinator.

Basically, we need not to write this from scratch, we have tools,

1. Relational Databases

Most relational databases have built-in support for distributed transactions and 2PC.

  • PostgreSQL: Implements distributed transactions using foreign data wrappers (FDWs) with PREPARE TRANSACTION and COMMIT PREPARED.
  • MySQL: Supports XA transactions, which follow the 2PC protocol.
  • Oracle Database: Offers robust distributed transaction support using XA.
  • Microsoft SQL Server: Provides distributed transactions through MS-DTC.

2. Distributed Transaction Managers

These tools manage distributed transactions across multiple systems.

  • Atomikos: A popular Java-based transaction manager supporting JTA/XA for distributed systems.
  • Bitronix: Another lightweight transaction manager for Java applications supporting JTA/XA.
  • JBoss Transactions (Narayana): A robust Java transaction manager that supports 2PC, often used in conjunction with JBoss servers.

3. Message Brokers

Message brokers provide transaction capabilities with 2PC.

  • RabbitMQ: Supports the 2PC protocol using transactional channels.
  • Apache Kafka: Supports transactions, ensuring β€œexactly-once” semantics across producers and consumers.
  • ActiveMQ: Provides distributed transaction support through JTA integration

4. Workflow Engines

Workflow engines can orchestrate 2PC across distributed systems.

  • Apache Camel: Can coordinate 2PC transactions using its transaction policy.
  • Camunda: Provides BPMN-based orchestration that can include transactional boundaries.
  • Zeebe: Supports distributed transaction workflows in modern architectures.

Key Properties of 2PC

  1. Atomicity: Ensures all-or-nothing transaction behavior.
  2. Consistency: Guarantees system consistency across all nodes.
  3. Durability: Uses logs to ensure decisions survive node failures.

Challenges of 2PC

  1. Blocking Nature: If the coordinator fails during the commit phase, participants must wait indefinitely unless a timeout or external mechanism is implemented.
  2. Performance Overhead: Multiple message exchanges and logging operations introduce latency.
  3. Single Point of Failure: The coordinator’s failure can stall the entire transaction.

Lucifer and the Git-Powered Calculator: The Complete Adventure

26 August 2024 at 01:43

In losangels , a young coder named Lucifer set out on a mission to build his very own calculator. Along the way, he learned how to use Git, a powerful tool that would help him track his progress and manage his code. Here’s the complete story of how Lucifer built his calculator, step by step, with the power of Git.

Step 1: Setting Up the Project with git init

Lucifer started by creating a new directory for his project and initializing a Git repository. This was like setting up a magical vault to store all his coding adventures.


mkdir MagicCalculator
cd MagicCalculator
git init

This command created the .git directory inside the MagicCalculator folder, where Git would keep track of everything.

Step 2: Configuring His Identity with git config

Before getting too far, Lucifer needed to make sure Git knew who he was. He configured his username and email address, so every change he made would be recorded in his name.


git config --global user.name "Lucifer"
git config --global user.email "lucifer@codeville.com"

Step 3: Writing the Addition Function and Staging It with git add

Lucifer began his calculator project by writing a simple function to add two numbers. He created a new Python file named main.py and added the following code,


# main.py

def add(x, y):
    return x + y

# Simple test to ensure it's working
print(add(5, 3))  # Output should be 8

Happy with his progress, Lucifer used the git add command to stage his changes. This was like preparing the code to be saved in Git’s memory.


git add main.py

Step 4: Committing the First Version with git commit

Next, Lucifer made his first commit. This saved the current state of his project, along with a message describing what he had done.


git commit -m "Added addition function"

Now, Git had recorded the addition function as the first chapter in the history of Lucifer’s project.

Step 5: Adding More Functions and Committing Them

Lucifer continued to add more functions to his calculator. First, he added subtraction,


# main.py

def add(x, y):
    return x + y

def subtract(x, y):
    return x - y

# Simple tests
print(add(5, 3))       # Output: 8
print(subtract(5, 3))  # Output: 2

He then staged and committed the subtraction function,


git add main.py
git commit -m "Added subtraction function"

Lucifer added multiplication and division next,


# main.py

def add(x, y):
    return x + y

def subtract(x, y):
    return x - y

def multiply(x, y):
    return x * y

def divide(x, y):
    if y != 0:
        return x / y
    else:
        return "Cannot divide by zero!"

# Simple tests
print(add(5, 3))       # Output: 8
print(subtract(5, 3))  # Output: 2
print(multiply(5, 3))  # Output: 15
print(divide(5, 3))    # Output: 1.666...
print(divide(5, 0))    # Output: Cannot divide by zero!

Again, he staged and committed these changes,


git add main.py
git commit -m "Added multiplication and division functions"

Step 6: Branching Out with git branch and git checkout

Lucifer had an idea to add a feature that would let users choose which operation to perform. However, he didn’t want to risk breaking his existing code. So, he created a new branch to work on this feature.


git branch operation-choice
git checkout operation-choice

Now on the operation-choice branch, Lucifer wrote the code to let users select an operation,


# main.py

def add(x, y):
    return x + y

def subtract(x, y):
    return x - y

def multiply(x, y):
    return x * y

def divide(x, y):
    if y != 0:
        return x / y
    else:
        return "Cannot divide by zero!"

def calculator():
    print("Select operation:")
    print("1. Add")
    print("2. Subtract")
    print("3. Multiply")
    print("4. Divide")

    choice = input("Enter choice (1/2/3/4): ")

    num1 = float(input("Enter first number: "))
    num2 = float(input("Enter second number: "))

    if choice == '1':
        print(f"{num1} + {num2} = {add(num1, num2)}")

    elif choice == '2':
        print(f"{num1} - {num2} = {subtract(num1, num2)}")

    elif choice == '3':
        print(f"{num1} * {num2} = {multiply(num1, num2)}")

    elif choice == '4':
        print(f"{num1} / {num2} = {divide(num1, num2)}")

    else:
        print("Invalid input")

# Run the calculator
calculator()

Step 7: Merging the Feature into the Main Branch

After testing his new feature and making sure it worked, Lucifer was ready to merge it back into the main branch. He switched back to the main branch and merged the changes


git checkout main
git merge operation-choice

With this, the feature was successfully added to his calculator project.

Conclusion: Lucifer’s Git-Powered Calculator

By the end of his adventure, Lucifer had built a fully functional calculator and learned how to use Git to manage his code. His calculator could add, subtract, multiply, and divide, and even let users choose which operation to perform.

Thanks to Git, Lucifer’s project was well-organized, and he had a complete history of all the changes he made. He knew that if he ever needed to revisit an old version or experiment with new features, Git would be there to help.

Lucifer’s calculator project was a success, and with his newfound Git skills, he felt ready to take on even bigger challenges in the future.

Git Commands

By: Elavarasu
2 March 2024 at 12:24
  • git clone
  • git add
  • git commit
  • git push
  • git pull
  • git branch
  • git checkout
  • git status

git clone remote

to create new repository in github cloud , then create a local repo and clone 2 repositories using git clone remote (repo link).

git add filename
git add .

to add a single file using git add and add all the files which present in the current repo the use git add . | . represent add all the files.

git commit -m "message"

commit the added file using git commit and put some mesagges about the file using -m ” ” .

some ideas of commit messages are available in below mentioned link.

Reference : conventional commit.org | https://www.conventionalcommits.org/en/v1.0.0/

git rm --cached filename
if incase want to remove the added file use git remove
git rm --cached -f filename
-f --> forcefully remove the file.
git push

push the commited file from the local to cloud repo using git push. it will push the file to remote repository.

git pull 

when u need to change or edit or modify the remote file first you will pull the file to local repository from remote using git pull.

git branch 
Branch means where will you store the files in git remote.git branch is to specify the branch name it will push the file to current remote branch.

git checkout -b "branchname"
to create a new branch.

git checkout branchname
check the created branch or new branch

git branch -M branchname
to modify or change the branch name

git branch -a
to listout all branches.
git status
check all the status of previously used command like know the added file is staged or unstaged

Reference : https://www.markdownguide.org/cheat-sheet/

markdown cheathseet for readme file

Reference : conventional commit.org | https://www.conventionalcommits.org/en/v1.0.0/

for commit messages.

Git Commands

By: Elavarasu
2 March 2024 at 12:24
  • git clone
  • git add
  • git commit
  • git push
  • git pull
  • git branch
  • git checkout
  • git status

git clone remote

to create new repository in github cloud , then create a local repo and clone 2 repositories using git clone remote (repo link).

git add filename
git add .

to add a single file using git add and add all the files which present in the current repo the use git add . | . represent add all the files.

git commit -m "message"

commit the added file using git commit and put some mesagges about the file using -m ” ” .

some ideas of commit messages are available in below mentioned link.

Reference : conventional commit.org | https://www.conventionalcommits.org/en/v1.0.0/

git rm --cached filename
if incase want to remove the added file use git remove
git rm --cached -f filename
-f --> forcefully remove the file.
git push

push the commited file from the local to cloud repo using git push. it will push the file to remote repository.

git pull 

when u need to change or edit or modify the remote file first you will pull the file to local repository from remote using git pull.

git branch 
Branch means where will you store the files in git remote.git branch is to specify the branch name it will push the file to current remote branch.

git checkout -b "branchname"
to create a new branch.

git checkout branchname
check the created branch or new branch

git branch -M branchname
to modify or change the branch name

git branch -a
to listout all branches.
git status
check all the status of previously used command like know the added file is staged or unstaged

Reference : https://www.markdownguide.org/cheat-sheet/

markdown cheathseet for readme file

Reference : conventional commit.org | https://www.conventionalcommits.org/en/v1.0.0/

for commit messages.

❌
❌