❌

Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Learning Notes #40 – SAGA Pattern | Cloud Patterns

5 January 2025 at 17:08

Today, I learnt about SAGA Pattern, followed by Compensation Pattern, Orchestration Pattern, Choreography Pattern and Two Phase Commit. SAGA is a combination of all the above. In this blog, i jot down notes on SAGA, for my future self.

Modern software applications often require the coordination of multiple distributed services to perform complex business operations. In such systems, ensuring consistency and reliability can be challenging, especially when a failure occurs in one of the services. The SAGA design pattern offers a robust solution to manage distributed transactions while maintaining data consistency.

What is the SAGA Pattern?

The SAGA pattern is a distributed transaction management mechanism where a series of independent operations (or steps) are executed sequentially across multiple services. Each operation in the sequence has a corresponding compensating action to roll back changes if a failure occurs. This approach avoids the complexities of distributed transactions, such as two-phase commits, by breaking down the process into smaller, manageable units.

Key Characteristics

  1. Decentralized Control: Transactions are managed across services without a central coordinator.
  2. Compensating Transactions: Every operation has an undo or rollback mechanism.
  3. Asynchronous Communication: Services communicate asynchronously in most implementations, ensuring loose coupling.

Types of SAGA Patterns

There are two primary types of SAGA patterns:

1. Choreography-Based SAGA

  • In this approach, services communicate with each other directly to coordinate the workflow.
  • Each service knows which operation to trigger next after completing its own task.
  • If a failure occurs, each service initiates its compensating action to roll back changes.

Advantages:

  • Simple implementation.
  • No central coordinator required.

Disadvantages:

  • Difficult to manage and debug in complex workflows.
  • Tight coupling between services.
import pika

class RabbitMQHandler:
    def __init__(self, queue):
        self.connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
        self.channel = self.connection.channel()
        self.channel.queue_declare(queue=queue)
        self.queue = queue

    def publish(self, message):
        self.channel.basic_publish(exchange='', routing_key=self.queue, body=message)

    def consume(self, callback):
        self.channel.basic_consume(queue=self.queue, on_message_callback=callback, auto_ack=True)
        self.channel.start_consuming()

# Define services
class FlightService:
    def book_flight(self):
        print("Flight booked.")
        RabbitMQHandler('hotel_queue').publish("flight_booked")

class HotelService:
    def on_flight_booked(self, ch, method, properties, body):
        try:
            print("Hotel booked.")
            RabbitMQHandler('invoice_queue').publish("hotel_booked")
        except Exception:
            print("Failed to book hotel. Rolling back flight.")
            FlightService().cancel_flight()

    def cancel_flight(self):
        print("Flight booking canceled.")

# Setup RabbitMQ
flight_service = FlightService()
hotel_service = HotelService()

RabbitMQHandler('hotel_queue').consume(hotel_service.on_flight_booked)

# Trigger the workflow
flight_service.book_flight()

2. Orchestration-Based SAGA

  • A central orchestrator service manages the workflow and coordinates between the services.
  • The orchestrator determines the sequence of operations and handles compensating actions in case of failures.

Advantages:

  • Clear control and visibility of the workflow.
  • Easier to debug and manage.

Disadvantages:

  • The orchestrator can become a single point of failure.
  • More complex implementation.
import pika

class Orchestrator:
    def __init__(self):
        self.rabbitmq = RabbitMQHandler('orchestrator_queue')

    def execute_saga(self):
        try:
            self.reserve_inventory()
            self.process_payment()
            self.generate_invoice()
        except Exception as e:
            print(f"Error occurred: {e}. Initiating rollback.")
            self.compensate()

    def reserve_inventory(self):
        print("Inventory reserved.")
        self.rabbitmq.publish("inventory_reserved")

    def process_payment(self):
        print("Payment processed.")
        self.rabbitmq.publish("payment_processed")

    def generate_invoice(self):
        print("Invoice generated.")
        self.rabbitmq.publish("invoice_generated")

    def compensate(self):
        print("Rolling back invoice.")
        print("Rolling back payment.")
        print("Rolling back inventory.")

# Trigger the workflow
Orchestrator().execute_saga()

How SAGA Works

  1. Transaction Initiation: The first operation is executed by one of the services.
  2. Service Communication: Subsequent services execute their operations based on the outcome of the previous step.
  3. Failure Handling: If an operation fails, compensating transactions are triggered in reverse order to undo any changes.
  4. Completion: Once all operations are successfully executed, the transaction is considered complete.

Benefits of the SAGA Pattern

  1. Improved Resilience: Allows partial rollbacks in case of failure.
  2. Scalability: Suitable for microservices and distributed systems.
  3. Flexibility: Works well with event-driven architectures.
  4. No Global Locks: Unlike traditional transactions, SAGA does not require global locking of resources.

Challenges and Limitations

  1. Complexity in Rollbacks: Designing compensating transactions for every operation can be challenging.
  2. Data Consistency: Achieving eventual consistency may require additional effort.
  3. Debugging Issues: Debugging failures in a distributed environment can be cumbersome.
  4. Latency: Sequential execution may increase overall latency.

When to Use the SAGA Pattern

  • Distributed systems where global ACID transactions are infeasible.
  • Microservices architectures with independent services.
  • Applications requiring high resilience and eventual consistency.

Real-World Applications

  1. E-Commerce Platforms: Managing orders, payments, and inventory updates.
  2. Travel Booking Systems: Coordinating flight, hotel, and car rental reservations.
  3. Banking Systems: Handling distributed account updates and transfers.
  4. Healthcare: Coordinating appointment scheduling and insurance claims.

Learning Notes #38 – Choreography Pattern | Cloud Pattern

5 January 2025 at 12:21

Today i learnt about Choreography pattern, where each and every service is communicating using a messaging queue. In this blog, i jot down notes on choreography pattern for my future self.

What is the Choreography Pattern?

In the Choreography Pattern, services communicate directly with each other via asynchronous events, without a central controller. Each service is responsible for a specific part of the workflow and responds to events produced by other services. This pattern allows for a more autonomous and loosely coupled system.

Key Features

  • High scalability and independence of services.
  • Decentralized control.
  • Services respond to events they subscribe to.

When to Use the Choreography Pattern

  • Event-Driven Systems: When workflows can be modeled as events triggering responses.
  • High Scalability: When services need to operate independently and scale autonomously.
  • Loose Coupling: When minimizing dependencies between services is critical.

Benefits of the Choreography Pattern

  1. Decentralized Control: No single point of failure or bottleneck.
  2. Increased Flexibility: Services can be added or modified without affecting others.
  3. Better Scalability: Services operate independently and scale based on their workloads.
  4. Resilience: The system can handle partial failures more gracefully, as services continue independently.

Example: E-Commerce Order Fulfillment

Problem

A fictional e-commerce platform needs to manage the following workflow:

  1. Accepting an order.
  2. Validating payment.
  3. Reserving inventory.
  4. Sending notifications to the customer.

Each step is handled by an independent service.

Solution

Using the Choreography Pattern, each service listens for specific events and publishes new events as needed. The workflow emerges naturally from the interaction of these services.

Implementation

Step 1: Define the Workflow as Events

  • OrderPlaced: Triggered when a customer places an order.
  • PaymentProcessed: Triggered after successful payment.
  • InventoryReserved: Triggered after reserving inventory.
  • NotificationSent: Triggered when the customer is notified.

Step 2: Implement Services

Each service subscribes to events and performs its task.

shared_utility.py

import pika
import json

def publish_event(exchange, event_type, data):
    connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
    channel = connection.channel()
    channel.exchange_declare(exchange=exchange, exchange_type='fanout')
    message = json.dumps({"event_type": event_type, "data": data})
    channel.basic_publish(exchange=exchange, routing_key='', body=message)
    connection.close()

def subscribe_to_event(exchange, callback):
    connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
    channel = connection.channel()
    channel.exchange_declare(exchange=exchange, exchange_type='fanout')
    queue = channel.queue_declare('', exclusive=True).method.queue
    channel.queue_bind(exchange=exchange, queue=queue)
    channel.basic_consume(queue=queue, on_message_callback=callback, auto_ack=True)
    print(f"Subscribed to events on exchange '{exchange}'")
    channel.start_consuming()

Order Service


from shared_utils import publish_event

def place_order(order_id, customer):
    print(f"Placing order {order_id} for {customer}")
    publish_event("order_exchange", "OrderPlaced", {"order_id": order_id, "customer": customer})

if __name__ == "__main__":
    # Simulate placing an order
    place_order(order_id=101, customer="John Doe")

Payment Service


from shared_utils import publish_event, subscribe_to_event
import time

def handle_order_placed(ch, method, properties, body):
    event = json.loads(body)
    if event["event_type"] == "OrderPlaced":
        order_id = event["data"]["order_id"]
        print(f"Processing payment for order {order_id}")
        time.sleep(1)  # Simulate payment processing
        publish_event("payment_exchange", "PaymentProcessed", {"order_id": order_id})

if __name__ == "__main__":
    subscribe_to_event("order_exchange", handle_order_placed)

Inventory Service


from shared_utils import publish_event, subscribe_to_event
import time

def handle_payment_processed(ch, method, properties, body):
    event = json.loads(body)
    if event["event_type"] == "PaymentProcessed":
        order_id = event["data"]["order_id"]
        print(f"Reserving inventory for order {order_id}")
        time.sleep(1)  # Simulate inventory reservation
        publish_event("inventory_exchange", "InventoryReserved", {"order_id": order_id})

if __name__ == "__main__":
    subscribe_to_event("payment_exchange", handle_payment_processed)

Notification Service


from shared_utils import subscribe_to_event
import time

def handle_inventory_reserved(ch, method, properties, body):
    event = json.loads(body)
    if event["event_type"] == "InventoryReserved":
        order_id = event["data"]["order_id"]
        print(f"Notifying customer for order {order_id}")
        time.sleep(1)  # Simulate notification
        print(f"Customer notified for order {order_id}")

if __name__ == "__main__":
    subscribe_to_event("inventory_exchange", handle_inventory_reserved)

Step 3: Run the Workflow

  1. Start RabbitMQ using Docker as described above.
  2. Run the services in the following order:
    • Notification Service: python notification_service.py
    • Inventory Service: python inventory_service.py
    • Payment Service: python payment_service.py
    • Order Service: python order_service.py
  3. Place an order by running the Order Service. The workflow will propagate through the services as events are handled.

Key Considerations

  1. Event Bus: Use an event broker like RabbitMQ, Kafka, or AWS SNS to manage communication between services.
  2. Event Versioning: Include versioning to handle changes in event formats over time.
  3. Idempotency: Ensure services handle repeated events gracefully to avoid duplication.
  4. Monitoring and Tracing: Use tools like OpenTelemetry to trace and debug distributed workflows.
  5. Error Handling:
    • Dead Letter Queues (DLQs) to capture failed events.
    • Retries with backoff for transient errors.

Advantages of the Choreography Pattern

  1. Loose Coupling: Services interact via events without direct knowledge of each other.
  2. Resilience: Failures in one service don’t block the entire workflow.
  3. High Autonomy: Services operate independently and can be deployed or scaled separately.
  4. Dynamic Workflows: Adding new services to the workflow requires subscribing them to relevant events.

Challenges of the Choreography Pattern

  1. Complex Debugging: Tracing errors across distributed services can be difficult.
  2. Event Storms: Poorly designed workflows may generate excessive events, overwhelming the system.
  3. Coordination Overhead: Decentralized logic can lead to inconsistent behavior if not carefully managed.

Orchestrator vs. Choreography: When to Choose?

  • Use Orchestrator Pattern when workflows are complex, require central control, or involve many dependencies.
  • Use Choreography Pattern when you need high scalability, loose coupling, or event-driven workflows.

❌
❌