Learning Notes #40 β SAGA Pattern | Cloud Patterns
Today, I learnt about SAGA Pattern, followed by Compensation Pattern, Orchestration Pattern, Choreography Pattern and Two Phase Commit. SAGA is a combination of all the above. In this blog, i jot down notes on SAGA, for my future self.
Modern software applications often require the coordination of multiple distributed services to perform complex business operations. In such systems, ensuring consistency and reliability can be challenging, especially when a failure occurs in one of the services. The SAGA design pattern offers a robust solution to manage distributed transactions while maintaining data consistency.
What is the SAGA Pattern?
The SAGA pattern is a distributed transaction management mechanism where a series of independent operations (or steps) are executed sequentially across multiple services. Each operation in the sequence has a corresponding compensating action to roll back changes if a failure occurs. This approach avoids the complexities of distributed transactions, such as two-phase commits, by breaking down the process into smaller, manageable units.
Key Characteristics
- Decentralized Control: Transactions are managed across services without a central coordinator.
- Compensating Transactions: Every operation has an undo or rollback mechanism.
- Asynchronous Communication: Services communicate asynchronously in most implementations, ensuring loose coupling.
Types of SAGA Patterns
There are two primary types of SAGA patterns:
1. Choreography-Based SAGA
- In this approach, services communicate with each other directly to coordinate the workflow.
- Each service knows which operation to trigger next after completing its own task.
- If a failure occurs, each service initiates its compensating action to roll back changes.
Advantages:
- Simple implementation.
- No central coordinator required.
Disadvantages:
- Difficult to manage and debug in complex workflows.
- Tight coupling between services.
import pika class RabbitMQHandler: def __init__(self, queue): self.connection = pika.BlockingConnection(pika.ConnectionParameters('localhost')) self.channel = self.connection.channel() self.channel.queue_declare(queue=queue) self.queue = queue def publish(self, message): self.channel.basic_publish(exchange='', routing_key=self.queue, body=message) def consume(self, callback): self.channel.basic_consume(queue=self.queue, on_message_callback=callback, auto_ack=True) self.channel.start_consuming() # Define services class FlightService: def book_flight(self): print("Flight booked.") RabbitMQHandler('hotel_queue').publish("flight_booked") class HotelService: def on_flight_booked(self, ch, method, properties, body): try: print("Hotel booked.") RabbitMQHandler('invoice_queue').publish("hotel_booked") except Exception: print("Failed to book hotel. Rolling back flight.") FlightService().cancel_flight() def cancel_flight(self): print("Flight booking canceled.") # Setup RabbitMQ flight_service = FlightService() hotel_service = HotelService() RabbitMQHandler('hotel_queue').consume(hotel_service.on_flight_booked) # Trigger the workflow flight_service.book_flight()
2. Orchestration-Based SAGA
- A central orchestrator service manages the workflow and coordinates between the services.
- The orchestrator determines the sequence of operations and handles compensating actions in case of failures.
Advantages:
- Clear control and visibility of the workflow.
- Easier to debug and manage.
Disadvantages:
- The orchestrator can become a single point of failure.
- More complex implementation.
import pika class Orchestrator: def __init__(self): self.rabbitmq = RabbitMQHandler('orchestrator_queue') def execute_saga(self): try: self.reserve_inventory() self.process_payment() self.generate_invoice() except Exception as e: print(f"Error occurred: {e}. Initiating rollback.") self.compensate() def reserve_inventory(self): print("Inventory reserved.") self.rabbitmq.publish("inventory_reserved") def process_payment(self): print("Payment processed.") self.rabbitmq.publish("payment_processed") def generate_invoice(self): print("Invoice generated.") self.rabbitmq.publish("invoice_generated") def compensate(self): print("Rolling back invoice.") print("Rolling back payment.") print("Rolling back inventory.") # Trigger the workflow Orchestrator().execute_saga()
How SAGA Works
- Transaction Initiation: The first operation is executed by one of the services.
- Service Communication: Subsequent services execute their operations based on the outcome of the previous step.
- Failure Handling: If an operation fails, compensating transactions are triggered in reverse order to undo any changes.
- Completion: Once all operations are successfully executed, the transaction is considered complete.
Benefits of the SAGA Pattern
- Improved Resilience: Allows partial rollbacks in case of failure.
- Scalability: Suitable for microservices and distributed systems.
- Flexibility: Works well with event-driven architectures.
- No Global Locks: Unlike traditional transactions, SAGA does not require global locking of resources.
Challenges and Limitations
- Complexity in Rollbacks: Designing compensating transactions for every operation can be challenging.
- Data Consistency: Achieving eventual consistency may require additional effort.
- Debugging Issues: Debugging failures in a distributed environment can be cumbersome.
- Latency: Sequential execution may increase overall latency.
When to Use the SAGA Pattern
- Distributed systems where global ACID transactions are infeasible.
- Microservices architectures with independent services.
- Applications requiring high resilience and eventual consistency.
Real-World Applications
- E-Commerce Platforms: Managing orders, payments, and inventory updates.
- Travel Booking Systems: Coordinating flight, hotel, and car rental reservations.
- Banking Systems: Handling distributed account updates and transfers.
- Healthcare: Coordinating appointment scheduling and insurance claims.