❌

Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Learning Notes #29 – Two Phase Commit Protocol | ACID in Distributed Systems

3 January 2025 at 13:45

Today, i learnt about compensating transaction pattern which leads to two phase commit protocol which helps in maintaining the Atomicity of a distributed transactions. Distributed transactions are hard.

In this blog, i jot down notes on Two Phase Commit protocol for better understanding.

The Two-Phase Commit (2PC) protocol is a distributed algorithm used to ensure atomicity in transactions spanning multiple nodes or databases. Atomicity ensures that either all parts of a transaction are committed or none are, maintaining consistency in distributed systems.

Why Two-Phase Commit?

In distributed systems, a transaction might involve several independent nodes, each maintaining its own database. Without a mechanism like 2PC, failures in one node can leave the system in an inconsistent state.

For example, consider an e-commerce platform where a customer places an order.

The transaction involves updating the inventory in one database, recording the payment in another, and generating a shipment request in a third system. If the payment database successfully commits but the inventory database fails, the system becomes inconsistent, potentially causing issues like double selling or incomplete orders. 2PC mitigates this by providing a coordinated protocol to commit or abort transactions across all nodes.

The Phases of 2PC

The protocol operates in two main phases

1. Prepare Phase (Voting Phase)

The coordinator node initiates the transaction and prepares to commit it across all participating nodes.

  1. Request to Prepare: The coordinator sends a PREPARE request to all participant nodes.
  2. Vote: Each participant checks if it can commit the transaction (e.g., no constraints violated, resources available). It logs its decision (YES or NO) locally and sends its vote to the coordinator. If any participant votes NO, the transaction cannot be committed.

2. Commit Phase (Decision Phase)

Based on the votes received in the prepare phase, the coordinator decides the final outcome.

Commit Decision:

If all participants vote YES, the coordinator logs a COMMIT decision, sends COMMIT messages to all participants, and participants apply the changes and confirm with an acknowledgment.

Abort Decision:

If any participant votes NO, the coordinator logs an ABORT decision, sends ABORT messages to all participants, and participants roll back any changes made during the transaction.

Implementation:

For a simple implementation of 2PC, we can try out the below flow using RabbitMQ as a medium for Co-Ordinator.

Basically, we need not to write this from scratch, we have tools,

1. Relational Databases

Most relational databases have built-in support for distributed transactions and 2PC.

  • PostgreSQL: Implements distributed transactions using foreign data wrappers (FDWs) with PREPARE TRANSACTION and COMMIT PREPARED.
  • MySQL: Supports XA transactions, which follow the 2PC protocol.
  • Oracle Database: Offers robust distributed transaction support using XA.
  • Microsoft SQL Server: Provides distributed transactions through MS-DTC.

2. Distributed Transaction Managers

These tools manage distributed transactions across multiple systems.

  • Atomikos: A popular Java-based transaction manager supporting JTA/XA for distributed systems.
  • Bitronix: Another lightweight transaction manager for Java applications supporting JTA/XA.
  • JBoss Transactions (Narayana): A robust Java transaction manager that supports 2PC, often used in conjunction with JBoss servers.

3. Message Brokers

Message brokers provide transaction capabilities with 2PC.

  • RabbitMQ: Supports the 2PC protocol using transactional channels.
  • Apache Kafka: Supports transactions, ensuring β€œexactly-once” semantics across producers and consumers.
  • ActiveMQ: Provides distributed transaction support through JTA integration

4. Workflow Engines

Workflow engines can orchestrate 2PC across distributed systems.

  • Apache Camel: Can coordinate 2PC transactions using its transaction policy.
  • Camunda: Provides BPMN-based orchestration that can include transactional boundaries.
  • Zeebe: Supports distributed transaction workflows in modern architectures.

Key Properties of 2PC

  1. Atomicity: Ensures all-or-nothing transaction behavior.
  2. Consistency: Guarantees system consistency across all nodes.
  3. Durability: Uses logs to ensure decisions survive node failures.

Challenges of 2PC

  1. Blocking Nature: If the coordinator fails during the commit phase, participants must wait indefinitely unless a timeout or external mechanism is implemented.
  2. Performance Overhead: Multiple message exchanges and logging operations introduce latency.
  3. Single Point of Failure: The coordinator’s failure can stall the entire transaction.

❌
❌