Skip to content

Commit

Permalink
Add documentation for Two-phase Commit Transactions (#303)
Browse files Browse the repository at this point in the history
  • Loading branch information
brfrn169 authored Sep 17, 2021
1 parent c9a7290 commit f5433ac
Show file tree
Hide file tree
Showing 5 changed files with 179 additions and 6 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ dependencies {
## Docs
* [Getting started](docs/getting-started.md)
* [Scalar DB supported databases](docs/scalar-db-supported-databases.md)
* [Two-phase Commit Transactions](docs/two-phase-commit-transactions.md)
* [Design document](docs/design.md)
* Slides
* [Making Cassandra more capable, faster, and more reliable](https://www.slideshare.net/scalar-inc/making-cassandra-more-capable-faster-and-more-reliable-at-apacheconhome-2020) at ApacheCon@Home 2020
Expand Down
2 changes: 1 addition & 1 deletion conf/database.properties
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ scalar.db.contact_points=localhost
scalar.db.username=cassandra
scalar.db.password=cassandra

# Storage implementation. cassandra, cosmos or dynamo can be set. Default storage is cassandra.
# Storage implementation. "cassandra" or "cosmos" or "dynamo" or "jdbc" or "grpc" can be set. Default storage is "cassandra".
#scalar.db.storage=cassandra

# Namespace prefix. The default is empty.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@
/**
* A transaction manager based on a two-phase commit protocol.
*
* <p>Just like a well-known two-phase commit protocol, there are 2 roles, a coordinator and a
* <p>Just like a well-known two-phase commit protocol, there are two roles, a coordinator and a
* participant, that execute a single transaction collaboratively. A coordinator process first
* starts a transaction with {@link #start()} or {@link #start(String)} and participant processes
* starts a transaction with {@link #start()} or {@link #start(String)}, and participant processes
* join the transaction with {@link #join(String)} with the transaction ID. Also, participants can
* resume the transaction with {@link #resume(String)} with the transaction ID.
*/
Expand Down
6 changes: 3 additions & 3 deletions docs/getting-started-with-scalardb.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,10 +72,10 @@ For JDBC databases
$ java -jar scalar-schema-standalone-<version>.jar --jdbc -j <JDBC_URL> -u <USERNAME> -p <PASSWORD> -f emoney-storage.json
```

## Store & retrieve data with storage service
## Store & retrieve data with storage API

[`ElectronicMoneyWithStorage.java`](./getting-started/src/main/java/sample/ElectronicMoneyWithStorage.java)
is a simple electronic money application with storage service.
is a simple electronic money application with storage API.
(Be careful: it is simplified for ease of reading and far from practical and is certainly not production-ready.)

```java
Expand Down Expand Up @@ -194,7 +194,7 @@ $ java -jar scalar-schema-standalone-<version>.jar --jdbc -j <JDBC_URL> -u <USER
$ java -jar scalar-schema-standalone-<version>.jar --jdbc -j <JDBC_URL> -u <USERNAME> -p <PASSWORD> -f emoney-transaction.json
```

## Store & retrieve data with transaction service
## Store & retrieve data with transaction API

The previous application seems fine under ideal conditions, but it is problematic when some failure happens during its operation or when multiple operations occur at the same time because it is not transactional.
For example, money transfer (pay) from `A's balance` to `B's balance` is not done atomically in the application, and there might be a case where only `A's balance` is decreased (and `B's balance` is not increased) if a failure happens right after the first `put` and some money will be lost.
Expand Down
172 changes: 172 additions & 0 deletions docs/two-phase-commit-transactions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,172 @@
# Two-phase Commit Transactions

Scalar DB also supports two-phase commit style transactions called *Two-phase Commit Transactions*.
With Two-phase Commit Transactions, you can execute a transaction that spans multiple processes/applications (e.g., Microservices).

This document briefly explains how to execute Two-phase Commit Transactions in Scalar DB.

## Configuration

The configuration for Two-phase Commit Transactions is the same as the one for the transaction API.

For example, you can set the following configuration when you use Cassandra:
```
# Comma separated contact points
scalar.db.contact_points=cassandra
# Port number for all the contact points. Default port number for each database is used if empty.
scalar.db.contact_port=9042
# Credential information to access the database
scalar.db.username=cassandra
scalar.db.password=cassandra
# Storage implementation. Either cassandra or cosmos or dynamo or jdbc can be set. Default storage is cassandra.
scalar.db.storage=cassandra
```

Please see [Getting Started](getting-started.md) for configurations of other databases/storages.

### Scalar DB server

You can also execute Two-phase Commit Transactions through the Scalar DB server.
You don't need a special configuration for Two-phase Commit Transactions, so you can follow [the Scalar DB server document](scalardb-server.md) to use it.

## How to execute Two-phase Commit Transactions

This section explains how to execute Two-phase Commit Transactions.

Like a well-known two-phase commit protocol, there are two roles, a coordinator and a participant, that collaboratively execute a single transaction.
The coordinator process first starts a transaction, and the participant processes join the transaction after that.

### Get a TwoPhaseCommitTransactionManager instance

First, you need to get a `TwoPhaseCommitTransactionManager` instance to execute Two-phase Commit Transactions.
You can use `TransactionFactory` to get a `TwoPhaseCommitTransactionManager` instance as follows:
```Java
TransactionFactory factory = new TransactionFactory(new DatabaseConfig(new File("<configuration file path>")));
TwoPhaseCommitTransactionManager manager = factory.getTwoPhaseCommitTransactionManager();
```

### Start a transaction (coordinator only)

You can start a transaction as follows:
```Java
TwoPhaseCommitTransaction tx = manager.start();
```

The process/application that starts the transaction acts as a coordinator, as mentioned.

You can also start a transaction by specifying a transaction ID as follows:
```Java
TwoPhaseCommitTransaction tx = manager.start("<transaction ID>");
```

And, you can get the transaction ID with `getId()` as follows:
```Java
tx.getId();
```

### Join the transaction (participant only)

If you are a participant, you can join the transaction that has been started by the coordinator as follows:
```Java
TwoPhaseCommitTransaction tx = manager.join("<transaction ID>")
```

You need to specify the transaction ID associated with the transaction that the coordinator has started.

#### Resume the transaction (participant only)

You can get the transaction object (the `TwoPhaseCommitTransaction` instance) that you have already joined with `TwoPhaseCommitTransactionManager.resume()`:
```Java
TwoPhaseCommitTransaction tx = manager.resume("<transaction ID>")
```

`TwoPhaseCommitTransactionManager` manages the transaction objects that you have joined, and you can get it with the transaction ID.

### CRUD operations for the transaction

The way to execute CRUD operations in Two-phase Commit Transactions is the same as the transaction API.
`TwoPhaseCommitTransacton` has `get()`/`put()`/`delete()`/`mutate()` to execute CRUD operations.

This is an example code for CRUD operations in Two-phase Commit Transactions:
```java
TwoPhaseCommitTransaction tx = ...

// Retrieve the current balances for ids
Get fromGet = new Get(new Key(ID, fromId));
Get toGet = new Get(new Key(ID, toId));
Optional<Result> fromResult = tx.get(fromGet);
Optional<Result> toResult = tx.get(toGet);

// Calculate the balances (it assumes that both accounts exist)
int newFromBalance = fromResult.get().getValue(BALANCE).get().getAsInt() - amount;
int newToBalance = toResult.get().getValue(BALANCE).get().getAsInt() + amount;

// Update the balances
Put fromPut = new Put(new Key(ID, fromId)).withValue(BALANCE, newFromBalance);
Put toPut = new Put(new Key(ID, toId)).withValue(BALANCE, newToBalance);
tx.put(fromPut);
tx.put(toPut);
```

### Prepare/Commit/Rollback the transaction

After finishing CRUD operations, you need to commit the transaction.
Like a well-known two-phase commit protocol, there are two phases: prepare and commit phases.
You first need to prepare the transaction in all the coordinator/participant processes, then you need to call in the order of coordinator's `commit()` and the participants' `commit()` as follows:
```Java
TwoPhaseCommitTransaction tx = ...

try {
// Execute CRUD operations in the coordinator/participant processes
...

// Prepare phase: Prepare the transaction in all the coordinator/participant processes
tx.prepare();
...

// Commit phase: Commit the transaction in all the coordinator/participant processes
tx.commit()
...
} catch (TransactionException e) {
// When an error happans, you need to rollback the transaction in all the coordinator/participant processes
tx.rollback();
...
}
```

If an error happens, you need to call `rollback()` in all the coordinator/participant processes.
Note that you need to call it in the coordinator process first, and then call it in the participant processes in parallel.

You can call `prepare()` in the coordinator/participant processes in parallel.
Similarly, you can also call `commit()` in the participant processes in parallel.

#### Validate the transaction

Depending on the concurrency control protocol, you need to call `validate()` in all the coordinator/participant processes after `prepare()` and before `commit()`:
```java
// Prepare phase 1: Prepare the transaction in all the coordinator/participant processes
tx.prepare();
...

// Prepare phase 2: Validate the transaction in all the coordinator/participant processes
tx.validate()
...

// Commit phase: Commit the transaction in all the coordinator/participant processes
tx.commit()
...
```

Similar to `prepare()`, you can call `validate()` in the coordinator/participant processes in parallel.

Currently, you need to call `validate()` when you use the `Consensus Commit` transaction manager with `EXTRA_READ` serializable strategy in `SERIALIZABLE` isolation level.
In other cases, `validate()` does nothing.

## Further documentation

One of the use cases for Two-phase Commit Transactions is Microservice Transaction.
Please see the following sample to learn Two-phase Commit Transactions further:
- [Microservice Transaction Sample](https://github.com/scalar-labs/scalardb-samples/microservice-transaction-sample/)

0 comments on commit f5433ac

Please sign in to comment.