Running Transactions In Splice Machine
Splice Machine is a fully transactional database that supports ACID transactions. This allows you to perform actions such as commit and rollback; in a transactional context, this means that the database does not make changes visible to others until a commit has been issued.
This topic includes brief overview information about transaction processing with Splice Machine, in these sections:
- Transactions Overview
- Using Transactions
A transaction is a unit of work performed in a database; to maintain the integrity of the database, each transaction must:
- complete in its entirety or have no effect on the database
- be isolated from other transactions that are running concurrently in the database
- produce results that are consistent with existing constraints in the database
- write its results to durable storage upon successful completion
The properties that describe how transactions must maintain integrity are Atomicity, Consistency, Isolation, and Durability. Transactions adhering to these properties are often referred to as ACID transactions. Here’s a summary of ACID transaction properties:
|Atomicity||Requires that each transaction be atomic, i.e. all-or-nothing: if one part of the transaction fails, the entire transaction fails, and the database state is left unchanged. Splice Machine guarantees atomicity in each and every situation, including power failures, errors, and crashes.|
|Consistency||Ensures that any transaction will bring the database from one valid state to another. Splice Machine makes sure that any data written to the database must be valid according to all defined rules, including constraints, cascades, triggers, and any combination thereof.|
|Isolation||Ensures that the concurrent execution of transactions results in a system state that would be obtained if transactions were executed serially. Splice Machine implements snapshot isolation using MVCC to guarantee that this is true.|
|Durability||Ensures that once a transaction has been committed, it will remain so, even in the event of power loss, crashes, or errors. Splice Machine stores changes in durable storage when they are committed.|
MVCC and Snapshot Isolation
Splice Machine employs a lockless snapshot isolation design that uses Multiple Version Concurrency Control (MVCC) to create a new version of the record every time it is updated and enforce consistency. Database systems use concurrency control systems to manage concurrent access. The simplest control method is to use locks that make sure that the writer is finished before any reader can proceed; however, this approach can be very slow. With snapshot isolation, each transaction has its own virtual snapshot of the database, which means that multiple transactions can operate concurrently without creating deadlock conditions.
When Splice Machine needs to update an item in the database, it doesn’t actually overwrite the old data value. Instead, it creates a new version with a new timestamp. Which means that readers have access to the data that was available when they began reading, even if that data has been updated by a writer in the meantime. This is referred to as point-in-time consistency and ensures that:
- Every transaction runs in its own transactional context, which includes a snapshot of the database from when the transaction began.
- Every read made during a transaction will see a consistent snapshot of the database.
- A transaction can only commit its changes if they do not conflict with updates that have been committed while the transaction was running.
Additional Information About Snapshot Isolation
This web page provides an overview of isolation levels and MVCC. https://vladmihalcea.com/a-beginners-guide-to-the-phantom-read-anomaly-and-how-it-differs-between-2pl-and-mvcc/. Some specifics that apply to Splice Machine:
Our implementation does not have the Phantom Reads anomaly, when you apply the strict interpretation of that anomaly: in Splice Machine, if you execute the same query twice on the same transaction, you will get the same results.
Splice Machine does not, however, protect against one form of this anomaly in its broadest interpretation; this has to do with how additions or modifications of rows in a predicate interact between transactions. This is described in the MVCC section of this document.
For additional information about isolation levels in SQL, see this technical paper: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/tr-95-51.pdf. Of particular interest:
- The diagram and table (Table 4) in the summary section of this paper, which summarize the different isolation levels and which anomalies they allow or disallow. Splice Machine allows: Write skew and a specific type of phantom reads (as noted above), but not phantom reads in general.
Reading and Writing Database Values During a Transaction
When you begin a transaction, you start working within a transactional context that includes a snapshot of the database. The operations that read and write database values for your transaction modify your transactional context. When your transaction is complete, you can commit those modifications to the database.
The commit of your transaction’s changes succeeds unless a write-write conflict occurs, which happens when your transaction attempts to commit an update to a value, and another update to that value has already been committed by a transaction that started before your transaction.
This means that the following statements are true with regard to reading values from and writing values to the database during a transaction:
- When you read a value during a transaction, you get the value that was most recently set within your transactional context. If you’ve not already set the value within your context, then this is the value that had been most recently committed in the database before your transaction began (and before your transactional context was established).
- When you write a value during a transaction, the value is set within your transactional context. It is only written to the database when you commit the transaction; that time is referred to as the commit timestamp for your transaction. The value changes that you commit then become visible to transactions that start after your transaction’s commit timestamp (until another transaction modifies the value).
If two parallel transactions attempt to change the same value, then a write-write conflict occurs, and the commit of the transaction that started later fails.
When a transaction fails due to a write-write conflict, your application must restart the transaction; Splice Machine does not automatically restart such transactions.
A Snapshot Isolation Example
The following diagram shows an example of snapshot isolation for a set of transactions, some of which are running in parallel.
In this example, the T3’ transaction has to be retried by the user.
Here’s a tabular version of the same transactional timeline, showing the values committed in the database over time, with added commentary:
|t2||T1 Start||T1 starts. The starting values within its transactional context are:
|T1 modifies the value of
|t4||T2 Start||T2 starts. The starting values within its transactional context are the same as for T1:
|T1 again modifies the value of
||Commit||T1 commits its modifications to the database.|
|t7||T3 Start||T3 starts. The starting values within its transactional context include the commits from T1:
|T2 modifies the value of
|T2 modifies the value of
||Commit||T2 commits its changes.|
T3 attempts to commit its changes, which causes a write-write conflict, since T2 already committed an update to value
T3 rolls back and resets.
|t16||T3' Start||T3 reset (T3') is restarted by the application; remember that Splice Machine does not automatically restart transactions. The starting values within its transactional context include the commits from T1 and T2:
|T3 modifies the value of
||Commit||T3 commits its changes.|
The Isolation Guarantee
The Isolation guarantee makes sure that the resulting state of the database is consistent with a serial execution of the transactions that were completed even if they ran concurrently. Note, however, that this does not mean that you’ll see the same results from all possible concurrent executions of transactions, even when they all complete successfully. What it does mean is this:
For a given start state S, if you run transactions T1, T2, and T3 concurrently, then the state at which you end up, S’, will be equivalent to the state generated by a serial execution of those transactions.
Those three transactions can execute in a number of different orders, yielding a different S’. For example, consider a starting state S in which
row A = 10, and we’ll be running these three transactions:
|T1||A += 10|
|T2||A = A*2|
|T3||A += 30|
When you run these transactions concurrently, the possible resulting states include these:
|Transaction Sequence||S' Value||Comments|
|S + T1 + T3 + T2||A = 100|
|S + T3 + T2 + T1||A = 90|
|S + T3||A = 40||
|S + T1 + T2||A = 40||
As you can see, even when all transactions complete successfully, the end result depends on the order in which the transactions committed. And all of these results meet the Isolation guarantee. Splice Machine will enforce that guarantee by aborting a transaction when it detects a write-write conflict.
The Isolation guarantee would be violated by generating an ending state that could not be generated by serially running the transactions. For example, this sequence of actions is not valid because there is no serial sequence that could generate an end result of
A = 20 if all transactions successfully commit.
A = 10
A = 10
A = 10
A = 10 + 30 = 40
A = 10 * 2 = 20
A = 10 + 10 = 20
- End result is
A = 20
This section describes using transactions in your database, in these subsections:
- Committing and Rolling Back Transaction Changes introduces autocommit, commit, and rollback of transactions.
- A Simple Transaction Example presents an example of a transaction using the splice> command line interface.
- Using Savepoints describes how to use savepoints within transactions.
- Using Rollback versus Rollback to Savepoint discusses the differences between rolling back a transaction, and rolling back to a savepoint.
Committing and Rolling Back Transaction Changes
Within a transactional context, how the changes that you make are
committed to the database depends on whether
autocommit is enabled or
|autocommit status||How changes are committed and rolled back|
Changes are automatically committed whenever the operation completes successfully.
If an operation reports any error, the changes are automatically rolled back.
Changes are only committed when you explicitly issue a
Changes are rolled back when you explicitly issue a
Autocommit is enabled by default. You typically disable
when you want a block of operations to be committed atomically (all at
once) instead of committing changes to the database after each
You can turn
autocommit on and off by issuing the
autocommit on or
autocommit off commands at the
For more information, see these topics in the Command Line Reference section of this book:
A Simple Transaction Example
Here is a simple example. Enter the following commands to see commit and rollback in action:
splice> create table myTbl (i int); splice> autocommit off; - commits must be made explicitly splice> insert into myTbl values 1,2,3; - inserted but not visible to others splice> commit; - now committed to the database splice> select * from myTbl; - verify table contents splice> insert into myTbl values 4,5; - insert more datasplice> select * from myTbl; - verify table contents splice> rollback; - roll back latest insertions splice> select * from myTbl; - and verify again ...
You can turn
autocommit back on by issuing the command: autocommit on;
Splice Machine supports the JDBC 3.0 Savepoint API, which adds methods for setting, releasing, and rolling back to savepoints within a transaction. Savepoints give you additional control over transactions by allowing you to define logical rollback points within a transaction, which effectively allows you to specify sub-transactions (also known as nested transactions).
You can specify multiple savepoints within a transaction. Savepoints are very useful when processing large transactions: you can implement error recovery schemes that allow you to rollback part of a transaction without having to abort the entire transaction.
You can use these commands to work with Savepoints:
- create a savepoint with the
- release a savepoint with the
- roll a transaction back to an earlier savepoint with the
rollback to savepointcommand
First we’ll create a table, turn autocommit off, and insert some data into the table. We then create a savepoint, and verify the contents of our table:
splice> CREATE TABLE myTbl(i int); 0 rows inserted/updated/deleted splice> AUTOCOMMIT OFF; splice> INSERT INTO myTbl VALUES 1,2,3; 3 rows inserted/updated/deleted splice> SAVEPOINT savept1; 0 rows inserted/updated/deleted splice> SELECT * FROM myTbl; I ----------- 1 2 3 3 rows selected
Next we add new values to the table and again verify its contents:
splice> INSERT INTO myTbl VALUES 4,5; 2 rows inserted/updated/deleted splice> SELECT * FROM myTbl; I ----------- 1 2 3 4 55 rows selected
Now we roll back to our savepoint, and verify that the rollback worked:
splice> ROLLBACK TO SAVEPOINT savept1; 0 rows inserted/updated/deleted splice> SELECT * FROM myTbl; I ----------- 1 2 3 3 rows selected
And finally, we commit the transaction:
Using Rollback Versus Rollback to Savepoint
There’s one important distinction you should be aware of between rolling back to a savepoint versus rolling back the entire transaction:
- When you perform a
rollback, Splice Machine aborts the entire transaction and creates a new transaction,
- When you perform a
rollback to savepoint, Splice Machine rolls back part of the changes, but does not create a new transaction.
Remember that this distinction also holds in a multi-tenant environment. In other words:
- If two users are making modifications to the same table in separate
transactions, and one user does a
rollback, all changes made by that user prior to that rollback are no longer in the database.
- Similarly, if two users are making modifications to the same table in
separate transactions, and one user does a
rollback to savepoint, all changes made by that user since the savepoint was established are no longer in the database.