Spring reactive transactions atomicity violation
Spring supports transactions in reactive flows since version 5.2 M2, so this combination (transactivity + reactive streams) is still pretty new. Transactions in imperative mode are here for years, they are well known and no serious caveats are expected. But it turns out that, in the current Spring versions (namely, version 5.2.6), reactive transactions may sometimes violate the atomicity requirement!
TLDR; skip to Pragmatic recommendations.
Further I explain why the atomicity violation can happen, but first a few words about the differences between the classic (imperative) case and the reactive one.
Imperative transaction management flow
In the classic (imperative) world, here is how wrapping of a business code in a transaction looks like:
startTransaction();
try {
businessLogic();
commitTransaction();
} catch (Throwable e) {
handleException(e);
}
It can be seen that a classic transactional code has only two terminal states:
- it either ‘completes normally’, in which case the transaction gets committed;
- or an exception is thrown, in which case the outcome actually depends on an exception, as some of them cause transaction commit, but most of the time a rollback happens.
Reactive transaction management flow
In the reactive case, schematically, it looks like this:
Publisher<T> businessLogicPublisher = businessLogic();
Publisher<T> transactionalPublisher = new Transactionally(businessPublisher, transactionManager);
transactionalPublisher.subscribe(...);
When the flow is activated (i.e. a subscription is made on the final publisher), a Subscription
object is created.
The subscription has three terminal states:
- complete signalled via
Subscriber#onComplete()
. This is the analogue of ‘completes normally’ state from the imperative flow; in this case the transaction gets committed. - error signalled via
Subscriber#onError(Throwable)
. This is the analogue of ‘an exception is thrown’ state from the imperative flow; in this case the same handling logic is executed, so the transaction may be committed or rolled back, it all depends on theThrowable
that is available when handling this signal. - cancelled signalled via
Subscription#onCancel()
. This is issued when the subscription gets cancelled, and here the problem has its roots.
Why is cancellation a problem?
The thing is that a cancellation can happen due to different reasons.
- It can be a part of the ‘normal flow’. For example, some Reactor’s operators, like
Flux.take(n)
,Flux.next()
,Mono.from(Publisher)
cancel the upstream when they are satisfied with the number of elements they got from there and do not need it anymore. Such cancellations are ‘benign’ and should be treated in the same way as a successful completion. - It can be due to an exception thrown somewhere inside the reactive pipe. For example,
Flux.concatMap()
cancels the upstream when it gets an ‘onError()` signal. - A cancellation may be requested by some code external to the reactive pipeline (for example, it may be a consequence of an application shutdown).
In the first case, the current transaction must be committed on cancel. In the second and third, it must be rolled back.
The worse thing is that a cancellation signal does not allow any information to be carried along. According to Reactor developers, Reactor context does not allow to pass such information as well. So it does not seem possible to distinguish these cases and pick the right decision: whether the current transaction needs to be committed or rolled back on cancel.
How it works now in Spring
Spring 5.2.6 commits on cancel. The choice was made
deliberately to make
operators like Flux.take(n)
usable with transactions.
This creates a possibility of a partial (i.e. non-atomic) commit if a cancel arrives at bad time!
Demonstrating the problem
Having the following ‘business method’ (very artificial, crafted specifically to get 100% reproduction rate)
@Transactional
public Mono<Void> savePair(String collection, CountDownLatch latch) {
return Mono.defer(() -> {
Boot left = new Boot();
left.setKind("left");
Boot right = new Boot();
right.setKind("right");
return mongoOperations.insert(left, collection)
// signaling to the test that the first insert has been done and the subscription can be cancelled
.then(Mono.fromRunnable(latch::countDown))
// do not proceed to the second insert ever
.then(Mono.fromRunnable(this::blockForever))
.then(mongoOperations.insert(right, collection))
.then();
});
}
the following test
@Test
void cancelShouldNotLeadToPartialCommit() throws InterruptedException {
// latch is used to make sure that we cancel the subscription only after the first insert has been done
CountDownLatch latch = new CountDownLatch(1);
Disposable disposable = bootService.savePair(collection, latch).subscribe();
// wait for the first insert to be executed
latch.await();
// now cancel the reactive pipeline
disposable.dispose();
// Now see what we have in the DB. Atomicity requires that we either see 0 or 2 documents.
List<Boot> boots = mongoOperations.findAll(Boot.class, collection).collectList().block();
assertEquals(0, boots.size());
}
fails because it sees exactly 1 record in the Mongo collection.
So what’s the big deal?
Well, if a transaction may be unreliable under ‘some rare circumstances’, it means that the transactions are not reliable. We usually use transactions when we need a 100% guarantee that our commits will be atomic so that we can count on data staying consistent. Spring reactive transactions currently cannot provide such a guarantee.
What can be done?
The Spring team is working on it, but right now
the safest approach is to patch spring-tx
by flipping the logic from ‘commit-on-cancel’ to ‘rollback-on-cancel’.
Here is a (straight-forward) example of how it can be done:
https://github.com/rpuch/spring-framework/commit/95c2872c0c3a8bebec06b413001148b28bc78f2a
This fixes TransactionalOperator
and the declarative @Transactional
-based cases.
If you use spring-data module’s specific transactivity mechanisms, you need to address them as well. For example,
if ReactiveMongoOperations.inTransaction()
is in use, you need to change the following code in
ReactiveMongoTemplate.inTransaction(Publisher)
return Flux.usingWhen(Mono.just(session), //
s -> ReactiveMongoTemplate.this.withSession(action, s), //
ClientSession::commitTransaction, //
(sess, err) -> sess.abortTransaction(), //
ClientSession::commitTransaction) //
.doFinally(signalType -> doFinally.accept(session));
to use ClientSession::abortTransaction
instead of ClientSession::commitTransaction
in the
asyncCancel
parameter.
The unpleasant consequences
Unfortunately, the fix is not free. It will restore the transactional guarantees for the cases of unexpected cancellations. But, in return, it will make expected cancellations to also roll transactions back. Here is an example:
@Transactional
public Flux<Shoe> savePairAndReturnFlux(String collection) {
return Flux.defer(() -> {
Shoe left = new Shoe("left");
Shoe right = new Shoe("right");
return mongoOperations.insert(List.of(left, right), collection)
.thenMany(Flux.just(left, right));
});
}
The following code
shoeService.savePairAndReturnFlux(collection)
.take(1)
.blockLast();
will roll the transaction back, so nothing will be stored in the database.
Pragmatic recommendations
- If you absolutely need operations like
Flux.take(n)
downstream from your transactional publishers, and you never make more than one write in your transactional code, it is ok to proceed with the current unpatched Spring versions as the partial commits are never going to bite you. But please note that (at least, currently) Spring team is going to flip the logic to ‘rollback-on-cancel’ in Spring 5.3, so it is better to be prepared anyway. - If you absolutely need operations like
Flux.take(n)
downstream from your transactional publishers, and you sometimes have more than one write in your transactional code, you are in trouble. You are forced to use the currently standard ‘commit-on-cancel’, but you are amenable to atomicity violations (partial commits) on cancel. - If you do not need ‘routinely cancelling’ operations like
Flux.take(n)
downstream from transactional publishers, switch to a patched Spring version (with ‘rollback-on-cancel’ policy) and then switch to Spring 5.3 when it is available (that will have the same ‘rollback-on-cancel’ policy).
Conclusion
Reactive transactions in the current Spring version (5.2.6) leave a possibility for non-atomic commits, but
in many cases it is possible to fix the problem by using a patched spring-tx
jar.
References
- Spring Reactive Transactions introduction
- https://github.com/spring-projects/spring-framework/issues/25091 The bug report at Spring Framework’s Github page
- A Stackoverflow question about the problem
- https://github.com/rpuch/spring-commit-on-cancel-problems A github repository with a test that demonstrates the problem
- https://github.com/rpuch/spring-framework/commit/95c2872c0c3a8bebec06b413001148b28bc78f2a A commit switching to ‘rollback-on-cancel’