Tuesday, December 31, 2013

Transaction isolation with Spring and Hibernate

Recently I had to investigate the question of stale data that may be passed over transactional borders in our application. The application is Spring based using JPA with Hibernate to access the database. The answer to the above question is not that interesting, but to give it I had to understand how this whole stuff is set up and configured. There are are quite some resources which address something like this, but when coming to Spring, in recent time the approach to implement this when from directly using Hibernate to using Hibernate through JPA, while there are quite some books and tutorials on this, none also include specific information how to set the isolation level and what happens when you use the optimistic locking approach.
When talking about the I in the ACID principle, one has to consider three things that can happen:
  • Dirty reads: One transaction reads data, that is not committed by another transaction.
  • Non-repeatable reads: One transaction reads data, another transaction updates this data and commits it, then the first transaction reads the “same” data again with a different result. A special case of this are lost updates. If the first transaction would update the data, the update of the second transaction is lost, as the update happened on the original data.
  • Phantom reads: One transaction reads a set of data. Another transaction than inserts, deletes additional data (this would also include some form of update) in such a way that when the first transaction reads the data again, the number of results is different.
This list is ordered: If phantom reads are handled by the transaction, dirty reads and non-repeatable reads cannot occur, while on the other preventing dirty reads does not prevent non-repeatable reads.
JPA defines four different isolation levels. There is a level for each separation of these phenomena:
  • TRANSACTION_READ_UNCOMMITTED: Allows dirty reads, non-repeatable reads, and phantom reads to occur.
  • TRANSACTION_READ_COMMITTED: Prevents dirty reads, but non-repeatable reads and phantom reads may occur.
  • TRANSACTION_REPEATABLE_READ: Prevents dirty and non-repeatable reads, but phantom reads may occur.
  • TRANSACTION_SERIALIZABLE: Prevent everything – phantom reads may not occur.
That is the technical basis, but how exactly is this applied in the concrete context. When using Spring the isolation level can be configured as part of the Transactional configuration. This is described at length in “Spring in Action” third edition by Craig Walls or the SpringFramework documentation.
Our application is largely configured using annotation, so I was used to seeing the occasional @Transactional annotation. However this annotation was at most accompanied with a specification of the propagation of the transaction. Nowhere the isolation level was defined, neither in the annotated form nor configured in XML. As Spring generally uses Convention over configuration, it’s save to assume that there are default values set, but what exactly is the default?
Spring defines an additional isolation level: DEFAULT. This however is not a proxy value for another of the defined transaction level, but simply states to use the use the default transaction level of the underlining data store. As I’m reading this, that could either mean Hibernate or the Oracle database. As it turns out it is the default level of the database itself. For oracle this is read committed. Oracle also provides the isolation levels read-only and serializable.
Setting the isolation level to serializable would mean that on any read the data (row(s) of the whole table) is locked until the transaction completes. In most cases this is not a good idea (the number of reads is far greater than the number of updates). This can be a real damper on performance. For our application TRANSACTION_REPEATABLE_READ would be the ideal choice, as updating of viewed data may happen but it is perfectly normal that two queries do not result in the same amount of results. As Oracle does not support this I had to investigate into another direction.
This is exactly where Hibernate comes into play with its caching. When data is read hibernate puts it into the cache. If data that was modified by a different transaction Hibernate recognizes this. When using the isolation level of serializable, this results in a pessimistic locking strategy: The data is locked against any other access by another transaction. The opposite is an optimistic locking strategy. This strategy follows the saying: “Hope for the best, but prepare for the worst!” That means we hope that there will not arrive any issues with concurrent modification of data by different transactions, but it might happen and we are prepared to handle these OptimisticLockingExceptions. One point that should be mentioned here, the OptimisticLockingException is a Hibernate exception. Normally it is good practice to let Spring wrap these specific exceptions into neutral RuntimeExceptions. This would be done by annotating the DAO classes with @Repository, which we did not do, so we can use the Hibernate exceptions directly.
Now how exactly is an application configured to use optimistic locking. There are two approaches:
  • Configuration in XML
  • Extension of the the database scheme with special fields and annotation
Configuration in XML means in the persistence.xml or wherever you configured your hibernate mapping. To every class element you add the attribute optimistic-lock="all". See also Hibernate Optimistic Locking without Version (or Timestamp). The locking with Version or Timestamp is the second approach, the one we have chosen in our application. In the configuration with XML approach you had to configure something for every class element. Here you have to do the same in the code: To every table a version or timestamp field has to be added which then has to be annotated in the entity class. This looks like can be seen here andhere (Optimistic vs. pesimistic).
But now it’s time to look at some code to configure the transactional stuff in Spring. The SpringFramework handles Transaction using AOP. This means that you have to specify the namespaces for aop and transaction:
1
2
3
4
5
6
7
8
  xmlns:aop="http://www.springframework.org/schema/aop"
  xmlns:tx="http://www.springframework.org/schema/tx"
...
xsi:schemaLocation="
  ...
  http://www.springframework.org/schema/aop http://www.springframework.org/schema/aop/spring-aop-3.0.xsd
  http://www.springframework.org/schema/tx http://www.springframework.org/schema/tx/spring-tx-3.0.xsd
  ...
Let’s first have a look at the advice, which defines what methods should be handled how:
1
2
3
4
5
6
7
8
<tx:annotation-driven />
 
<tx:advice id="txAdvice" transaction-manager="transactionManager">
  <tx:attributes>
    <tx:method name="get*" propagation="REQUIRED" read-only="true" rollback-for="Exception" />
    <tx:method name="*" propagation="REQUIRED" rollback-for="Exception" />
  </tx:attributes>
</tx:advice>
The “annotation-driven” element defines that the transaction configuration is handled with Annotations. The advice references a transaction manager (in this case it could also be omitted, as Spring asumes there is a transaction manager defined with the id “transactionManager” by default). What this advice does is defining the default propagation level to REQUIRED and rollback behavior for any Exception. For methods which start with get it specified that the transaction is read only.
Now let’s take a look at the pointcut: where this transactional advice should be applied to:
1
2
<aop:pointcut id="repository_methods"
    expression="execution(* (@ch.sahits.spring.MyRepository ch.sahits.spring.dataaccess*)+.*(..))"/>
This defines all methods in the packagech.sahits.spring.dataaccess which are annotated withMyRepository, which is an extension of @Component.
Now let’s put this together:
1
2
3
4
5
<aop:config>
  <aop:pointcut id="repository_methods"
      expression="execution(* (@ch.sahits.spring.MyRepository ch.sahits.spring.dataaccess*)+.*(..))"/>
  <aop:advisor advice-ref="txAdvice" pointcut-ref="repository_methods" />
</aop:config>
There is more stuff: How to configure the transaction-manager. The Transaction manager obviously will need the DataSource. I omit the configuration of that one, as there are enough examples for just that out there. The Transaction manager is defined as a JpaTransactionManager. As I mentioned earlier on, we are using JPA as a facade for Hibernate:
1
2
3
<bean id="transactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
  <property name="entityManagerFactory" ref="emFactory" />
</bean>
Next the definition of the emFactory:
1
2
3
<bean id="emFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
  <property name="persistenceXmlLocation" value="classpath:META-INF/da-oracle-persistence.xml" />
</bean>
This bit of code bridges the JPA with the persistence.xml mentioned earlier on. The last bit is just that XML:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
xml version="1.0" encoding="UTF-8"?>
<persistence version="2.0" xmlns="http://java.sun.com/xml/ns/persistence">
 
  <persistence-unit name="myPersistence" transaction-type="RESOURCE_LOCAL">
    <provider>org.hibernate.ejb.HibernatePersistence</provider>
    <non-jta-data-source>java:comp/env/jdbc/MyDataSource</non-jta-data-source>
 
    
 
    <properties>
      <property name="hibernate.dialect" value="org.hibernate.dialect.Oracle10gDialect" />
      <property name="hibernate.show_sql" value="false" />
      <property name="hibernate.jdbc.batch_size" value="100" />
      <property name="hibernate.order_inserts" value="true" />
      <property name="hibernate.order_updates" value="true" />
 
      <property name="javax.persistence.validation.group.pre-persist"
          value="ch.sahits.spring.datamodel.validation.PrePersist" />
      <property name="javax.persistence.validation.group.pre-update"
          value="ch.sahits.spring.datamodel.validation.PrePersist" />
      <property name="javax.persistence.validation.group.pre-remove"
          value="" />
    </properties>
  </persistence-unit>
 
</persistence>
First we define Hibernate as persistance provider. Next I retrieve the data source. In this case it will be retrieved from JNDI. Compare:
1
2
3
<bean id="myDataSource" class="org.springframework.jndi.JndiObjectFactoryBean">
  <property name="jndiName" value="java:comp/env/jdbc/MyDataSource" />
</bean>
In the properties section Hibernate specific configurations are made, most notable the definition of the dialect. The PrePersist is a marker interface for a group for validation before save/update.
The mapping must not be configured here as it is handled with annotations (@Entity and Column).