Sunday, October 16, 2011

EHCache - Write behind example

What is Write-Behind?

Write behind is asynchronous writing of data to the underlying database. Thus, when data is being written to the Cache, instead of writing simultaneously to the database, the cache saves the data into a queue and allows a background thread to write to the database later. 

This is a transformative capability because now you can:
  1. Move writes to the database at a particular time
  2. Use write coalescing, which means if there are multiple updates on the same key in the queue, only the latest one is considered
  3. Batch multiple write operations
  4. Specify the number of retry attempts in case of write failure
Here is an introductory video.

In order to write behind, you need to first implement the CacheWriter interface
/*
This class handles writing to the database or your backend persistence storage
*/
public class EhcacheWriteBehindClass implements CacheWriter {

 @Override
 public CacheWriter clone(Ehcache arg0) throws CloneNotSupportedException {
  throw new CloneNotSupportedException("EhcacheWriteBehindClass cannot be cloned!");
 }
 
 @Override
 public void delete(CacheEntry arg0) throws CacheException {
  // TODO Auto-generated method stub
  
 }
 
 @Override
 public void deleteAll(Collection arg0) throws CacheException {
  // TODO Auto-generated method stub
  
 }
 
 @Override
 public void dispose() throws CacheException {
  // You can close database connections here
  
 }
 
 @Override
 public void init() {
  // You can initialize the database here
  
 }
 
 @Override
 public void write(Element arg0) throws CacheException {
                // Typically you would write to your database here
  System.out.println("Write : Key is " + arg0.getKey());
  System.out.println("Write : Value is " + arg0.getValue());
 }
 
 @Override
 public void writeAll(Collection arg0) throws CacheException {
  // TODO Auto-generated method stub
  System.out.println("Write All");
 }

 @Override
 public void throwAway(Element arg0, SingleOperationType arg1,
   RuntimeException arg2) {
  // TODO Auto-generated method stub
  
 }
}

This class is instantiated by the CacheWriterFactory:

public class WriteBehindClassFactory extends CacheWriterFactory {

 public CacheWriter createCacheWriter(Ehcache arg0, Properties arg1) {
  return new EhcacheWriteBehindClass();
 }
}

Now register the factory in the ehcache.xml as follows:
 

           

              




In order to use this write behind functionality, your class would look like this:
 
public class EhcacheWriteBehindTest {

 public static void main(String[] args) throws Exception {
  // pass in the number of object you want to generate, default is 10
  int numberOfObjects = Integer.parseInt(args.length == 0 ? "100": args[0]);
  System.out.println(numberOfObjects);
  //create the CacheManager
  CacheManager cacheManager = CacheManager.getInstance();
  //get a handle on the Cache - the name "myCache" is the name of a cache in the ehcache.xml file
  Cache myCache = cacheManager.getCache("writeBehindCache");
  
  //iterate through numberOfObjects and use the iterator as the key, value does not matter at this time
  for (int i = 0; i < numberOfObjects; i++) {
   String key = new Integer(i).toString();
   if (!checkInCache(key, myCache)) {
    //when putting in the cache, it is as an Element, the key and the value must be serializable
    myCache.putWithWriter(new Element(key, "Value"));
    System.out.println(key + " NOT in cache!!!");
   } else {
    System.out.println("Put with writer ... value1");
                               //note, we use the putWithWriter method and not the put method
    myCache.putWithWriter(new Element(key, "Value1"));
   }
  }
  
  while (true) {
   Thread.sleep(1000);
  }
 }
 
 //check to see if the key is in the cache
 private static boolean checkInCache(String key, Cache myCache) throws Exception {
  Element element = myCache.get(key);
  boolean returnValue = false;
  if (element != null) {
   System.out.println(key + " is in the cache!!!");
   returnValue = true;
  }
  return returnValue;
 }
}

Thats it! For a detailed explanation of the configurations involved have a look at this.

The limitation of this is that if your JVM goes down, your write-behind queue is lost. In order to avoid this you can used clustered Terracotta, which uses the Terracotta Server Array. In this case the queue is maintained at the Terracotta Server Array which provides HA features. If one client JVM were to go down, any changes it put into the write-behind queue can always be loaded by threads in other  clustered JVMs, therefore will be applied to the database without any data loss. 


Terracotta Server Array is an enterprise feature and can be configured extremely easily. You can download a trial version from here


The only change you need to make in this app to make it clustered is in the ehcache.xml. You ehcache.xml would now look like this:



 
           
              

            
   

 

terracottaConfig url="localhost:9510" is where your Terracotta Server Array runs.

9 comments:

  1. Why write-behind is not ordered in ehcache? Please suggest a way to make them ordered in a clustered environment. We are running our application on cloud and were wondering whether Terracotta is capable to run ordered write-behinds on the cloud?

    ReplyDelete
  2. When putting elements to the cache with write-behind, does it have to use myCache.putWithWriter(new Element(key, "Value")) or put would work too?

    ReplyDelete
  3. @Ankita: On a single node, write behind order should be preserved alright. Now if two nodes do modify the data, it could indeed be that one node, who's value has been overwritten by another node already, gets to write to the underlying database only after the other (more up-to-date) value has been written indeed. In order to work around this you would need some versioning logic so every time you update an element in the cache the version number increases.
    More details here:
    http://ehcache.org/documentation/apis/write-through-caching#node-time-synchronisation

    @Joff: Yes you would need to use putWithWriter to use write-behind

    ReplyDelete
  4. Sourabh Ghose, I have another question. When doing write behind, how do I delete what is in the cache after it has been processed through the write method in the CacheWriter implementation.

    ReplyDelete
  5. @Joff: Write-behind is used only to replicate asynchronously to another datastore. It will not remove that element from the cache. You have to remove the element in the application code.

    ReplyDelete
  6. So, Sourabh Ghose can I use putWithWriter and removeWithWriter and in the delete method of the CacheWriter include the removal of the element from the cache in the delete method. I noticed that the delete method in the CacheWriter does not get invoked like write does.

    ReplyDelete
  7. This comment has been removed by the author.

    ReplyDelete
  8. If I am doing both add new entry and update some old entry, how to manage that at CacheWriter level as we have only one method in CacheWriter API.. Mt writer impl try to insert or update data in DB..Please suggest way out of this..

    ReplyDelete
    Replies
    1. @Mohit - I am not sure I understand you correctly. The write(Element e) method in the CacheWriter will manage both insert and updates.
      There are several other options to write to an underlying DB -
      1. Use Write through caching to update the cache and DB synchronously
      2. Run a Transaction (http://ehcache.org/documentation/apis/transactions) and update the cache and DB within the transaction boundary. With this you get all transactional guarantees, but of course pay the price for performance.
      3. Check write-behind properties (http://ehcache.org/documentation/apis/write-through-caching#configuration) there is a property called writeCoalescing which when set, will consider only the latest write to a particular key within the queue.

      Delete