Monday, December 30, 2013

I am developing a Java web application that bases it behavior through large XML configuration files that are loaded from a web service. As these files are not actually required until a particular section of the application is accessed, they are loaded lazily. When one of these files are required, a query is sent to the webservice to retrieve the corresponding file. As some of the configuration files are likely to be used much, much more often than others I'd like to setup some kind of caching (with maybe a 1 hour expiration time) to avoid requesting the same file over and over.
The files returned by the web service are the same for all users across all sessions. I do not use JSP, JSF or any other fancy framework, just plain servlets.
My question is, what is considered a best practice to implement such a global, static cache within a java Web application? Is a singleton class appropriate, or will there be weird behaviors due to the J2EE containers? Should I expose something somewhere through JNDI? What shall I do so that my cache doesn't get screwed in clustered environments (it's OK, but not necessary, to have one cache per clustered server)?
Given the informations above, Would it be a correct implementation to put an object responsible for caching as a ServletContext attribute?
Note: I do not want to load all of them at startup and be done with it because that would
1). overload the webservice whenever my application starts up
2). The files might change while my application is running, so I would have to requery them anyway
3). I would still need a globally accessible cache, so my question still holds
Update: Using a caching proxy (such as squid) may be a good idea, but each request to the webservice will send rather large XML query in the post Data, which may be different each time. Only the web application really knows that two different calls to the webservice are actually equivalent.
Thanks for your help
share|improve this question
add comment

4 Answers

Your question contains several separate questions together. Let's start slowly. ServletContext is good place where you can store handle to your cache. But you pay by having cache per server instance. It should be no problem. If you want to register cache in wider range consider registering it into JNDI.
The problem with caching. Basically, you are retrieving xml via webservice. If you are accesing this webservice via HTTP you can install simple HTTP proxy server on your side which handle caching of xml. The next step will be caching of resolved xml in some sort of local object cache. This cache can exists per server without any problem. In this second case the EHCache will do perfect job. In this case the chain of processing will be like this Client - http request -> servlet -> look into local cache - if not cached -> look into http proxy (xml files) -> do proxy job (http to webservice).
Pros:
  • Local cache per server instance, which contains only objects from requested xmls
  • One http proxy running on same hardware as our webapp.
  • Possibility to scale webapp without adding new http proxies for xml files.
Cons:
  • Next level of infrastructure
  • +1 point of failure (http proxy)
  • More complicated deployment
Update: don't forget to always send HTTP HEAD request into proxy to ensure that cache is up to date.
share|improve this answer
add comment
Here's an example of caching with EhCache. This code is used in several projects to implement ad hoc caching.
1) Put your cache in the global context. (Don't forget to add the listener in WEB.XML).
import net.sf.ehcache.Cache;
import net.sf.ehcache.CacheManager;

public class InitializationListener implements ServletContextListener {    
    @Override
    public void contextInitialized(ServletContextEvent sce) {
        ServletContext ctx = sce.getServletContext();
        CacheManager singletonManager = CacheManager.create();
        Cache memoryOnlyCache = new Cache("dbCache", 100, false, true, 86400,86400);
        singletonManager.addCache(memoryOnlyCache);
        cache = singletonManager.getCache("dbCache");       
        ctx.setAttribute("dbCache", cache );           
    }
}
2) Retrieve the cache instance when you need it. i.e. from a servlet:
cache = (Cache) this.getContext().getAttribute("dbCache");
3) Query the cache just before you do an expensive operation.
        Element e = getCache().get(key);
        if (e != null) {
            result = e.getObjectValue(); // get object from cache
        } else {
            // Write code to create the object you need to cache, then store it in the cache.
            Element resultCacheElement = new Element(key, result);
            cache.put(resultCacheElement);

        }
4) Also don't forget to invalidate cached objects when appropriate.
You can find more samples here
share|improve this answer
add comment
Option #1: Use an Open Source Caching Library Such as EHCache
Don't implement your own cache when there are a number of good open source alternatives that you can drop in and start using. Implementing your own cache is much more complex than most people realize and if you don't know exactly what you are doing wrt threading you'll easily start reinventing the wheel and resolving some very difficult problems.
I'd recommend EHCache it is under an Apache license. You'll want to take a look at the EHCace code samples.
Option #2: Use Squid
An even easier solution to your problem would be to use Squid... Put Squid in between the process that requests the data to be cached and the system making the request: http://www.squid-cache.org/
share|improve this answer
 
Thank you, but that doesn't really answer my question. If I decide to use EHCache, how do I set it up within a java ee container so that it works properly across sessions, various classloader mess and in clustered environments? –  LordOfThePigs Mar 31 '09 at 5:32
 
I'm not convinced that Squid would be the right solution. It seems awefully clever, however I don't believe that webservices will respect "if modified" –  monksy Nov 1 '09 at 10:12
add comment