Tuesday, October 29, 2013

Take Control of Class Loading in Java

By Jeff Hanson
Jun 1, 2006

ava's class loading framework is powerful and flexible. It allows applications to access class libraries without linking to static "include" files. Instead, it loads archive files containing library classes and resources from designated locations, such as directories and network locations defined by the CLASSPATH environment variable. The system resolves run-time references to classes and resources dynamically, which simplifies updates and version releases. Nevertheless, each library has its own set of dependencies—and it is up to developers and deployment personnel to make sure that their applications properly reference the correct versions. Sadly, the combination of the default class-loading system and specific dependencies can and does lead to bugs, system crashes, and worse.

This article proposes a container framework for class loading intended to resolve these issues.

The Java Classpath
Java relies on the environment property/variable, CLASSPATH, to designate the path that the runtime uses to search for classes and other resources, as they are needed. You define theCLASSPATH property by setting the CLASSPATH environment variable or using the Java command-line option, -classpath.

A Java runtime typically finds and loads classes in the following order:

  1. Classes in the list of bootstrap classes—These are classes that embody the Java platform, such as the classes in rt.jar.
  2. Classes that appear in the list of extension classes—These classes use the Extension Mechanism Framework to extend the Java platform, with archive files (.jar.zip, etc.) located in the/lib/ext directory of the runtime environment.
  3. User classes—These are classes that do not use the extension mechanism architecture identified using the -classpath command-line option or the CLASSPATH environment variable.
Archives and the Classpath
An archive .jar or.zip file can include a manifest file containing entries that can be used to provide archive information, set archive properties, etc. The manifest can also extend the classpath by including an entry named Class-Path, which contains a list of archives and directories. JDK 1.3 introduced the Class-Path manifest entry for specifying optional jars and directories that load if needed. Here's an example Class-Path entry:

   Class-Path: mystuff/utils.jar 
      mystuff/logging.jar mylib/ 

Java provides an extensible model for designating the list of locations and files from which to load classes. However, some problems can arise, such as when a different version of a library exists on the classpath than an executing class expects.

Classpath Version Conflicts
The runtime identity of a class in Java is defined by its fully-qualified name (the package name prepended to the class name, sometimes known as the FQN), all appended to the ID of the classloader that loaded the class. Thus, each instance of a class loaded by multiple classloaders is regarded as a separate entity by the Java runtime. This means that the runtime can load multiple versions of the same class at any given time. This is a very powerful and flexible feature; however, the side effects can be confusing to a developer if not used intelligently.

Imagine, if you will, that you're developing an enterprise application that accesses data from multiple sources with similar semantics, such a file system and a database. Many systems of this type expose a data-access layer with data access objects (DAOs) that abstract the similar data sources. Now, imagine that you load a new version of a database DAO with a slightly different API to meet the demands of a new feature of a DAO client—but you still need the old DAO for other clients not ready for the new API. In typical runtime environments, the new DAO will simply replace the old version and all new instances will be created from the new version. However, if the update takes place without stopping the runtime environment (hot-loading) any already-existing instances of the old DAO will reside in memory alongside any instances of the new DAO as they are created. This is confusing at best. Even worse is the danger of a DAO client expecting to create an instance of the old version of the DAO, but actually getting an instance of the new version with the altered API. As you can see, this can present some interesting challenges.

To ensure stability and safety, calling code must be able to designate the exact version of a class that it intends to use. You can address this by creating a class-loading, component-container model and using some simple class-loading techniques.

Archives and Components
Because archive files (jar files, zip files, etc.) are so tightly coupled with the Java class-loading mechanism and deployment tools, they are a natural candidate to employ as vessels for self-defining components. The success of a Java component packaged and deployed within an archive depends on:

  • Developers being able to specify which version of a component to instantiate
  • Loading the correct version of the component's ancillary classes based on information found in the same jar file as the component
This gives complete control to developers and consumers of the component as to which version of each component is actually created and used.

In the following sections, I will discuss the concept of defining components and component namespaces by the archive into which they are stored.

Sharing Ancillary Resources
 
Figure 1. Using Multiple Classloaders: Because of the way Java's naming convention works, using different classloaders defines different namespaces.
One of the biggest problems when dealing with shared libraries in Java when using standard classloaders is that all classes are loaded into a single namespace. This makes it very difficult to use different versions of the same library at any given time. What you need is the ability for a component to define its own namespace into which the component and all of its ancillary libraries would be loaded.

Because the runtime identity of a class in Java is defined by the class's fully-qualified name and the ID of its classloader, a namespace already exists for each classloader. Therefore, you can use the classloader to build a component container that defines a namespace for a component and its dependencies.

For example, if I have a class named "com.jeffhanson.components.HelloWorld" for which I want to run two versions, the solution is to create an instance of one version of the HelloWorld class with one classloader and create the other version of the HelloWorld class with another classloader. Figure 1 illustrates this concept.

As I'll demonstrate in this article, the technique of instantiating a class using two different classloaders actually creates a virtual namespace. However, I have actually just created multiple instances of the same version of the class.

To facilitate loading and instantiating multiple versions of the same class, I will demonstrate, in the following sections, a component-container framework that will build on the classloader namespace mechanism to allow loading different versions of the same class.

Exploiting Classloader Namespaces
You can implement the component container framework concept as a container entity that is responsible for loading components defined in jar or zip archives and the ancillary classes needed by the components. The goals for this framework are to:

  1. Allow developers to specify which version of a component to instantiate.
  2. Load the correct ancillary classes for each component based on information found in the same jar file as the component.
  3. Share ancillary classes and archives across components.
You'll need a configuration file to define components and their corresponding ancillary files, as illustrated in the following examples:

   
   
      
         HelloWorldComponentV1.jar
      
   
      
         
            log4j-1.2.12.jar
         
   
         
            concurrent-1.3.4.jar
         
      
   
Compare the elements in the preceding example with the following example. The only change is the value of the component-archive element. This element value defines the name of the archive that contains each versioned component.

   
   
      
         HelloWorldComponentV2.jar
      
   
      
         
            log4j-1.2.12.jar
         
   
         
            concurrent-1.3.4.jar
         
      
   
To be sure the framework loads classes from only the specified locations, you must create a new ClassLoader that extends URLClassLoader. Override the loadClass method to prevent calls to it from propagating to the default classloader parent—and thus loading the class from the standard classpath. Doing that restricts class-searching to the URLs supplied to the classloader and lets you supply specific jar file locations to the classloader from which it will load components.

The following code illustrates the component classloading mechanism:

   package com.jeffhanson.components;
   
   import java.net.URL;
   import java.net.URLClassLoader;
   
   public class RestrictedURLClassLoader 
      extends URLClassLoader
   {
     public RestrictedURLClassLoader(
        URL[] urls)
     {
        super(urls, null);
     }
   
 
Figure 2. Component Container Framework Class Relationships: The figure illustrates the relationships between the classes of the component container framework.
public Class loadClass(String name) throws ClassNotFoundException { Class cls = super.loadClass(name); if (cls == null) { throw new ClassNotFoundException( "Restricted ClassLoader" + " is unable to find class: " + name); } return cls; } }
The restricted classloader is used by the component container to load components and any specified ancillary classes.

The component container uses the context classloader of the current thread to find the URL for the component. Then this URL is fed to the restricted classloader and used to instantiate the component. The component class is then cached by the component container for subsequent calls. Listing 1 shows the code for the component container, and Figure 2 illustrates the relationships between the classes of the component container framework.

Loading Specific Class Versions
Now, you can use the container and restricted classloader to load components containing versioned classes from specified archives.

Listing 2 shows how to instantiate instances of the component container and initialize them with the names of the configuration files for two different versions of a HelloWorld component. Each component version is then loaded and instantiated using the createComponent method of the ComponentContainer class.

 
Figure 3. Component Sequence: The figure shows the sequence the component container framework follows to create a component.
Calls to each instantiated component object produce results from the expected versions of each component.

The sequence diagram in Figure 3 illustrates the steps taken by the framework to load and create a component:

Notice that calls to the RestrictedURLClassLoader class terminate before reaching an instance of the default classloader; thereby restricting class-searching to the URLs supplied to the RestrictedURLClassLoader instance.

So, you've seen how to build a class loading component container framework that facilitates a self-contained context in which to define, version, and create Java components. Exploiting Java's class loading capabilities in this manner restricts class-loading to specified locations, leting you load different versions of classes simultaneously—both created and used in the same running JVM.





No comments: