Monday, July 16, 2012


Create dynamic applications with javax.tools

Understanding and applying javax.tools.JavaCompiler for building dynamic applications
David J. Biesack (David.Biesack@sas.com), Principal Systems Developer, SAS Institute, Inc.
Summary:  Many of today's applications require dynamic capabilities, such as enabling users to supply an abstract form of computation that extends an application's static capabilities. The javax.tools package, added to Java™ Platform, Standard Edition 6 (Java SE) as a standard API for compiling Java source, is a superb way to achieve this goal. This article provides an overview of the major classes in the package, demonstrates how to use them to create a facade for compiling Java source from Java Strings instead of files, and then uses this facade to build an interactive plotting application.
Tags for this article:  javatools
Date:  11 Dec 2007
Level:  Intermediate
Also available in:   Chinese  Russian  Japanese

Activity:  33112 views
Comments:   1 (View | Add comment - Sign in)
Average rating 5 stars based on 25 votes Average rating (25 votes)
Rate this article
The javax.tools package, added to Java SE 6 as a standard API for compiling Java source, lets you add dynamic capabilities to extend static applications. This article provides an overview of the major classes in the package and demonstrates how to use them to create a façade for compiling Java source from Java Strings, StringBuffers, or other CharSequences, instead of files. It then uses this façade to build an interactive plotting application that lets the user express a numeric function y = f(x) using any valid numeric Java expression. Finally, it discusses the possible security risks associated with dynamic source compilation and ways to deal with those risks.
The idea of extending applications via compiling and loading Java extensions isn't new, and several existing frameworks support this capability. JavaServer Pages (JSP) technology in Java Platform, Enterprise Edition (Java EE) is a widely known example of a dynamic framework that generates and compiles Java classes. The JSP translator transforms .jsp files into Java servlets, using intermediate source-code files that the JSP engine then compiles and loads into the Java EE servlet container. The compilation is often performed by directly invoking the javac compiler, which requires an installed Java Development Kit (JDK) or by callingcom.sun.tools.javac.Main, which can be found in Sun's tools.jar. Sun's licensing allows tools.jar to be redistributed with the full Java Runtime Environment (JRE). Other ways to implement such dynamic capabilities include using an existing dynamic scripting language (such as JavaScript or Groovy) that integrates with the application's implementation language (seeResources) or writing a domain-specific language and associated language interpreter or compiler.
Other frameworks (such as NetBeans and Eclipse) allow extensions that developers code directly in the Java language, but such systems require external static compilation and source and binary management of the Java code and its artifacts. Apache Commons JCI provides a mechanism to compile and load Java classes into a running application. Janino and Javassist also provide similar dynamic capabilities, although Janino is limited to pre-Java 1.4 language constructs, and Javassist works not at the source-code level but at a Java class abstraction level. (See Resources for links to these projects.) However, because Java developers are already adept at writing in the Java language, a system that lets you simply generate Java source code on the fly and then compile and load it promises the shortest learning curve and the most flexibility.
Using javax.tools has the following advantages:
  • It is an approved extension of Java SE, which means it's a standard API developed through the Java Community Process (as JSR 199). The com.sun.tools.javac.Main API is specifically not part of the documented Java platform API and isn't necessarily available in other vendors' JDKs or guaranteed to have the same API in future releases of the Sun JDK.
  • You use what you know: Java source, not bytecodes. You can create correct Java classes by generating valid Java source without needing to worry about learning the more intricate rules of valid bytecode or a new object model of classes, methods, statements, and expressions.
  • It simplifies, and standardizes on, one supported mechanism for code generation and loading without limiting you to file-based source.
  • It's portable across different vendor implementations of the JDK Version 6 and above, both current and future.
  • It uses a validated version of the Java compiler.
  • Unlike interpreter-based systems, your loaded classes benefit from all the JRE's runtime optimizations.
To understand the javax.tools package, it's helpful to review Java compilation concepts and how the package implements them. The javax.tools package provides abstractions for all of these concepts in a general way that lets you provide the source code from alternate source objects rather than requiring the source to be located in the file system.
Compiling Java source requires the following components:
  • A classpath, from which the compiler can resolve library classes. The compiler classpath is typically composed of an ordered list of file system directories and archive files (JAR or ZIP files) that contain previously compiled .class files. The classpath is implemented by a JavaFileManager that manages multiple source and class JavaFileObject instances and the ClassLoader passed to the JavaFileManager constructor. A JavaFileObject is a FileObject, specialized with one of the one of the JavaFileObject.Kind enumerated types useful to the compiler:
    • SOURCE
    • CLASS
    • HTML
    • OTHER
    Each source file provides an openInputStream() method to access the source as an InputStream.
  • javac options, which are passed as an Iterable
  • Source files — one or more .java source files to compile. JavaFileManager provides an abstract file system that maps source and output file names to JavaFileObject instances. (Here, file means an association between a unique name and a sequence of bytes. The client doesn't need to use an actual file system.) In this article's example, aJavaFileManager manages mappings between class names and the CharSequence instances containing the Java source to compile. A JavaFileManager.Location contains a file name and a flag that indicates if the location is a source or an output location. ForwardingJavaFileManager implements the Chain of Responsibility pattern (seeResources), allowing file managers to be chained together, just as a classpath and source paths chain JARs and directories together. If a Java class isn't found in the chain's first element, the lookup is delegated to the rest of the items in the chain.
  • Output directories, where the compiler writes generated .class files. Acting as a collection of output class files, theJavaFileManager also stores JavaFileObject instances representing compiled CLASS files.
  • The compiler itself. The JavaCompiler creates JavaCompiler.CompilationTask objects that compile source fromJavaFileObject SOURCE objects in the JavaFileManager, creating new output JavaFileObject CLASS files andDiagnostics (warnings and errors). The static ToolProvider.getSystemJavaCompiler() method returns the compiler instance.
  • Compiler warnings and errors, which are implemented with Diagnostic and DiagnosticListener. A Diagnostic is a single warning or compile error emitted by the compiler. A Diagnostic specifies:
    • Kind (ERRORWARNINGMANDATORY_WARNINGNOTE, or OTHER)
    • A source location (including a line and column number)
    • A message
    A client provides a DiagnosticListener to the compiler, through which the compiler passes diagnostics back to the client. DiagnosticCollector is a simple DiagnosticListener implementation.
Figure 1 maps the javac concepts to their implementations in javax.tools:

Figure 1. How javac concepts map to javax.tools interfaces
How javac concepts map into javax.tools interfaces.
With these concepts in mind, you'll now see how to implement a façade for compiling CharSequences.
In this section, I'll construct a façade for javax.tools.JavaCompiler. The javaxtools.compiler.CharSequenceCompilerclass (see Download) can compile Java source stored in arbitrary java.lang.CharSequence objects (such as String,StringBuffer, and StringBuilder), returning a ClassCharSequenceCompiler has the following API:
  • public CharSequenceCompiler(ClassLoader loader, Iterable options) : This constructor accepts a ClassLoader that is passed to the Java compiler, allowing it to resolve dependent classes. The Iterable optionsallow the client to pass additional compiler options that correspond to the javac options.
  • public Map> compile(Map classes, final DiagnosticCollector diagnostics) throws CharSequenceCompilerException, ClassCastException : This is the general compilation method that supports compiling multiple sources together. Note that the Java compiler must handle cyclic graphs of classes, such as A.java depending on B.java, B.java depending on C.java, and C.java depending on A.java. The first argument to this method is a Map whose keys are fully qualified class names and whose corresponding values are CharSequences containing the source for that class. For example:
    • "mypackage.A" ⇒ "package mypackage; public class A { ... }";
    • "mypackage.B" ⇒ "package mypackage; class B extends A implements C { ... }";
    • "mypackage.C" ⇒ "package mypackage; interface C { ... }"
    The compiler adds Diagnostics to the DiagnosticCollector. The generic type parameter T is the primary type that you wish to cast the class to. compile() is overloaded with another method that takes a single class name andCharSequence to compile.
  • public ClassLoader getClassLoader() : This method returns the class loader that the compiler assembles when generating .class files, so that you can load other classes or resources from it.
  • public Class loadClass(final String qualifiedClassName) throws ClassNotFoundException : Because the compile() method can define multiple classes (including public nested classes), this method allows these auxiliary classes to be loaded.
To support this CharSequenceCompiler API, I implement the javax.tools interfaces with the classes JavaFileObjectImpl(for storing the CharSequence sources and CLASS output emitted by the compiler) and JavaFileManagerImpl (which maps names to JavaFileObjectImpl instances to manage both the source sequences and the bytecode emitted from the compiler).
JavaFileObjectImpl, shown in Listing 1, implements JavaFileObject and holds a CharSequence source (for SOURCE) or a ByteArrayOutputStream byteCode (for CLASS files). The key method is CharSequence getCharContent(final boolean ignoreEncodingErrors), through which the compiler obtains the source text. See Download for the complete source for all the code examples.

Listing 1. JavaFileObjectImpl (partial source listing)
                
final class JavaFileObjectImpl extends SimpleJavaFileObject {
   private final CharSequence source;

   JavaFileObjectImpl(final String baseName, final CharSequence source) {
      super(CharSequenceCompiler.toURI(baseName + ".java"), Kind.SOURCE);
      this.source = source;
   }
   @Override
   public CharSequence getCharContent(final boolean ignoreEncodingErrors)
         throws UnsupportedOperationException {
      if (source == null)
         throw new UnsupportedOperationException("getCharContent()");
      return source;
   }
}

FileManagerImpl (see Listing 2) extends ForwardingJavaFileManager to map qualified class names toJavaFileObjectImpl instances:

Listing 2. FileManagerImpl (partial source listing)
                
final class FileManagerImpl extends ForwardingJavaFileManager {
   private final ClassLoaderImpl classLoader;
   private final Map fileObjects 
           = new HashMap();

   public FileManagerImpl(JavaFileManager fileManager, ClassLoaderImpl classLoader) {
      super(fileManager);
      this.classLoader = classLoader;
   }

   @Override
   public FileObject getFileForInput(Location location, String packageName,
         String relativeName) throws IOException {
      FileObject o = fileObjects.get(uri(location, packageName, relativeName));
      if (o != null)
         return o;
      return super.getFileForInput(location, packageName, relativeName);
   }

   public void putFileForInput(StandardLocation location, String packageName,
         String relativeName, JavaFileObject file) {
      fileObjects.put(uri(location, packageName, relativeName), file);
   }
}

If ToolProvider.getSystemJavaCompiler() can't create a JavaCompiler

The ToolProvider.getSystemJavaCompiler()method can return null if tools.jar is not in the application's classpath. The CharStringCompilerclass detects this possible configuration problem and throws an exception with a recommendation for fixing the problem. Note that Sun's licensing allows tools.jar to be redistributed with the JRE.
With these support classes, I can now define theCharSequenceCompiler. It's constructed with a runtimeClassLoader and compiler options. It usesToolProvider.getSystemJavaCompiler() to get theJavaCompiler instance, then instantiates aJavaFileManagerImpl that forwards to the compiler's standard file manager.
The compile() method iterates over the input map, constructing aJavaFileObjectImpl from each name/CharSequence and adding it to the JavaFileManager so the JavaCompiler finds them when calling the file manager's getFileForInput() method. Thecompile() method then creates a JavaCompiler.Task instance and runs it. Failures are thrown as a CharSequenceCompilerException. Then, for each source passed to the compile()method, the resulting Class is loaded and placed in the result Map.
The class loader associated with the CharSequenceCompiler (see Listing 3) is a ClassLoaderImpl instance that looks up the bytecode for a class in the JavaFileManagerImpl instance, returning the .class files created by the compiler:

Listing 3. CharSequenceCompiler (partial source listing)
                
public class CharSequenceCompiler {
   private final ClassLoaderImpl classLoader;
   private final JavaCompiler compiler;
   private final List options;
   private DiagnosticCollector diagnostics;
   private final FileManagerImpl javaFileManager;

   public CharSequenceCompiler(ClassLoader loader, Iterable options) {
      compiler = ToolProvider.getSystemJavaCompiler();
      if (compiler == null) {
         throw new IllegalStateException(
               "Cannot find the system Java compiler. "
               + "Check that your class path includes tools.jar");
      }
      classLoader = new ClassLoaderImpl(loader);
      diagnostics = new DiagnosticCollector();
      final JavaFileManager fileManager = compiler.getStandardFileManager(diagnostics,
            null, null);
      javaFileManager = new FileManagerImpl(fileManager, classLoader);
      this.options = new ArrayList();
      if (options != null) {
         for (String option : options) {
            this.options.add(option);
         }
      }
   }

   public synchronized Map> 
	      compile(final Map classes,
                  final DiagnosticCollector diagnosticsList)
          throws CharSequenceCompilerException, ClassCastException {
      List sources = new ArrayList();
      for (Entry entry : classes.entrySet()) {
         String qualifiedClassName = entry.getKey();
         CharSequence javaSource = entry.getValue();
         if (javaSource != null) {
            final int dotPos = qualifiedClassName.lastIndexOf('.');
            final String className = dotPos == -1 
	              ? qualifiedClassName
                  : qualifiedClassName.substring(dotPos + 1);
            final String packageName = dotPos == -1 
	              ? "" 
                  : qualifiedClassName .substring(0, dotPos);
            final JavaFileObjectImpl source = 
	              new JavaFileObjectImpl(className, javaSource);
            sources.add(source);
            javaFileManager.putFileForInput(StandardLocation.SOURCE_PATH, packageName,
                  className + ".java", source);
         }
      }
      final CompilationTask task = compiler.getTask(null, javaFileManager, diagnostics,
                                                    options, null, sources);
      final Boolean result = task.call();
      if (result == null || !result.booleanValue()) {
         throw new CharSequenceCompilerException("Compilation failed.", 
                                                 classes.keySet(), diagnostics);
      }
      try {
         Map> compiled = 
	                    new HashMap>();
         for (Entry entry : classes.entrySet()) {
            String qualifiedClassName = entry.getKey();
            final Class newClass = loadClass(qualifiedClassName);
            compiled.put(qualifiedClassName, newClass);
         }
         return compiled;
      } catch (ClassNotFoundException e) {
         throw new CharSequenceCompilerException(classes.keySet(), e, diagnostics);
      } catch (IllegalArgumentException e) {
         throw new CharSequenceCompilerException(classes.keySet(), e, diagnostics);
      } catch (SecurityException e) {
         throw new CharSequenceCompilerException(classes.keySet(), e, diagnostics);
      }
   }
}

Now that I have a simple API for compiling source, I'll put it in action by creating a function-plotting application, written in Swing. Figure 2 shows the application plotting the x * sin(x) * cos(x) function:

Figure 2. A dynamic application using the javaxtools.compiler package
Plotter Swing application screenshot
The application uses the Function interface defined in Listing 4:

Listing 4. Function interface
                
package javaxtools.compiler.examples.plotter;
public interface Function {
   double f(double x);
}

The application provides a text field in which the user can enter a Java expression that returns a double value based on an implicitly declared double x input parameter. The application inserts that expression text into the code template shown in Listing 5, at the location marked $expression. It also generates a unique class name each time, replacing $className in the template. The package name is also a template variable.

Listing 5. Function template
                
package $packageName;
import static java.lang.Math.*;
public class $className
             implements javaxtools.compiler.examples.plotter.Function {
  public double f(double x) { 
    return ($expression) ; 
  }
} 

The application fills in the template via fillTemplate(packageName, className, expr), which returns a String object it then compiles using the CharSequenceCompiler. Exceptions or compiler diagnostics are passed to the log() method or written directly into the scrollable errors component in the application.
The newFunction() method shown in Listing 6 returns an object that implements the Function interface (see the source template in Listing 5):

Listing 6. Plotter's Function newFunction(String expr) method
                
Function newFunction(final String expr) {
   errors.setText("");
   try {
      // generate semi-secure unique package and class names
      final String packageName = PACKAGE_NAME + digits();
      final String className = "Fx_" + (classNameSuffix++) + digits();
      final String qName = packageName + '.' + className;
      // generate the source class as String
      final String source = fillTemplate(packageName, className, expr);
      // compile the generated Java source
      final DiagnosticCollector errs =
            new DiagnosticCollector();
      Class compiledFunction = stringCompiler.compile(qName, source, errs,
            new Class[] { Function.class });
      log(errs);
      return compiledFunction.newInstance();
   } catch (CharSequenceCompilerException e) {
      log(e.getDiagnostics());
   } catch (InstantiationException e) {
      errors.setText(e.getMessage());
   } catch (IllegalAccessException e) {
      errors.setText(e.getMessage());
   } catch (IOException e) {
      errors.setText(e.getMessage());
   }
   return NULL_FUNCTION;
}

You'll typically generate source classes that extend a known base class or implement a specific interface, so that you can cast the instances to a known type and invoke its methods through a type-safe API. Note that the Function class is used as the generic type parameter T when instantiating the CharSequenceCompiler. This allows the compiledFunction likewise to be typed as Class and compiledFunction.newInstance() to return a Function instance without requiring casts.
Once it has dynamically generated a Function instance, the application uses it to generate y values for a range of x values and then plot the (x,y) values using the open source JFreeChart API (see Resources). The full source of the Swing application is available in the downloadable source in the javaxtools.compiler.examples.plotter package.
This application's source code generation needs are quite modest. Other applications will benefit from a more sophisticated source template facility, such as Apache Velocity (see Resources).
An application that allows arbitrary Java source code to be entered by the user has some inherent security risks. Analogous to SQL injection (see Resources), a system that allows a user or other agent to supply raw Java source for code generation can be exploited maliciously. For example, in the Plotter application presented here, a valid Java expression can contain anonymous nested classes that access system resources, spawn threads for denial-of-service attacks, or perform other exploits. This exploit can be termed Java injection. Such applications should not be deployed in an insecure location in which an untrusted user can access it (such as on a Java EE server as a servlet, or as an applet). Instead, most clients of javax.tools should restrict the user input and translate user requests into secure source code.
Strategies for preserving security when using this package include:
  • Use a custom SecurityManager or ClassLoader that prevents loading of anonymous classes or other classes not under your direct control.
  • Use a source-code scanner or other preprocessor that discards input that uses questionable code constructs. For example, the Plotter can use a java.io.StreamTokenizer and discard input that includes a { (left brace) character, effectively preventing the declaration of anonymous or nested classes.
  • Using the javax.tools API, the JavaFileManager can discard writing of any CLASS file that's unexpected. For example, when compiling a specific class, the JavaFileManager can throw a SecurityExeception for any other calls to store unexpected class files and allow only generated package and class names that the user can't guess or spoof. This is the strategy used by the Plotter's newFunction method.
I've explained the concepts and significant interfaces of the javax.tools package and shown a façade for compiling Java stored in Strings or other CharSequences, then used that library class to develop a sample application that plots an arbitrary f(x) function. The many other highly useful applications of this technique include:
  • Generating binary file readers/writers from a data-description language.
  • Generating format translators, similar to the Java Architecture for XML Binding (JAXB) or persistence frameworks.
  • Implementing domain-specific language interpreters by performing source-to-Java language translation followed by Java source compilation and loading, as is done for JSP.
  • Implementing rules engines.
  • Whatever your imagination calls to mind.
The next time your application-development needs call for dynamic behavior, explore the variety and flexibility that javax.toolsprovides.

Download
DescriptionNameSizeDownload method
Sample code for this articlej-jcomp.zip166KBHTTP