Tuesday, January 10, 2012

How Servlet Containers Work

by Budi Kurniawan
05/14/2003

Editor's Note: This article and the previous one in this series, "How Web Servers Work," are excerpts from the book How Tomcat Works, a tutorial on the internal workings of Tomcat. If you have not done so, please read the previous article first; it gives you some useful background information. In this article, you'll learn how to build two servlet containers. The applications accompanying this article can be downloaded. If you are interested, other parts of the book are available for download for a limited period of time from the author's web site.

This article explains how a simple servlet container works. There are two servlet container applications presented; the first one is made as simple as possible and the second is a refinement of the first. The sole reason I do not try to make the first container perfect is to keep it simple. More sophisticated servlet containers, including Tomcat 4 and 5, are discussed in other chapters of How Tomcat Works.

Both servlet containers can process simple servlets, as well as static resources. You can use PrimitiveServlet, located in the webroot/ directory, to test this container. More complex servlets are beyond the capability of these containers, but you can learn how to build more sophisticated servlet containers in the How Tomcat Works book.

The classes for both applications are part of the ex02.pyrmont package. To understand how the applications work, you need to be familiar with the javax.servlet.Servlet interface. To refresh your memory, the first section of this article discusses this interface. After that, you'll learn what a servlet container has to do to serve a servlet.

The javax.servlet.Servlet Interface

Servlet programming is made possible through the classes and interfaces of two packages: javax.servlet and javax.servlet.http. Of these classes and interfaces, the javax.servlet.Servlet interface is the most important interface. All servlets must implement this interface or extend a class that does.

The Servlet interface has five methods, whose signatures are as follows:

· public void init(ServletConfig config) throws ServletException

· public void service(ServletRequest request, ServletResponse response)

· throws ServletException, java.io.IOException

· public void destroy()

· public ServletConfig getServletConfig()

· public java.lang.String getServletInfo()

The init, service, and destroy methods are the servlet's lifecycle methods. The init method is called once by the servlet container after the servlet class has been instantiated to indicate to the servlet that it being placed into service. The init method must complete successfully before the servlet can receive any requests. A servlet programmer can override this method to write initialization code that needs to run only once, such as loading a database driver, initializing values, and so on. In other cases, this method is normally left blank.

The service method is then called by the servlet container to allow the servlet to respond to a request. The servlet container passes a javax.servlet.ServletRequest object and a javax.servlet.ServletResponse object. The ServletRequest object contains the client's HTTP request information and the ServletResponse encapsulates the servlet's response. These two objects enable you to write custom code that determines how the servlet services the client request.

The servlet container calls the destroy method before removing a servlet instance from service. This normally happens when the servlet container is shut down or when the servlet container needs some free memory. This method is called only after all threads within the servlet's service method have exited or after a timeout period has passed. After the servlet container calls destroy, it will not call the service method again on this servlet. The destroy method gives the servlet an opportunity to clean up any resources that are being held (for example, memory, file handles, and threads) and make sure that any persistent state is synchronized with the servlet's current state in memory.

Listing 2.1 contains the code for a servlet named PrimitiveServlet, a very simple servlet that you can use to test the servlet container applications in this article. The PrimitiveServlet class implements javax.servlet.Servlet (as all servlets must) and provides implementations for all five servlet methods. What it does is very simple: each time any of the init, service, or destroy methods is called, the servlet writes the method's name to the console. The code in the service method also obtains the java.io.PrintWriter object from the ServletResponse object and sends strings to the browser.

Listing 2.1. PrimitiveServlet.java

import javax.servlet.*;

import java.io.IOException;

import java.io.PrintWriter;

public class PrimitiveServlet implements Servlet {

public void init(ServletConfig config) throws ServletException {

System.out.println("init");

}

public void service(ServletRequest request, ServletResponse response)

throws ServletException, IOException {

System.out.println("from service");

PrintWriter out = response.getWriter();

out.println("Hello. Roses are red.");

out.print("Violets are blue.");

}

public void destroy() {

System.out.println("destroy");

}

public String getServletInfo() {

return null;

}

public ServletConfig getServletConfig() {

return null;

}

}

Application 1

Now, let's look at servlet programming from a servlet container's perspective. In a nutshell, a fully functional servlet container does the following for each HTTP request for a servlet:

  • When the servlet is called for the first time, load the servlet class and call its init method (once only).
  • For each request, construct an instance of javax.servlet.ServletRequest and an instance of javax.servlet.ServletResponse.
  • Invoke the servlet's service method, passing the ServletRequest and ServletResponse objects.
  • When the servlet class is shut down, call the servlet's destroy method and unload the servlet class.

What happens in a servlet container is much more complex than that. However, this simple servlet container is not fully functional. Therefore, it can only run very simple servlets and does not call the servlet's init and destroy methods. Instead, it does the following:

  • Wait for HTTP request.
  • Construct a ServletRequest object and a ServletResponse object.
  • If the request is for a static resource, invoke the process method of the StaticResourceProcessor instance, passing the ServletRequest and ServletResponse objects.
  • If the request is for a servlet, load the servlet class and invoke its service method, passing the ServletRequest and ServletResponse objects. Note that in this servlet container, the servlet class is loaded every time the servlet is requested.

In the first application, the servlet container consists of six classes:

  • HttpServer1
  • Request
  • Response
  • StaticResourceProcessor
  • ServletProcessor1
  • Constants

Just like the application in the previous article, the entry point of this application (the static main method) is in the HttpServer class. This method creates an instance of HttpServer and calls its await method. As the name implies, this method waits for HTTP requests, creates a Request object and a Response object, and dispatches either to a StaticResourceProcessor instance or a ServletProcessor instance, depending on whether the request is for a static resource or a servlet.

The Constants class contains the static final WEB_ROOT that is referenced from other classes. WEB_ROOT indicates the location of PrimitiveServlet and the static resources this container can serve.

The HttpServer1 instance keeps waiting for HTTP requests until it receives a shutdown command. Issue a shutdown command the same way as you did it in the previous article.

Each of the classes in the application is discussed in the following sections.



The HttpServer1 Class

The HttpServer1 class in this application is similar to the HttpServer class in the simple web server application in the previous article. However, in this application, the HttpServer1 can serve both static resources and servlets. To request a static resource, use a URL in the following format:

http://machineName:port/staticResource

This is exactly how you requested a static resource in the web server application in the previous article. To request a servlet, you use the following URL:

http://machineName:port/servlet/servletClass

Therefore, if you are using a browser locally to request a servlet called PrimitiveServlet, enter the following URL in the browser's address or URL box:

http://localhost:8080/servlet/PrimitiveServlet

The class' await method, given in Listing 2.2, waits for HTTP requests until a shutdown command is issued. It is similar to the await method discussed in the previous article.

Listing 2.2. The HttpServer1 class' await method

public void await() {

ServerSocket serverSocket = null;

int port = 8080;

try {

serverSocket = new ServerSocket(port, 1,

InetAddress.getByName("127.0.0.1"));

}

catch (IOException e) {

e.printStackTrace();

System.exit(1);

}

// Loop waiting for a request

while (!shutdown) {

Socket socket = null;

InputStream input = null;

OutputStream output = null;

try {

socket = serverSocket.accept();

input = socket.getInputStream();

output = socket.getOutputStream();

// create Request object and parse

Request request = new Request(input);

request.parse();

// create Response object

Response response = new Response(output);

response.setRequest(request);

// check if this is a request for a servlet or a static resource

// a request for a servlet begins with "/servlet/"

if (request.getUri().startsWith("/servlet/")) {

ServletProcessor1 processor = new ServletProcessor1();

processor.process(request, response);

}

else {

StaticResourceProcessor processor =

new StaticResourceProcessor();

processor.process(request, response);

}

// Close the socket

socket.close();

//check if the previous URI is a shutdown command

shutdown = request.getUri().equals(SHUTDOWN_COMMAND);

}

catch (Exception e) {

e.printStackTrace();

System.exit(1);

}

}

}

The difference between the await method in Listing 2.2 and the one in the previous article is that in Listing 2.2, the request can be dispatched to either a StaticResourceProcessor or a ServletProcessor. The request is forwarded to the latter if the URI contains the string "/servlet/." Otherwise, the request is passed to the StaticResourceProcessor instance.

The Request Class

A servlet's service method accepts a javax.servlet.ServletRequest instance and a javax.servlet.ServletResponse instance from the servlet container. The container therefore must construct a ServletRequest object and a ServletResponse object to pass to the service method of the servlet being served.

The ex02.pyrmont.Request class represents a request object to pass to the service method. As such, it must implement the javax.servlet.ServletRequest interface. This class has to provide implementations for all methods in the interface. However, we'd like to make it very simple and implement only a few of the methods. To compile the Request class, you'll need to provide blank implementations for those methods. If you look at the Request class, you can see that all methods whose signatures return an object instance return a null, such as the following:

...

public Object getAttribute(String attribute) {

return null;

}

public Enumeration getAttributeNames() {

return null;

}

public String getRealPath(String path) {

return null;

}

...

In addition, the Request class still has the parse and the getUri methods, which were discussed in the previous article.

The Response Class

The Response class implements javax.servlet.ServletResponse. As such, the class must provide implementations for all of the methods in the interface. Similar to the Request class, I leave the implementations of all methods "blank," except the getWriter method.

public PrintWriter getWriter() {

// autoflush is true, println() will flush,

// but print() will not.

writer = new PrintWriter(output, true);

return writer;

}

The second argument to the PrintWriter class' constructor is a Boolean indicating whether or not autoflush is enabled. Passing true as the second argument will make any call to a println method flush the output. However, a print call does not flush the output. Therefore, if a call to a print method happens to be the last line in a servlet's service method, the output is not sent to the browser. This imperfection will be fixed in the later applications.

The Response class still has the sendStaticResource method discussed in the previous article.

The StaticResourceProcessor Class

The StaticResourceProcessor class is used to serve requests for static resources. Its only method is process, as shown in Listing 2.3.

Listing 2.3. The StaticResourceProcessor class' process method

public void process(Request request, Response response) {

try {

response.sendStaticResource();

}

catch (IOException e) {

e.printStackTrace();

}

}

The process method receives two arguments: a Request instance and a Response instance. It simply calls the sendStaticResource method of the Response class.



The ServletProcessor1 Class

The ServletProcessor1 class processes HTTP requests for servlets. It is surprisingly simple, consisting only of the process method. This method accepts two arguments: an instance of javax.servlet.ServletRequest and an instance of javax.servlet.ServletResponse. The process method also constructs a java.net.URLClassLoader object and uses it to load the servlet class file. Upon obtaining a Class object from the class loader, the process method creates an instance of the servlet and calls its service method.

The process method is given in Listing 2.4.

Listing 2.4. The ServletProcessor1 class' process method

public void process(Request request, Response response) {

String uri = request.getUri();

String servletName = uri.substring(uri.lastIndexOf("/") + 1);

URLClassLoader loader = null;

try {

// create a URLClassLoader

URLStreamHandler streamHandler = null;

URL[] urls = new URL[1];

File classPath = new File(Constants.WEB_ROOT);

String repository = (new URL("file", null,

classPath.getCanonicalPath() + File.separator)).toString() ;

urls[0] = new URL(null, repository, streamHandler);

loader = new URLClassLoader(urls);

}

catch (IOException e) {

System.out.println(e.toString() );

}

Class myClass = null;

try {

myClass = loader.loadClass(servletName);

}

catch (ClassNotFoundException e) {

System.out.println(e.toString());

}

Servlet servlet = null;

try {

servlet = (Servlet) myClass.newInstance();

servlet.service((ServletRequest) request, (ServletResponse) response);

}

catch (Exception e) {

System.out.println(e.toString());

}

catch (Throwable e) {

System.out.println(e.toString());

}

}

The process method accepts two arguments: an instance of ServletRequest and an instance of ServletResponse. From the ServletRequest, it obtains the URI by calling the getRequestUri method:

String uri = request.getUri();

Remember that the URI is in the following format:

/servlet/servletName

where servletName is the name of the servlet class.

To load the servlet class, we need to know the servlet name from the URI, which we do using the next line of the process method:

String servletName = uri.substring(uri.lastIndexOf("/") + 1);

Next, the process method loads the servlet. To do this, you need to create a class loader and tell this class loader the class' location. This servlet container directs the class loader to look in the directory pointed by Constants.WEB_ROOT. WEB_ROOT points to the webroot/ directory under the working directory.

To load a servlet, use the java.net.URLClassLoader class, which is an indirect child class of java.lang.ClassLoader. Once you have an instance of the URLClassLoader class, use its loadClass method to load a servlet class. Instantiating the URLClassLoader class is straightforward. This class has three constructors, the simplest one being:

public URLClassLoader(URL[] urls);

where urls is an array of java.net.URL objects pointing to the locations on which searches will be conducted when loading a class. Any URL that ends with a / is assumed to refer to a directory. Otherwise, the URL is assumed to refer to a .jar file, which will be downloaded and opened as needed.

In a servlet container, the location where a class loader can find servlet classes is called a repository.

In our application, there is only one location in which the class loader must look — the webroot/ directory under the working directory. Therefore, we start by creating an array of a single URL. The URL class provides several constructors, so there are many ways to construct a URL object. For this application, I used the same constructor used in another class in Tomcat. The constructor has the following signature:

public URL(URL context, String spec, URLStreamHandler hander)

throws MalformedURLException

You can use this constructor by passing a specification for the second argument and null for both the first and the third arguments. However, there is another constructor that accepts three arguments:

public URL(String protocol, String host, String file)

throws MalformedURLException

Thus, the compiler will not know which constructor you mean if you write the following:

new URL(null, aString, null);

You can get around this by telling the compiler the type of the third argument, like this:

URLStreamHandler streamHandler = null;

new URL(null, aString, streamHandler);

For the second argument, pass the String containing the repository (the directory where servlet classes can be found). Create it with this code:

String repository = (new URL("file", null,

classPath.getCanonicalPath() + File.separator)).toString();

Combining everything, here is the part of the process method that constructs the right URLClassLoader instance:

// create a URLClassLoader

URLStreamHandler streamHandler = null;

URL[] urls = new URL[1];

File classPath = new File(Constants.WEB_ROOT);

String repository = (new URL("file", null,

classPath.getCanonicalPath() + File.separator)).toString() ;

urls[0] = new URL(null, repository, streamHandler);

loader = new URLClassLoader(urls);

The code that forms a repository comes from the createClassLoader method in org.apache.catalina.startup.ClassLoaderFactory, and the code for forming the URL is taken from the addRepository method in the org.apache.catalina.loader.StandardClassLoader class. However, you don't have to worry about these classes at this stage.

Having a class loader, you can load a servlet class using the loadClass method:

Class myClass = null;

try {

myClass = loader.loadClass(servletName);

}

catch (ClassNotFoundException e) {

System.out.println(e.toString());

}

Next, the process method creates an instance of the servlet class loaded, downcasts it to javax.servlet.Servlet, and invokes the servlet's service method:

Servlet servlet = null;

try {

servlet = (Servlet) myClass.newInstance();

servlet.service((ServletRequest) request, (ServletResponse) response);

}

catch (Exception e) {

System.out.println(e.toString());

}

catch (Throwable e) {

System.out.println(e.toString());

}

Compiling and Running the Application

To compile the application, type the following from the working directory:

javac -d . -classpath ./lib/servlet.jar src/ex02/pyrmont/*.java

To run the application on Windows, type the following command from the working directory:

java -classpath ./lib/servlet.jar;./ ex02.pyrmont.HttpServer1

in Linux, use a colon to separate between libraries.

java -classpath ./lib/servlet.jar:./ ex02.pyrmont.HttpServer1

To test the application, type the following in the URL or address box of your browser:

http://localhost:8080/index.html

or

http://localhost:8080/servlet/PrimitiveServlet

You will see the following text in your browser:

Hello. Roses are red.

Note that you can't see the second string (Violets are blue) because only the first string is flushed to the browser. The applications accompanying the later chapters of the How Tomcat Works book show you how to fix this problem.



Application 2

There is a serious problem in the first application. In the ServletProcessor1 class' process method, we upcast the instance of ex02.pyrmont.Request to javax.servlet.ServletRequest, passing it as the first argument to the servlet's service method. We also upcast the instance of ex02.pyrmont.Response to javax.servlet.ServletResponse and pass it as the second argument to the servlet's service method.

try {

servlet = (Servlet) myClass.newInstance();

servlet.service((ServletRequest) request, (ServletResponse) response);

}

This compromises security. Servlet programmers who know the internal workings of our servlet container can downcast the ServletRequest and ServletResponse instances back to Request and Response and call their public methods. Having a Request instance, they can call its parse method. Having a Response instance, they can call its sendStaticResource method.

You cannot make the parse and sendStaticResource methods private, because they will be called from other classes in the ex02.pyrmont package. However, these two methods are not supposed to be available from inside of a servlet. One solution is to give both Request and Response classes a default access modifier, so that they cannot be used from outside the ex02.pyrmont package. However, there is a more elegant solution: using facade classes.

In this second application, we add two facade classes:RequestFacade and ResponseFacade. The RequestFacade class implements the ServletRequest interface and is instantiated by passing a Request instance, which it assigns to a ServletRequest object reference in its constructor. The implementation of each method in the ServletRequest interface invokes the corresponding method of the Request object, but the ServletRequest object itself is private and cannot be accessed from outside the class. Instead of upcasting the Request object to ServletRequest and passing it to the service method, we construct a RequestFacade object and pass it to the service method. The servlet programmer can still downcast the ServletRequest instance back to the RequestFacade; however, he can only access the methods available in the ServletRequest interface. Now, the parseUri method is safe.

Listing 2.5 shows the incomplete RequestFacade class.

Listing 2.5. The RequestFacade class

package ex02.pyrmont;

public class RequestFacade implements ServletRequest {

private ServletRequest request = null;

public RequestFacade(Request request) {

this.request = request;

}

/* implementation of the ServletRequest*/

public Object getAttribute(String attribute) {

return request.getAttribute(attribute);

}

public Enumeration getAttributeNames() {

return request.getAttributeNames();

}

...

}

Notice the constructor of RequestFacade. It accepts a Request object but immediately assigns it to the private servletRequest object reference. Notice also that each method in the RequestFacade class invokes the corresponding method in the ServletRequest object.

The same applies to the ResponseFacade class.

Here are the classes used in Application 2:

  • HttpServer2
  • Request
  • Response
  • StaticResourceProcessor
  • ServletProcessor2
  • Constants

The HttpServer2 class is similar to HttpServer1, except that it uses ServletProcessor2 in its await method, instead of ServletProcessor1:

if (request.getUri().startsWith("/servlet/")) {

ServletProcessor2 processor = new ServletProcessor2();

processor.process(request, response);

}

else {

...

}

The ServletProcessor2 class is similar to ServletProcessor1, except in the following part of the process method:

Servlet servlet = null;

RequestFacade requestFacade = new RequestFacade(request);

ResponseFacade responseFacade = new ResponseFacade(response);

try {

servlet = (Servlet) myClass.newInstance();

servlet.service((ServletRequest) requestFacade,

(ServletResponse) responseFacade);

}

Compiling and Running the Application

To compile the application, type the following from the working directory.

javac -d . -classpath ./lib/servlet.jar src/ex02/pyrmont/*.java

To run the application on Windows, type the following command from the working directory:

java -classpath ./lib/servlet.jar;./ ex02.pyrmont.HttpServer2

In Linux, use a semicolon to separate between libraries.

java -classpath ./lib/servlet.jar:./ ex02.pyrmont.HttpServer2

You can use the same URLs as Application1 to receive the same result.

Summary

This article discussed a simple servlet container that can be used to serve static resources, as well as to process servlets as simple as PrimitiveServlet. It also provided background information on the javax.servlet.Servlet interface.

No comments: