Sunday, October 13, 2013

Finding and Fixing Memory Leaks in Java Programs

Derek Rayside, Lucy Mendel & Daniel Jackson

Why are memory leaks a problem?
Memory leaks can cause programs to slow down or crash, and can also limit the size of the input that the program can process in a given memory space.
Memory leaks are notoriously hard to find and fix, because it is difficult to discern meaningful patterns in the program's heap, and because it is difficult to correlate memory consumption to lines of source code. We use an ownership inference to tackle the first problem, and an interactive hierarchical matrix browser to tackle the second. We call our technique object ownership profiling.
What is a memory leak?
A memory leak is storage is allocated but that the program no longer needs. The most common mechanical criteria used to determine if a program still needs some bit of storage is whether that memory can be reached from the current call stack or global variables — that is, whether that memory is root reachable. Storage that is allocated but not root reachable is clearly a leak: there is no possible way for the program to use that storage again. Automatic storage reclamation, or garbage collection, releases storage that is allocated but not root reachable, and thereby precludes these kinds of leaks.
Programs written in garbage collected languages, such as Java, can still have memory leaks: storage that is root reachable, but no longer needed by the program. One way to charaterise this kind of leak is storage that is no longer observably reachable: ie, if this storage were released, it would have no observable effect on the program, even though it is technically root reachable.
Figure 1 shows a Venn diagram of the memory reachability hierarchy we have just described. There are two kinds of leaks illustrated by this diagram: storage that is allocated but not root reachable, and storage that is root reachable but not observably reachable. In this work we focus on the second kind of leak.
storage reachability hierarchy
Figure 1: memory reachability hierarchy
Finding Leaks
There are three phases in our leak-finding analysis: identify potentially leaked objects, infer a hierarchy of objects, and rank nodes in the hierarchy for programmer inspection.
We identify potentially leaked objects by instrumenting the program to record every time the program uses that object, and when the garbage collector releases the object. During this "drag" time the object is a potential leak. Observing an object in this way under-approximates whether it is observably reachable, because if the program had happened to take another branch, it might have used that object. However, this under-approximation is useful in diagnosing leaks.
A program execution may contain millions of objects, so simply knowing which objects are potential leaks and for how long is not enough: the analysis must also provide a way to structure and rank objects. We use ownership inference to infer a hierarchical structure on the objects in a program trace. Potential leaks are aggregated up this hierarchy — similarly to how file sizes are aggregated up the directory structure when profiling disk usage. In the third place, our technique ranks ownership nodes in two ways: according to the integral of leaked space over time, and whether new incoming pointers were created to them after they stopped being active ("zombie references").
Fixing Leaks
Fixing leaks can be difficult. Even if the programmer knows what objects are leaked, determining how to change the source code to prevent that leak in future is often non-obvious. We use a hierarchical matrix browser for the programmer to explore the interactions between the potential leak and the other objects in the program execution. These interactions include all method calls (which our dynamic analysis also records), in addition to field operations. This hierarchical matrix browser is illustrated below in Figure 3.
Case Study: Finding and Fixing a Leak in an IDE
Here we present a memory leak that our technique found in the Alloy Analyzer (version 3). The Alloy Analyzer is an IDE for the Alloy language; it's main components are a translator from Alloy to SAT, an external SAT solver, and a visualizer. The Alloy Analyzer is written in Java, with the exception of the external SAT solver, which is third party code written in C. We are looking for memory leaks in the Java code.
Figure 2a shows a Java heap profile of the Alloy Analyzer v3 on a workload of two translations, before the memory leak was fixed. Figure 2b shows the profile on the same workload after the leak was fixed. Notice that the memory allocated for the first translation is now released before the second translation begins, and consequently the peak memory requirement is reduced from 45MB to 30MB. These heap profiles were captured with Sun's JConsole tool that ships with JDK6. These plots verify, independently of our tools, that we did indeed fix leaks in the Alloy Analyzer v3.
before profileafter profile
Figure 2a: profile before fixing leaks (peak 45MB)Figure 2b: profile after fixing leaks (peak 30MB)
Figure 2: Java heap profiles of Alloy Analyzer v3, captured with Sun's JConsole tool
As an example, we will explain how one of these leaks was found and fixed with our tools. (Discussion of the other leaks is elided here for space considerations.)
Figure 3 shows a subset of the ownership profile captured by our tool. Each plot in Figure 3 represents the memory profile of an ownership node, and the arrows between nodes indicate the ownership hierarchy. The red (top) lines on the plots represent allocated objects, and the green (bottom) lines on the plots represent active objects (ie, objects that were observed to have been reached). A gap between the two lines indicates objects that were allocated but not observably reachable — a potential leak.
In Figure 3 we see, on the far right, that there are some large int[] and char[] arrays that are leaking: they are kept around long after the program has stopped using them. Without further information it would be difficult to fix these leaks: we'd have to look for all of the int and char arrays in the source code, and try to guess which ones were large and leaky. However, the ownership inference analysis further tells us that these leaking arrays are owned by an ASCII_Charstream object. Should we fix these leaks by modifying the code of ASCII_Charstream? No: the analysis tells us that the real problem is in the Specification class, which has azombie reference to an AlloyParser that in turn owns the ASCII_Charstream. A zombie reference is a pointer to an object that is no longer observably reachable. The zombie reference means that the leaked object is still root reachable, so the garbage collector does not collect it.
Figure 3: ownership profile for Alloy Analyzer v3 (with leak)
Now that we know where a potential leak is, we use the hierarchical matrix browser to show the interactions between the (potentially) leaked objects and the rest of the objects, as pictured in Figure 4. First we see that the leaked AlloyParser objects interact only with the Specification objects: they do not interact with anything else, which means our changes can be localized to the Specification class (or one of its superclasses). Mousing over the dots in Figure 4 (when using the interactive tool, not on this web page) reveals that it is the Specification.parser field that is holding on to the AlloyParser object. A simple textual search reveals the few statements mentioning this field that need to be examined and changed.
interactive hierarchical matrix
browser
Figure 4: interactive hierarchical matrix browser
This example illustrates the power and utility of our analyses: only a few lines of code needed to be changed to fix the leak, but those lines were nowhere near the observable symptom (leaking int and char arrays).
Conclusion
Object ownership profiling is an effective and intuitive technique for finding and fixing memory leaks in Java programs: it both identifies the leaking objects, and places in the source code that may need to be modified to fix the leak — places that may be far away from, and only indirectly related to, the leaked objects. We are conducting further empirical research on object ownership profiling, including investigating different bases for determining object ownership. We are also implementing engineering enhancements and exploring improvements to the user interface.
Acknowledgements
This research was supported, in part, by grants 0086154 (Design Conformant Software) and 6895566 (Safety Mechanisms for Medical Software) from the ITR program of the National Science Foundation.
References:
Derek Rayside, Lucy Mendel, and Daniel Jackson. A Dynamic Analysis for Revealing Object Ownership and Sharing. Fourth International Workshop on Dynamic Analysis (WODA 2006), associated with the International Conference on Software Engineering (ICSE'06). Shanghai, China. May, 2006. Best paper award (as voted by workshop participants). (PDF)
Derek Rayside, Lucy Mendel, Robert Seater, and Daniel Jackson. An Analysis and Visualization for Revealing Object Sharing. Eclipse Technology Exchange (ETX) Workshop at OOPSLA'05. (PDF)
David G. Clarke, John M. Potter, and James Noble. Ownership types for flexible alias protection. OOPSLA'98.
James Noble, Jan Vitek, and John M. Potter. Flexible Alias Protection. ECOOP'98.

No comments: