Decompilation Dava
Program-Transformation.Org: The Program Transformation Wiki
McGill's "Dava" Java Decompiler
The
Sable group at
McGill University,
under the leadership of Professor
Laurie Hendren,
are working on a framework called
Soot for analysing
and optimising Java bytecode.
An offshoot of this work, though not described directly on the Soot page,
is the Dava decompiler. This is mostly the work of Jerome Miecznikowski,
whose
masters thesis
is available.
A
paper
published at the
Decompilation Workshop at
WCRE
2001 describes the restructuring algorithm used.
See also their
tech report
Decompiling Java Bytecode: Problems, Traps and Pitfalls.
Also of interest to decompilation researchers is their
paper on types:
Efficient Inference of Static Types for Java Bytecode.
You can get a working version (with a little effort) from the website.
See
DecompilationDavaTest for tests.
Making Dava
There is a version of Dava in the latest release of Soot (but it is undocumented, and predates some enhancements mentioned in their paper). According to the author, the new version is ready, it just needs integration into a current version of Soot. This version was supposed to happen early 2003.
These instructions and tests relate to the old version.
Extremely brief installation instructions can be found in the main author's (Jerome Miecznikowski's) web page, which was at
http://www.sable.mcgill.ca/~jerome/public, in the file INSTALL (still accessable
here from
archive.org). So to use Dava, you just install Soot, and use some special command line options. You can download the full source version (5MB), or the version with just the
classes
directory (2MB). Note that the Jasmin needed is a modification of the standard Jasmin "assembler for Java", so make sure you use the one provided.
The main problem I had in getting it going is to set the CLASSPATH environment variable correctly. This worked for me: 1) current directory (.), 2) path to the
rt.jar
file (include
rt.jar
in this path), 3) path to the Soot classes directory, and 4) path to the jasmin classes directory.
To use Dava, alter the CLASSPATH is set up as above, cd to the directory with the top level .class file to be decompiled, and
% java soot.Main --dava foo or
% java soot.Main --dava --app foo
as per the INSTALL instructions. Note: if you get warnings about phantom library classes, you probably need to get your CLASSPATH right. There is more about the
rt.jar
file in the
Soot tutorial. Note: you should not need to unjar the
rt.jar
file. I found it convenient to set up a shell program to set the classpath and add the
soot.Main --dava
part automatically. Mine looks like this (all on one line):
java -Xmx128M -cp .:/usr/java/jre/lib/rt.jar:
/home/44/emmerik/soot-1.2.4/soot/classes/:
/home/44/emmerik/soot-1.2.4/jasmin/classes:. soot.Main --dava $@
About Dava
As you can read in the various papers, in particular
Decompiling Java Bytecode: Problems, Traps and Pitfalls, not all Java decompilers are created equal. In particular, 4 of the most popular ones completely failed when faced with optimised bytecode (you can use soot as an optimiser, as well as its many other uses). Also, Dava is tested on bytecode generated by relatively exotic languages, such as Haskell, Eiffel, ML, Ada, and even Fortran. A good Java decompiler should be able to produce correct, compilable code for any verifyable bytecode, and they claim that Dava comes close to this (certainly a lot closer than the other decompilers they tested, including two commercial decompilers). However, the more recent open source decompiler
JODE seems to fare better than the decompilers they tested, and a little better than Dava as well.
The main problems that good decompilers solve are
- Finding the types of local variables (including references to objects),
- creating "stack variables" as needed, and
- sorting out the control flow issues associated with gotos, breaks, and the like, so that these can be implemented in pure Java.
There are other problems, associated with exceptions, class literals, package and class resolution, etc, which they have solved along the way.
CategoryDecompilation