Reverse Engineering Compiler
Program-Transformation.Org: The Program Transformation Wiki
http://www.backerstreet.com/rec/rec.htm
REC, a Reverse Engineering Compiler by Giampiero Caprino (
caprino@netcom.com), is a portable decompiler which supports a variety of different input binary files and produces a C-like representation as output. Source processors supported at present include: Intel 386, Motorola 68000, MIPS R3000 and Motorola PowerPC. Version 1.6 also supports SPARC. Amongst the source binary file formats accepted are: Unix Elf, Unix COFF, Windows PE, Linux and SunOS AOUT, PlayStation PS-X, and raw binary data.
REC started as a workbench to test compiler related algorithms. Its author, Giampiero Caprino, has worked on this project on an on and off basis for several years and uses the Linux platform for development. REC source is
not in the public domain, but binaries for several popular operating systems are freely downloadable from the website at
http://www.backerstreet.com/rec/rec.htm.
Some of the features of REC include:
- Recognizes dynamically and statically linked functions like printf,
- Recovers control structure information correctly,
- Gets arguments to variable length routines such as printf correct,
- Supports the information provided by the -g option, hence recovering correct names for all variables and routines,
- For each routine provides the name (if known), argument size, local variables size, and saved register size,
- Generates good code.
As of version 1.4a there is
HTML support at its user interface level, as well as a HTTP proxy server. Users can therefore decompile programs remotely over the internet.
As of version 1.4a, minor bugs that could be easily fixed include:
- Easily confused by nulls on Solaris; these are rarely used on Linux
- Return types are not correct; returns to eax even when using void routines
- Does not handle global static arrays of initialized strings
- A pattern for gcc's signed divide by 2 is needed to avoid emiting an expression that includes an arithmetic right shift by 31 bits followed by a logical right shift of 31 bits
- Arrays are left as expressions at present; they are not processed into array notation
- Uses globals for register variables instead of allocating a local variable
- Registers are still available on the output; in many cases they can be removed
- Gets confused by call $+8 instructions, which are common in position independent code
Overall, REC is a great attempt at a portable decompiler. It needs some data flow and type recovery analyses to produce better code. Development seems to have stopped in September 2000 with version 1.6. For some tests see
DecompilerRecTest.
Note also a serious problem: REC assumes that the stack pointer does not change between the
prologue and epilogue of a procedure. This is not valid for programs using the "omit frame pointer"
optimisation (or equivalent), including many Windows programs compiled with MSVC. For more details,
see the paper
Using a Decompiler for Real-World Source Recovery.
Readers attempting to evaluate REC should note that you need
-gstabs
(not just
-g
as stated in the REC manual) for creating object files to be used with the
files:
option of a
.cmd
file.
CategoryDecompilation