Update: A Sneak Peek of the Source

2014-04-03 | Dagger Team

Since our last post, we have received lots of comments, asking why we didn't release the source code yet. The simple answer is: we didn't believe it was ready. As a matter of fact, we still think it isn't. Long story short, we didn't have a lot of time to work on the project; the last few months were pretty hectic for us grad school students. But enough excuses! Since we're back to work on the project, we decided to open the source code starting from now.

There is obviously a tremendous amount of work left to do; we got to a very good stage last summer, with a good part of the LLVM testsuite running under our binary->IR->JIT dynamic binary translator. During development, lots of problems and corner cases came up; we'll start documenting them on this website. As always, contributions/questions are very welcome! We don't want to burden the LLVM mailing lists with our still-unmerged work, so you can contact us at dagger repzret.org.

Again, our intention is to have something useful and good enough to be merged into LLVM. Thus, our work is on a fork of the llvm git, which we rebase onto every once in a while. But enough talk, get the code:

git clone http://repzret.org/git/dagger.git
cd dagger
mkdir build
cd build
cmake ..
make

Right now, only X86-64 is supported; the best working object file format being mach-o. Remember, this isn't a "release", so lots of smaller (configure) or bigger (building on other architectures) things might be missing.

The easiest thing to try out would be to run llvm-dc on a yaml MCModule; for that, see the DC testcases. Another interesting usage might be to run the static translator, llvm-dec, on a full mach-o binary. Finally, the ultimate step is to run the dynamic translator, DYN, on a mach-o executable: that is done by setting the DYLD_INSERT_LIBRARY environment variable to the path of the DYN.dylib file in the build/bin directory. Depending on the version of clang, C runtime, the C++ standard library, etc.., there might be problems. For instance, we're aware of a limitation (lack of support for weak internally defined function symbols) that seems to be triggered by recent versions of the OS X toolchain; we're working on it.

The good news is, after a pretty long hiatus, we're back on the project!