If you're working with LLVM and run into a bug, we definitely want to know about it. This document describes what you can do to increase the odds of getting it fixed quickly.
Basically you have to do two things at a minimum. First, decide whether the bug crashes the compiler (or an LLVM pass), or if the compiler is miscompiling the program. Based on what type of bug it is, follow the instructions in the linked section to narrow down the bug so that the person who fixes it will be able to find the problem more easily.
Once you have a reduced test-case, go to the LLVM Bug Tracking System, select the category in which the bug falls, and fill out the form with the necessary details. The bug description should contain the following information:
Thanks for helping us make LLVM better!
More often than not, bugs in the compiler cause it to crash - often due to an assertion failure of some sort. If you are running opt directly, and something crashes, jump to the section on bugs in LLVM passes. Otherwise, the most important piece of the puzzle is to figure out if it is the GCC-based front-end that is buggy or if it's one of the LLVM tools that has problems.
To figure out which program is crashing (the front-end, gccas, or gccld), run the llvm-gcc command line as you were when the crash occurred, but add a -v option to the command line. The compiler will print out a bunch of stuff, and should end with telling you that one of cc1/cc1plus, gccas, or gccld crashed.
If the problem is in the front-end, you should re-run the same llvm-gcc command that resulted in the crash, but add the -save-temps option. The compiler will crash again, but it will leave behind a foo.i file (containing preprocessed C source code) and possibly foo.s (containing LLVM assembly code), for each compiled foo.c file. Send us the foo.i file, along with a brief description of the error it caused. A tool that might help you reduce a front-end testcase to a more manageable size is delta.
If you find that a bug crashes in the gccas stage of compilation, compile your test-case to a .s file with the -save-temps option to llvm-gcc. Then run:
gccas -debug-pass=Arguments < /dev/null -o - > /dev/null
... which will print a list of arguments, indicating the list of passes that gccas runs. Once you have the input file and the list of passes, go to the section on debugging bugs in LLVM passes.
If you find that a bug crashes in the gccld stage of compilation, gather all of the .o bytecode files and libraries that are being linked together (the "llvm-gcc -v" output should include the full list of objects linked). Then run:
llvm-as < /dev/null > null.bc
gccld -debug-pass=Arguments null.bc
... which will print a list of arguments, indicating the list of passes that gccld runs. Once you have the input files and the list of passes, go to the section on debugging bugs in LLVM passes.
At this point, you should have some number of LLVM assembly files or bytecode files and a list of passes which crash when run on the specified input. In order to reduce the list of passes (which is probably large) and the input to something tractable, use the bugpoint tool as follows:
bugpoint <input files> <list of passes>
bugpoint will print a bunch of output as it reduces the test-case, but it should eventually print something like this:
...
Emitted bytecode to 'bugpoint-reduced-simplified.bc'
*** You can reproduce the problem with: opt bugpoint-reduced-simplified.bc -licm
Once you complete this, please send the LLVM bytecode file and the command line to reproduce the problem to the llvmbugs mailing list.
A miscompilation occurs when a pass does not correctly transform a program, thus producing errors that are only noticed during execution. This is different from producing invalid LLVM code (i.e., code not in SSA form, using values before defining them, etc.) which the verifier will check for after a pass finishes its run.
If it looks like the LLVM compiler is miscompiling a program, the very first thing to check is to make sure it is not using undefined behavior. In particular, check to see if the program valgrinds clean, passes purify, or some other memory checker tool. Many of the "LLVM bugs" that we have chased down ended up being bugs in the program being compiled, not LLVM.
Once you determine that the program itself is not buggy, you should choose which code generator you wish to compile the program with (e.g. C backend, the JIT, or LLC) and optionally a series of LLVM passes to run. For example:
bugpoint -run-cbe [... optzn passes ...] file-to-test.bc --args -- [program arguments]
bugpoint will try to narrow down your list of passes to the one pass that causes an error, and simplify the bytecode file as much as it can to assist you. It will print a message letting you know how to reproduce the resulting error.
Similarly to debugging incorrect compilation by mis-behaving passes, you can debug incorrect code generation by either LLC or the JIT, using bugpoint. The process bugpoint follows in this case is to try to narrow the code down to a function that is miscompiled by one or the other method, but since for correctness, the entire program must be run, bugpoint will compile the code it deems to not be affected with the C Backend, and then link in the shared object it generates.
To debug the JIT:
bugpoint -run-jit -output=[correct output file] [bytecode file] \ --tool-args -- [arguments to pass to lli] \ --args -- [program arguments]
Similarly, to debug the LLC, one would run:
bugpoint -run-llc -output=[correct output file] [bytecode file] \ --tool-args -- [arguments to pass to llc] \ --args -- [program arguments]
Special note: if you are debugging MultiSource or SPEC tests that already exist in the llvm/test hierarchy, there is an easier way to debug the JIT, LLC, and CBE, using the pre-written Makefile targets, which will pass the program options specified in the Makefiles:
cd llvm/test/../../program
make bugpoint-jit
At the end of a successful bugpoint run, you will be presented with two bytecode files: a safe file which can be compiled with the C backend and the test file which either LLC or the JIT mis-codegenerates, and thus causes the error.
To reproduce the error that bugpoint found, it is sufficient to do the following:
Regenerate the shared object from the safe bytecode file:
llc -march=c safe.bc -o safe.c
gcc -shared safe.c -o safe.so
If debugging LLC, compile test bytecode native and link with the shared object:
llc test.bc -o test.s -f
gcc test.s safe.so -o test.llc
./test.llc [program options]
If debugging the JIT, load the shared object and supply the test bytecode:
lli -load=safe.so test.bc [program options]