A Direct stack map location records the address of frame index. This
address is itself the value that the runtime requested. This differs
from IndirectMemRefOp locations, which refer to a stack locations from
which the requested values must be loaded. Direct locations can
directly communicate the address if an alloca, while IndirectMemRefOp
handle register spills.
For example:
entry:
%a = alloca i64...
llvm.experimental.stackmap(i32 <ID>, i32 <shadowBytes>, i64* %a)
Since both the alloca and stackmap intrinsic are in the entry block,
and the intrinsic takes the address of the alloca, the runtime can
assume that LLVM will not substitute alloca with any intervening
value. This must be verified by the runtime by checking that the stack
map's location is a Direct location type. The runtime can then
determine the alloca's relative location on the stack immediately after
compilation, or at any time thereafter. This differs from Register and
Indirect locations, because the runtime can only read the values in
those locations when execution reaches the instruction address of the
stack map.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195712 91177308-0d34-0410-b5e6-96231b3b80d8
The dumper was only dumping one pubtypes set and it was /always/ dumping
one pubtypes set even when there were zero sets. Now that the dumper
correctly dumps zero, one, or many sets, we can update this test case to
test for the absolute absence of a set rather than a bogus/accidental
zero-valued set.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195706 91177308-0d34-0410-b5e6-96231b3b80d8
Patch by Mikulas Patocka. I added the test. I checked that for cpu names that
gas knows about, it also doesn't generate nopl.
The modified cpus:
i686 - there are i686-class CPUs that don't have nopl: Via c3, Transmeta
Crusoe, Microsoft VirtualBox - see
https://bbs.archlinux.org/viewtopic.php?pid=775414
k6, k6-2, k6-3, winchip-c6, winchip2 - these are 586-class CPUs
via c3 c3-2 - see https://bugs.archlinux.org/task/19733 as a proof that
Via c3 and c3-Nehemiah don't have nopl
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195679 91177308-0d34-0410-b5e6-96231b3b80d8
This patch fixes a bug in the assembler that was causing bad code to
be emitted. When switching modes in an assembly file (e.g. arm to
thumb mode) we would always emit the opcode from the original mode.
Consider this small example:
$ cat align.s
.code 16
foo:
add r0, r0
.align 3
add r0, r0
$ llvm-mc -triple armv7-none-linux align.s -filetype=obj -o t.o
$ llvm-objdump -triple thumbv7 -d t.o
Disassembly of section .text:
foo:
0: 00 44 add r0, r0
2: 00 f0 20 e3 blx #4195904
6: 00 00 movs r0, r0
8: 00 44 add r0, r0
This shows that we have actually emitted an arm nop (e320f000)
instead of a thumb nop. Unfortunately, this encodes to a thumb
branch which causes bad things to happen when compiling assembly
code with align directives.
The fix is to notify the ARMAsmBackend when we switch mode. The
MCMachOStreamer was already doing this correctly. This patch makes
the same change for the MCElfStreamer.
There is still a bug in the way nops are emitted for alignment
because the MCAlignment fragment does not store the correct mode.
The ARMAsmBackend will emit nops for the last mode it knew about. In
the example above, we still generate an arm nop if we add a `.code
32` to the end of the file.
PR18019
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195677 91177308-0d34-0410-b5e6-96231b3b80d8
These should not use COMDATs. GNU as uses .bss for .lcomm and section 0 for
.comm.
Given
static int a;
int b;
MSVC puts both in .bss. This patch then puts both .comm and .lcomm on .bss. With
this change we agree with gas on .lcomm, are much closer on .comm and clang-cl
matches msvc on the above example.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195654 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Moved the requirement for SelectionDAG::getConstant() to return legally
typed nodes slightly earlier. There were two optional DAGCombine passes
that were missed out and were required to produce type-legal DAGs.
Simplified a code-path in tryFoldToZero() to use SelectionDAG::getConstant().
This provides support for both promoted and expanded vector types whereas the
previous code only supported promoted vector types.
Fixes a "Type for zero vector elements is not legal" assertion detected by
an llvm-stress generated test.
Reviewers: resistor
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2251
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195635 91177308-0d34-0410-b5e6-96231b3b80d8
to what is needed for constant islands. The prescan method for Mips16 constant
islands will eventually go away. It is only temporary and should be done
earlier when the instructions are first created or from the DAG. If we keep
it here we need to handle better the situation where constant islands
is called multiple times since don't want to prescan more than once.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195569 91177308-0d34-0410-b5e6-96231b3b80d8
I had to move some code and I moved a declaration forward past it's first use
in the function but by nutty coincidence there was another variable of the same
name and type and with completely unrelated function that was declared globally
in the class so no compilation error ensued.
It required some unusual conditions for it to even matter. Caused test
case casts.c in test-suite to fail during compilation with a duplicate
symbol error. I would have noticed it during final code review for this port.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195565 91177308-0d34-0410-b5e6-96231b3b80d8
We are going to drop debug info without a version number or with a different
version number, to make sure we don't crash when we see bitcode files with
different debug info metadata format.
Make tests more robust by removing hard-coded metadata numbers in CHECK lines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195535 91177308-0d34-0410-b5e6-96231b3b80d8
We were ignoring the ordered/onordered bits and also the signed/unsigned
bits of condition codes when lowering the DAG to MachineInstrs.
NOTE: This is a candidate for the 3.4 branch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195514 91177308-0d34-0410-b5e6-96231b3b80d8
We are going to drop debug info without a version number or with a different
version number, to make sure we don't crash when we see bitcode files with
different debug info metadata format.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195504 91177308-0d34-0410-b5e6-96231b3b80d8
Utilizing the 8 and 16 bit comparison instructions, even when an input can
be folded into the comparison instruction itself, is typically not worth it.
There are too many partial register stalls as a result, leading to significant
slowdowns. By always performing comparisons on at least 32-bit
registers, performance of the calculation chain leading to the
comparison improves. Continue to use the smaller comparisons when
minimizing size, as that allows better folding of loads into the
comparison instructions.
rdar://15386341
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195496 91177308-0d34-0410-b5e6-96231b3b80d8
If the beginning of the loop was also the entry block
of the function, branches were inserted to the entry block
which isn't allowed. If this occurs, create a new dummy
function entry block that branches to the start of the loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195493 91177308-0d34-0410-b5e6-96231b3b80d8
Improvements over r195317:
- Set/restore EnableFastISel flag instead of just running FastISel within
SelectAllBasicBlocks; the flag is checked in various places, and
FastISel won't run properly if those places don't do the right thing.
- Test looks for normal ISel versus FastISel behavior, and not
something more subtle that doesn't work everywhere.
Based on work by Andrea Di Biagio.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195491 91177308-0d34-0410-b5e6-96231b3b80d8
The fix is simply to use CurI instead of I when handling aliases to
avoid accessing a invalid iterator.
original message:
Convert linkonce* to weak* instead of strong.
Also refactor the logic into a helper function. This is an important improve
on mingw where the linker complains about mixed weak and strong symbols.
Converting to weak ensures that the symbol is not dropped, but keeps in a
comdat, making the linker happy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195477 91177308-0d34-0410-b5e6-96231b3b80d8
- When simplifying the mask generation for BLEND, check whether that mask is
also consumed by other non-BLEND insns. If true, skip that simplification.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195476 91177308-0d34-0410-b5e6-96231b3b80d8
I've no idea why I decided to handle TMxx differently from all the other
high/low logic operations, but it was a stupid thing to do. The high
registers aren't available as separate 32-bit registers on z10,
so subreg_h32 can't be used on a GR64 there.
I've normally been testing with z196 and with -O3 and so hadn't noticed
this until now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195473 91177308-0d34-0410-b5e6-96231b3b80d8
Also refactor the logic into a helper function. This is an important improvement
on mingw where the linker complains about mixed weak and strong symbols.
Converting to weak ensures that the symbol is not dropped, but keeps in a
comdat, making the linker happy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195470 91177308-0d34-0410-b5e6-96231b3b80d8
e.g. "%tmp = load <2 x i64>* %ptr" can't be selected.
"%tmp = bitcast i64 %in to <2 x i32>" can't be selected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195424 91177308-0d34-0410-b5e6-96231b3b80d8
The legalizer can now do this type of expansion for more
type combinations without loading and storing to and
from the stack.
NOTE: This is a candidate for the 3.4 branch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195398 91177308-0d34-0410-b5e6-96231b3b80d8
section use the form DW_FORM_data4 whilst in Dwarf 4 and later they
use the form DW_FORM_sec_offset.
This patch updates the places where such attributes are generated to
use the appropriate form depending on the Dwarf version. The DIE entries
affected have the following tags:
DW_AT_stmt_list, DW_AT_ranges, DW_AT_location, DW_AT_GNU_pubnames,
DW_AT_GNU_pubtypes, DW_AT_GNU_addr_base, DW_AT_GNU_ranges_base
It also adds a hidden command line option "--dwarf-version=<uint>"
to llc which allows the version of Dwarf to be generated to override
what is specified in the metadata; this makes it possible to update
existing tests to check the debugging information generated for both
Dwarf 4 (the default) and Dwarf 3 using the same metadata.
Patch (slightly modified) by Keith Walker!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195391 91177308-0d34-0410-b5e6-96231b3b80d8
AMD's processors family K7, K8, K10, K12, K15 and K16 are known to have SHLD/SHRD instructions with very poor latency. Optimization guides for these processors recommend using an alternative sequence of instructions. For these AMD's processors, I disabled folding (or (x << c) | (y >> (64 - c))) when we are not optimizing for size.
It might be beneficial to disable this folding for some of the Intel's processors. However, since I couldn't find specific recommendations regarding using SHLD/SHRD instructions on Intel's processors, I haven't disabled this peephole for Intel.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195383 91177308-0d34-0410-b5e6-96231b3b80d8