are targeting an ELF platform. Only fold gs-relative (and fs-relative) loads
if it is actually sensible to do so for the target platform.
This fixes PR13438.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160687 91177308-0d34-0410-b5e6-96231b3b80d8
might be deliberate "one time" leaks, so that leak checkers can find them.
This is a reapply of r160602 with the fix that this time I'm committing the
code I thought I was committing last time; the I->eraseFromParent() goes
*after* the break out of the loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160664 91177308-0d34-0410-b5e6-96231b3b80d8
that do not support it (X86 does not lower select_cc).
PR: 13428
Together with Michael Kuperstein <michael.m.kuperstein@intel.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160619 91177308-0d34-0410-b5e6-96231b3b80d8
r160529 that was subsequently reverted. The fix was to not call
GV->eraseFromParent() right before the caller does the same. The existing
testcases already caught this bug if run under valgrind.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160602 91177308-0d34-0410-b5e6-96231b3b80d8
This pass no longer requires that the global pointer value be saved to the
stack or register since it uses bal instruction to compute branch distance.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160601 91177308-0d34-0410-b5e6-96231b3b80d8
Make sure we do not emit index computations with NSW flags so that we dont get an undef value if the GEP overflows
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160589 91177308-0d34-0410-b5e6-96231b3b80d8
LiveRangeEdit::foldAsLoad() can eliminate a register by folding a load
into its only use. Only do that when the load is safe to move, and it
won't extend any live ranges.
This fixes PR13414.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160575 91177308-0d34-0410-b5e6-96231b3b80d8
CI's name, and then used the StringRef pointing at its old name. I'm
fixing it by storing the name in a std::string, and hoisting the
renaming logic to happen always. This is nicer anyways as it will allow
the upgraded IR to have the same names as the input IR in more cases.
Another bug found by AddressSanitizer. Woot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160572 91177308-0d34-0410-b5e6-96231b3b80d8
PHIElimination splits critical edges when it predicts it can resolve
interference and eliminate copies. It doesn't split the edge if the
interference wouldn't be resolved anyway because the phi-use register is
live in the critical edge anyway.
Teach PHIElimination to split loop exiting edges with interference, even
if it wouldn't resolve the interference. This removes the necessary
copies from the loop, which is still an improvement from injecting the
copies into the loop.
The test case demonstrates the improvement. Before:
LBB0_1:
cmpb $0, (%rdx)
leaq 1(%rdx), %rdx
movl %esi, %eax
je LBB0_1
After:
LBB0_1:
cmpb $0, (%rdx)
leaq 1(%rdx), %rdx
je LBB0_1
movl %esi, %eax
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160571 91177308-0d34-0410-b5e6-96231b3b80d8
GetBestDestForJumpOnUndef() assumes there is at least 1 successor, which isn't
true if the block ends in an indirect branch with no successors. Fix this by
bailing out earlier in this case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160546 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes a bunch of make check failures of the form:
Unknown Architecture Version.
UNREACHABLE executed at ../lib/Target/Hexagon/HexagonSubtarget.cpp:60!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160518 91177308-0d34-0410-b5e6-96231b3b80d8
It is optimal at least up to 7 bits (I've tested all such cases)
This change to truncate() allows a little simplification to the multiplication code,
and it also makes multiplication optimal :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160512 91177308-0d34-0410-b5e6-96231b3b80d8
(instead of basenames) from DWARF. Use this behavior in llvm-dwarfdump tool.
Reviewed by Benjamin Kramer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160496 91177308-0d34-0410-b5e6-96231b3b80d8
Updated OptimizeCompare in peephole to remove redundant cmp against zero.
We only remove Compare if CF and OF are not used.
rdar://11855129
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160454 91177308-0d34-0410-b5e6-96231b3b80d8
when run on an Intel Atom processor. The failures have arisen due
to changes elsewhere in the trunk over the past 8 weeks or so.
These failures were not detected by the Atom buildbot because the
CPU on the Atom buildbot was not being detected as an Atom CPU.
The fix for this problem is in Host.cpp and X86Subtarget.cpp, but
shall remain commented out until the current set of Atom test failures
are fixed.
Patch by Andy Zhang and Tyler Nowicki!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160451 91177308-0d34-0410-b5e6-96231b3b80d8
LiveIntervals due to the two-addr pass generating bogus MI code.
The crux of the issue was a loop nesting problem. The intent of the code
which attempts to transform instructions before converting them to
two-addr form is to defer and reprocess any transformed instructions as
the second processing is likely to have more opportunities to coalesce
copies, etc. Unfortunately, there was one section of processing that was
not deferred -- the INSERT_SUBREG rewriting. Due to quirks of how this
rewriting proceeded, not only did it occur early, it removed the bits of
information needed for the deferred processing to correctly generate the
necessary two address form (specifically inserting a copy), but didn't
trigger any immediate assertions and produced what appeared to be
already valid two-address from code. Thus, the assertion only fired much
later in the pipeline.
The fix is to hoist the transformation logic up layer to where it can
more firmly defer all further processing, and to teach the normal
processing to handle an edge case previously handled as part of the
transformation logic. This edge case (already matched tied register
operands) needs to *not* defer any steps.
As has been brought up repeatedly in the process: wow does this code
need refactoring. I *may* squeeze in some time to at least bring sanity
to this loop... but wow... =]
Thanks to Jakob for helpful hints on the way here, and the review.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160443 91177308-0d34-0410-b5e6-96231b3b80d8
load source operand is used by multiple nodes. The v2i64 broadcast was emulated
by shuffling the two lower i32 elements to the upper two.
We had a bug in the immediate used for the broadcast.
Replacing 0 to 0x44.
0x44 means [01|00|01|00] which corresponds to the correct lane.
Patch by Michael Kuperstein.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160430 91177308-0d34-0410-b5e6-96231b3b80d8
Print the high order register of a double word register operand.
In 32 bit mode, a 64 bit double word integer will be represented
by 2 32 bit registers. This modifier causes the high order register
to be used in the asm expression. It is useful if you are using
doubles in assembler and continue to control register to variable
relationships.
This patch also fixes a related bug in a previous patch:
case 'D': // Second part of a double word register operand
case 'L': // Low order register of a double word register operand
case 'M': // High order register of a double word register operand
I got 'D' and 'M' confused. The second part of a double word operand
will only match 'M' for one of the endianesses. I had 'L' and 'D'
be the opposite twins when 'L' and 'M' are.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160429 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes PR13371: indvars pass incorrectly substitutes 'undef' values.
I do not like this fix. It's needed until/unless the meaning of undef
changes. It attempts to be complete according to the IR spec, but I
don't have much confidence in the implementation given the difficulty
testing undefined behavior. Worse, this invalidates some of my
hard-fought work on indvars and LSR to optimize pointer induction
variables. It results benchmark regressions, which I'll track
internally. On x86_64 no LTO I see:
-3% huffbench
-3% 400.perlbench
-8% fhourstones
My only suggestion for recovering is to change the meaning of
undef. If we could trust an arbitrary instruction to produce a some
real value that can be manipulated (e.g. incremented) according to
non-undef rules, then this case could be easily handled with SCEV.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160421 91177308-0d34-0410-b5e6-96231b3b80d8
intrinsics. The second instruction(s) to be handled are the vector versions
of count set bits (ctpop).
The changes here are to clang so that it generates a target independent
vector ctpop when it sees an ARM dependent vector bits set count. The changes
in llvm are to match the target independent vector ctpop and in
VMCore/AutoUpgrade.cpp to update any existing bc files containing ARM
dependent vector pop counts with target-independent ctpops. There are also
changes to an existing test case in llvm for ARM vector count instructions and
to a test for the bitcode upgrade.
<rdar://problem/11892519>
There is deliberately no test for the change to clang, as so far as I know, no
consensus has been reached regarding how to test neon instructions in clang;
q.v. <rdar://problem/8762292>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160410 91177308-0d34-0410-b5e6-96231b3b80d8
To fetch a subprogram name we should not only inspect the DIE for this subprogram, but optionally inspect
its specification, or its abstract origin (even if there is no inlining), or even specification of an abstract origin.
Reviewed by Benjamin Kramer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160365 91177308-0d34-0410-b5e6-96231b3b80d8
When truncating a result of a vector that is split we need
to use the result of the split vector, and not re-split the dead node.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160357 91177308-0d34-0410-b5e6-96231b3b80d8
large immediates. Add dag combine logic to recover in case the large
immediates doesn't fit in cmp immediate operand field.
int foo(unsigned long l) {
return (l>> 47) == 1;
}
we produce
%shr.mask = and i64 %l, -140737488355328
%cmp = icmp eq i64 %shr.mask, 140737488355328
%conv = zext i1 %cmp to i32
ret i32 %conv
which codegens to
movq $0xffff800000000000,%rax
andq %rdi,%rax
movq $0x0000800000000000,%rcx
cmpq %rcx,%rax
sete %al
movzbl %al,%eax
ret
TargetLowering::SimplifySetCC would transform
(X & -256) == 256 -> (X >> 8) == 1
if the immediate fails the isLegalICmpImmediate() test. For x86,
that's immediates which are not a signed 32-bit immediate.
Based on a patch by Eli Friedman.
PR10328
rdar://9758774
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160346 91177308-0d34-0410-b5e6-96231b3b80d8
This places limits on CollectSubexprs to constrains the number of
reassociation possibilities. It limits the recursion depth and skips
over chains of nested recurrences outside the current loop.
Fixes PR13361. Although underlying SCEV behavior is still potentially bad.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160340 91177308-0d34-0410-b5e6-96231b3b80d8
uint32_t hi(uint64_t res)
{
uint_32t hi = res >> 32;
return !hi;
}
llvm IR looks like this:
define i32 @hi(i64 %res) nounwind uwtable ssp {
entry:
%lnot = icmp ult i64 %res, 4294967296
%lnot.ext = zext i1 %lnot to i32
ret i32 %lnot.ext
}
The optimizer has optimize away the right shift and truncate but the resulting
constant is too large to fit in the 32-bit immediate field. The resulting x86
code is worse as a result:
movabsq $4294967296, %rax ## imm = 0x100000000
cmpq %rax, %rdi
sbbl %eax, %eax
andl $1, %eax
This patch teaches the x86 lowering code to handle ult against a large immediate
with trailing zeros. It will issue a right shift and a truncate followed by
a comparison against a shifted immediate.
shrq $32, %rdi
testl %edi, %edi
sete %al
movzbl %al, %eax
It also handles a ugt comparison against a large immediate with trailing bits
set. i.e. X > 0x0ffffffff -> (X >> 32) >= 1
rdar://11866926
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160312 91177308-0d34-0410-b5e6-96231b3b80d8
In the added testcase the constant 55 was behind an AssertZext of type i1, and ComputeDemandedBits
reported that some of the bits were both known to be one and known to be zero.
Together with Michael Kuperstein <michael.m.kuperstein@intel.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160305 91177308-0d34-0410-b5e6-96231b3b80d8
Make it always return APInts with the same bitwidth for the same ConstantRange bitwidth to simply clients
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160294 91177308-0d34-0410-b5e6-96231b3b80d8
Mips shift instructions DSLL, DSRL and DSRA are transformed into
DSLL32, DSRL32 and DSRA32 respectively if the shift amount is between
32 and 63
Here is a description of DSLL:
Purpose: Doubleword Shift Left Logical Plus 32
To execute a left-shift of a doubleword by a fixed amount--32 to 63 bits
Description: GPR[rd] <- GPR[rt] << (sa+32)
The 64-bit doubleword contents of GPR rt are shifted left, inserting
zeros into the emptied bits; the result is placed in
GPR rd. The bit-shift amount in the range 0 to 31 is specified by sa.
This patch implements the direct object output of these instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160277 91177308-0d34-0410-b5e6-96231b3b80d8
undef virtual register. The problem is that ProcessImplicitDefs removes the
definition of the register and marks all uses as undef. If we lose the undef
marker then we get a register which has no def, is not marked as undef. The
live interval analysis does not collect information for these virtual
registers and we crash in later passes.
Together with Michael Kuperstein <michael.m.kuperstein@intel.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160260 91177308-0d34-0410-b5e6-96231b3b80d8
It turns out that ASan relied on the at-the-end block insertion order to
(purely by happenstance) disable some LLVM optimizations, which in turn
start firing when the ordering is made more "normal". These
optimizations in turn merge many of the instrumentation reporting calls
which breaks the return address based error reporting in ASan.
We're looking at several different options for fixing this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160256 91177308-0d34-0410-b5e6-96231b3b80d8
This is particularly useful to the backend code generators which try to
process things in the incoming function order.
Also, cleanup some uses of IRBuilder to be a bit simpler and more clear.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160254 91177308-0d34-0410-b5e6-96231b3b80d8
It is intended to fix PR11468.
Old prologue and epilogue looked like this:
push %rbp
mov %rsp, %rbp
and $alignment, %rsp
push %r14
push %r15
...
pop %r15
pop %r14
mov %rbp, %rsp
pop %rbp
The problem was to reference the locations of callee-saved registers in exception handling:
locations of callee-saved had to be re-calculated regarding the stack alignment operation. It would
take some effort to implement this in LLVM, as currently MachineLocation can only have the form
"Register + Offset". Funciton prologue and epilogue are now changed to:
push %rbp
mov %rsp, %rbp
push %14
push %15
and $alignment, %rsp
...
lea -$size_of_saved_registers(%rbp), %rsp
pop %r15
pop %r14
pop %rbp
Reviewed by Chad Rosier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160248 91177308-0d34-0410-b5e6-96231b3b80d8
the move of *Builder classes into the Core library.
No uses of this builder in Clang or DragonEgg I could find.
If there is a desire to have an IR-building-support library that
contains all of these builders, that can be easily added, but currently
it seems likely that these add no real overhead to VMCore.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160243 91177308-0d34-0410-b5e6-96231b3b80d8
IRBuilder, DIBuilder, etc.
This is the proper layering as MDBuilder can't be used (or implemented)
without the Core Metadata representation.
Patches to Clang and Dragonegg coming up.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160237 91177308-0d34-0410-b5e6-96231b3b80d8
Allow the folding of vbroadcastRR to vbroadcastRM, where the memory operand is a spill slot.
PR12782.
Together with Michael Kuperstein <michael.m.kuperstein@intel.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160230 91177308-0d34-0410-b5e6-96231b3b80d8
Add a micro-optimization to getNode of CONCAT_VECTORS when both operands are undefs.
Can't find a testcase for this because VECTOR_SHUFFLE already handles undef operands, but Duncan suggested that we add this.
Together with Michael Kuperstein <michael.m.kuperstein@intel.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160229 91177308-0d34-0410-b5e6-96231b3b80d8
The notable fix is to look at any dependencies attached to the kill
instruction (or other instructions between MI nad the kill) where the
dependencies are specific to the register in question.
The old code implicitly handled this by rejecting the transform if *any*
other uses were found within the block, but after the start point. The
new code directly finds the kill, and has to re-use the existing
dependency scan to check for non-kill uses.
This was caught by self-host, but I found the bug via inspection and use
of absurd assert scaffolding to compute the kills in two ways and
compare them. So I have no useful testcase for this other than
"bootstrap". I'd work harder to reduce a test case if this particular
code were likely to live for a long time.
Thanks to Benjamin Kramer for reviewing the fix itself.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160228 91177308-0d34-0410-b5e6-96231b3b80d8
Catch uses of undefined physregs that haven't been added to basic block
live-in lists. Run the verifier to pinpoint the problem.
Also run the verifier when a virtual register use is not jointly
dominated by defs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160207 91177308-0d34-0410-b5e6-96231b3b80d8
All SCEV expressions used by LSR formulae must be safe to
expand. i.e. they may not contain UDiv unless we can prove nonzero
denominator.
Fixes PR11356: LSR hoists UDiv.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160205 91177308-0d34-0410-b5e6-96231b3b80d8
This allows SCEVExpander to run on the IV expressions.
This codifies an assumption made by LSR to complete the fix for
PR11356, but I haven't been able to generate a separate unit test for
this part. I'm adding it as an extra safety check.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160204 91177308-0d34-0410-b5e6-96231b3b80d8
intrinsics with target-indepdent intrinsics. The first instruction(s) to be
handled are the vector versions of count leading zeros (ctlz).
The changes here are to clang so that it generates a target independent
vector ctlz when it sees an ARM dependent vector ctlz. The changes in llvm
are to match the target independent vector ctlz and in VMCore/AutoUpgrade.cpp
to update any existing bc files containing ARM dependent vector ctlzs with
target-independent ctlzs. There are also changes to an existing test case in
llvm for ARM vector count instructions and a new test for the bitcode upgrade.
<rdar://problem/11831778>
There is deliberately no test for the change to clang, as so far as I know, no
consensus has been reached regarding how to test neon instructions in clang;
q.v. <rdar://problem/8762292>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160200 91177308-0d34-0410-b5e6-96231b3b80d8
removes the largest scaling problem in the test cases from PR13225 when
ASan is switched to insert basic blocks in the natural CFG order.
It may also solve some scaling problems for more normal code with large
numbers of basic blocks and variables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160194 91177308-0d34-0410-b5e6-96231b3b80d8
Call instructions are no longer required to be variadic, and
variable_ops should only be used for instructions that encode a variable
number of arguments, like the ARM stm/ldm instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160189 91177308-0d34-0410-b5e6-96231b3b80d8
Function argument registers are added to the call SDNode, but
InstrEmitter now knows how to make those operands implicit, and the call
instruction doesn't have to be variadic.
Explicit register operands should only be those that are encoded in the
instruction, implicit register operands are for extra dependencies like
call argument and return values.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160188 91177308-0d34-0410-b5e6-96231b3b80d8
is used in cases where global symbols are
directly represented in the GOT and we use an
offset into the global offset table.
This patch adds direct object support for R_MIPS_GOT_DISP.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160183 91177308-0d34-0410-b5e6-96231b3b80d8
When dumping the DAG for a fatal 'Cannot select' back-end error, also
provide the name of the function the construct is in. Useful when dealing
with large testcases, as the next step is to llvm-extract the function
in question to get a small(er) testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160152 91177308-0d34-0410-b5e6-96231b3b80d8
the input vector, it can be bigger (this is helpful for powerpc where <2 x i16>
is a legal vector type but i16 isn't a legal type, IIRC). However this wasn't
being taken into account by ExpandRes_EXTRACT_VECTOR_ELT, causing PR13220.
Lightly tweaked version of a patch by Michael Liao.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160116 91177308-0d34-0410-b5e6-96231b3b80d8
%shr = lshr i64 %key, 3
%0 = load i64* %val, align 8
%sub = add i64 %0, -1
%and = and i64 %sub, %shr
ret i64 %and
to:
%shr = lshr i64 %key, 3
%0 = load i64* %val, align 8
%sub = add i64 %0, 2305843009213693951
%and = and i64 %sub, %shr
ret i64 %and
The demanded bit optimization is actually a pessimization because add -1 would
be codegen'ed as a sub 1. Teach the demanded constant shrinking optimization
to check for negated constant to make sure it is actually reducing the width
of the constant.
rdar://11793464
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160101 91177308-0d34-0410-b5e6-96231b3b80d8
def Pat<...>;
Results in 'record name is not a string!' diagnostic. Not the best,
but the lack of location information moves it from not very helpful
into completely useless. We're in the Record class when throwing the
error, so just add the location info directly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160098 91177308-0d34-0410-b5e6-96231b3b80d8