at phis. This enables us to eliminate a lot of pointless zexts during the DAGCombine
phase. This fixes <rdar://problem/8760114>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126170 91177308-0d34-0410-b5e6-96231b3b80d8
In other words, do not keep track of argument's location. The debugger (gdb) is not prepared to see line table entries for arguments. For the debugger, "second" line table entry marks beginning of function body.
This requires some coordination with debugger to get this working.
- The debugger needs to be aware of prolog_end attribute attached with line table entries.
- The compiler needs to accurately mark prolog_end in line table entries (at -O0 and at -O1+)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126155 91177308-0d34-0410-b5e6-96231b3b80d8
"dllimport" function must not be GlobalVariable, but Function. It is enough to check with GlobalValue.
test/CodeGen/X86/dll-linkage.ll is updated to check llc -O0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126110 91177308-0d34-0410-b5e6-96231b3b80d8
of a constant had a minor typo introduced when copying it from the book, which
caused it to favor negative approximations over positive approximations in many
cases. Positive approximations require fewer operations beyond the multiplication.
In the case of division by 3, we still generate code that is a single instruction
larger than GCC's code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126097 91177308-0d34-0410-b5e6-96231b3b80d8
test for that. With this change, test/CodeGen/X86/codegen-dce.ll no longer finds
any instructions to DCE, so delete the test.
Also renamed J and JP to I and IP in RecursivelyDeleteDeadPHINode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126088 91177308-0d34-0410-b5e6-96231b3b80d8
We usually catch this kind of optimization through InstSimplify's distributive
magic, but or doesn't distribute over xor in general.
"A | ~(A | B) -> A | ~B" hits 24 times on gcc.c.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126081 91177308-0d34-0410-b5e6-96231b3b80d8
The DAGCombiner folds the zext into complex load instructions. This patch
prevents this optimization on vectors since none of the supported targets
knows how to perform load+vector_zext in one instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126080 91177308-0d34-0410-b5e6-96231b3b80d8
constant, including globals. This makes us generate much more "pretty" pattern
globals as well because it doesn't break it down to an array of bytes all the
time.
This enables us to handle stores of relocatable globals. This kicks in about
48 times in 254.gap, giving us stuff like this:
@.memset_pattern40 = internal constant [2 x %struct.TypHeader* (%struct.TypHeader*, %struct.TypHeader*)*] [%struct.TypHeader* (%struct.TypHeader*, %struct
.TypHeader*)* @IsFalse, %struct.TypHeader* (%struct.TypHeader*, %struct.TypHeader*)* @IsFalse], align 16
...
call void @memset_pattern16(i8* %scevgep5859, i8* bitcast ([2 x %struct.TypHeader* (%struct.TypHeader*, %struct.TypHeader*)*]* @.memset_pattern40 to i8*
), i64 %tmp75) nounwind
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126044 91177308-0d34-0410-b5e6-96231b3b80d8
unsplatable values into memset_pattern16 when it is available
(recent darwins). This transforms lots of strided loop stores
of ints for example, like 5 in vpr:
Formed memset: call void @memset_pattern16(i8* %4, i8* getelementptr inbounds ([16 x i8]* @.memset_pattern9, i32 0, i32 0), i64 %tmp25)
from store to: {%3,+,4}<%11> at: store i32 3, i32* %scevgep, align 4, !tbaa !4
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126040 91177308-0d34-0410-b5e6-96231b3b80d8
taken (and used!). This prevents merging the blocks (invalidating
the block addresses) in a case like this:
#define _THIS_IP_ ({ __label__ __here; __here: (unsigned long)&&__here; })
void foo() {
printf("%p\n", _THIS_IP_);
printf("%p\n", _THIS_IP_);
printf("%p\n", _THIS_IP_);
}
which fixes PR4151.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125829 91177308-0d34-0410-b5e6-96231b3b80d8
No one uses *-mingw64. mingw-w64 is represented as {i686|x86_64}-w64-mingw32. In llvm side, i686 and x64 can be treated as similar way.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125747 91177308-0d34-0410-b5e6-96231b3b80d8
variations (some of these were already present so I unified the code). Spotted by my
auto-simplifier as occurring a lot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125734 91177308-0d34-0410-b5e6-96231b3b80d8
transformation if we can't legally create a build vector of the correct
type. Check that we can make the transformation first, and add a TODO to
refactor this code with similar cases.
Fixes: PR9223 and rdar://9000350
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125631 91177308-0d34-0410-b5e6-96231b3b80d8
Machine instruction range consisting of only DBG_VALUE MIs only contributes consecutive labels in assembly output, which is harmless, and empty scope entry in DebugInfo, which confuses debugger tools.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125577 91177308-0d34-0410-b5e6-96231b3b80d8
The i64_buildvector test in this file relies on the alignment of i64 and
f64 types being the same, which is true for Darwin but not AAPCS.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125525 91177308-0d34-0410-b5e6-96231b3b80d8
- Add custom operand matching for imod and iflags.
- Rename SplitMnemonicAndCC to SplitMnemonic since it splits more than CC
from mnemonic.
- While adding ".w" as an operand, don't change "Head" to avoid passing the
wrong mnemonic to ParseOperand.
- Add asm parser tests.
- Add disassembler tests just to make sure it can catch all cps versions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125489 91177308-0d34-0410-b5e6-96231b3b80d8
have their low bits set to zero. This allows us to optimize
out explicit stack alignment code like in stack-align.ll:test4 when
it is redundant.
Doing this causes the code generator to start turning FI+cst into
FI|cst all over the place, which is general goodness (that is the
canonical form) except that various pieces of the code generator
don't handle OR aggressively. Fix this by introducing a new
SelectionDAG::isBaseWithConstantOffset predicate, and using it
in places that are looking for ADD(X,CST). The ARM backend in
particular was missing a lot of addressing mode folding opportunities
around OR.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125470 91177308-0d34-0410-b5e6-96231b3b80d8
plus some variations of this. According to my auto-simplifier this occurs a lot
but usually in combination with max/min idioms. Because max/min aren't handled
yet this unfortunately doesn't have much effect in the testsuite.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125462 91177308-0d34-0410-b5e6-96231b3b80d8
It caused a crash in MultiSource/Benchmarks/Bullet.
Opt hit an assertion with "opt -std-compile-opts" because
Constant::getAllOnesValue doesn't know how to handle floats.
This patch added a test to reproduce the problem and a check that the
destination vector is of integer type.
Thank you Benjamin!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125459 91177308-0d34-0410-b5e6-96231b3b80d8
the shift amounts are in a suitably wide type so that
we don't generate out of range constant shift amounts.
This fixes PR9028.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125458 91177308-0d34-0410-b5e6-96231b3b80d8
is narrower than the shift register. Doing an anyext provides undefined bits in
the top part of the register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125457 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes a FIXME in scev-aa.ll (allowing a new no-alias result) and
generally makes things more precise.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125449 91177308-0d34-0410-b5e6-96231b3b80d8
These are just FXSAVE and FXRSTOR with REX.W prefixes. These versions use
64-bit pointer values instead of 32-bit pointer values in the memory map they
dump and restore.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125446 91177308-0d34-0410-b5e6-96231b3b80d8
The DAGCombiner created illegal BUILD_VECTOR operations.
The patch added a check that either illegal operations are
allowed or that the created operation is legal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125435 91177308-0d34-0410-b5e6-96231b3b80d8
unsigned overflow (e.g. "gep P, -1"), and while they can have
signed wrap in theoretical situations, modelling an AddRec as
not having signed wrap is going enough for any case we can
think of today. In the future if this isn't enough, we can
revisit this. Modeling them as having NUW isn't causing any
known problems either FWIW.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125410 91177308-0d34-0410-b5e6-96231b3b80d8
This
define float @foo(float %x, float %y) nounwind readnone {
entry:
%0 = tail call float @copysignf(float %x, float %y) nounwind readnone
ret float %0
}
Was compiled to:
vmov s0, r1
bic r0, r0, #-2147483648
vmov s1, r0
vcmpe.f32 s0, #0
vmrs apsr_nzcv, fpscr
it lt
vneglt.f32 s1, s1
vmov r0, s1
bx lr
This fails to copy the sign of -0.0f because it's lost during the float to int
conversion. Also, it's sub-optimal when the inputs are in GPR registers.
Now it uses integer and + or operations when it's profitable. And it's correct!
lsrs r1, r1, #31
bfi r0, r1, #31, #1
bx lr
rdar://8984306
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125357 91177308-0d34-0410-b5e6-96231b3b80d8
gep to explicit addressing, we know that none of the intermediate
computation overflows.
This could use review: it seems that the shifts certainly wouldn't
overflow, but could the intermediate adds overflow if there is a
negative index?
Previously the testcase would instcombine to:
define i1 @test(i64 %i) {
%p1.idx.mask = and i64 %i, 4611686018427387903
%cmp = icmp eq i64 %p1.idx.mask, 1000
ret i1 %cmp
}
now we get:
define i1 @test(i64 %i) {
%cmp = icmp eq i64 %i, 1000
ret i1 %cmp
}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125271 91177308-0d34-0410-b5e6-96231b3b80d8
exact/nsw/nuw shifts and have instcombine infer them when it can prove
that the relevant properties are true for a given shift without them.
Also, a variety of refactoring to use the new patternmatch logic thrown
in for good luck. I believe that this takes care of a bunch of related
code quality issues attached to PR8862.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125267 91177308-0d34-0410-b5e6-96231b3b80d8
optimizations to be much more aggressive in the face of
exact/nsw/nuw div and shifts. For example, these (which
are the same except the first is 'exact' sdiv:
define i1 @sdiv_icmp4_exact(i64 %X) nounwind {
%A = sdiv exact i64 %X, -5 ; X/-5 == 0 --> x == 0
%B = icmp eq i64 %A, 0
ret i1 %B
}
define i1 @sdiv_icmp4(i64 %X) nounwind {
%A = sdiv i64 %X, -5 ; X/-5 == 0 --> x == 0
%B = icmp eq i64 %A, 0
ret i1 %B
}
compile down to:
define i1 @sdiv_icmp4_exact(i64 %X) nounwind {
%1 = icmp eq i64 %X, 0
ret i1 %1
}
define i1 @sdiv_icmp4(i64 %X) nounwind {
%X.off = add i64 %X, 4
%1 = icmp ult i64 %X.off, 9
ret i1 %1
}
This happens when you do something like:
(ptr1-ptr2) == 42
where the pointers are pointers to non-unit types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125266 91177308-0d34-0410-b5e6-96231b3b80d8
When matching operands for a candidate opcode match in the auto-generated
AsmMatcher, check each operand against the expected operand match class.
Previously, operands were classified independently of the opcode being
handled, which led to difficulties when operand match classes were
more complicated than simple subclass relationships.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125245 91177308-0d34-0410-b5e6-96231b3b80d8
could end up removing a different function than we intended because it was
functionally equivalent, then end up with a comparison of a function against
itself in the next round of comparisons (the one in the function set and the
one on the deferred list). To fix this, I introduce a choice in the form of
comparison for ComparableFunctions, either normal or "pointer only" used to
find exact Function*'s in lookups.
Also add some debugging statements.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125180 91177308-0d34-0410-b5e6-96231b3b80d8
The vld1-lane, vld1-dup and vst1-lane instructions do not yet support using
post-increment versions, but all the rest of the NEON load/store instructions
should be handled now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125014 91177308-0d34-0410-b5e6-96231b3b80d8
failures with relocations.
The code committed is a first cut at compatibility for emitted relocations in
ELF .o.
Why do this? because existing ARM tools like emitting relocs symbols as
explicit relocations, not as section-offset relocs.
Result is that with these changes,
1) relocs are now substantially identical what to gcc outputs.
2) larger apps (including many spec2k tests) compile, cross-link, and pass
Added reminder fixme to tests for future conversion to .s form.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124996 91177308-0d34-0410-b5e6-96231b3b80d8
Unified EmitTextAttribute for both Asm and Obj emission (.cpu only)
Added necessary cortex-A8 related attrs for codegen compat tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124995 91177308-0d34-0410-b5e6-96231b3b80d8
(yes, this is different from R_ARM_CALL)
- Adds a new method getARMBranchTargetOpValue() which handles the
necessary distinction between the conditional and unconditional br/bl
needed for ARM/ELF
At least for ARM mode, the needed fixup for conditional versus unconditional
br/bl is identical, but the ARM docs and existing ARM tools expect this
reloc type...
Added a few FIXME's for future naming fixups in ARMInstrInfo.td
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124895 91177308-0d34-0410-b5e6-96231b3b80d8
auto-simplifier). This has a big impact on Ada code, but not much else.
Unfortunately the impact is mostly negative! This is due to PR9004 (aka
SCCP failing to resolve conditional branch conditions in the destination
blocks of the branch), in which simple correlated expressions are not
resolved but complicated ones are, so simplifying has a bad effect!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124788 91177308-0d34-0410-b5e6-96231b3b80d8