argument. The generated alloca has to have at least the alignment of the
byval, if not, the client may be making assumptions that the new alloca won't
satisfy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122234 91177308-0d34-0410-b5e6-96231b3b80d8
the same as setcc. Optimize ADDC(0,0,FLAGS) -> SET_CARRY(FLAGS). This is
a step towards finishing off PR5443. In the testcase in that bug we now get:
movq %rdi, %rax
addq %rsi, %rax
sbbq %rcx, %rcx
testb $1, %cl
setne %dl
ret
instead of:
movq %rdi, %rax
addq %rsi, %rax
movl $0, %ecx
adcq $0, %rcx
testq %rcx, %rcx
setne %dl
ret
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122219 91177308-0d34-0410-b5e6-96231b3b80d8
doesn't, match it back to setb.
On a 64-bit version of the testcase before we'd get:
movq %rdi, %rax
addq %rsi, %rax
sbbb %dl, %dl
andb $1, %dl
ret
now we get:
movq %rdi, %rax
addq %rsi, %rax
setb %dl
ret
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122217 91177308-0d34-0410-b5e6-96231b3b80d8
consistently by moving it out of lowering into dag combine.
Add some missing patterns for matching away extended versions of setcc_c.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122201 91177308-0d34-0410-b5e6-96231b3b80d8
This resolves a README entry and technically resolves PR4916,
but we still get poor code for the testcase in that PR because
GVN isn't CSE'ing uadd with add, filed as PR8817.
Previously we got:
_test7: ## @test7
addq %rsi, %rdi
cmpq %rdi, %rsi
movl $42, %eax
cmovaq %rsi, %rax
ret
Now we get:
_test7: ## @test7
addq %rsi, %rdi
movl $42, %eax
cmovbq %rsi, %rax
ret
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122182 91177308-0d34-0410-b5e6-96231b3b80d8
sadd formed is half the size of the original type. We can
now compile this into a sadd.i8:
unsigned char X(char a, char b) {
int res = a+b;
if ((unsigned )(res+128) > 255U)
abort();
return res;
}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122178 91177308-0d34-0410-b5e6-96231b3b80d8
checking to see if the high bits of the original add result were dead.
Inserting a smaller add and zexting back to that size is not good enough.
This is likely to be the fix for 8816.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122177 91177308-0d34-0410-b5e6-96231b3b80d8
isel is *required* to split the edge. PHI values get evaluated
on the edge, not in their predecessor block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122170 91177308-0d34-0410-b5e6-96231b3b80d8
which have trapping constant exprs in them due to PHI nodes.
Eliminating them can cause the constant expr to be evalutated
on new paths if the input edges are critical.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122164 91177308-0d34-0410-b5e6-96231b3b80d8
It turns out that ppc backend has really weird interdependencies
over different hooks and all stuff is fragile wrt small changes.
This should fix PR8749
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122155 91177308-0d34-0410-b5e6-96231b3b80d8
may be called. If the entry block is empty, the insertion point iterator will be
the "end()" value. Calling ->getParent() on it (among others) causes problems.
Modify materializeFrameBaseRegister to take the machine basic block and insert
the frame base register at the beginning of that block. (It's very similar to
what the code does all ready. The only difference is that it will always insert
at the beginning of the entry block instead of after a previous materialization
of the frame base register. I doubt that that matters here.)
<rdar://problem/8782198>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122104 91177308-0d34-0410-b5e6-96231b3b80d8
BUILD_VECTOR operands where the element type is not legal. I had previously
changed this code to insert TRUNCATE operations, but that was just wrong.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122102 91177308-0d34-0410-b5e6-96231b3b80d8
on the DragonEgg self-host bot. Unfortunately, the testcase is pretty messy and doesn't reduce well due to
interactions with other parts of InstCombine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122072 91177308-0d34-0410-b5e6-96231b3b80d8
a null endptr argument, because they may write to errno.
This fixes a seflhost miscompile observed on Linux targets when TBAA
was enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122014 91177308-0d34-0410-b5e6-96231b3b80d8
dragonegg self-host buildbot. Original commit message:
Add an InstCombine transform to recognize instances of manual overflow-safe addition
(performing the addition in a wider type and explicitly checking for overflow), and
fold them down to intrinsics. This currently only supports signed-addition, but could
be generalized if someone works out the magic constant formulas for other operations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121965 91177308-0d34-0410-b5e6-96231b3b80d8
(performing the addition in a wider type and explicitly checking for overflow), and
fold them down to intrinsics. This currently only supports signed-addition, but could
be generalized if someone works out the magic constant formulas for other operations.
Fixes <rdar://problem/8558713>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121905 91177308-0d34-0410-b5e6-96231b3b80d8
Clang is now providing intrinsics for these and so we need to support them
in the backend. Radar 8068427.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121902 91177308-0d34-0410-b5e6-96231b3b80d8
and "save_volatiles" correctly. This completes the custom calling convention
functionality changes for the MBlaze backend that were started in 121888.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121891 91177308-0d34-0410-b5e6-96231b3b80d8
When it sees a promising select it now tries to figure out whether the condition of the select is known in any of the predecessors and if so it maps the operands appropriately.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121859 91177308-0d34-0410-b5e6-96231b3b80d8
With this we don't need the EffectiveSize field anymore. Without that field
LayoutFragment only updates offsets and we don't need to invalidate the
current fragment when it is relaxed (only the ones following it).
This is also a very small improvement in the accuracy of the layout info as
we now use the after relaxation size immediately.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121857 91177308-0d34-0410-b5e6-96231b3b80d8
Since we now don't update addresses so early, we might relax a bit more than
we need to. This is simillar to the issue in PR8467.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121856 91177308-0d34-0410-b5e6-96231b3b80d8
update the condition codes. These come from my test generator and are just
the ones that MC currently assembles correctly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121830 91177308-0d34-0410-b5e6-96231b3b80d8
regB = move RCX
regA = op regB, regC
RAX = move regA
where both regB and regC are killed. If regB is constrainted to non-compatible
physical registers but regC is not constrainted at all, then it's better to
commute the instruction.
movl %edi, %eax
shlq $32, %rcx
leaq (%rcx,%rax), %rax
=>
movl %edi, %eax
shlq $32, %rcx
orq %rcx, %rax
rdar://8762995
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121793 91177308-0d34-0410-b5e6-96231b3b80d8
which is simpler than finding a place to insert in BB.
- Don't perform the 'if condition hoisting' xform on certain
i1 PHIs, as it interferes with switch formation.
This re-fixes "example 7", without breaking the world hopefully.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121764 91177308-0d34-0410-b5e6-96231b3b80d8
first, it can kick in on blocks whose conditions have been
folded to a constant, even though one of the edges will be
trivially folded.
second, it doesn't clean up the "if diamond" that it just
eliminated away. This is a problem because other simplifycfg
xforms kick in depending on the order of block visitation,
causing pointless work.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121762 91177308-0d34-0410-b5e6-96231b3b80d8
Use the same COPY_TO_REGCLASS approach as for the 2-register *_sfp instructions.
This change made a big difference in the code generated for the
CodeGen/Thumb2/cross-rc-coalescing-2.ll test: The coalescer is still doing
a fine job, but some instructions that were previously moved outside the loop
are not moved now. It's using fewer VFP registers now, which is generally
a good thing, so I think the estimates for register pressure changed and that
affected the LICM behavior. Since that isn't obviously wrong, I've just
changed the test file. This completes the work for Radar 8711675.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121730 91177308-0d34-0410-b5e6-96231b3b80d8
when the wider type is legal. This allows us to compile:
define zeroext i16 @test1(i16 zeroext %x) nounwind {
entry:
%div = udiv i16 %x, 33
ret i16 %div
}
into:
test1: # @test1
movzwl 4(%esp), %eax
imull $63551, %eax, %eax # imm = 0xF83F
shrl $21, %eax
ret
instead of:
test1: # @test1
movw $-1985, %ax # imm = 0xFFFFFFFFFFFFF83F
mulw 4(%esp)
andl $65504, %edx # imm = 0xFFE0
movl %edx, %eax
shrl $5, %eax
ret
Implementing rdar://8760399 and example #4 from:
http://blog.regehr.org/archives/320
We should implement the same thing for [su]mul_hilo, but I don't
have immediate plans to do this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121696 91177308-0d34-0410-b5e6-96231b3b80d8
when simplifying, allowing them to be eagerly turned into switches. This
is the last step required to get "Example 7" from this blog post:
http://blog.regehr.org/archives/320
On X86, we now generate this machine code, which (to my eye) seems better
than the ICC generated code:
_crud: ## @crud
## BB#0: ## %entry
cmpb $33, %dil
jb LBB0_4
## BB#1: ## %switch.early.test
addb $-34, %dil
cmpb $58, %dil
ja LBB0_3
## BB#2: ## %switch.early.test
movzbl %dil, %eax
movabsq $288230376537592865, %rcx ## imm = 0x400000017001421
btq %rax, %rcx
jb LBB0_4
LBB0_3: ## %lor.rhs
xorl %eax, %eax
ret
LBB0_4: ## %lor.end
movl $1, %eax
ret
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121690 91177308-0d34-0410-b5e6-96231b3b80d8
class A<bit a, bits<3> x, bits<3> y> {
bits<3> z;
let z = !if(a, x, y);
}
The variable z will get the value of x when 'a' is 1 and 'y' when a is '0'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121666 91177308-0d34-0410-b5e6-96231b3b80d8
(x & 2^n) ? 2^m+C : C
we can offset both arms by C to get the "(x & 2^n) ? 2^m : 0" form, optimize the
select to a shift and apply the offset afterwards.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121609 91177308-0d34-0410-b5e6-96231b3b80d8
Alignments smaller than the total size of the memory being loaded or stored,
unless the alignment is 8 bytes, are not allowed. Add tests for this, too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121506 91177308-0d34-0410-b5e6-96231b3b80d8
cmake/modules/AddLLVM.cmake: Add empty "phony" target in add_llvm_loadable_module() even if loadable module were not supported.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121455 91177308-0d34-0410-b5e6-96231b3b80d8
the condition codes. Where the ones that do have an 's' suffix and the ones
that don't don't have the suffix. The trick is if MatchInstructionImpl() fails
we try again after adding a CCOut operand with the correct value and removing
the 's' if present. Four simple test cases added for now, lots more to come.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121401 91177308-0d34-0410-b5e6-96231b3b80d8
Otherwise, a plain str/ldr should be used instead. Make sure we account for
that in prologue/epilogue code generation.
rdar://8745460
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121391 91177308-0d34-0410-b5e6-96231b3b80d8
the output to the correct register. Fixes a hidden problem uncovered
by the last patch where we'd try to DAG combine our MVT::Other node
oddly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121358 91177308-0d34-0410-b5e6-96231b3b80d8
Added test to check bl __aeabi_read_tp gets emitted properly for ELF/ASM
as well as ELF/OBJ (including fixup)
Also added support for ELF::R_ARM_TLS_IE32
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121312 91177308-0d34-0410-b5e6-96231b3b80d8
vpush instructions to save / restore VFP / NEON registers like this:
vpush {d8,d10,d11}
vpop {d8,d10,d11}
vpush and vpop do not allow gaps in the register list.
rdar://8728956
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121197 91177308-0d34-0410-b5e6-96231b3b80d8
(if available) as we go so that we get simple constantexprs not insane ones.
This fixes the failure of clang/test/CodeGenCXX/virtual-base-ctor.cpp
that the previous iteration of this patch had.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121111 91177308-0d34-0410-b5e6-96231b3b80d8
as llc + llvm-mc. This time ELF is not changed and I tested that llvm-gcc
bootstrap on darwin10 using darwin9's assembler and linker.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121006 91177308-0d34-0410-b5e6-96231b3b80d8
memcpy's like:
memcpy(A, B)
memcpy(A, C)
we cannot delete the first memcpy as dead if A and C might be aliases.
If so, we actually get:
memcpy(A, B)
memcpy(A, A)
which is not correct to transform into:
memcpy(A, A)
This patch was heavily influenced by Jakub Staszak's patch in PR8728, thanks
Jakub!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120974 91177308-0d34-0410-b5e6-96231b3b80d8
difficult on current ARM implementations for a few reasons.
1. Even though a single vmla has latency that is one cycle shorter than a pair
of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause
additional pipeline stall. So it's frequently better to single codegen
vmul + vadd.
2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to
stall for 4 cycles. We need to schedule them apart.
3. A vmla followed vmla is a special case. Obvious issuing back to back RAW
vmla + vmla is very bad. But this isn't ideal either:
vmul
vadd
vmla
Instead, we want to expand the second vmla:
vmla
vmul
vadd
Even with the 4 cycle vmul stall, the second sequence is still 2 cycles
faster.
Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough
but it isn't the optimial solution. This patch attempts to make it possible to
use vmla / vmls in cases where it is profitable.
A. Add missing isel predicates which cause vmla to be codegen'ed.
B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to
compute a fmul and a fmla.
C. Add additional isel checks for vmla, avoid cases where vmla is feeding into
fp instructions (except for the #3 exceptional case).
D. Add ARM hazard recognizer to model the vmla / vmls hazards.
E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the
vmla / vmls will trigger one of the special hazards.
Work in progress, only A+B are enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120960 91177308-0d34-0410-b5e6-96231b3b80d8
Also add asserts that the indices are valid in InsertValueInst::init(). ExtractValueInst already asserts when constructed with invalid indices.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120956 91177308-0d34-0410-b5e6-96231b3b80d8