Triple class constructor. Only valid triples should now be used
inside LLVM - front-ends are now responsable for rejecting or
correcting invalid target triples. The Triple::normalize method
can be used to straighten out funky triples provided by users.
Give this a whirl through the buildbots to see if I caught all
places where triples enter LLVM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112470 91177308-0d34-0410-b5e6-96231b3b80d8
optional modified register (instead of reg0). Along with r112461 it will make
sure that the optional define of CPSR is marked as "def" and will thus mark the
instructions using these classes (t2ANDS*) as setting the 's' flag.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112462 91177308-0d34-0410-b5e6-96231b3b80d8
instead of PromoteMemToReg. This allows it to stop using DF and DT,
eliminating a computation of DT and DF from clang -O3. Clang is now
down to 2 runs of DomFrontier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112457 91177308-0d34-0410-b5e6-96231b3b80d8
assertingvh so we get a violent explosion if the pointer dangles.
2) Fix AliasSetTracker::deleteValue to remove call sites with
by-pointer comparisons instead of by-alias queries. Using
findAliasSetForCallSite can cause alias sets to get merged
when they shouldn't, and can also miss alias sets when the
call is readonly.
#2 fixes PR6889, which only repros with a .c file :(
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112452 91177308-0d34-0410-b5e6-96231b3b80d8
LICM correctly. When sinking an instruction, it should not add
entries for the sunk instruction to the AST, it should remove
the entry for the sunk instruction. The blocks being sunk to
are not in the loop, so their instructions shouldn't be in the
AST (yet)!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112447 91177308-0d34-0410-b5e6-96231b3b80d8
keeping them around until the pass is destroyed, keep them
around a) just when useful (not for outer loops) and b) destroy
them right after we use them. This should reduce memory use
and fixes potential bugs where a loop is deleted and another
loop gets allocated to the same address.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112446 91177308-0d34-0410-b5e6-96231b3b80d8
other filtering techniques, as those may allow it to filter
out more obviously unprofitable candidates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112441 91177308-0d34-0410-b5e6-96231b3b80d8
LSRInstance data structures up to date. This fixes some
pessimizations caused by stale data which will be exposed
in an upcoming change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112440 91177308-0d34-0410-b5e6-96231b3b80d8
all applicable addrecs before recursing on getMulExpr, instead of
recursing on getMulExpr for each one.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112433 91177308-0d34-0410-b5e6-96231b3b80d8
IR add/sub operations with one or both operands sign- or zero-extended.
Auto-upgrade the old intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112416 91177308-0d34-0410-b5e6-96231b3b80d8
of the sets is volatile. We were dropping the volatile bit of the
merged in set, leading (luckily) to assertions in cases like
PR7535. I cannot produce a testcase that repros with opt, but this
is obviously correct.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112402 91177308-0d34-0410-b5e6-96231b3b80d8
times. This patch causes llc and llvm-mc (which both default to
verbose-asm) to print out comments after a few common shuffle
instructions which indicates the shuffle mask, e.g.:
insertps $113, %xmm3, %xmm0 ## xmm0 = zero,xmm0[1,2],xmm3[1]
unpcklps %xmm1, %xmm0 ## xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
pshufd $1, %xmm1, %xmm1 ## xmm1 = xmm1[1,0,0,0]
This is carefully factored to keep the information extraction (of the
shuffle mask) separate from the printing logic. I plan to move the
extraction part out somewhere else at some point for other parts of
the x86 backend that want to introspect on the behavior of shuffles.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112387 91177308-0d34-0410-b5e6-96231b3b80d8
when the top elements of a vector are undefined. This happens all
the time for X86-64 ABI stuff because only the low 2 elements of
a 4 element vector are defined. For example, on:
_Complex float f32(_Complex float A, _Complex float B) {
return A+B;
}
We used to produce (with SSE2, SSE4.1+ uses insertps):
_f32: ## @f32
movdqa %xmm0, %xmm2
addss %xmm1, %xmm2
pshufd $16, %xmm2, %xmm2
pshufd $1, %xmm1, %xmm1
pshufd $1, %xmm0, %xmm0
addss %xmm1, %xmm0
pshufd $16, %xmm0, %xmm1
movdqa %xmm2, %xmm0
unpcklps %xmm1, %xmm0
ret
We now produce:
_f32: ## @f32
movdqa %xmm0, %xmm2
addss %xmm1, %xmm2
pshufd $1, %xmm1, %xmm1
pshufd $1, %xmm0, %xmm3
addss %xmm1, %xmm3
movaps %xmm2, %xmm0
unpcklps %xmm3, %xmm0
ret
This implements rdar://8368414
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112378 91177308-0d34-0410-b5e6-96231b3b80d8
a new EltStride variable instead of reusing NumElems variable
for a non-obvious purpose. No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112377 91177308-0d34-0410-b5e6-96231b3b80d8
According to the Microsoft documentation here:
http://msdn.microsoft.com/en-us/library/ms724284%28VS.85%29.aspx
this cast used in lib/System/Win32/Path.inc:
__int64 ft = *reinterpret_cast<__int64*>(&fi.ftLastWriteTime);
should not be done. The documentation says: "Do not cast a pointer to a
FILETIME structure to either a ULARGE_INTEGER* or __int64* value because
it can cause alignment faults on 64-bit Windows."
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112376 91177308-0d34-0410-b5e6-96231b3b80d8
Also teach this logic how to handle target specific shuffles if
needed, this is necessary while searching recursively for zeroed
scalar elements in vector shuffle operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112348 91177308-0d34-0410-b5e6-96231b3b80d8
the special values that for ARM would be used with IB or DA modes. Fall
through and consider materializing a new base address is it would be
profitable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112329 91177308-0d34-0410-b5e6-96231b3b80d8
all the other LDM/STM instructions. This fixes asm printer crashes when
compiling with -O0. I've changed one of the NEON tests (vst3.ll) to run
with -O0 to check this in the future.
Prior to this change VLDM/VSTM used addressing mode #5, but not really.
The offset field was used to hold a count of the number of registers being
loaded or stored, and the AM5 opcode field was expanded to specify the IA
or DB mode, instead of the standard ADD/SUB specifier. Much of the backend
was not aware of these special cases. The crashes occured when rewriting
a frameindex caused the AM5 offset field to be changed so that it did not
have a valid submode. I don't know exactly what changed to expose this now.
Maybe we've never done much with -O0 and NEON. Regardless, there's no longer
any reason to keep a count of the VLDM/VSTM registers, so we can use
addressing mode #4 and clean things up in a lot of places.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112322 91177308-0d34-0410-b5e6-96231b3b80d8
A = shl x, 42
...
B = lshr ..., 38
which can be transformed into:
A = shl x, 4
...
iff we can prove that the would-be-shifted-in bits
are already zero. This eliminates two shifts in the testcase
and allows eliminate of the whole i128 chain in the real example.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112314 91177308-0d34-0410-b5e6-96231b3b80d8
framework, which is good at ripping through bitfield
operations. This generalize a bunch of the existing
xforms that instcombine does, such as
(x << c) >> c -> and
to handle intermediate logical nodes. This is useful for
ripping up the "promote to large integer" code produced by
SRoA.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112304 91177308-0d34-0410-b5e6-96231b3b80d8
transformation collect all the addrecs with the same loop
add combine them at once rather than starting everything over
at the first chance.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112290 91177308-0d34-0410-b5e6-96231b3b80d8
computation can be truncated if it is fed by a sext/zext that doesn't
have to be exactly equal to the truncation result type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112285 91177308-0d34-0410-b5e6-96231b3b80d8
by the SRoA "promote to large integer" code, eliminating
some type conversions like this:
%94 = zext i16 %93 to i32 ; <i32> [#uses=2]
%96 = lshr i32 %94, 8 ; <i32> [#uses=1]
%101 = trunc i32 %96 to i8 ; <i8> [#uses=1]
This also unblocks other xforms from happening, now clang is able to compile:
struct S { float A, B, C, D; };
float foo(struct S A) { return A.A + A.B+A.C+A.D; }
into:
_foo: ## @foo
## BB#0: ## %entry
pshufd $1, %xmm0, %xmm2
addss %xmm0, %xmm2
movdqa %xmm1, %xmm3
addss %xmm2, %xmm3
pshufd $1, %xmm1, %xmm0
addss %xmm3, %xmm0
ret
on x86-64, instead of:
_foo: ## @foo
## BB#0: ## %entry
movd %xmm0, %rax
shrq $32, %rax
movd %eax, %xmm2
addss %xmm0, %xmm2
movapd %xmm1, %xmm3
addss %xmm2, %xmm3
movd %xmm1, %rax
shrq $32, %rax
movd %eax, %xmm0
addss %xmm3, %xmm0
ret
This seems pretty close to optimal to me, at least without
using horizontal adds. This also triggers in lots of other
code, including SPEC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112278 91177308-0d34-0410-b5e6-96231b3b80d8
Update all the tests using those intrinsics and add support for
auto-upgrading bitcode files with the old versions of the intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112271 91177308-0d34-0410-b5e6-96231b3b80d8
return to avoid needing two calls to test for equivalence, and sort
addrecs by their degree before examining their operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112267 91177308-0d34-0410-b5e6-96231b3b80d8
virtual base registers handle this function, and more. A bit more cleanup
to do on the interface to eliminateFrameIndex() after this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112237 91177308-0d34-0410-b5e6-96231b3b80d8
still having a significant effect. It shouldn't be now that the pre-RA
virtual base reg stuff is in. Assuming that's valididated by the nightly
testers, we can simplify a lot of the PEI frame index code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112220 91177308-0d34-0410-b5e6-96231b3b80d8
fix: add a flag to MapValue and friends which indicates whether
any module-level mappings are being made. In the common case of
inlining, no module-level mappings are needed, so MapValue doesn't
need to examine non-function-local metadata, which can be very
expensive in the case of a large module with really deep metadata
(e.g. a large C++ program compiled with -g).
This flag is a little awkward; perhaps eventually it can be moved
into the ClonedCodeInfo class.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112190 91177308-0d34-0410-b5e6-96231b3b80d8
comparison with 0. These two pieces of code should give identical results:
rsbs r1, r1, 0
cmp r0, r1
mov r0, #0
it ls
mov r0, #1
and:
cmn r0, r1
mov r0, #0
it ls
mov r0, #1
However, the CMN gives the *opposite* result when r1 is 0. This is because the
carry flag is set in the CMP case but not in the CMN case. In short, the CMP
instruction doesn't perform a truncate of the (logical) NOT of 0 plus the value
of r0 and the carry bit (because the "carry bit" parameter to AddWithCarry is
defined as 1 in this case, the carry flag will always be set when r0 >= 0). The
CMN instruction doesn't perform a NOT of 0 so there is never a "carry" when this
AddWithCarry is performed (because the "carry bit" parameter to AddWithCarry is
defined as 0).
The AddWithCarry in the CMP case seems to be relying upon the identity:
~x + 1 = -x
However when x is 0 and unsigned, this doesn't hold:
x = 0
~x = 0xFFFF FFFF
~x + 1 = 0x1 0000 0000
(-x = 0) != (0x1 0000 0000 = ~x + 1)
Therefore, we should disable *all* versions of CMN, especially when comparing
against zero, until we can limit when the CMN instruction is used (when we know
that the RHS is not 0) or when we have a hardware fix for this.
(See the ARM docs for the "AddWithCarry" pseudo-code.)
This is related to <rdar://problem/7569620>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112176 91177308-0d34-0410-b5e6-96231b3b80d8
with the VST4 instructions. Until after register allocation, we want to
represent sets of adjacent registers by a single super-register. These
VST4 pseudo instructions have a single QQ or QQQQ source register operand.
They get expanded to the real VST4 instructions with 4 separate D register
operands. Once this conversion is complete, we'll be able to remove the
NEONPreAllocPass and avoid some fragile and hacky code elsewhere.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112108 91177308-0d34-0410-b5e6-96231b3b80d8
expanding: e.g. <2 x float> -> <4 x float> instead of -> 2 floats. This
affects two places in the code: handling cross block values and handling
function return and arguments. Since vectors are already widened by
legalizetypes, this gives us much better code and unblocks x86-64 abi
and SPU abi work.
For example, this (which is a silly example of a cross-block value):
define <4 x float> @test2(<4 x float> %A) nounwind {
%B = shufflevector <4 x float> %A, <4 x float> undef, <2 x i32> <i32 0, i32 1>
%C = fadd <2 x float> %B, %B
br label %BB
BB:
%D = fadd <2 x float> %C, %C
%E = shufflevector <2 x float> %D, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef>
ret <4 x float> %E
}
Now compiles into:
_test2: ## @test2
## BB#0:
addps %xmm0, %xmm0
addps %xmm0, %xmm0
ret
previously it compiled into:
_test2: ## @test2
## BB#0:
addps %xmm0, %xmm0
pshufd $1, %xmm0, %xmm1
## kill: XMM0<def> XMM0<kill> XMM0<def>
insertps $0, %xmm0, %xmm0
insertps $16, %xmm1, %xmm0
addps %xmm0, %xmm0
ret
This implements rdar://8230384
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112101 91177308-0d34-0410-b5e6-96231b3b80d8
from MDValueList between each function, now that the bitcode
writer is reusing the index space for function-local metadata.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112082 91177308-0d34-0410-b5e6-96231b3b80d8
comparison that would overflow.
- The other under/overflow cases can't actually happen because the immediates
which would trigger them are legal (so we don't enter this code), but
adjusted the style to make it clear the transform is always valid.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112053 91177308-0d34-0410-b5e6-96231b3b80d8
Mark _alloca call as clobberring EFLAGS, otherwise some DCE might remove
other flags-clobberring stuff (e.g. cmp instructions) occuring after
_alloca call.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112034 91177308-0d34-0410-b5e6-96231b3b80d8
any load in the default address space that completes implies that the base value that it GEP'd from
was not null.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112015 91177308-0d34-0410-b5e6-96231b3b80d8
for MDNodes, since this information is effectively implied by
the operands. This allow allows the code to avoid doing a
recursive is-it-really-function-local check in some cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111995 91177308-0d34-0410-b5e6-96231b3b80d8
needed parsing for the .loc directive and saves the current info from that
into the context. The next patch will take the current loc info after an
instruction is assembled and save that info into a vector for each section for
use to build the line number tables. The patch after that will encode the info
from those vectors into the output file as the dwarf line tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111956 91177308-0d34-0410-b5e6-96231b3b80d8
For now it's still a command line option, but the interface to the generic
code doesn't need to know that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111942 91177308-0d34-0410-b5e6-96231b3b80d8
which does the same thing. This eliminates redundant code and
handles MDNodes better. MDNode linking still doesn't fully
work yet though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111941 91177308-0d34-0410-b5e6-96231b3b80d8
- Implemented by manually splicing the tokens. If this turns out to be
problematically platform specific, a more elegant solution would be to
implement some context dependent lexing support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111934 91177308-0d34-0410-b5e6-96231b3b80d8
Intended to help ease reproducing problems by increasing base register usage
after heuristics for only using the when needed are in place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111930 91177308-0d34-0410-b5e6-96231b3b80d8
that it avoids a lot of unnecessary cloning by avoiding remapping
MDNode cycles when none of the nodes in the cycle actually need to
be remapped. Also it uses the new temporary MDNode mechanism.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111922 91177308-0d34-0410-b5e6-96231b3b80d8