This patch will optimize the following cases:
sub r1, r3 | sub r1, imm
cmp r3, r1 or cmp r1, r3 | cmp r1, imm
bge L1
TO
subs r1, r3
bge L1 or ble L1
If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.
rdar: 10734411
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156599 91177308-0d34-0410-b5e6-96231b3b80d8
This patch will optimize the following cases:
sub r1, r3 | sub r1, imm
cmp r3, r1 or cmp r1, r3 | cmp r1, imm
bge L1
TO
subs r1, r3
bge L1 or ble L1
If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.
rdar: 10734411
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156550 91177308-0d34-0410-b5e6-96231b3b80d8
The getPointerRegClass() hook can return register classes that depend on
the calling convention of the current function (ptr_rc_tailcall).
So far, we have been able to infer the calling convention from the
subtarget alone, but as we add support for multiple calling conventions
per target, that no longer works.
Patch by Yiannis Tsiouris!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156328 91177308-0d34-0410-b5e6-96231b3b80d8
This will be used to determine whether it's profitable to turn a select into a
branch when the branch is likely to be predicted.
Currently enabled for everything but Atom on X86 and Cortex-A9 devices on ARM.
I'm not entirely happy with the name of this flag, suggestions welcome ;)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156233 91177308-0d34-0410-b5e6-96231b3b80d8
This moves the logic for selecting a TLS model to a single place,
instead of the previous three (ARM, Mips, and X86 which already
uses this function).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156162 91177308-0d34-0410-b5e6-96231b3b80d8
for the assembler and disassembler. Which were not being set/read correctly
for offsets greater than 22 bits in some cases.
Changes to lib/Target/ARM/ARMAsmBackend.cpp from Gideon Myles!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156118 91177308-0d34-0410-b5e6-96231b3b80d8
Expressions for movw/movt don't always have an :upper16: or :lower16:
on them and that's ok. When they don't, it's just a plain [0-65536]
immediate result, effectively the same as a :lower16: variant kind.
rdar://10550147
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155941 91177308-0d34-0410-b5e6-96231b3b80d8
The TargetPassManager's default constructor wants to initialize the PassManager
to 'null'. But it's illegal to bind a null reference to a null l-value. Make the
ivar a pointer instead.
PR12468
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155902 91177308-0d34-0410-b5e6-96231b3b80d8
Replace some assert() calls w/ actual diagnostics. In a perfect world,
there'd be range checks on these values long before things ever reached
this code. For now, though, issuing a better-late-than-never diagnostic
is still a big improvement over assert().
rdar://11347287
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155851 91177308-0d34-0410-b5e6-96231b3b80d8
This was exposed by SingleSource/UnitTests/Vector/constpool.c.
The computed size of a basic block isn't always a multiple of its known
alignment, and that can introduce extra alignment padding after the
block.
<rdar://problem/11347135>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155845 91177308-0d34-0410-b5e6-96231b3b80d8
ARM BUILD_VECTORs created after type legalization cannot use i8 or i16
operands, since those types are not legal. Instead use i32 operands, which
will be implicitly truncated by the BUILD_VECTOR to match the element type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155824 91177308-0d34-0410-b5e6-96231b3b80d8
The code could search past the end of the basic block when there was
already a constant pool entry after the block.
Test case with giant basic block in SingleSource/UnitTests/Vector/constpool.c
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155753 91177308-0d34-0410-b5e6-96231b3b80d8
Make sure when parsing the Thumb1 sp+register ADD instruction that
the source and destination operands match. In thumb2, just use the
wide encoding if they don't. In Thumb1, issue a diagnostic.
rdar://11219154
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155748 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, ARMConstantIslandPass would conservatively compute the
address of an aligned basic block as:
RoundUpToAlignment(Offset + UnknownPadding)
This worked fine for the layout algorithm itself, but it could fool the
verify() function because it accounts for alignment padding twice: Once
when adding the worst case UnknownPadding, and again by rounding up the
fictional block offset. This meant that when optimizeThumb2Instructions
would shrink an instruction, the conservative distance estimate could
grow. That shouldn't be possible since the woorst case alignment padding
wss already included.
This patch drops the use of RoundUpToAlignment, and depends only on
worst case padding to compute conservative block offsets. This has the
weird effect that the computed offset for an aligned block may not be
aligned.
The important difference is that shrinking an instruction can never
cause the estimated distance between two instructions to grow. The
estimated distance is always larger than the real distance that only the
assembler knows.
<rdar://problem/11339352>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155744 91177308-0d34-0410-b5e6-96231b3b80d8
The base address for the PC-relative load is Align(PC,4), so it's the
address of the word containing the 16-bit instruction, not the address
of the instruction itself. Ugh.
rdar://11314619
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155659 91177308-0d34-0410-b5e6-96231b3b80d8
On some cores it's a bad idea for performance to mix VFP and NEON instructions
and since these patterns are NEON anyway, the NEON load should be used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155630 91177308-0d34-0410-b5e6-96231b3b80d8
the feature set of v7a. This comes about if the user specifies something like
-arch armv7 -mcpu=cortex-m3. We shouldn't be generating instructions such as
uxtab in this case.
rdar://11318438
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155601 91177308-0d34-0410-b5e6-96231b3b80d8
When an instruction match is found, but the subtarget features it
requires are not available (missing floating point unit, or thumb vs arm
mode, for example), issue a diagnostic that identifies what the feature
mismatch is.
rdar://11257547
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155499 91177308-0d34-0410-b5e6-96231b3b80d8
on X86 Atom. Some of our tests failed because the tail merging part of
the BranchFolding pass was creating new basic blocks which did not
contain live-in information. When the anti-dependency code in the Post-RA
scheduler ran, it would sometimes rename the register containing
the function return value because the fact that the return value was
live-in to the subsequent block had been lost. To fix this, it is necessary
to run the RegisterScavenging code in the BranchFolding pass.
This patch makes sure that the register scavenging code is invoked
in the X86 subtarget only when post-RA scheduling is being done.
Post RA scheduling in the X86 subtarget is only done for Atom.
This patch adds a new function to the TargetRegisterClass to control
whether or not live-ins should be preserved during branch folding.
This is necessary in order for the anti-dependency optimizations done
during the PostRASchedulerList pass to work properly when doing
Post-RA scheduling for the X86 in general and for the Intel Atom in particular.
The patch adds and invokes the new function trackLivenessAfterRegAlloc()
instead of using the existing requiresRegisterScavenging().
It changes BranchFolding.cpp to call trackLivenessAfterRegAlloc() instead of
requiresRegisterScavenging(). It changes the all the targets that
implemented requiresRegisterScavenging() to also implement
trackLivenessAfterRegAlloc().
It adds an assertion in the Post RA scheduler to make sure that post RA
liveness information is available when it is needed.
It changes the X86 break-anti-dependencies test to use –mcpu=atom, in order
to avoid running into the added assertion.
Finally, this patch restores the use of anti-dependency checking
(which was turned off temporarily for the 3.1 release) for
Intel Atom in the Post RA scheduler.
Patch by Andy Zhang!
Thanks to Jakob and Anton for their reviews.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155395 91177308-0d34-0410-b5e6-96231b3b80d8