Like nearbyint, rint can be implemented on PPC using the frin instruction. The
complication comes from the fact that rint needs to set the FE_INEXACT flag
when the result does not equal the input value (and frin does not do that). As
a result, we use a custom inserter which, after the rounding, compares the
rounded value with the original, and if they differ, explicitly sets the XX bit
in the FPSCR register (which corresponds to FE_INEXACT).
Once LLVM has better modeling of the floating-point environment we should be
able to (often) eliminate this extra complexity.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178362 91177308-0d34-0410-b5e6-96231b3b80d8
These instructions are available on the P5x (and later) and on the A2. They
implement the standard floating-point rounding operations (floor, trunc, etc.).
One caveat: frin (round to nearest) does not implement "ties to even", and so
is only enabled in fast-math mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178337 91177308-0d34-0410-b5e6-96231b3b80d8
Mips assembler supports macros that allows the OR instruction
to have an immediate parameter. This patch adds an instruction
alias that converts this macro into a Mips ORI instruction.
Contributer: Vladimir Medic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178316 91177308-0d34-0410-b5e6-96231b3b80d8
- RDRAND always clears the destination value when a random value is not
available (i.e. CF == 0). This value is truncated or zero-extended as
the false boolean value to be returned. Boolean simplification needs
to skip this 'zext' or 'trunc' node.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178312 91177308-0d34-0410-b5e6-96231b3b80d8
Mips assembler allows following to be used as aliased instructions:
jal $rs for jalr $rs
jal $rd,$rd for jalr $rd,$rs
This patch provides alias definitions in td files and test cases to show the usage.
Contributer: Vladimir Medic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178304 91177308-0d34-0410-b5e6-96231b3b80d8
Compiling in 32-bit mode on a P7 would assert after 64-bit DAG combines were
added for bswap with load/store. This is because these combines are really only
valid in 64-bit mode, regardless of the CPU (and this was not being checked).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178286 91177308-0d34-0410-b5e6-96231b3b80d8
Since we handle optimizable objc_retainBlocks through strength reduction
in OptimizableIndividualCalls, we know that all code after that point
will only see non-optimizable objc_retainBlock calls. IsForwarding is
only called by functions after that point, so it is ok to just classify
objc_retainBlock as non-forwarding.
<rdar://problem/13249661>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178285 91177308-0d34-0410-b5e6-96231b3b80d8
If an objc_retainBlock has the copy_on_escape metadata attached to it
AND if the block pointer argument only escapes down the stack, we are
allowed to strength reduce the objc_retainBlock to to an objc_retain and
thus optimize it.
Current there is logic in the ARC data flow analysis to handle
this case which is complicated and involved making distinctions in
between objc_retainBlock and objc_retain in certain places and
considering them the same in others.
This patch simplifies said code by:
1. Performing the strength reduction in the initial ARC peephole
analysis (ObjCARCOpts::OptimizeIndividualCalls).
2. Changes the ARC dataflow analysis (which runs after the peephole
analysis) to consider all objc_retainBlock calls to not be optimizable
(since if the call was optimizable, we would have strength reduced it
already).
This patch leaves in the infrastructure in the ARC dataflow analysis to
handle this case, which due to 2 will just be dead code. I am doing this
on purpose to separate the removal of the old code from the testing of
the new code.
<rdar://problem/13249661>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178284 91177308-0d34-0410-b5e6-96231b3b80d8
These are 64-bit load/store with byte-swap, and available on the P7 and the A2.
Like the similar instructions for 16- and 32-bit words, these are matched in the
target DAG-combine phase against load/store-bswap pairs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178276 91177308-0d34-0410-b5e6-96231b3b80d8
PPC ISA 2.06 (P7, A2, etc.) has a popcntd instruction. Add this instruction and
tell TTI about it so that popcount-loop recognition will know about it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178233 91177308-0d34-0410-b5e6-96231b3b80d8
There were a few places where kill flags were not being set correctly, and
where 32-bit instruction variants were being used with 64-bit registers. After
r178180, this code was being triggered causing llc to assert.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178220 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit 342d92c7a0adeabc9ab00f3f0d88d739fe7da4c7.
Turns out we're going with a different schema design to represent
DW_TAG_imported_modules so we won't need this extra field.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178215 91177308-0d34-0410-b5e6-96231b3b80d8
form of call in preference to memory indirect on Atom.
In this case, the patch applies the optimization to the code for reloading
spilled registers.
The patch also includes changes to sibcall.ll and movgs.ll, which were
failing on the Atom buildbot after the first patch was applied.
This patch by Sriram Murali.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178193 91177308-0d34-0410-b5e6-96231b3b80d8
Made sure we were looking a correct section
Added Mips32/64 as an extra check
Updated llvm-objdump to generate symbolic info for Mips relocations
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178190 91177308-0d34-0410-b5e6-96231b3b80d8
indirect through a memory address is to load the memory address into
a register and then call indirect through the register.
This patch implements this improvement by modifying SelectionDAG to
force a function address which is a memory reference to be loaded
into a virtual register.
Patch by Sriram Murali.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178171 91177308-0d34-0410-b5e6-96231b3b80d8
6 more piglit tests.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178145 91177308-0d34-0410-b5e6-96231b3b80d8
It seems that the Darwin PPC assembler requires r0 to be written as 0 when it
means 0 (at least in lwarx/stwcx.). Fixes PR15605.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178142 91177308-0d34-0410-b5e6-96231b3b80d8
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178127 91177308-0d34-0410-b5e6-96231b3b80d8
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178126 91177308-0d34-0410-b5e6-96231b3b80d8
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178125 91177308-0d34-0410-b5e6-96231b3b80d8
The R0 register can now be allocated because instructions
that cannot use R0 as a GPR have been appropriately marked.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178123 91177308-0d34-0410-b5e6-96231b3b80d8
Some implementation detail in the forgotten past required the link
register to be placed in the GPRC and G8RC register classes. This is
just wrong on the face of it, and causes several extra intersection
register classes to be generated. I found this was having evil
effects on instruction scheduling, by causing the wrong register class
to be consulted for register pressure decisions.
No code generation changes are expected, other than some minor changes
in instruction order. Seven tests in the test bucket required minor
tweaks to adjust to the new normal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178114 91177308-0d34-0410-b5e6-96231b3b80d8
The test was removed since I had not turned off the test during release
builds. This fails since ARC annotations support is conditionally
compiled out during release builds. I added the proper requires header
to assuage this issue.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178101 91177308-0d34-0410-b5e6-96231b3b80d8
This is just the basic groundwork for supporting DW_TAG_imported_module but I
wanted to commit this before pushing support further into Clang or LLVM so that
this rather churny change is isolated from the rest of the work. The major
churn here is obviously adding another field (within the common DIScope prefix)
to all DIScopes (files, classes, namespaces, lexical scopes, etc). This should
be the last big churny change needed for DW_TAG_imported_module/using directive
support/PR14606.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178099 91177308-0d34-0410-b5e6-96231b3b80d8
As Bill Schmidt pointed out to me, only on Darwin do we need to spill/restore
VRSAVE in the SjLj code. For non-Darwin, don't spill/restore VRSAVE (and I've
added some asserts to make sure that we're not).
As it turns out, we're not currently handling the Darwin case correctly (I've
added a FIXME in the test case). I've tried adding various implied register
definitions/uses to force the spill without success, so I'll need to address
this later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178096 91177308-0d34-0410-b5e6-96231b3b80d8
All Intel CPUs since Yonah look a lot alike, at least at the granularity
of the scheduling models. We can add more accurate models for
processors that aren't Sandy Bridge if required. Haswell will probably
need its own.
The Atom processor and anything based on NetBurst is completely
different. So are the non-Intel chips.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178080 91177308-0d34-0410-b5e6-96231b3b80d8
Now that the register scavenger can support multiple spill slots, and PEI can
use virtual-register-based scavenging for multiple simultaneous registers, we
can use a virtual register for the transfer register in the CR spilling code.
This should eliminate the last place (outside of the prologue/epilogue) where
we depend on the unconditional availability of the r0 register. We will soon be
able to allocate it (in a somewhat restricted sense) as a GPR.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178060 91177308-0d34-0410-b5e6-96231b3b80d8
The previous algorithm could not deal properly with scavenging multiple virtual
registers because it kept only one live virtual -> physical mapping (and
iterated through operands in order). Now we don't maintain a current mapping,
but rather use replaceRegWith to completely remove the virtual register as
soon as the mapping is established.
In order to allow the register scavenger to return a physical register killed
by an instruction for definition by that same instruction, we now call
RS->forward(I) prior to eliminating virtual registers defined in I. This
requires a minor update to forward to ignore virtual registers.
These new features will be tested in forthcoming commits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178058 91177308-0d34-0410-b5e6-96231b3b80d8
- 'prefetch' intrinsics are only lowered when SSE is available. On non-X86
builds, 'generic' CPU is used and stops lowering any prefetch intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178046 91177308-0d34-0410-b5e6-96231b3b80d8