types for N32 ABI. Test case will be updated after the patch that fixes
TargetLowering::getPICJumpTableRelocBase is checked in.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154036 91177308-0d34-0410-b5e6-96231b3b80d8
A MOVCCr instruction can be commuted by inverting the condition. This
can help reduce register pressure and remove unnecessary copies in some
cases.
<rdar://problem/11182914>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154033 91177308-0d34-0410-b5e6-96231b3b80d8
This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154011 91177308-0d34-0410-b5e6-96231b3b80d8
This patch allows llvm to recognize that a 64 bit object file is being produced
and that the subsequently generated ELF header has the correct information.
The test case checks for both big and little endian flavors.
Patch by Jack Carter.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153889 91177308-0d34-0410-b5e6-96231b3b80d8
The 440 and A2 cores have detailed itineraries, and this allows them to be
fully used to maximize throughput.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153845 91177308-0d34-0410-b5e6-96231b3b80d8
Post-RA scheduling gives a significant performance improvement on
the embedded cores, so turn it on. Using full anti-dep. breaking is
important for FP-intensive blocks, so turn it on (just on the
embedded cores for now; this should also be good on the 970s because
post-ra scheduling is all that we have for now, but that should have
more testing first).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153843 91177308-0d34-0410-b5e6-96231b3b80d8
This adds a full itinerary for IBM's PPC64 A2 embedded core. These
cores form the basis for the CPUs in the new IBM BG/Q supercomputer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153842 91177308-0d34-0410-b5e6-96231b3b80d8
Loads and stores can have different pipeline behavior, especially on
embedded chips. This change allows those differences to be expressed.
Except for the 440 scheduler, there are no functionality changes.
On the 440, the latency adjustment is only by one cycle, and so this
probably does not affect much. Nevertheless, it will make a larger
difference in the future and this removes a FIXME from the 440 itin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153821 91177308-0d34-0410-b5e6-96231b3b80d8
Dynamic linking on PPC64 has had problems since we had to move the top-down
hazard-detection logic post-ra. For dynamic linking to work there needs to be
a nop placed after every call. It turns out that it is really hard to guarantee
that nothing will be placed in between the call (bl) and the nop during post-ra
scheduling. Previous attempts at fixing this by placing logic inside the
hazard detector only partially worked.
This is now fixed in a different way: call+nop codegen-only instructions. As far
as CodeGen is concerned the pair is now a single instruction and cannot be split.
This solution works much better than previous attempts.
The scoreboard hazard detector is also renamed to be more generic, there is currently
no cpu-specific logic in it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153816 91177308-0d34-0410-b5e6-96231b3b80d8
ARMConstantIslandPass still has bugs where jump table compression can
cause constant pool entries to go out of range.
Add a safety margin of 2 bytes when placing constant islands, but use
the real max displacement for verification.
<rdar://problem/11156595>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153789 91177308-0d34-0410-b5e6-96231b3b80d8
The 8-bit payload is not contiguous in the opcode. Move the upper nibble
over 4 bits into the correct place.
rdar://11158641
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153780 91177308-0d34-0410-b5e6-96231b3b80d8
When an immediate is both a value [t2_]so_imm and a [t2_]so_imm_neg,
we want to use the non-negated form to make sure we prefer the normal
encoding, not the aliased encoding via the negation of, e.g., 'cmp.w'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153770 91177308-0d34-0410-b5e6-96231b3b80d8
Make the non-tied register operand names line up with what the base
class encoding handler expects.
rdar://11157236
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153766 91177308-0d34-0410-b5e6-96231b3b80d8
For 'adds r2, r2, #56' outside of an IT block, the 16-bit encoding T2
can be used for this syntax. Prefer the narrow encoding when possible.
rdar://11156277
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153759 91177308-0d34-0410-b5e6-96231b3b80d8
This pass splits basic blocks to insert constant islands, and it
doesn't recompute the live-in lists. No later passes depend on accurate
liveness information.
This fixes PR12410 where the machine code verifier was complaining.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153700 91177308-0d34-0410-b5e6-96231b3b80d8
We are sometimes allocatinog from the DPair register class which
contains odd-even pairs in addition to the Q registers.
Place the Q registers first in the DPair allocation order as they can be
copied with a single instruction. The odd-even pairs should only be
allocated as a last resort.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153699 91177308-0d34-0410-b5e6-96231b3b80d8
The CMP->CMN alias was matching for an immediate of zero when it
should only match for negative values.
rdar://11129224
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153689 91177308-0d34-0410-b5e6-96231b3b80d8
ARM recently gained DPair, DTriple, and DQuad register classes.
Update copyPhysReg() to handle copies in these register classes.
No test case, it is difficult to make the register allocator emit the
odd copies reliably. The missing DPair copy caused a failure on
partialsums in the nightly test suite.
<rdar://problem/11147997>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153686 91177308-0d34-0410-b5e6-96231b3b80d8
This is a code change to add support for changing instruction sequences of the form:
load
inc/dec of 8/16/32/64 bits
store
into the appropriate X86 inc/dec through memory instruction:
inc[qlwb] / dec[qlwb]
The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better
named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode. The comments have also been expanded.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153635 91177308-0d34-0410-b5e6-96231b3b80d8
This is a code change to add support for changing instruction sequences of the form:
load
inc/dec of 8/16/32/64 bits
store
into the appropriate X86 inc/dec through memory instruction:
inc[qlwb] / dec[qlwb]
The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better
named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode. The comments have also been expanded.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153617 91177308-0d34-0410-b5e6-96231b3b80d8
Some targets still mess up the liveness information, but that isn't
verified after MRI->invalidateLiveness().
The verifier can still check other useful things like register classes
and CFG, so it should be enabled after all passes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153615 91177308-0d34-0410-b5e6-96231b3b80d8
When an strd instruction doesn't get the registers it wants, it can be
expanded into two str instructions. Make sure the first str doesn't kill
the base register in the case where the base and data registers are
identical:
t2STRi12 %R0<kill>, %R0, 4, pred:14, pred:%noreg
t2STRi12 %R2<kill>, %R0, 8, pred:14, pred:%noreg
<rdar://problem/11101911>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153611 91177308-0d34-0410-b5e6-96231b3b80d8
When a number of sub-register VLRDS instructions are combined into a
VLDM, preserve any super-register implicit defs. This is required to
keep the register scavenger and machine code verifier happy.
Enable machine code verification after ARMLoadStoreOptimizer.
ARM/2012-01-26-CopyPropKills.ll was failing because of this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153610 91177308-0d34-0410-b5e6-96231b3b80d8
The arm_neon intrinsics can create virtual registers from the DPair
register class which allows both even-odd and odd-even D-register pairs.
This fixes PR12389.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153603 91177308-0d34-0410-b5e6-96231b3b80d8
Revert r153519: "ARMLoadStoreOptimizer invalidates register liveness."
These patches caused miscompilations in povray by turning off branch
folding's updating of live-in lists.
It turns out the the late scheduler depends on the live-in lists, even
if it doesn't need correct kill flags.
<rdar://problem/11139228>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153593 91177308-0d34-0410-b5e6-96231b3b80d8
imposes a constraint that GOT16 referring to a local symbol or HI16 has to be
followed immediately by a matching LO16 relocation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153553 91177308-0d34-0410-b5e6-96231b3b80d8
them as machine instructions. Directives ".set noat" and ".set at" are now
emitted only at the beginning and end of a function except in the case where
they are emitted to enclose .cpload with an immediate operand that doesn't fit
in 16-bit field or unaligned load/stores.
Also, make the following changes:
- Remove function isUnalignedLoadStore and use a switch-case statement to
determine whether an instruction is an unaligned load or store.
- Define helper function CreateMCInst which generates an instance of an MCInst
from an opcode and a list of operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153552 91177308-0d34-0410-b5e6-96231b3b80d8
If EmitNOAT is true, directives ".set noat" and ".set at" are emitted at the
beginning and end of a function.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153528 91177308-0d34-0410-b5e6-96231b3b80d8
This pass tries to update kill flags, but there are still many bugs.
Passes after the load/store optimizer don't need accurate liveness, so
don't even try.
<rdar://problem/11101911>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153519 91177308-0d34-0410-b5e6-96231b3b80d8
MachinePointerInfo when getStore is called to create a node that stores an
argument passed in register to the stack. Without this change, the post RA
scheduler will fail to discover the dependencies between the stores
instructions and the instructions that load from a structure passed by value.
The link to the related discussion is here:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-March/048055.html
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153499 91177308-0d34-0410-b5e6-96231b3b80d8
set it in MipsMCCodeEmitter::getMachineOpValue. Assert in getMachineOpValue if
MachineOperand MO is of an unexpected type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153494 91177308-0d34-0410-b5e6-96231b3b80d8
produces a 32-bit immediate which is consumed by the use. It tries to
fold the immediate by breaking it into two parts and fold them into the
immmediate fields of two uses. e.g
movw r2, #40885
movt r3, #46540
add r0, r0, r3
=>
add.w r0, r0, #3019898880
add.w r0, r0, #30146560
;
However, this transformation is incorrect if the user produces a flag. e.g.
movw r2, #40885
movt r3, #46540
adds r0, r0, r3
=>
add.w r0, r0, #3019898880
adds.w r0, r0, #30146560
Note the adds.w may not set the carry flag even if the original sequence
would.
rdar://11116189
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153484 91177308-0d34-0410-b5e6-96231b3b80d8
The PPC64 SVR4 ABI requires integer stack arguments, and thus the var. args., that
are smaller than 64 bits be zero extended to 64 bits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153373 91177308-0d34-0410-b5e6-96231b3b80d8
Code such as:
%vreg100 = setcc %vreg10, -1, SETNE
brcond %vreg10, %tgt
was being incorrectly morphed into
%vreg100 = and %vreg10, 1
brcond %vreg10, %tgt
where the 'and' instruction could be eliminated since
such logic is on 1-bit types in the PTX back-end, leaving
us with just:
brcond %vreg10, %tgt
which essentially gives us inverted branch conditions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153364 91177308-0d34-0410-b5e6-96231b3b80d8
These changes allow us to compile big endian from the command line for 32 bit
Mips targets. This patch will result in code and data actually being produced
in the correct endianess.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153153 91177308-0d34-0410-b5e6-96231b3b80d8
ARMBaseRegisterInfo::canRealignStack was checking for variable-sized objects
but not for stack adjustments around calls. Use hasReservedCallFrame() to
check for both. The hasBasePointer function was already correctly checking
both conditions, so the effect of this was that a base pointer would be used
without checking whether the base pointer register could be reserved. I don't
have a small testcase for this.
<rdar://problem/11075906>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153110 91177308-0d34-0410-b5e6-96231b3b80d8
ARMFrameLowering::hasReservedCallFrame is already checking for variable
sized objects, so there's no point in checking it twice.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153109 91177308-0d34-0410-b5e6-96231b3b80d8
This results in things such as
vmovups 16(%rdi), %xmm0
vinsertf128 $1, %xmm0, %ymm0, %ymm0
to be combined to
vinsertf128 $1, 16(%rdi), %ymm0, %ymm0
rdar://11076953
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153092 91177308-0d34-0410-b5e6-96231b3b80d8
X86InstrCompiler.td.
It also adds –mcpu-generic to the legalize-shift-64.ll test so the test
will pass if run on an Intel Atom CPU, which would otherwise
produce an instruction schedule which differs from that which the test expects.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153033 91177308-0d34-0410-b5e6-96231b3b80d8
fast-isel before emitting code. If the program bails after code was emitted,
then it could lead to the stack being adjusted more than once (two
CALLSEQ_BEGINs emitted) but being adjuste back only once after the call. This
leads to general badness and gnashing of teeth.
<rdar://problem/11050630>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@152959 91177308-0d34-0410-b5e6-96231b3b80d8
It's not a good style idea, as the registers will be laid down in memory in
numerical order, not the order they're in the list, but it's legal. vldm/vstm
are stricter.
rdar://11064740
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@152943 91177308-0d34-0410-b5e6-96231b3b80d8
This results in things such as
vmovaps -96(%rbx), %xmm1
vinsertf128 $1, %xmm1, %ymm0, %ymm0
to be combined to
vinsertf128 $1, -96(%rbx), %ymm0, %ymm0
rdar://10643481
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@152762 91177308-0d34-0410-b5e6-96231b3b80d8
instruction's destination operand like it does for the source operand.
Also fix a typo in the comment for X86AsmParser::isSrcOp().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@152654 91177308-0d34-0410-b5e6-96231b3b80d8