Currently, we canonicalize shuffles that produce a result larger than
their operands with:
shuffle(concat(v1, undef), concat(v2, undef))
->
shuffle(concat(v1, v2), undef)
because we can access quad vectors (see PerformVECTOR_SHUFFLECombine).
This is useful in the general case, but there are special cases where
native shuffles produce larger results: the two-result ops.
We can look through the concat when lowering them:
shuffle(concat(v1, v2), undef)
->
concat(VZIP(v1, v2):0, :1)
This lets us generate the native shuffles instead of scalarizing to
dozens of VMOVs.
Differential Revision: http://reviews.llvm.org/D10424
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240118 91177308-0d34-0410-b5e6-96231b3b80d8
The personality routine currently lives in the LandingPadInst.
This isn't desirable because:
- All LandingPadInsts in the same function must have the same
personality routine. This means that each LandingPadInst beyond the
first has an operand which produces no additional information.
- There is ongoing work to introduce EH IR constructs other than
LandingPadInst. Moving the personality routine off of any one
particular Instruction and onto the parent function seems a lot better
than have N different places a personality function can sneak onto an
exceptional function.
Differential Revision: http://reviews.llvm.org/D10429
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239940 91177308-0d34-0410-b5e6-96231b3b80d8
ARMTargetParser::getFPUFeatures should disable fp16 whenever it
disables vfp4, as otherwise something like -mcpu=cortex-a7 -mfpu=none
leaves us with fp16 enabled (though the only effect that will have is
a wrong build attribute).
Differential Revision: http://reviews.llvm.org/D10397
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239599 91177308-0d34-0410-b5e6-96231b3b80d8
that was resetting it.
Remove the uses of DisableTailCalls in subclasses of TargetLowering and use
the value of function attribute "disable-tail-calls" instead. Also,
unconditionally add pass TailCallElim to the pipeline and check the function
attribute at the start of runOnFunction to disable the pass on a per-function
basis.
This is part of the work to remove TargetMachine::resetTargetOptions, and since
DisableTailCalls was the last non-fast-math option that was being reset in that
function, we should be able to remove the function entirely after the work to
propagate IR-level fast-math flags to DAG nodes is completed.
Out-of-tree users should remove the uses of DisableTailCalls and make changes
to attach attribute "disable-tail-calls"="true" or "false" to the functions in
the IR.
rdar://problem/13752163
Differential Revision: http://reviews.llvm.org/D10099
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239427 91177308-0d34-0410-b5e6-96231b3b80d8
on a per-function basis.
Previously some of the passes were conditionally added to ARM's pass pipeline
based on the target machine's subtarget. This patch makes changes to add those
passes unconditionally and execute them conditonally based on the predicate
functor passed to the pass constructors. This enables running different sets of
passes for different functions in the module.
rdar://problem/20542263
Differential Revision: http://reviews.llvm.org/D8717
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239325 91177308-0d34-0410-b5e6-96231b3b80d8
These are added mainly for the benefit of clang, but this also means that they
are now allowed in .fpu directives and we emit the correct .fpu directive when
single-precision-only is used.
Differential Revision: http://reviews.llvm.org/D10238
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239151 91177308-0d34-0410-b5e6-96231b3b80d8
The existing code would unnecessarily break LDRD/STRD apart with
non-adjacent registers, on thumb2 this is not necessary.
Ideally on thumb2 we shouldn't match for ldrd/strd pre-regalloc anymore
as there is not reason to set register hints anymore, changing that is
something for a future patch however.
Differential Revision: http://reviews.llvm.org/D9694
Recommiting after the revert in r238821, the buildbot still failed with
the patch removed so there seems to be another reason for the breakage.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238935 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r238795, as it broke the Thumb2 self-hosting buildbot.
Since self-hosting issues with Clang are hard to investigate, I'm taking the
liberty to revert now, so we can investigate it offline.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238821 91177308-0d34-0410-b5e6-96231b3b80d8
The existing code would unnecessarily break LDRD/STRD apart with
non-adjacent registers, on thumb2 this is not necessary.
Ideally on thumb2 we shouldn't match for ldrd/strd pre-regalloc anymore
as there is not reason to set register hints anymore, changing that is
something for a future patch however.
Differential Revision: http://reviews.llvm.org/D9694
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238795 91177308-0d34-0410-b5e6-96231b3b80d8
The original version didn't properly account for the base register
being modified before the final jump, so caused miscompilations in
Chromium and LLVM. I've fixed this and tested with an LLVM self-host
(I don't have the means to build & test Chromium).
The general idea remains the same: in pathological cases jump tables
can be too far away from the instructions referencing them (like other
constants) so they need to be movable.
Should fix PR23627.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238680 91177308-0d34-0410-b5e6-96231b3b80d8
This is part of the work to remove TargetMachine::resetTargetOptions.
In this patch, instead of updating global variable NoFramePointerElim in
resetTargetOptions, its use in DisableFramePointerElim is replaced with a call
to TargetFrameLowering::noFramePointerElim. This function determines on a
per-function basis if frame pointer elimination should be disabled.
There is no change in functionality except that cl:opt option "disable-fp-elim"
can now override function attribute "no-frame-pointer-elim".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238080 91177308-0d34-0410-b5e6-96231b3b80d8
Ideally this is going to be and LLVM IR pass (shared, among others
with AArch64), but for the time being just enable it if consumers
ask us for optimization and not unconditionally.
Discussed with Tim Northover on IRC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237837 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, they were forced to immediately follow the actual branch
instruction. This was usually OK (the LEAs actually accessing them got emitted
nearby, and weren't usually separated much afterwards). Unfortunately, a
sufficiently nasty phi elimination dumps many instructions right before the
basic block terminator, and this can increase the range too much.
This patch frees them up to be placed as usual by the constant islands pass,
and consequently has to slightly modify the form of TBB/TBH tables to refer to
a PC-relative label at the final jump. The other jump table formats were
already position-independent.
rdar://20813304
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237590 91177308-0d34-0410-b5e6-96231b3b80d8
This patch implements LLVM support for the ACLE special register intrinsics in
section 10.1, __arm_{w,r}sr{,p,64}.
This patch is intended to lower the read/write_register instrinsics, used to
implement the special register intrinsics in the clang patch for special
register intrinsics (see http://reviews.llvm.org/D9697), to ARM specific
instructions MRC,MCR,MSR etc. to allow reading an writing of coprocessor
registers in AArch32 and AArch64. This is done by inspecting the register
string passed to the intrinsic and then lowering to the appropriate
instruction.
Patch by Luke Cheeseman.
Differential Revision: http://reviews.llvm.org/D9699
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237579 91177308-0d34-0410-b5e6-96231b3b80d8
Other targets probably should as well. Since r237161, compiler-rt has
both, but I don't see why anything other than gnueabi would use a
gnueabi naming scheme.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237324 91177308-0d34-0410-b5e6-96231b3b80d8
DEBUG_VALUE nodes do not take part in code generation. Ignore them when
performing KILL updates. Addresses PR23486.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237211 91177308-0d34-0410-b5e6-96231b3b80d8
AEABI defines aligned variants of memcpy etc. that can be faster than
the default version due to not having to do alignment checks. When
emitting target code for these functions make use of these aligned
variants if possible. Also convert memset to memclr if possible.
Differential Revision: http://reviews.llvm.org/D8060
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237127 91177308-0d34-0410-b5e6-96231b3b80d8
to use the information in the module rather than TargetOptions.
We've had and clang has used the use-soft-float attribute for some
time now so have the backends set a subtarget feature based on
a particular function now that subtargets are created based on
functions and function attributes.
For the one middle end soft float check go ahead and create
an overloadable TargetLowering::useSoftFloat function that
just checks the TargetSubtargetInfo in all cases.
Also remove the command line option that hard codes whether or
not soft-float is set by using the attribute for all of the
target specific test cases - for the generic just go ahead and
add the attribute in the one case that showed up.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237079 91177308-0d34-0410-b5e6-96231b3b80d8
When emitting something like 'add x, 1000' if we remat the 1000 then we should be able to
mark the vreg containing 1000 as killed. Given that we go bottom up in fast-isel, a later
use of 1000 will be higher up in the BB and won't kill it, or be impacted by the lower kill.
However, rematerialised constant expressions aren't generated bottom up. The local value save area
grows downwards. This means that if you remat 2 constant expressions which both use 1000 then the
first will kill it, then the second, which is *lower* in the BB will read a killed register.
This is the case in the attached test where the 2 GEPs both need to generate 'add x, 6680' for the constant offset.
Note that this commit only makes kill flag generation conservative. There's nothing else obviously wrong with
the local value save area growing downwards, and in fact it needs to for handling arbitrarily complex constant expressions.
However, it would be nice if there was a solution which would let us generate more accurate kill flags, or just kill flags completely.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236922 91177308-0d34-0410-b5e6-96231b3b80d8
The code that builds the dependence graph assumes that two PseudoSourceValues
don't alias. In a tail calling function two FixedStackObjects might refer to the
same location. Worse 'immutable' fixed stack objects like function arguments are
not immutable and will be clobbered.
Change this so that a load from a FixedStackObject is not invariant in a tail
calling function and don't return a PseudoSourceValue for an instruction in tail
calling functions when building the dependence graph so that we handle function
arguments conservatively.
Fix for PR23459.
rdar://20740035
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236916 91177308-0d34-0410-b5e6-96231b3b80d8
When selecting an extract instruction, we don't actually generate code but instead work out which register we are reading, and rewrite uses of the extract def to the source register. This is done via updateValueMap,.
However, its possible that the source register we are rewriting *to* to also have uses. If those uses are after a kill of the value we are rewriting *from* then we have uses after a kill and the verifier fails.
This code checks for the case where the to register is also used, and if so it clears all kill on the from register. This is conservative, but better that always clearing kills on the from register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236897 91177308-0d34-0410-b5e6-96231b3b80d8
If we duplicate an instruction then we must also clear kill flags on any uses we rewrite.
Otherwise we might be killing a register which was used in other BBs.
For example, here the entry BB ended up with these instructions, the ADD having been tail duplicated.
%vreg24<def> = t2ADDri %vreg10<kill>, 1, pred:14, pred:%noreg, opt:%noreg; GPRnopc:%vreg24 rGPR:%vreg10
%vreg22<def> = COPY %vreg10; GPR:%vreg22 rGPR:%vreg10
The copy here is inserted after the add and so needs vreg10 to be live.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236782 91177308-0d34-0410-b5e6-96231b3b80d8
We had code such as this:
r2 = ...
t2Bcc
label1:
ldr ... r2
label2;
return r2<dead, def>
The if converter was transforming this to
r2<def> = ...
return [pred] r2<dead,def>
ldr <r2, kill>
return
which fails the machine verifier because the ldr now reads from a dead def.
The fix here detects dead defs in stepForward and passes them back to the caller in the clobbers list. The caller then clears the dead flag from the def is the value is live.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236660 91177308-0d34-0410-b5e6-96231b3b80d8
If called twice in the same BB on the same constant, FastISel::fastEmit_ri_ was marking the materialized vreg as killed on each use, instead of only the last use.
Change this to only mark the last use as killed by making earlier uses check if the vreg is already used elsewhere.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236650 91177308-0d34-0410-b5e6-96231b3b80d8
With neon enabled, we reach SelectBinaryFPOp and are able to get registers for a <2 x double> add.
However, we shouldn't actually attempt arithmetic on it as ARMIselLowering says "v2f64 is legal so that QR subregs can be extracted as f64 elements, but neither Neon nor VFP support any arithmetic operations on it."
This commit disables SelectBinaryFPOp for any vector types. There's already a FIXME to try handle neon. Doing so would require fixing this conditional which isn't safe for vectors 'VT == MVT::f64 || VT == MVT::i64'
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236609 91177308-0d34-0410-b5e6-96231b3b80d8
Since r234249, i1 are sext instead of zext; because of that, doing
"CMP rN, #0; IT EQ/NE" isn't correct anymore.
"TST #1" is the conservatively correct alternative - the tradeoff being
that it doesn't have a 16-bit encoding -, so use that instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236569 91177308-0d34-0410-b5e6-96231b3b80d8
Note, this is a recommit of r236515 after fixing an error in r236514. The buildbot ran fast enough that it picked up r236514 prior to r236515 and threw an error. r236515 itself ran 'make check' without errors.
Original commit message follows:
A regmask (typically seen on a call) clobbers the set of registers it lists. The IfConverter, in UpdatePredRedefs, was handling register defs, but not regmasks.
These are slightly different to a def in that we need to add both an implicit use and def to appease the machine verifier. Otherwise, uses after the if converted call could think they are reading an undefined register.
Reviewed by Matthias Braun and Quentin Colombet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236550 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit b27413cbfd78d959c18e713bfa271fb69e6b3303 (ie r236515).
This is to get the bots green while i investigate the failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236517 91177308-0d34-0410-b5e6-96231b3b80d8
A regmask (typically seen on a call) clobbers the set of registers it lists. The IfConverter, in UpdatePredRedefs, was handling register defs, but not regmasks.
These are slightly different to a def in that we need to add both an implicit use and def to appease the machine verifier. Otherwise, uses after the if converted call could think they are reading an undefined register.
Reviewed by Matthias Braun and Quentin Colombet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236515 91177308-0d34-0410-b5e6-96231b3b80d8
If we move an instruction from one block down to a MOVC and predicate it,
then the original instruction could be moved in to a loop. In this case,
its invalid for any kill flags to remain on there.
Fails with -verfy-machineinstrs.
rdar://problem/20752113
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236290 91177308-0d34-0410-b5e6-96231b3b80d8
When commuting a thumb instruction in the size reduction pass, thumb
instructions are represented as a bundle and so some operands may be marked
as internal. The internal flag has to move with the operand when commuting.
This test is sensitive to register allocation so can't specifically check that
this error was happening, but so long as it continues to pass with -verify then
hopefully its still ok.
rdar://problem/20752113
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236282 91177308-0d34-0410-b5e6-96231b3b80d8
The expansion for t2ABS was always setting the kill flag on the rsb instruction.
It should instead only be set on rsb if it was set on the original ABS instruction.
rdar://problem/20752113
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236272 91177308-0d34-0410-b5e6-96231b3b80d8
temporary.
Because of that:
1. The machine verifier was complaining on such code.
2. The generate code worked just because the thumb reduction size pass fixed the
opcode.
rdar://problem/20749824
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236247 91177308-0d34-0410-b5e6-96231b3b80d8
Finish off PR23080 by renaming the debug info IR constructs from `MD*`
to `DI*`. The last of the `DIDescriptor` classes were deleted in
r235356, and the last of the related typedefs removed in r235413, so
this has all baked for about a week.
Note: If you have out-of-tree code (like a frontend), I recommend that
you get everything compiling and tests passing with the *previous*
commit before updating to this one. It'll be easier to keep track of
what code is using the `DIDescriptor` hierarchy and what you've already
updated, and I think you're extremely unlikely to insert bugs. YMMV of
course.
Back to *this* commit: I did this using the rename-md-di-nodes.sh
upgrade script I've attached to PR23080 (both code and testcases) and
filtered through clang-format-diff.py. I edited the tests for
test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns
were off-by-three. It should work on your out-of-tree testcases (and
code, if you've followed the advice in the previous paragraph).
Some of the tests are in badly named files now (e.g.,
test/Assembler/invalid-mdcompositetype-missing-tag.ll should be
'dicompositetype'); I'll come back and move the files in a follow-up
commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236120 91177308-0d34-0410-b5e6-96231b3b80d8
We were trying to look through COPY instructions, but only to the next
instruction in a BB and incorrectly anyway. The cases where that would actually
be a good idea are rare enough (and not even tested!) that it's not worth
trying to get right.
rdar://20721342
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236050 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, the code would try to put a fall-through case last,
even if that meant moving a case with much higher branch weight
further down the chain.
Ordering by branch weight is most important, putting a fall-through
block last is secondary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235942 91177308-0d34-0410-b5e6-96231b3b80d8