Summary:
AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for targets with mature MC support. Such targets will always parse the inline assembly (even when emitting assembly). Targets without mature MC support continue to use EmitRawText() for assembly output.
The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler to parse inline assembly (even when emitting assembly output). UseIntegratedAs is set to true for targets that consider any failure to parse valid assembly to be a bug. Target specific subclasses generally enable the integrated assembler in their constructor. The default value can be overridden with -no-integrated-as.
All tests that rely on inline assembly supporting invalid assembly (for example, those that use mnemonics such as 'foo' or 'hello world') have been updated to disable the integrated assembler.
Reviewers: rafael
Reviewed By: rafael
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2686
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201237 91177308-0d34-0410-b5e6-96231b3b80d8
* CPRCs may be allocated to co-processor registers or the stack – they may never be allocated to core registers
* When a CPRC is allocated to the stack, all other VFP registers should be marked as unavailable
The difference is only noticeable in rare cases where there are a large number of floating point arguments (e.g.
7 doubles + additional float, double arguments). Although it's probably still better to avoid vmov as it can cause
stalls in some older ARM cores. The other, more subtle benefit, is to minimize difference between the various
calling conventions.
rdar://16039676
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201193 91177308-0d34-0410-b5e6-96231b3b80d8
Similarly to the vshrn instructions, these are simple zext/sext + trunc
operations. Using normal LLVM IR should allow for better code, and more sharing
with the AArch64 backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201093 91177308-0d34-0410-b5e6-96231b3b80d8
For A- and R-class processors, r12 is not normally callee-saved, but is for
interrupt handlers. See AAPCS, 5.3.1.1, "Use of IP by the linker".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201089 91177308-0d34-0410-b5e6-96231b3b80d8
vshrn is just the combination of a right shift and a truncate (and the limits
on the immediate value actually mean the signedness of the shift doesn't
matter). Using that representation allows us to get rid of an ARM-specific
intrinsic, share more code with AArch64 and hopefully get better code out of
the mid-end optimisers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201085 91177308-0d34-0410-b5e6-96231b3b80d8
According to the AAPCS, when a CPRC is allocated to the stack, all other
VFP registers should be marked as unavailable.
I have also modified the rules for allocating non-CPRCs to the stack, to make
it more explicit that all GPRs must be made unavailable. I cannot think of a
case where the old version would produce incorrect answers, so there is no test
for this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200970 91177308-0d34-0410-b5e6-96231b3b80d8
Fix a bug triggered in IfConverterTriangle when CvtBB has multiple predecessors
by getting the weights before removing a successor.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200958 91177308-0d34-0410-b5e6-96231b3b80d8
This patch fixes the ldr-pseudo implementation to work when used in
inline assembly. The fix is to move arm assembler constant pools
from the ARMAsmParser class to the ARMTargetStreamer class.
Previously we kept the assembler generated constant pools in the
ARMAsmParser object. This does not work for inline assembly because
a new parser object is created for each blob of inline assembly.
This patch moves the constant pools to the ARMTargetStreamer class
so that the constant pool will remain alive for the entire code
generation process.
An ARMTargetStreamer class is now required for the arm backend.
There was no existing implementation for MachO, only Asm and ELF.
Instead of creating an empty MachO subclass, we decided to make the
ARMTargetStreamer a non-abstract class and provide default
(llvm_unreachable) implementations for the non constant-pool related
methods.
Differential Revision: http://llvm-reviews.chandlerc.com/D2638
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200777 91177308-0d34-0410-b5e6-96231b3b80d8
There was an extremely confusing proliferation of LLVM intrinsics to implement
the vacge & vacgt instructions. This combines them all into two polymorphic
intrinsics, shared across both backends.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200768 91177308-0d34-0410-b5e6-96231b3b80d8
Missing braces on if meant we inserted both ARM and Thumb load for a litpool
entry. This didn't end well.
rdar://problem/15959157
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200752 91177308-0d34-0410-b5e6-96231b3b80d8
A bunch of test cases needed to be cleaned up for this, many my fault -
when implementid imported modules I updated test cases by simply
duplicating the prior metadata field - which wasn't always the empty
metadata entry.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200731 91177308-0d34-0410-b5e6-96231b3b80d8
Some of the SHA instructions take a scalar i32 as one argument (largely because
they work on 160-bit hash fragments). This wasn't reflected in the IR
previously, with ARM and AArch64 choosing different types (<4 x i32> and <1 x
i32> respectively) which was ugly.
This makes all the affected intrinsics take a uniform "i32", allowing them to
become non-polymorphic at the same time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200706 91177308-0d34-0410-b5e6-96231b3b80d8
This changes the PrologueEpilogInserter and LocalStackSlotAllocation passes to
follow the extended stack layout rules for sspstrong and sspreq.
The sspstrong layout rules are:
1. Large arrays and structures containing large arrays (>= ssp-buffer-size)
are closest to the stack protector.
2. Small arrays and structures containing small arrays (< ssp-buffer-size) are
2nd closest to the protector.
3. Variables that have had their address taken are 3rd closest to the
protector.
Differential Revision: http://llvm-reviews.chandlerc.com/D2546
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200601 91177308-0d34-0410-b5e6-96231b3b80d8
When converting from "or + br" to two branches, or converting from
"and + br" to two branches, we correctly update the edge weights of
the two branches.
The previous attempt at r200431 was reverted at r200434 because of
two testing case failures. I modified my patch a little, but forgot
to re-run "make check-all".
Testing case CodeGen/ARM/lsr-unfolded-offset.ll is updated because of
the patch's impact on branch probability which causes changes in
spill placement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200502 91177308-0d34-0410-b5e6-96231b3b80d8
This commit only handles IfConvertTriangle. To update edge weights
of a successor, one interface is added to MachineBasicBlock:
/// Set successor weight of a given iterator.
setSuccWeight(succ_iterator I, uint32_t weight)
An existing testing case test/CodeGen/Thumb2/v8_IT_5.ll is updated,
since we now correctly update the edge weights, the cold block
is placed at the end of the function and we jump to the cold block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200428 91177308-0d34-0410-b5e6-96231b3b80d8
After all hard work to implement the EHABI and with the test-suite
passing, it's time to turn it on by default and allow users to
disable it as a work-around while we fix the eventual bugs that show
up.
This commit also remove the -arm-enable-ehabi-descriptors, since we
want the tables to be printed every time the EHABI is turned on
for non-Darwin ARM targets.
Although MCJIT EHABI is not working yet (needs linking with the right
libraries), this commit also fixes some relocations on MCJIT regarding
the EH tables/lib calls, and update some tests to avoid using EH tables
when none are needed.
The EH tests in the test-suite that were previously disabled on ARM
now pass with these changes, so a follow-up commit on the test-suite
will re-enable them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200388 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This commit gives an address mode to the PLD instruction. We
were getting an assertion failure in the frame lowering code
because we had code that was doing a pld of a stack allocated
address. The frame lowering was checking the address mode and
then asserting because pld had none defined.
This commit fixes pld for arm mode. There was a previous fix for
thumb mode in a separate commit. The commit for thumb mode
added a test in a separate file because it would otherwise fail
for arm. This commit moves the thumb test back into the prefetch.ll
file and adds the corresponding arm test.
Differential Revision: http://llvm-reviews.chandlerc.com/D2622
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200248 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r200058 and adds the using directive for
ARMTargetTransformInfo to silence two g++ overload warnings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200062 91177308-0d34-0410-b5e6-96231b3b80d8
This commit caused -Woverloaded-virtual warnings. The two new
TargetTransformInfo::getIntImmCost functions were only added to the superclass,
and to the X86 subclass. The other targets were not updated, and the
warning highlighted this by pointing out that e.g. ARMTTI::getIntImmCost was
hiding the two new getIntImmCost variants.
We could pacify the warning by adding "using TargetTransformInfo::getIntImmCost"
to the various subclasses, or turning it off, but I suspect that it's wrong to
leave the functions unimplemnted in those targets. The default implementations
return TCC_Free, which I don't think is right e.g. for ARM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200058 91177308-0d34-0410-b5e6-96231b3b80d8
Retry commit r200022 with a fix for the build bot errors. Constant expressions
have (unlike instructions) module scope use lists and therefore may have users
in different functions. The fix is to simply ignore these out-of-function uses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200034 91177308-0d34-0410-b5e6-96231b3b80d8
This pass identifies expensive constants to hoist and coalesces them to
better prepare it for SelectionDAG-based code generation. This works around the
limitations of the basic-block-at-a-time approach.
First it scans all instructions for integer constants and calculates its
cost. If the constant can be folded into the instruction (the cost is
TCC_Free) or the cost is just a simple operation (TCC_BASIC), then we don't
consider it expensive and leave it alone. This is the default behavior and
the default implementation of getIntImmCost will always return TCC_Free.
If the cost is more than TCC_BASIC, then the integer constant can't be folded
into the instruction and it might be beneficial to hoist the constant.
Similar constants are coalesced to reduce register pressure and
materialization code.
When a constant is hoisted, it is also hidden behind a bitcast to force it to
be live-out of the basic block. Otherwise the constant would be just
duplicated and each basic block would have its own copy in the SelectionDAG.
The SelectionDAG recognizes such constants as opaque and doesn't perform
certain transformations on them, which would create a new expensive constant.
This optimization is only applied to integer constants in instructions and
simple (this means not nested) constant cast experessions. For example:
%0 = load i64* inttoptr (i64 big_constant to i64*)
Reviewed by Eric
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200022 91177308-0d34-0410-b5e6-96231b3b80d8
Sweep the codebase for common typos. Includes some changes to visible function
names that were misspelt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200018 91177308-0d34-0410-b5e6-96231b3b80d8
r200011 remove the special codepaths in MC for inline asm, so we can now test
all the logic with just llc + llvm-mc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200013 91177308-0d34-0410-b5e6-96231b3b80d8
Originally, BLX was passed as operand #0 in MachineInstr and as operand
#2 in MCInst. But now, it's operand #2 in both cases.
This patch also removes unnecessary FileCheck in the test case added by r199127.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199928 91177308-0d34-0410-b5e6-96231b3b80d8
With constant-sharing, litpool loads consume 4 + N*2 bytes of code, but
movw/movt pairs consume 8*N. This means litpools are better than movw/movt even
with just one use. Other materialisation strategies can still be better though,
so the logic is a little odd.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199891 91177308-0d34-0410-b5e6-96231b3b80d8
This patch restores the ARM mode if the user's inline assembly
does not. In the object streamer, it ensures that instructions
following the inline assembly are encoded correctly and that
correct mapping symbols are emitted. For the asm streamer, it
emits a .arm or .thumb directive.
This patch does not ensure that the inline assembly contains
the ADR instruction to switch modes at runtime.
The problem we need to solve is code like this:
int foo(int a, int b) {
int r = a + b;
asm volatile(
".align 2 \n"
".arm \n"
"add r0,r0,r0 \n"
: : "r"(r));
return r+1;
}
If we compile this function in thumb mode then the inline assembly
will switch to arm mode. We need to make sure that we switch back to
thumb mode after emitting the inline assembly or we will incorrectly
encode the instructions that follow (i.e. the assembly instructions
for return r+1).
Based on patch by David Peixotto
Change-Id: Ib57f6d2d78a22afad5de8693fba6230ff56ba48b
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199818 91177308-0d34-0410-b5e6-96231b3b80d8
When expanding neon pseudo stores, it may miss the implicit uses of sub
regs, which may cause post RA scheduler reorder instructions that
breakes anti dependency.
For example:
VST1d64QPseudo %R0<kill>, 16, %Q9_Q10, pred:14, pred:%noreg
will be expanded to
VST1d64Q %R0<kill>, 16, %D18, pred:14, pred:%noreg;
An instruction that defines %D20 may be scheduled before the store by
mistake.
This patches adds implicit uses for such case. For the example above, it
emits:
VST1d64Q %R0<kill>, 8, %D18, pred:14, pred:%noreg, %Q9_Q10<imp-use>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199282 91177308-0d34-0410-b5e6-96231b3b80d8
The changes caused by folding an sp-adjustment into a "pop" previously
disrupted the forward search for the final real instruction in a
terminating block. This switches to a backward search (skipping debug
instrs).
This fixes PR18399.
Patch by Zhaoshi.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199266 91177308-0d34-0410-b5e6-96231b3b80d8
Parse tag names as well as expressions. The former is part of the
specification, the latter is for improved compatibility with the GNU assembler.
Fix attribute value handling to be comformant to the specification.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198662 91177308-0d34-0410-b5e6-96231b3b80d8
The ARM backend has been using most of the MachO related subtarget
checks almost interchangeably, and since the only target it's had to
run on has been IOS (which is all three of MachO, Darwin and IOS) it's
worked out OK so far.
But we'd like to support embedded targets under the "*-*-none-macho"
triple, which means everything starts falling apart and inconsistent
behaviours emerge.
This patch should pick a reasonably sensible set of behaviours for the
new triple (and any others that come along, with luck). Some choices
were debatable (notably FP == r7 or r11), but we can revisit those
later when deficiencies become apparent.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198617 91177308-0d34-0410-b5e6-96231b3b80d8
Longer term, we want to move users to "*-*-*-macho" for embedded work, but for
now people are relying on the last thing we told them, which is unfortunately
"*-*-darwin-eabi".
rdar://problem/15703934
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198602 91177308-0d34-0410-b5e6-96231b3b80d8
This patch makes it possible to select the ABI with -mattr. It will be used to
forward clang's -target-abi option to llvm's CodeGen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198304 91177308-0d34-0410-b5e6-96231b3b80d8
Schedule more conservatively to account for stalls on floating point
resources and latency. Use the AGU resource to model latency stalls
since it's shared between FP and LD/ST instructions. This might not be
completely accurate but should work well in practice.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198125 91177308-0d34-0410-b5e6-96231b3b80d8
This changes the MachineFrameInfo API to use the new SSPLayoutKind information
produced by the StackProtector pass (instead of a boolean flag) and updates a
few pass dependencies (to preserve the SSP analysis).
The stack layout follows the same approach used prior to this change - i.e.,
only LargeArray stack objects will be placed near the canary and everything
else will be laid out normally. After this change, structures containing large
arrays will also be placed near the canary - a case previously missed by the
old implementation.
Out of tree targets will need to update their usage of
MachineFrameInfo::CreateStackObject to remove the MayNeedSP argument.
The next patch will implement the rules for sspstrong and sspreq. The end goal
is to support ssp-strong stack layout rules.
WIP.
Differential Revision: http://llvm-reviews.chandlerc.com/D2158
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197653 91177308-0d34-0410-b5e6-96231b3b80d8
Given vsel_cc, op1, op2, since vsel has no LE/LT, to generate vsel for
such selection, it needs to inverse cc and swap op1 and op2. To inverse
cc, both L/G and E bits should be flipped.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197615 91177308-0d34-0410-b5e6-96231b3b80d8
Clang sets the float-abi target option manually, but no longer
annotates each function with its ABI. This can lead to confusing
mistmatch between "clang -emit-llvm | llc" and normal clang
invocations.
Besides which, gnueabihf actually *is* hard-float. Defaulting to soft
was just perverse.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197554 91177308-0d34-0410-b5e6-96231b3b80d8
This reapplies r197438 and fixes the link-time circular dependency between
IR and Support. The fix consists in moving the diagnostic support into IR.
The patch adds a new LLVMContext::diagnose that can be used to communicate to
the front-end, if any, that something of interest happened.
The diagnostics are supported by a new abstraction, the DiagnosticInfo class.
The base class contains the following information:
- The kind of the report: What this is about.
- The severity of the report: How bad this is.
This patch also adds 2 classes:
- DiagnosticInfoInlineAsm: For inline asm reporting. Basically, this diagnostic
will be used to switch to the new diagnostic API for LLVMContext::emitError.
- DiagnosticStackSize: For stack size reporting. Comes as a replacement of the
hard coded warning in PEI.
This patch also features dynamic diagnostic identifiers. In other words plugins
can use this infrastructure for their own diagnostics (for more details, see
getNextAvailablePluginDiagnosticKind).
This patch introduces a new DiagnosticHandlerTy and a new DiagnosticContext in
the LLVMContext that should be set by the front-end to be able to map these
diagnostics in its own system.
http://llvm-reviews.chandlerc.com/D2376
<rdar://problem/15515174>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197508 91177308-0d34-0410-b5e6-96231b3b80d8
The patch adds a new LLVMContext::diagnose that can be used to communicate to
the front-end, if any, that something of interest happened.
The diagnostics are supported by a new abstraction, the DiagnosticInfo class.
The base class contains the following information:
- The kind of the report: What this is about.
- The severity of the report: How bad this is.
This patch also adds 2 classes:
- DiagnosticInfoInlineAsm: For inline asm reporting. Basically, this diagnostic
will be used to switch to the new diagnostic API for LLVMContext::emitError.
- DiagnosticStackSize: For stack size reporting. Comes as a replacement of the
hard coded warning in PEI.
This patch also features dynamic diagnostic identifiers. In other words plugins
can use this infrastructure for their own diagnostics (for more details, see
getNextAvailablePluginDiagnosticKind).
This patch introduces a new DiagnosticHandlerTy and a new DiagnosticContext in
the LLVMContext that should be set by the front-end to be able to map these
diagnostics in its own system.
http://llvm-reviews.chandlerc.com/D2376
<rdar://problem/15515174>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197438 91177308-0d34-0410-b5e6-96231b3b80d8
The tests were no longer using fast-isel at all (MachO needs an "ios" rather
than "darwin" triple at the moment and Linux needs ARM mode). Once that was
corrected, the verifier complained about a t2ADDri created for the alloca.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197046 91177308-0d34-0410-b5e6-96231b3b80d8
When trying to eliminate an "sub sp, sp, #N" instruction by folding
it into an existing push/pop using dummy registers, we need to account
for the fact that this might affect precisely how "fp" gets set in the
prologue.
We were attempting this, but assuming that *whenever* we performed a
fold it would make a difference. This is false, for example, in:
push {r4, r7, lr}
add fp, sp, #4
vpush {d8}
sub sp, sp, #8
we can fold the "sub" into the "vpush", forming "vpush {d7, d8}".
However, in that case the "add fp" instruction mustn't change, which
we were getting wrong before.
Should fix PR18160.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196725 91177308-0d34-0410-b5e6-96231b3b80d8
The current peephole optimizing for compare inst assumes an instr that
uses CPSR has an MO for ARM Cond code.However, for VSEL instructions
(vseqeq, vselgt, vselgt, vselvs), there is no such operand nor do
they support the modification of Cond Code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196588 91177308-0d34-0410-b5e6-96231b3b80d8
We were trying to fold the stack adjustment into the wrong instruction in the
situation where the entire basic-block was epilogue code. Really, it can only
ever be valid to do the folding precisely where the "add sp, ..." would be
placed so there's no need for a separate iterator to track that.
Should fix PR18136.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196493 91177308-0d34-0410-b5e6-96231b3b80d8
ARM symbol variants are written with parens instead of @ like this:
.word __GLOBAL_I_a(target1)
This commit adds support for parsing these symbol variants in
expressions. We introduce a new flag to MCAsmInfo that indicates the
parser should use parens to parse the symbol variant. The expression
parser is modified to look for symbol variants using parens instead
of @ when the corresponding MCAsmInfo flag is true.
The MCAsmInfo parens flag is enabled only for ARM on ELF.
By adding this flag to MCAsmInfo, we are able to get rid of
redundant ARM-specific symbol variants and use the generic variants
instead (e.g. VK_GOT instead of VK_ARM_GOT). We use the new
UseParensForSymbolVariant attribute in MCAsmInfo to correctly print
the symbol variants for arm.
To achive this we need to keep a handle to the MCAsmInfo in the
MCSymbolRefExpr class that we can check when printing the symbol
variant.
Updated Tests:
Changed case of symbol variant to match the generic kind.
test/CodeGen/ARM/tls-models.ll
test/CodeGen/ARM/tls1.ll
test/CodeGen/ARM/tls2.ll
test/CodeGen/Thumb2/tls1.ll
test/CodeGen/Thumb2/tls2.ll
PR18080
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196424 91177308-0d34-0410-b5e6-96231b3b80d8
These are used by MachO only at the moment, and (much like the existing
MOVW/MOVT set) work around the fact that the labels used in the actual
instructions often contain PC-dependent components, which means that repeatedly
materialising the same global can't be CSEed.
With small modifications, it could be adapted to how ELF finds the address of
_GLOBAL_OFFSET_TABLE_, which would give similar benefits in PIC mode there.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196090 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, we clobbered callee-saved registers when folding an "add
sp, #N" into a "pop {rD, ...}" instruction. This change checks whether
a register we're going to add to the "pop" could actually be live
outside the function before doing so and should fix the issue.
This should fix PR18081.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196046 91177308-0d34-0410-b5e6-96231b3b80d8
We are going to drop debug info without a version number or with a different
version number, to make sure we don't crash when we see bitcode files with
different debug info metadata format.
Make tests more robust by removing hard-coded metadata numbers in CHECK lines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195535 91177308-0d34-0410-b5e6-96231b3b80d8
We are going to drop debug info without a version number or with a different
version number, to make sure we don't crash when we see bitcode files with
different debug info metadata format.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195504 91177308-0d34-0410-b5e6-96231b3b80d8
Stop folding constant adds into GEP when the type size doesn't match.
Otherwise, the adds' operands are effectively being promoted, changing the
conditions of an overflow. Results are different when:
sext(a) + sext(b) != sext(a + b)
Problem originally found on x86-64, but also fixed issues with ARM and PPC,
which used similar code.
<rdar://problem/15292280>
Patch by Duncan Exon Smith!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194840 91177308-0d34-0410-b5e6-96231b3b80d8
We used to perform an invalid operation on an MVT and crash, which wasn't much
fun.
Patch by Oliver Stannard.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194714 91177308-0d34-0410-b5e6-96231b3b80d8
In ELF and COFF an alias is just another offset in a section. There is no way
to represent an alias to something in another file.
In MachO, the spec has the N_INDR type which should allow for exactly that, but
is not currently implemented. Given that it is specified but not implemented,
we error in codegen to avoid miscompiling but don't reject aliases to
declarations in the verifier to leave the option open of implementing it.
In the past we have used alias to declarations as a way of implementing
weakref, which is why it exists in some old tests which this patch updates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194705 91177308-0d34-0410-b5e6-96231b3b80d8
By default, the behavior of IT block generation will be determinated
dynamically base on the arch (armv8 vs armv7). This patch adds backend
options: -arm-restrict-it and -arm-no-restrict-it. The former one
restricts the generation of IT blocks (the same behavior as thumbv8) for
both arches. The later one allows the generation of legacy IT block (the
same behavior as ARMv7 Thumb2) for both arches.
Clang will support -mrestrict-it and -mno-restrict-it, which is
compatible with GCC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194592 91177308-0d34-0410-b5e6-96231b3b80d8
isPhysRegUsed if the unwind information is required.
Indeed, the runtime may need a correct stack to be able to unwind the call.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194271 91177308-0d34-0410-b5e6-96231b3b80d8
ARM prologues usually look like:
push {r7, lr}
sub sp, sp, #4
If code size is extremely important, this can be optimised to the single
instruction:
push {r6, r7, lr}
where we don't actually care about the contents of r6, but pushing it subtracts
4 from sp as a side effect.
This should implement such a conversion, predicated on the "minsize" function
attribute (-Oz) since I've yet to find any code it actually makes faster.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194264 91177308-0d34-0410-b5e6-96231b3b80d8
Add a Virtualization ARM subtarget feature along with adding proper build
attribute emission for Tag_Virtualization_use (encodes Virtualization and
TrustZone) and Tag_MPextension_use.
Also rework test/CodeGen/ARM/2010-10-19-mc-elf-objheader.ll testcase to
something that is more maintainable. This changes the focus of this
testcase away from testing CPU defaults (which is tested elsewhere), onto
specifically testing that attributes are encoded correctly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193859 91177308-0d34-0410-b5e6-96231b3b80d8
Fix Tag_ABI_HardFP_use build attribute to handle single precision FP,
replace deprecated Tag_ABI_HardFP_use value of 3 with 0 and also add
some tests for Tag_ABI_VFP_args.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193856 91177308-0d34-0410-b5e6-96231b3b80d8
This commit allows the ARM integrated assembler to parse
and assemble the code with .eabi_attribute, .cpu, and
.fpu directives.
To implement the feature, this commit moves the code from
AttrEmitter to ARMTargetStreamers, and several new test
cases related to cortex-m4, cortex-r5, and cortex-a15 are
added.
Besides, this commit also change the Subtarget->isFPOnlySP()
to Subtarget->hasD16() to match the usage of .fpu directive.
This commit changes the test cases:
* Several .eabi_attribute directives in
2010-09-29-mc-asm-header-test.ll are removed because the .fpu
directive already cover the functionality.
* In the Cortex-A15 test case, the value for
Tag_Advanced_SIMD_arch has be changed from 1 to 2,
which is more precise.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193524 91177308-0d34-0410-b5e6-96231b3b80d8
There's a barrier instruction so that should still be used, but most actual
atomic operations are going to need a platform decision on the correct
behaviour (either nop if single-threaded or OS-support otherwise).
rdar://problem/15287210
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193399 91177308-0d34-0410-b5e6-96231b3b80d8
ARM processors without ldrex/strex need to be able to make libcalls for all
atomic operations, including the newer min/max versions.
The alternative would probably be expanding these operations in terms of
cmpxchg (as x86 does always), but in the configurations where this matters
code-size tends to be paramount so the libcall is more desirable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193398 91177308-0d34-0410-b5e6-96231b3b80d8
Only use them if the subtarget has ARM mode, as these routines are implemented
as ARM code.
rdar://15302004
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193381 91177308-0d34-0410-b5e6-96231b3b80d8
The compiler-rt functions __adddf3vfp and so on exist purely to allow Thumb1
code to make use of VFP instructions by switching back to ARM mode, they make
no sense for M-class processors which don't even have an ARM mode.
Given that justification, in practice this is a platform ABI decision so the
actual check is based on that rather than CPU features.
rdar://problem/15302004
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193327 91177308-0d34-0410-b5e6-96231b3b80d8
This commit implements the correct lowering of the
COPY_STRUCT_BYVAL_I32 pseudo-instruction for thumb1 targets.
Previously, the lowering of COPY_STRUCT_BYVAL_I32 generated the
post-increment forms of ldr/ldrh/ldrb instructions. Thumb1 does not
have the post-increment form of these instructions so the generated
assembly contained invalid instructions.
Passing the generated assembly to gcc caused it to complain with an
error like this:
Error: cannot honor width suffix -- `ldrb r3,[r0],#1'
and the integrated assembler would generate an object file with an
invalid instruction encoding.
This commit contains a small test case that demonstrates the problem
with thumb1 targets as well as an expanded test case that more
throughly tests the lowering of byval struct passing for arm,
thumb1, and thumb2 targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192916 91177308-0d34-0410-b5e6-96231b3b80d8
Per original comment, the intention of this loop
is to go ahead and break the critical edge
(in order to sink this instruction) if there's
reason to believe doing so might "unblock" the
sinking of additional instructions that define
registers used by this one. The idea is that if
we have a few instructions to sink "together"
breaking the edge might be worthwhile.
This commit makes a few small changes
to help better realize this goal:
First, modify the loop to ignore registers
defined by this instruction. We don't
sink definitions of physical registers,
and sinking an SSA definition isn't
going to unblock an upstream instruction.
Second, ignore uses of physical registers.
Instructions that define physical registers are
rejected for sinking, and so moving this one
won't enable moving any defining instructions.
As an added bonus, while virtual register
use-def chains are generally small due
to SSA goodness, iteration over the uses
and definitions (used by hasOneNonDBGUse)
for physical registers like EFLAGS
can be rather expensive in practice.
(This is the original reason for looking at this)
Finally, to keep things simple continue
to only consider this trick for registers that
have a single use (via hasOneNonDBGUse),
but to avoid spuriously breaking critical edges
only do so if the definition resides
in the same MBB and therefore this one directly
blocks it from being sunk as well.
If sinking them together is meant to be,
let the iterative nature of this pass
sink the definition into this block first.
Update tests to accomodate this change,
add new testcase where sinking avoids pipeline stalls.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192608 91177308-0d34-0410-b5e6-96231b3b80d8
When if converting something like:
true:
... = R0<kill>
false:
... = R0<kill>
then the instructions of the true block must not have a <kill> flag
anymore, as the instruction of the false block follow and do still read
the R0 value.
Specifically this patch determines the set of register live-in in the
false block (possibly after simulating the liveness changes of the
duplicated instructions). Each of these live-in registers mustn't be
killed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192482 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts r192454
Apparently FileCheck isn't as smart as I though and does not enforce a
topological order between variable defs+uses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192472 91177308-0d34-0410-b5e6-96231b3b80d8
When we had a sequence like:
s1 = VLDRS [r0, 1], Q0<imp-def>
s3 = VLDRS [r0, 2], Q0<imp-use,kill>, Q0<imp-def>
s0 = VLDRS [r0, 0], Q0<imp-use,kill>, Q0<imp-def>
s2 = VLDRS [r0, 4], Q0<imp-use,kill>, Q0<imp-def>
we were gathering the {s0, s1} loads below the s3 load. This is fine,
but confused the verifier since now the s3 load had Q0<imp-use> with
no definition above it.
This should mark such uses <undef> as well. The liveness structure at
the beginning and end of the block is unaffected, and the true sN
definitions should prevent any dodgy reorderings being introduced
elsewhere.
rdar://problem/15124449
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192344 91177308-0d34-0410-b5e6-96231b3b80d8
from struct byval to registers.
We used to pass 0 which means the alignment of PtrVT. Even when the alignment
of the struct is smaller than 4, the LOADs would have alignment of 4, and
further optimizations could combine the LOADs into a ldm, which would
cause crash.
The fix is to pass the alignment of the struct byval.
rdar://problem/15144402
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192126 91177308-0d34-0410-b5e6-96231b3b80d8
Bitcasting everything to i8* won't work. Autoupgrade the old
intrinsic declarations to use the new mangling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192117 91177308-0d34-0410-b5e6-96231b3b80d8
optimizeSelect folds (predicated) copy instructions, it must not ignore
the original register class of the operand when replacing the register
with the copies dest register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191963 91177308-0d34-0410-b5e6-96231b3b80d8
The jump doesn't really kill the registers, the following call does but
we never get back anyway.
This avoids some verify-machineinstrs problems when TAILJUMPs are
if-converted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191962 91177308-0d34-0410-b5e6-96231b3b80d8
Copy over the whole register machine operand instead of creating a new one
with an incomplete set of flags.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191961 91177308-0d34-0410-b5e6-96231b3b80d8
This function-attribute modifies the callee-saved register list and function
epilogue (specifically the return instruction) so that a routine is suitable
for use as an interrupt-handler of the specified type without disrupting
user-mode applications.
rdar://problem/14207019
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191766 91177308-0d34-0410-b5e6-96231b3b80d8
Generally, it is desirable to distribute (a + b) * c to a*c + b*c for
ARM with VMLx forwarding, where a, b and c are vectors.
However, for (a + b)*(a + b), distribution will result in one extra
instruction.
With distribution:
x = a + b (add)
y = a * x (mul)
z = y + b * y (mla)
Without distribution:
x = a + b (add)
z = x * x (mul)
This patch checks if a mul is a square of add/sub. If yes, skip
distribution.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191410 91177308-0d34-0410-b5e6-96231b3b80d8
PEI inserts a save/restore sequence for the link register, according to the
information it gets from the MachineRegisterInfo.
MachineRegisterInfo is populated by the VirtRegMap pass.
This pass was not aware of noreturn calls and was registering the definitions of
these calls the same way as regular operations.
Modify VirtRegPass so that it does not set the isPhysRegUsed information for
registers only defined by noreturn calls.
The rational is that a noreturn call is the "last instruction" of the program
(if it returns the behavior is undefined), so everything that is defined by it
cannot be used and will not interfere with anything else. Therefore, it is
pointless to account for then.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191349 91177308-0d34-0410-b5e6-96231b3b80d8
This is being disabled because it is no longer needed for
performance. It is only used by postRAscheduler which is also planned
for removal, and it is implemented with an out-dated view of register
liveness. It consideres aliases instead of register units, assumes
valid kill flags, and assumes implicit uses on partial register
defs. Kill flags and implicit operands are error prone and impossible
to verify. We should gradually eliminate dependence on them in the
postRA phases.
Targets that still benefit from this should move to the MI
scheduler. If that doesn't solve the problem, then we should add a
hook to regalloc to optimize reload placement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191348 91177308-0d34-0410-b5e6-96231b3b80d8
When a truncate node defines a legal vector type but uses an illegal
vector type, the legalization process was splitting the vector until
<1 x vector> type, but then it was failing to scalarize the node because
it did not know how to handle TRUNCATE.
<rdar://problem/14989896>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190830 91177308-0d34-0410-b5e6-96231b3b80d8
IT blocks can only be one instruction lonf, and can only contain a subset of
the 16 instructions.
Patch by Artyom Skrobov!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190309 91177308-0d34-0410-b5e6-96231b3b80d8
Field 2 of DIType (Context), field 9 of DIDerivedType (TypeDerivedFrom),
field 12 of DICompositeType (ContainingType), fields 2, 7, 12 of DISubprogram
(Context, Type, ContainingType).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190205 91177308-0d34-0410-b5e6-96231b3b80d8
field of DICompositeType.
This will help the follow-on patch of using DITypeRef for containing-type field.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190187 91177308-0d34-0410-b5e6-96231b3b80d8
Solution is not sufficient to prevent 'mov pc, lr' being emitted for jump table code.
Test case doesn't trigger the added functionality.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190047 91177308-0d34-0410-b5e6-96231b3b80d8
This improves code generation for jump tables by avoiding the emission of "mov pc, lr" which could fool the processor into believing this is a return from a function causing mispredicts. The code generation logic for jump tables uses ADR to materialize the address of the jump target.
Patch by Daniel Stewart!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190043 91177308-0d34-0410-b5e6-96231b3b80d8
'Force' values in registers using the calling convention. Now, we only depend on
the calling convention and that the allocator performs copy coalescing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189985 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r189648.
Fixes for the previously failing clang-side arm_neon_intrinsics test
cases will be checked in separately.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189841 91177308-0d34-0410-b5e6-96231b3b80d8
What we really want is to enable Swift by default for *v7s triples (and there already seems to be some logic which attempts to do that). In that case the iOS version doesn't matter.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189763 91177308-0d34-0410-b5e6-96231b3b80d8
In addition to recognizing when the multiply's second argument is
coming from an explicit VDUPLANE, also look for a plain scalar
f32 reference and reference it via the corresponding vector
lane.
rdar://14870054
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189619 91177308-0d34-0410-b5e6-96231b3b80d8
Clang is now generating cleaner IR, so this removes the old variants which
should be completely unused.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189481 91177308-0d34-0410-b5e6-96231b3b80d8
The vqdmlal and vqdmlls instructions are really just a fused pair consisting of
a vqdmull.sN and a vqadd.sN. This adds patterns to LLVM so that we can switch
Clang's CodeGen over to generating these instead of the special vqdmlal
intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189480 91177308-0d34-0410-b5e6-96231b3b80d8
These instructions aren't particularly complicated and it's well worth having
patterns for some reasonably useful LLVM IR that will match them. Soon we
should be able to switch Clang over to producing this natural version.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189335 91177308-0d34-0410-b5e6-96231b3b80d8
DICompositeType will have an identifier field at position 14. For now, the
field is set to null in DIBuilder.
For DICompositeTypes where the template argument field (the 13th field)
was optional, modify DIBuilder to make sure the template argument field is set.
Now DICompositeType has 15 fields.
Update DIBuilder to use NULL instead of "i32 0" for null value of a MDNode.
Update verifier to check that DICompositeType has 15 fields and the last
field is null or a MDString.
Update testing cases to include an extra field for DICompositeType.
The identifier field will be used by type uniquing so a front end can
genearte a DICompositeType with a unique identifer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189282 91177308-0d34-0410-b5e6-96231b3b80d8
Now that fast-isel is in better shape, we can enable the machine
verifier for these tests, too.
rdar://12594152
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189275 91177308-0d34-0410-b5e6-96231b3b80d8
Get the register class right for the TST instruction. This keeps the
machine verifier happy, enabling us to turn it on for another test.
rdar://12594152
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189274 91177308-0d34-0410-b5e6-96231b3b80d8
Constant pool and global value reference instructions need more
restricted register classes than plain GPR.
rdar://12594152
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189270 91177308-0d34-0410-b5e6-96231b3b80d8
A single metadata will not span multiple lines. This also helps me with
my script to automatic update the testing cases.
A debug info testing case should have a llvm.dbg.cu.
Do not use hard-coded id for debug nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189033 91177308-0d34-0410-b5e6-96231b3b80d8
This uses the ARMcmov pattern that Tim cleaned up in r188995.
Thanks to Simon Tatham for his floating point help!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189024 91177308-0d34-0410-b5e6-96231b3b80d8
The function call to external function should come with PLT relocation
type if the PIC relocation model is used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189002 91177308-0d34-0410-b5e6-96231b3b80d8
Back in the mists of time (2008), it seems TableGen couldn't handle the
patterns necessary to match ARM's CMOV node that we convert select operations
to, so we wrote a lot of fairly hairy C++ to do it for us.
TableGen can deal with it now: there were a few minor differences to CodeGen
(see tests), but nothing obviously worse that I could see, so we should
probably address anything that *does* come up in a localised manner.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188995 91177308-0d34-0410-b5e6-96231b3b80d8
The code for 'Q' and 'R' operand modifiers needs to look through tied
operands to discover the register class.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188990 91177308-0d34-0410-b5e6-96231b3b80d8
Update testcase to be more careful about checking register
values. While regexes are general goodness for these sorts of
testcases, in this example, the registers are constrained by
the calling convention, so we can and should check their
explicit values.
rdar://14779513
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188819 91177308-0d34-0410-b5e6-96231b3b80d8
Previously we used a const-pool load for virtually all 64-bit floating values.
Actually, we can get quite a few common values (including 0.0, 1.0) via "vmov"
instructions of one stripe or another.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188773 91177308-0d34-0410-b5e6-96231b3b80d8
When patching inlineasm nodes to use GPRPair for 64-bit values, we
were dropping the information that two operands were tied, which
effectively broke the live-interval of vregs affected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188643 91177308-0d34-0410-b5e6-96231b3b80d8
Properly constrain the operand register class for instructions used
in [sz]ext expansion. Update more tests to use the verifier now that
we're getting the register classes correct.
rdar://12594152
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188594 91177308-0d34-0410-b5e6-96231b3b80d8
Teach the generic instruction selection helper functions to constrain
the register classes of their input operands. For non-physical register
references, the generic code needs to be careful not to mess that up
when replacing references to result registers. As the comment indicates
for MachineRegisterInfo::replaceRegWith(), it's important to call
constrainRegClass() first.
rdar://12594152
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188593 91177308-0d34-0410-b5e6-96231b3b80d8
Lots of machine verifier errors result from using a plain GPR regclass
for incoming argument copies. A more restrictive rGPR class is more
appropriate since it more accurately represents what's happening, plus
it lines up better with isel later on so the verifier is happier.
Reduces the number of ARM fast-isel tests not running with the verifier
enabled by over half.
rdar://12594152
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188592 91177308-0d34-0410-b5e6-96231b3b80d8
This unbreaks PIC with fast isel on ELF targets (PR16717). The output matches
what GCC and SDag do for PIC but may not cover all of the many flavors of PIC
that exist.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188551 91177308-0d34-0410-b5e6-96231b3b80d8
- Instead of setting the suffixes in a bunch of places, just set one master
list in the top-level config. We now only modify the suffix list in a few
suites that have one particular unique suffix (.ml, .mc, .yaml, .td, .py).
- Aside from removing the need for a bunch of lit.local.cfg files, this enables
4 tests that were inadvertently being skipped (one in
Transforms/BranchFolding, a .s file each in DebugInfo/AArch64 and
CodeGen/PowerPC, and one in CodeGen/SI which is now failing and has been
XFAILED).
- This commit also fixes a bunch of config files to use config.root instead of
older copy-pasted code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188513 91177308-0d34-0410-b5e6-96231b3b80d8
When determining if two different loads are from the same base address,
this patch allows one load to use a t2LDRi8 address mode and another to
use a t2LDRi12 address mode. The current implementation is very
conservative and this allows the case of differing Thumb2 byte loads to
be considered. Allowing these differing modes instead of forcing the exact
same opcode is useful for situations where one opcodes loads from a base
address+1 and a second opcode loads for a base address-1.
Patch by Daniel Stewart.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188385 91177308-0d34-0410-b5e6-96231b3b80d8
A common idiom is to use zero and all-ones as sentinal values and to
check for both in a single conditional ("x != 0 && x != (unsigned)-1").
That generates code, for i32, like:
testl %edi, %edi
setne %al
cmpl $-1, %edi
setne %cl
andb %al, %cl
With this transform, we generate the simpler:
incl %edi
cmpl $1, %edi
seta %al
Similar improvements for other integer sizes and on other platforms. In
general, combining the two setcc instructions into one is better.
rdar://14689217
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188315 91177308-0d34-0410-b5e6-96231b3b80d8
Various tests had sprung up over the years which had --check-prefix=ABC on the
RUN line, but "CHECK-ABC:" later on. This happened to work before, but was
strictly incorrect. FileCheck is getting stricter soon though.
Patch by Ron Ofir.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188173 91177308-0d34-0410-b5e6-96231b3b80d8
the type exists.
Fix up cases where we weren't checking for optional types and add
an assert to addType to make sure we catch this in the future.
Fix up a testcase that was using the tag for DW_TAG_array_type
when it meant DW_TAG_enumeration_type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187963 91177308-0d34-0410-b5e6-96231b3b80d8
Also remove checking of llvm.dbg.sp since it is not used in generating dwarf.
Current state of Finder:
DebugInfoFinder tries to list all debug info MDNodes used in a module. To
list debug info MDNodes used by an instruction, DebugInfoFinder provides
processDeclare, processValue and processLocation to handle DbgDeclareInst,
DbgValueInst and DbgLoc attached to instructions. processModule will go
through all DICompileUnits in llvm.dbg.cu and list debug info MDNodes
used by the CUs.
TODO:
1> Finder has a list of CUs, SPs, Types, Scopes and global variables. We
need to add a list of variables that are used by DbgDeclareInst and
DbgValueInst.
2> MDString fields should be null or isa<MDString> and MDNode fields should be
null or isa<MDNode>. We currently use empty string or int 0 to represent null.
3> Go though Verify functions and make sure that they check field types.
4> Clean up existing testing cases to remove llvm.dbg.sp and make sure each
testing case has a llvm.dbg.cu.
Re-apply r187609 with fix to pass ocaml binding. vmcore.ml generates a debug
location with scope being metadata !{}, in verifier we treat this as a null
scope.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187812 91177308-0d34-0410-b5e6-96231b3b80d8
Function attributes are the future! So just query whether we want to realign the
stack directly from the function instead of through a random target options
structure.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187618 91177308-0d34-0410-b5e6-96231b3b80d8
Also remove checking of llvm.dbg.sp since it is not used in generating dwarf.
Current state of Finder:
DebugInfoFinder tries to list all debug info MDNodes used in a module. To
list debug info MDNodes used by an instruction, DebugInfoFinder provides
processDeclare, processValue and processLocation to handle DbgDeclareInst,
DbgValueInst and DbgLoc attached to instructions. processModule will go
through all DICompileUnits in llvm.dbg.cu and list debug info MDNodes
used by the CUs.
TODO:
1> Finder has a list of CUs, SPs, Types, Scopes and global variables. We
need to add a list of variables that are used by DbgDeclareInst and
DbgValueInst.
2> MDString fields should be null or isa<MDString> and MDNode fields should be
null or isa<MDNode>. We currently use empty string or int 0 to represent null.
3> Go though Verify functions and make sure that they check field types.
4> Clean up existing testing cases to remove llvm.dbg.sp and make sure each
testing case has a llvm.dbg.cu.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187609 91177308-0d34-0410-b5e6-96231b3b80d8
When simplifying a (or (and B A) (and C ~A)) to a (VBSL A B C) ensure that the
bitwidth of the second operands to both ands match before comparing the negation
of the values.
Split the check of the value of the second operands to the ands. Move the cast
and variable declaration slightly higher to make it slightly easier to follow.
Bug-Id: 16700
Signed-off-by: Saleem Abdulrasool <compnerd@compnerd.org>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187404 91177308-0d34-0410-b5e6-96231b3b80d8
This patch prevents the following combine when the input vector is used more
than once.
insert_vector_elt (build_vector elt0, ..., eltN), NewEltIdx, idx
=>
build_vector elt0, ..., NewEltIdx, ..., eltN
The reasons are:
- Building a vector may be expensive, so try to reuse the existing part of a
vector instead of creating a new one (think big vectors).
- elt0 to eltN now have two users instead of one. This may prevent some other
optimizations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187396 91177308-0d34-0410-b5e6-96231b3b80d8
Also always add DIType, DISubprogram and DIGlobalVariable to the list
in DebugInfoFinder without checking them, so we can verify them later
on.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187285 91177308-0d34-0410-b5e6-96231b3b80d8
We used to call Verify before adding DICompileUnit to the list, and now we
remove the check and always add DICompileUnit to the list in DebugInfoFinder,
so we can verify them later on.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187237 91177308-0d34-0410-b5e6-96231b3b80d8
The previous change to local live range allocation also suppressed
eviction of local ranges. In rare cases, this could result in more
expensive register choices. This commit actually revives a feature
that I added long ago: check if live ranges can be reassigned before
eviction. But now it only happens in rare cases of evicting a local
live range because another local live range wants a cheaper register.
The benefit is improved code size for some benchmarks on x86 and armv7.
I measured no significant compile time increase and performance
changes are noise.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187140 91177308-0d34-0410-b5e6-96231b3b80d8
Also avoid locals evicting locals just because they want a cheaper register.
Problem: MI Sched knows exactly how many registers we have and assumes
they can be colored. In cases where we have large blocks, usually from
unrolled loops, greedy coloring fails. This is a source of
"regressions" from the MI Scheduler on x86. I noticed this issue on
x86 where we have long chains of two-address defs in the same live
range. It's easy to see this in matrix multiplication benchmarks like
IRSmk and even the unit test misched-matmul.ll.
A fundamental difference between the LLVM register allocator and
conventional graph coloring is that in our model a live range can't
discover its neighbors, it can only verify its neighbors. That's why
we initially went for greedy coloring and added eviction to deal with
the hard cases. However, for singly defined and two-address live
ranges, we can optimally color without visiting neighbors simply by
processing the live ranges in instruction order.
Other beneficial side effects:
It is much easier to understand and debug regalloc for large blocks
when the live ranges are allocated in order. Yes, global allocation is
still very confusing, but it's nice to be able to comprehend what
happened locally.
Heuristics could be added to bias register assignment based on
instruction locality (think late register pairing, banks...).
Intuituvely this will make some test cases that are on the threshold
of register pressure more stable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187139 91177308-0d34-0410-b5e6-96231b3b80d8
Make sure the context and type fields are MDNodes. We will generate
verification errors if those fields are non-empty strings.
Fix testing cases to make them pass the verifier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187106 91177308-0d34-0410-b5e6-96231b3b80d8