For example, ".byte 256" would previously assert() when emitting an object
file. Now it generates a diagnostic that the literal value is out of range.
rdar://9686950
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134069 91177308-0d34-0410-b5e6-96231b3b80d8
Drop the FpMov instructions, use plain COPY instead.
Drop the FpSET/GET instruction for accessing fixed stack positions.
Instead use normal COPY to/from ST registers around inline assembly, and
provide a single new FpPOP_RETVAL instruction that can access the return
value(s) from a call. This is still necessary since you cannot tell from
the CALL instruction alone if it returns anything on the FP stack. Teach
fast isel to use this.
This provides a much more robust way of handling fixed stack registers -
we can tolerate arbitrary FP stack instructions inserted around calls
and inline assembly. Live range splitting could sometimes break x87 code
by inserting spill code in unfortunate places.
As a bonus we handle floating point inline assembly correctly now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134018 91177308-0d34-0410-b5e6-96231b3b80d8
opening single quote with no closing single quote, and with {} quotes
"inside" of it. This broke some of our tools that scrape test cases.
Also, while here, make the test actually assert what the comment says it
asserts. This was essentially authored by Nick Lewycky, and merely typed
in by myself. Let me know if this is still missing the mark, but the
previous test only succeeded due to the improper quoting preventing
*anything* from matching the grep -- it had a '4(%...)' sequence in the
output!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133980 91177308-0d34-0410-b5e6-96231b3b80d8
When the destination operand is the same as the first source register
operand for arithmetic instructions, the destination operand may be omitted.
For example, the following two instructions are equivalent:
and r1, #ff
and r1, r1, #ff
rdar://9672867
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133973 91177308-0d34-0410-b5e6-96231b3b80d8
Correctly parse the forms of the Thumb mov-immediate instruction:
1. 8-bit immediate 0-255.
2. 12-bit shifted-immediate.
The 16-bit immediate "movw" form is also legal with just a "mov" mnemonic,
but is not yet supported. More parser logic necessary there due to fixups.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133966 91177308-0d34-0410-b5e6-96231b3b80d8
Add aliases for the vpush/vpop mnemonics to the VFP load/store multiple
writeback instructions w/ SP as the base pointer.
rdar://9683231
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133932 91177308-0d34-0410-b5e6-96231b3b80d8
When the destination operand is the same as the first source register
operand for arithmetic instructions, the destination operand may be omitted.
For example, the following two instructions are equivalent:
sub r2, r2, #6
sub r2, #6
rdar://9682597
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133925 91177308-0d34-0410-b5e6-96231b3b80d8
Also fix some of the tests that were actually testing wrong behavior -
An input operand in {st} is only popped by the inline asm when {st} is
also in the clobber list.
The original bug reports all had ~{st} clobbers as they should.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133916 91177308-0d34-0410-b5e6-96231b3b80d8
alloca that only holds a copy of a global and we're going to replace the users
of the alloca with that global, just nuke the lifetime intrinsics. Part of
PR10121.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133905 91177308-0d34-0410-b5e6-96231b3b80d8
The .b8 operations in PTX are far more limiting than I first thought. The mov operation isn't even supported, so there's no way of converting a .pred value into a .b8 without going via .b16, which is
not sensible. An improved implementation needs to use the fact that loads and stores automatically extend and truncate to implement support for EXTLOAD and TRUNCSTORE in order to correctly support
boolean values.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133873 91177308-0d34-0410-b5e6-96231b3b80d8
The i8 type is required for boolean values, but can only use ld, st and mov instructions. The i1 type continues to be used for predicates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133814 91177308-0d34-0410-b5e6-96231b3b80d8
instructions can be used to match combinations of multiply/divide and VCVT
(between floating-point and integer, Advanced SIMD). Basically the VCVT
immediate operand that specifies the number of fraction bits corresponds to a
floating-point multiply or divide by the corresponding power of 2.
For example, VCVT (floating-point to fixed-point, Advanced SIMD) can replace a
combination of VMUL and VCVT (floating-point to integer) as follows:
Example (assume d17 = <float 8.000000e+00, float 8.000000e+00>):
vmul.f32 d16, d17, d16
vcvt.s32.f32 d16, d16
becomes:
vcvt.s32.f32 d16, d16, #3
Similarly, VCVT (fixed-point to floating-point, Advanced SIMD) can replace a
combinations of VCVT (integer to floating-point) and VDIV as follows:
Example (assume d17 = <float 8.000000e+00, float 8.000000e+00>):
vcvt.f32.s32 d16, d16
vdiv.f32 d16, d17, d16
becomes:
vcvt.f32.s32 d16, d16, #3
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133813 91177308-0d34-0410-b5e6-96231b3b80d8
enables SelectionDAG::getLoad at MipsISelLowering.cpp:1914 to return a
pre-existing node instead of redundantly create a new node every time it is
called.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133811 91177308-0d34-0410-b5e6-96231b3b80d8
parameters if SM >= 2.0
- Update test cases to be more robust against register allocation changes
- Bump up the number of registers to 128 per type
- Include Python script to re-generate register file with any number of
registers
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133736 91177308-0d34-0410-b5e6-96231b3b80d8
Take #2. Don't piggyback on the existing config.build_mode. Instead,
define a new lit feature for each build feature we need (currently
just "asserts"). Teach both autoconf'd and cmake'd Makefiles to define
this feature within test/lit.site.cfg. This doesn't require any lit
harness changes and should be more robust across build systems.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133664 91177308-0d34-0410-b5e6-96231b3b80d8
to emit "movd" across the board to continue supporting a Darwin assembler bug.
This is the reincarnation of r133452.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133565 91177308-0d34-0410-b5e6-96231b3b80d8
1. (((x) & 0xFF00) >> 8) | (((x) & 0x00FF) << 8)
=> (bswap x) >> 16
2. ((x&0xff)<<8)|((x&0xff00)>>8)|((x&0xff000000)>>8)|((x&0x00ff0000)<<8))
=> (rotl (bswap x) 16)
This allows us to eliminate most of the def : Pat patterns for ARM rev16
revsh instructions. It catches many more cases for ARM and x86.
rdar://9609108
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133503 91177308-0d34-0410-b5e6-96231b3b80d8
ops.
This is a rewrite of the IV simplification algorithm used by
-disable-iv-rewrite. To avoid perturbing the default mode, I
temporarily split the driver and created SimplifyIVUsersNoRewrite. The
idea is to avoid doing opcode/pattern matching inside
IndVarSimplify. SCEV already does it. We want to optimize with the
full generality of SCEV, but optimize def-use chains top down on-demand rather
than rewriting the entire expression bottom-up. This was easy to do
for operations that SCEV can prove are identity function. So we're now
eliminating bitmasks and zero extends this way.
A result of this rewrite is that indvars -disable-iv-rewrite no longer
requires IVUsers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133502 91177308-0d34-0410-b5e6-96231b3b80d8
source vector type is to be split while the target vector is to be promoted.
(eg: <4 x i64> -> <4 x i8> )
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133424 91177308-0d34-0410-b5e6-96231b3b80d8
In cases such as the attached test, where the case value for a switch
destination is used in a phi node that follows the destination, it
might be better to replace that value with the condition value of the
switch, so that more blocks can be folded away with
TryToSimplifyUncondBranchFromEmptyBlock because there are less
conflicts in the phi node.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133344 91177308-0d34-0410-b5e6-96231b3b80d8
type's bitwidth matches the (allocated) size of the alloca. This severely
pessimizes vector scalar replacement when the only vector type being used is
something like <3 x float> on x86 or ARM whose allocated size matches a
<4 x float>.
I hope to fix some of the flawed assumptions about allocated size throughout
scalar replacement and reenable this in most cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133338 91177308-0d34-0410-b5e6-96231b3b80d8
for pre-2.9 bitcode files. We keep x86 unaligned loads, movnt, crc32, and the
target indep prefetch change.
As usual, updating the testsuite is a PITA.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133337 91177308-0d34-0410-b5e6-96231b3b80d8
range without a libcall to a new mulo<mode> libcall
that we'd have to create.
Finishes the rest of rdar://9090077 and rdar://9210061
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133318 91177308-0d34-0410-b5e6-96231b3b80d8
* rounding modes for fp add, mul, sub now use .rn
* float -> int rounding correctly uses .rzi not .rni
* 32bit fdiv for sm13 uses div.rn (instead of div.approx)
* 32bit fdiv for sm10 now uses div (instead of div.approx)
Approx is not IEEE 754 compatible (and should be optionally set by a flag to the backend instead). The .rn rounding modifier is the PTX default anyway, but it's better to be explicit.
All these modifiers should be available by using __fmul_rz functions for example, but support will need to be added for this in the backend.
Patch by Dan Bailey
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133253 91177308-0d34-0410-b5e6-96231b3b80d8
In Thumb mode we cannot handle GPR virtual registers, even though some
instructions can. When isel is lowering a CopyFromReg, it should limit
itself to subclasses of getRegClassFor(VT).
<rdar://problem/9624323>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133210 91177308-0d34-0410-b5e6-96231b3b80d8
REQUIRES: Asserts
REQUIRES: Debug
This required chaining test configuration properties. It seems like a
generally good thing to do.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133131 91177308-0d34-0410-b5e6-96231b3b80d8
accumulator forwarding. Specifically (from SVN log entry):
Distribute (A + B) * C to (A * C) + (B * C) to make use of NEON multiplier
accumulator forwarding:
vadd d3, d0, d1
vmul d3, d3, d2
=>
vmul d3, d0, d2
vmla d3, d1, d2
Make sure it catches cases where operand 1 is add/fadd/sub/fsub, which was
intended in the original revision.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133127 91177308-0d34-0410-b5e6-96231b3b80d8
optimizations when emitting calls to the function; instead those calls may
use faster relocations which require the function to be immediately resolved
upon loading the dynamic object featuring the call. This is useful when it
is known that the function will be called frequently and pervasively and
therefore there is no merit in delaying binding of the function.
Currently only implemented for x86-64, where it turns into a call through
the global offset table.
Patch by Dan Gohman, who assures me that he's going to add LangRef documentation
for this once it's committed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133080 91177308-0d34-0410-b5e6-96231b3b80d8
Note that this actually changes code generation, and someone who
understands this target better should check the changes.
- R12Q is now allocatable. I think it was omitted from the allocation
order by mistake since it isn't reserved. It as apparently used as a
GOT pointer sometimes, and it should probably be reserved if that is
the case.
- The GR64 registers are allocated in a different order now. The
register allocator will automatically put the CSRs last. There were
other changes to the order that may have been significant.
The test fix is because r0 and r1 swapped places in the allocation order.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133067 91177308-0d34-0410-b5e6-96231b3b80d8
the bits being cleared by the AND are not demanded by the BFI.
The previous BFI dag combine rule was actually incorrect (or used to be
correct until BFI representation changed).
rdar://9609030
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133034 91177308-0d34-0410-b5e6-96231b3b80d8
converted to add x,x if x is a undef. add undef, undef does not guarantee
that the resulting low order bit is zero.
Fixes <rdar://problem/9453156> and <rdar://problem/9487392>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133022 91177308-0d34-0410-b5e6-96231b3b80d8
types (with power of two types such as 8,16,32 .. 512).
Fix a bug in the integer promotion of bitcast nodes. Enable integer expanding
only if the target of the conversion is an integer (when the type action is
scalarize).
Add handling to the legalization of vector load/store in cases where the saved
vector is integer-promoted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132985 91177308-0d34-0410-b5e6-96231b3b80d8
cache prefetch and now that the info from "prefetch" to "ARMPreload" is present,
only add a testcase for PLI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132978 91177308-0d34-0410-b5e6-96231b3b80d8
sharp all or nothing transition when one extra predecessor was added. Now
we still test first ones for merging.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132974 91177308-0d34-0410-b5e6-96231b3b80d8
might overflow. Re-typing the alloca to a larger type (e.g. double)
hoists a shift into the alloca, potentially exposing overflow in the
expression. rdar://problem/9265821
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132926 91177308-0d34-0410-b5e6-96231b3b80d8
In particular, don't spill dirty registers only to satisfy a hint. It is
not worth it.
The attached test case provides an example where the fast allocator
would spill a register when other registers are available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132900 91177308-0d34-0410-b5e6-96231b3b80d8
we try to branch to them.
Before we were creating successor lists with duplicated entries. Fixing that
found a bug in isBlockOnlyReachableByFallthrough that would causes it to
return the wrong answer for
-----------
...
jne foo
jmp bar
foo:
----------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132882 91177308-0d34-0410-b5e6-96231b3b80d8