Commit Graph

71441 Commits

Author SHA1 Message Date
Matt Arsenault
ba86db191d R600: Implement enableClusterLoads()
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213831 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-24 02:10:17 +00:00
Kevin Qin
2daff76c05 [AArch64] Fix a bug generating incorrect instruction when building small vector.
This bug is introduced by r211144. The element of operand may be
smaller than the element of result, but previous commit can
only handle the contrary condition. This commit is to handle this
scenario and generate optimized codes like ZIP1.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213830 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-24 02:05:42 +00:00
Jiangning Liu
1bc34d71b7 [AArch64] Disable some optimization cases for type conversion from sint to fp, because those optimization cases are micro-architecture dependent and only make sense for Cyclone. A new predicate Cyclone is introduced in .td file.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213827 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-24 01:29:59 +00:00
Filipe Cabecinhas
d28cfd150b Fixed PR20411 - bug in getINSERTPS()
When we had a vector_shuffle where we had an input from each vector, we
could miscompile it because we were assuming the input from V2 wouldn't
be moved from where it was on the vector.

Added a test case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213826 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-24 01:28:21 +00:00
Manman Ren
4cae9cb034 SimplifyCFG: fix a bug in switch to table conversion
We use gep to access the global array "switch.table", and the table index
should be treated as unsigned. When the highest bit is 1, this commit
zero-extends the index to an integer type with larger size.

For a switch on i2, we used to generate:
%switch.tableidx = sub i2 %0, -2
getelementptr inbounds [4 x i64]* @switch.table, i32 0, i2 %switch.tableidx

It is incorrect when %switch.tableidx is 2 or 3. The fix is to generate
%switch.tableidx = sub i2 %0, -2
%switch.tableidx.zext = zext i2 %switch.tableidx to i3
getelementptr inbounds [4 x i64]* @switch.table, i32 0, i3 %switch.tableidx.zext

rdar://17735071


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213815 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 23:13:23 +00:00
Rafael Espindola
48b8c128df Fix the build when building with only the ARM backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213814 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 22:54:28 +00:00
Eric Christopher
0bebd31dce Fix indenting.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213811 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 22:34:13 +00:00
Eric Christopher
ba08e46435 Reorganize and simplify local variables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213809 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 22:27:10 +00:00
Rafael Espindola
34c5b5b952 Finish inverting the MC -> Object dependency.
There were still some disassembler bits in lib/MC, but their use of Object
was only visible in the includes they used, not in the symbols.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213808 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 22:26:07 +00:00
Eric Christopher
3322b7eef1 Remove the query for TargetMachine and TargetInstrInfo since we're
already inside TargetInstrInfo.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213806 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 22:12:03 +00:00
David Blaikie
ccd1035ad4 ArgPromo+DebugInfo: Handle updating debug info over multiple applications of argument promotion.
While the subprogram map cache used by Dead Argument Elimination works
there, I made a mistake when reusing it for Argument Promotion in
r212128 because ArgPromo may transform functions more than once whereas
DAE transforms each function only once, removing all the dead arguments
in one go.

To address this, ensure that the map is updated after each argument
promotion.

In retrospect it might be a little wasteful to create a map of all
subprograms when only handling a single CGSCC, but the alternative is
walking the debug info for each function in the CGSCC that gets updated.
It's not clear to me what the right tradeoff is there, but since the
current tradeoff seems to be working OK (and the code to keep things
updated is very cheap), let's stick with that for now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213805 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 22:09:29 +00:00
Jim Grosbach
adb6a3649e [X86,AArch64] Extend vcmp w/ unary op combine to work w/ more constants.
The transform to constant fold unary operations with an AND across a
vector comparison applies when the constant is not a splat of a scalar
as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213800 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 20:41:43 +00:00
Jim Grosbach
4070037e3c X86: restrict combine to when type sizes are safe.
The folding of unary operations through a vector compare and mask operation
is only safe if the unary operation result is of the same size as its input.
For example, it's not safe for [su]itofp from v4i32 to v4f64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213799 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 20:41:38 +00:00
Jim Grosbach
df48c93ef0 DAG: fp->int conversion for non-splat constants.
Constant fold the lanes of the input constant build_vector individually
so we correctly handle when the vector elements are not all the same
constant value.

PR20394

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213798 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 20:41:31 +00:00
Justin Holewinski
ce64168930 [NVPTX] Silence a GCC warning found by the buildbots
The cast to NVPTXTargetLowering was missing a 'const', but let's
just access the right pointer through the subtarget anyway.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213793 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 20:23:47 +00:00
Mark Heffernan
d55c7c7f42 Do not add unroll disable metadata after unrolling pass for loops with #pragma clang loop unroll(full).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213789 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 20:05:44 +00:00
Juergen Ributzka
4fa6ecc26f [FastISel][AArch64] Fix return type in FastLowerCall.
I used the wrong method to obtain the return type inside FinishCall. This fix
simply uses the return type from FastLowerCall, which we already determined to
be a valid type.

Reduced test case from Chad. Thanks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213788 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 20:03:13 +00:00
Justin Holewinski
03f160f9d3 [NVPTX] mul.wide generation works for any smaller integer source types, not just the next smaller power of two
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213784 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 18:46:03 +00:00
Saleem Abdulrasool
e2c63ff3c9 AsmParser: remove deprecated LLIR support
linker_private and linker_private_weak were deprecated in 3.5.  Remove support
for them now that the 3.5 branch has been created.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213777 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 18:09:31 +00:00
Saleem Abdulrasool
58ba15dbe6 ExecutionEngine: remove a stray semicolon
Detected via GCC 4.8 [-Wpedantic].

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213776 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 18:09:28 +00:00
Justin Holewinski
6b50a7d179 [NVPTX] Make sure we do not generate MULWIDE ISD nodes when optimizations are disabled
With optimizations disabled, we disable the isel patterns for mul.wide; but we
were still generating MULWIDE ISD nodes.  Now, we only try to generate MULWIDE
ISD nodes in DAGCombine if the optimization level is not zero.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213773 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 17:40:45 +00:00
Mark Heffernan
e8d7ebcd5a In unroll pragma syntax and loop hint metadata, change "enable" forms to a new form using the string "full".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213772 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 17:31:37 +00:00
Chad Rosier
67c325e9f0 [AArch64] Lower sdiv x, pow2 using add + select + shift.
The target-independent DAGcombiner will generate:
asr w1, X, #31 w1 = splat sign bit.
add X, X, w1, lsr #28 X = X + 0 or pow2-1
asr w0, X, asr #4 w0 = X/pow2

However, the add + shifts is expensive, so generate:
add w0, X, 15 w0 = X + pow2-1
cmp X, wzr X - 0
csel X, w0, X, lt X = (X < 0) ? X + pow2-1 : X;
asr w0, X, asr 4 w0 = X/pow2

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213758 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 14:57:52 +00:00
Robert Khasanov
3922da8ae8 [SKX] Enabling mask instructions: encoding, lowering
KMOVB, KMOVW, KMOVD, KMOVQ, KNOTB, KNOTW, KNOTD, KNOTQ

Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213757 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 14:49:42 +00:00
Tim Northover
afb1938c39 ARM: spot SBFX-compatbile code expressed with sign_extend_inreg
We were assuming all SBFX-like operations would have the shl/asr form, but
often when the field being extracted is an i8 or i16, we end up with a
SIGN_EXTEND_INREG acting on a shift instead. Simple enough to check for though.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213754 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 13:59:12 +00:00
Tim Northover
52aa80b068 ARM: add patterns for [su]xta[bh] from just a shift.
Although the final shifter operand is a rotate, this actually only matters for
the half-word extends when the amount == 24. Otherwise folding a shift in is
just as good.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213753 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 13:59:07 +00:00
James Molloy
ed10064699 Enable partial libcall inlining for all targets by default.
This pass attempts to speculatively use a sqrt instruction if one exists on the target, falling back to a libcall if the target instruction returned NaN.

This was enabled for MIPS and System-Z, but is well guarded and is good for most targets - GCC does this for (that I've checked) X86, ARM and AArch64.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213752 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 13:33:00 +00:00
Tilmann Scheller
3b867c9c8e [ARM] Make the assembler reject unpredictable pre/post-indexed ARM STRB instructions.
The ARM ARM prohibits STRB instructions with writeback into the source register. With this commit this constraint is now enforced and we stop assembling STRB instructions with unpredictable behavior.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213750 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 13:03:47 +00:00
Tim Northover
8b6257629a AArch64: remove "arm64_be" support in favour of "aarch64_be".
There really is no arm64_be: it was a useful fiction to test big-endian support
while both backends existed in parallel, but now the only platform that uses
the name (iOS) doesn't have a big-endian variant, let alone one called
"arm64_be".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213748 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 12:58:11 +00:00
Tilmann Scheller
0f7b2e9db0 [ARM] Make the assembler reject unpredictable pre/post-indexed ARM STR instructions.
The ARM ARM prohibits STR instructions with writeback into the source register. With this commit this constraint is now enforced and we stop assembling STR instructions with unpredictable behavior.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213745 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 12:38:17 +00:00
Tim Northover
9231148e69 AArch64: remove arm64 triple enumerator.
Having both Triple::arm64 and Triple::aarch64 is extremely confusing, and
invites bugs where only one is checked. In reality, the only legitimate
difference between the two (arm64 usually means iOS) is also present in the OS
part of the triple and that's what should be checked.

We still parse the "arm64" triple, just canonicalise it to Triple::aarch64, so
there aren't any LLVM-side test changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213743 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 12:32:47 +00:00
Andrea Di Biagio
bf0fb36d72 Revert r211771. It was: "[X86] Improve the selection of SSE3/AVX addsub instructions".
This chang fully reverts r211771.
That revision added a canonicalization rule which has the potential to causes a
combine-cycle in the target-independent canonicalizing DAG combine.

The plan is to move the logic that forms target specific addsub nodes as part of
the lowering of shuffles.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213736 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 11:20:24 +00:00
Tilmann Scheller
fbf7b85869 [ARM] Add earlyclobber constraint to pre/post-indexed ARM STRH instructions.
The post-indexed instructions were missing the constraint, causing unpredictable STRH instructions to be emitted.

The earlyclobber constraint on the pre-indexed STR instructions is not strictly necessary, as the instruction selection for pre-indexed STR instructions goes through an additional layer of pseudo instructions which have the constraint defined, however it doesn't hurt to specify the constraint directly on the pre-indexed instructions as well, since at some point someone might create instances of them programmatically and then the constraint is definitely needed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213729 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 08:12:51 +00:00
Chandler Carruth
a6425604c2 [SDAG] Make the DAGCombine worklist not grow endlessly due to duplicate
insertions.

The old behavior could cause arbitrarily bad memory usage in the DAG
combiner if there was heavy traffic of adding nodes already on the
worklist to it. This commit switches the DAG combine worklist to work
the same way as the instcombine worklist where we null-out removed
entries and only add new entries to the worklist. My measurements of
codegen time shows slight improvement. The memory utilization is
unsurprisingly dominated by other factors (the IR and DAG itself
I suspect).

This change results in subtle, frustrating churn in the particular order
in which DAG combines are applied which causes a number of minor
regressions where we fail to match a pattern previously matched by
accident. AFAICT, all of these should be using AddToWorklist to directly
or should be written in a less brittle way. None of the changes seem
drastically bad, and a few of the changes seem distinctly better.

A major change required to make this work is to significantly harden the
way in which the DAG combiner handle nodes which become dead
(zero-uses). Previously, we relied on the ability to "priority-bump"
them on the combine worklist to achieve recursive deletion of these
nodes and ensure that the frontier of remaining live nodes all were
added to the worklist. Instead, I've introduced a routine to just
implement that precise logic with no indirection. It is a significantly
simpler operation than that of the combiner worklist proper. I suspect
this will also fix some other problems with the combiner.

I think the x86 changes are really minor and uninteresting, but the
avx512 change at least is hiding a "regression" (despite the test case
being just noise, not testing some performance invariant) that might be
looked into. Not sure if any of the others impact specific "important"
code paths, but they didn't look terribly interesting to me, or the
changes were really minor. The consensus in review is to fix any
regressions that show up after the fact here.

Thanks to the other reviewers for checking the output on other
architectures. There is a specific regression on ARM that Tim already
has a fix prepped to commit.

Differential Revision: http://reviews.llvm.org/D4616

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213727 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 07:08:53 +00:00
Nick Lewycky
1bda2d9386 We may visit a call that uses an alloca multiple times in callUsesLocalStack, sometimes with IsNocapture true and sometimes with IsNocapture false. We accidentally skipped work we needed to do in the IsNocapture=false case if we were called with IsNocapture=true the first time. Fixes PR20405!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213726 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 06:24:49 +00:00
NAKAMURA Takumi
a63fc16e5f RuntimeDyldMachOAArch64.h: Fix a warning. [-Wunused-variable]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213710 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 00:17:44 +00:00
Lang Hames
8c3156eb82 [MCJIT] Make stub_addr functionality in RuntimeDyldChecker work in release mode.
There's no reason to restrict this particular piece of RuntimeDyldChecker
functionality to +Asserts builds.

This should fix failures in MachO_x86-64_PIC_relocations.s on release bots.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213708 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 23:50:51 +00:00
Lang Hames
f072ab78ee [MCJIT] Teach RuntimeDyldChecker to handle underscores at the start of symbols.
RuntimeDyldChecker had been testing isalpha(Expr[0]) to recognise symbol tokens,
and throwing unrecognized token errors when it hit symbols with leading
underscores. This fixes that.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213706 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 23:17:21 +00:00
Juergen Ributzka
7edf396977 [FastIsel][AArch64] Add support for the FastLowerCall and FastLowerIntrinsicCall target-hooks.
This commit modifies the existing call lowering functions to be used as the
FastLowerCall and FastLowerIntrinsicCall target-hooks instead.

This enables patchpoint intrinsic lowering for AArch64.

This fixes <rdar://problem/17733076>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213704 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 23:14:58 +00:00
Lang Hames
fa8abbd9be [MCJIT] Improve stub_addr file-not-found diagnostic to help track down a
buildbot failure.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213701 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 23:07:52 +00:00
Lang Hames
daf061cf05 [MCJIT] Refactor and add stub inspection to the RuntimeDyldChecker framework.
This patch introduces a 'stub_addr' builtin that can be used to find the address
of the stub for a given (<file>, <section>, <symbol>) tuple. This address can be
used both to verify the contents of stubs (by loading from the returned address)
and to verify references to stubs (by comparing against the returned address).

Example (1) - Verifying stub contents:

Load 8 bytes (assuming a 64-bit target) from the stub for 'x' in the __text
section of f.o, and compare that value against the addres of 'x'.

# rtdyld-check: *{8}(stub_addr(f.o, __text, x) = x

Example (2) - Verifying references to stubs:

Decode the immediate of the instruction at label 'l', and verify that it's
equal to the offset from the next instruction's PC to the stub for 'y' in the
__text section of f.o (i.e. it's the correct PC-rel difference).

# rtdyld-check: decode_operand(l, 4) = stub_addr(f.o, __text, y) - next_pc(l)
l:
        movq    y@GOTPCREL(%rip), %rax

Since stub inspection requires cooperation with RuntimeDyldImpl this patch
pimpl-ifies RuntimeDyldChecker. Its implementation is moved in to a new class,
RuntimeDyldCheckerImpl, that has access to the definition of RuntimeDyldImpl.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213698 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 22:47:39 +00:00
Juergen Ributzka
d3e2d81592 Appease the buildbots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213694 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 22:02:19 +00:00
Juergen Ributzka
04e8cc79cd [RuntimeDyld][MachO][AArch64] Add a helper function for encoding addends in instructions.
Factor out the addend encoding into a helper function and simplify the
processRelocationRef.

Also add a few simple rtdyld tests. More tests to come once GOTs can be tested too.

Related to <rdar://problem/17768539>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213689 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 21:42:55 +00:00
Juergen Ributzka
6f2f090b06 [RuntimeDyld][MachO][AArch64] Implement the decodeAddend method.
This adds the required functionality to decode the immediate encoded in an
instruction that is referenced in a relocation entry.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213688 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 21:42:51 +00:00
Juergen Ributzka
214c554b64 [RuntimeDyld][MachO][AArch64] Add assertion to check for duplicate addend definition.
In MachO for AArch64 it is possible to have an explicit addend defined by
the ARM64_RELOC_ADDEND relocation or having an addend encoded within the
instruction. Only one of them are allowed per relocation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213687 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 21:42:49 +00:00
Juergen Ributzka
5b50a3c769 [RuntimeDyld] Change the return type of decodeAddend to match the storage type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213686 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 21:42:46 +00:00
Suyog Sarda
c9ea25fc51 This patch implements optimization as mentioned in PR19753: Optimize comparisons with "ashr/lshr exact" of a constanst.
It handles the errors which were seen in PR19958 where wrong code was being emitted due to earlier patch.
Added code for lshr as well as non-exact right shifts.

It implements : 
(icmp eq/ne (ashr/lshr const2, A), const1)" ->
(icmp eq/ne A, Log2(const2/const1)) ->
(icmp eq/ne A, Log2(const2) - Log2(const1))

Differential Revision: http://reviews.llvm.org/D4068
 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213678 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 19:19:36 +00:00
Suyog Sarda
3326ee444a Added InstCombine transform for pattern "(A & B) ^ (A ^ B) -> (A | B)"
Patch idea by Ankit Jain !

Differential Revision: http://reviews.llvm.org/D4618



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213677 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 18:30:54 +00:00
Suyog Sarda
1a1b1f708d Added InstCombine Transform for patterns:
"((~A & B) | A) -> (A | B)" and "((A & B) | ~A) -> (~A | B)"

Original Patch credit to Ankit Jain !!

Differential Revision: http://reviews.llvm.org/D4591



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213676 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 18:09:41 +00:00
Alexey Samsonov
f969d5b86b [ASan] Fix comments about __sanitizer_cov function
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213673 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 17:46:09 +00:00
Hal Finkel
b3b2aac5be Make use of the align parameter attribute for all pointer arguments
We previously supported the align attribute on all (pointer) parameters, but we
only used it for byval parameters. However, it is completely consistent at the
IR level to treat 'align n' on all pointer parameters as an alignment
assumption on the pointer, and now we wll. Specifically, this causes
computeKnownBits to use the align attribute on all pointer parameters, not just
byval parameters. I've also added an explicit parameter attribute test for this
to test/Bitcode/attributes.ll.

And I've updated the LangRef to document the align parameter attribute (as it
turns out, it was not documented at all previously, although the byval
documentation mentioned that it could be used).

There are (at least) two benefits to doing this:
 - It allows enhancing alignment based on the pointer alignment after inlining callees.
 - It allows simplification of pointer arithmetic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213670 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 16:58:55 +00:00
Tim Northover
50f2f1434c X86: drop relocations on __eh_frame sections globally.
Without this, we produce non-extern relocations when targeting older OS X
versions that ld64 can't cope with in the particular context of __eh_frame
sections (who'd want generic relocation-processing anyway?).

This means that an updated linker (ld64 from Xcode 3.2.6 or later) may be
needed when targeting such platforms with a modern version of LLVM, but this is
probably the case anyway and a reasonable requirement.

PR20212, rdar://problem/17544795

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213665 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 15:47:09 +00:00
Suyog Sarda
578c74e35d This patch implements transform for pattern "(A | B) ^ (~A) -> (A | ~B)".
Patch Credit to Ankit Jain !!

Differential Revision: http://reviews.llvm.org/D4588



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213662 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 15:37:39 +00:00
Sasa Stankovic
f8b83e39c5 [mips] Fix two patterns that select i32's (for MIPS32r6) / i64's (for MIPS64r6)
from setne comparison with an i32.

The patterns that are fixed:
  * (select (i32 (setne i32, immZExt16)), i32, i32) (for MIPS32r6)
  * (select (i32 (setne i32, immZExt16)), i64, i64) (for MIPS64r6)


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213653 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 13:36:02 +00:00
Elena Demikhovsky
bf348c4e46 AVX-512: Fixed intrinsic of VSQRTPS/PD instructions.
I set number and types of parameters according to GCC intrinsics.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213640 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 11:07:31 +00:00
Sanjay Patel
8e80aa5d5f fixed typo in comment
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213614 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 04:57:06 +00:00
Chandler Carruth
3f1ae7f58c [SDAG] Refactor the code for inserting a newly allocated SDNode into the
DAG into a helper function.

This adds a trip through the (very minimal) verification logic in
a bunch of places that were missing it, but shouldn't have any other
impact outside of refactoring. I'm hoping to use this to do more clever
things when DAG nodes are inserted into the graph.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213612 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 04:07:55 +00:00
Chandler Carruth
3c249a5a67 [SDAG] Remove a giant pile of asserts that may have helped track down
a bug in 2010 when they were added but are adding no value today.

In fact, they are utter lies. NodeAllocator is used to allocate almost
all of these node types. I don't know what we were trying to assert
here, and the docs don't give any answer. Until we once again stumble
upon a bug needing help, let's clear the path for improvements.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213610 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 04:03:22 +00:00
Mark Heffernan
bc7f1aba2d Rename metadata llvm.loop.vectorize.unroll to llvm.loop.vectorize.interleave.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213588 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 23:11:03 +00:00
Hal Finkel
9da65a8644 Match semantics of PointerMayBeCapturedBefore to its name by default
As it turns out, the capture tracker named CaptureBefore used by AA, and now
available via the PointerMayBeCapturedBefore function, would have been
more-aptly named CapturedBeforeOrAt, because it considers captures at the
instruction provided. This is not always what one wants, and it is difficult to
get the strictly-before behavior given only the current interface. This adds an
additional parameter which controls whether or not you want to include
captures at the provided instruction. The default is not to include the
instruction provided, so that 'Before' matches its name.

No functionality change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213582 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 21:30:22 +00:00
David Blaikie
529165298e Revert "Recommit r212203: Don't try to construct debug LexicalScopes hierarchy for functions that do not have top level debug information."
This reverts commit r212649 while I investigate/reduce/etc PR20367.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213581 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 20:45:59 +00:00
Saleem Abdulrasool
7ed655da8c R600: silence GCC warning
GCC believes it may be possible to not return a value from the switch:
  lib/Target/R600/SIRegisterInfo.cpp:187:1: warning: control reaches end of non-void function [-Wreturn-type]

Add an unreachable label to indicate that this is not possible and still permit
switch coverage checking.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213572 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 17:52:00 +00:00
Tom Stellard
163d8ce61f R600/SI: Refactor VOP3 instruction definitions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213571 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 17:44:29 +00:00
Tom Stellard
3ee2c33655 R600/SI: Separate encoding and operand definitions into their own classes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213570 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 17:44:28 +00:00
Logan Chien
8c4cf40507 Replace the result usages while legalizing cmpxchg.
We should update the usages to all of the results;
otherwise, we might get assertion failure or SEGV during
the type legalization of ATOMIC_CMP_SWAP_WITH_SUCCESS
with two or more illegal types.

For example, in the following sequence, both i8 and i1
might be illegal in some target, e.g. armv5, mipsel, mips64el,

    %0 = cmpxchg i8* %ptr, i8 %desire, i8 %new monotonic monotonic
    %1 = extractvalue { i8, i1 } %0, 1

Since both i8 and i1 should be legalized, the corresponding
ATOMIC_CMP_SWAP_WITH_SUCCESS dag will be checked/replaced/updated
twice.

If we don't update the usage to *ALL* of the results in the
first round, the DAG for extractvalue might be processed earlier.
The GetPromotedInteger() will result in assertion failure,
because its operand (i.e. the success bit of cmpxchg) is not
promoted beforehand.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213569 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 17:33:44 +00:00
Tom Stellard
d7858afe79 R600/SI: Initailize encoding fields of unused VOP3 modifiers to 0
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213564 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 17:12:40 +00:00
Tom Stellard
0794af86a1 R600/SI: Initialize unused VOP3 sources to 0 instead of SIOperand.ZERO
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213563 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 17:12:37 +00:00
Duncan P. N. Exon Smith
facdfc6781 Revert "[C++11] Add predecessors(BasicBlock *) / successors(BasicBlock *) iterator ranges."
This reverts commit r213474 (and r213475), which causes a miscompile on
a stage2 LTO build.  I'll reply on the list in a moment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213562 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 17:06:51 +00:00
Tom Stellard
9787e8c76b R600/SI: Add instruction shrinking pass
This pass converts 64-bit instructions to 32-bit when possible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213561 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 16:55:33 +00:00
Tom Stellard
df99a7f5dc R600/SI: VOPC instructions explicitly define VCC
Therefore we don't need to add it to the implict defs list.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213558 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 16:27:24 +00:00
David Blaikie
95689f0845 Correct the ownership passing semantics of object::createBinary and make them explicit in the type system.
createBinary documented that it destroyed the parameter in error cases,
though by observation it does not. By passing the unique_ptr by value
rather than lvalue reference, callers are now explicit about passing
ownership and the function implements the documented contract. Remove
the explicit documentation, since now the behavior cannot be anything
other than what was documented, so it's redundant.

Also drops a unique_ptr::release in llvm-nm that was always run on a
null unique_ptr anyway.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213557 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 16:26:24 +00:00
David Blaikie
d8fa9295c8 Remove unnecessary use of unique_ptr::release() used to construct another unique_ptr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213556 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 16:23:21 +00:00
David Blaikie
0c466c88d4 Remove unused variable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213554 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 16:13:24 +00:00
Tom Stellard
05388f25d7 R600/SI: Clean up some of the unused REGISTER_{LOAD,STORE} code
There are a few more cleanups to do, but I ran into some problems
with ext loads and trunc stores, when I tried to change some of the
vector loads and stores from custom to legal, so I wasn't able to
get rid of everything.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213552 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 15:45:06 +00:00
Tom Stellard
3280804237 R600/SI: Use scratch memory for large private arrays
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213551 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 15:45:01 +00:00
Tom Stellard
c912b101d2 R600/SI: Specify wavefront size for SI and CI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213550 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 15:44:58 +00:00
Tom Stellard
59b8363f8a R600/SI: Remove vaddr operand from BUFFER_LOAD_*_OFFSET instructions
This operand is never used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213549 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 15:44:55 +00:00
Daniel Sanders
2479def756 [mips] Do not emit '.module fp=...' unless we really need to.
We now emit this value when we need to contradict the default value. This
restores support for binutils 2.24.

When a suitable binutils has been released we can resume unconditionally
emitting .module directives. This is preferable to omitting the .module
directives since the .module directives protect against, for example,
accidentally assembling FP32 code with -mfp64 and producing an unusuable object.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213548 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 15:25:24 +00:00
Robert Khasanov
aac33cfc08 [SKX] Enabling SKX target and AVX512BW, AVX512DQ, AVX512VL features.
Enabling HasAVX512{DQ,BW,VL} predicates.
Adding VK2, VK4, VK32, VK64 masked register classes.
Adding new types (v64i8, v32i16) to VR512.
Extending calling conventions for new types (v64i8, v32i16)

Patch by Zinovy Nis <zinovy.y.nis@intel.com>
Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213545 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 14:54:21 +00:00
Tom Stellard
b664d47cb0 R600/SI: Store constant initializer data in constant memory
This implements a solution for constant initializers suggested
by Vadim Girlin, where we store the data after the shader code
and then use the S_GETPC instruction to compute its address.

This saves use the trouble of creating a new buffer for constant data
and then having to pass the pointer to the kernel via user SGPRs or the
input buffer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213530 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 14:01:14 +00:00
Tom Stellard
b97240dba8 R600/SI: Add isCFDepth0 Predicate to SALU addc pattern
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213529 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 14:01:12 +00:00
Tom Stellard
54a2540fee R600/SI: Use VALU for i1 XOR
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213528 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 14:01:10 +00:00
Tom Stellard
ad9769a118 R600/SI: Use a custom encoding method for simm16 in SOPP branch instructions
This allows us to explicitly define the type of fixup that is needed,
so we can distinguish this from future fixup types.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213527 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 14:01:08 +00:00
Tom Stellard
efb733cbba R600/SI: Rename SOPP operands to match the encoding fields
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213526 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 14:01:05 +00:00
Daniel Sanders
6816d66d99 [mips] Add MipsOptionRecord abstraction and use it to implement .reginfo/.MIPS.options
This abstraction allows us to support the various records that can be placed in
the .MIPS.options section in the future. We currently use it to record register
usage information (the ODK_REGINFO record in our ELF64 spec).

Each .MIPS.options record should subclass MipsOptionRecord and provide an
implementation of EmitMipsOptionRecord.

Patch by Matheus Almeida and Toma Tabacu



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213522 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 13:30:55 +00:00
Hal Finkel
8db585afaa Move the CapturesBefore tracker from AA into CaptureTracking
There were two generally-useful CaptureTracker classes defined in LLVM: the
simple tracker defined in CaptureTracking (and made available via the
PointerMayBeCaptured utility function), and the CapturesBefore tracker
available only inside of AA. This change moves the CapturesBefore tracker into
CaptureTracking, generalizes it slightly (by adding a ReturnCaptures
parameter), and makes it generally available via a PointerMayBeCapturedBefore
utility function.

This logic will be needed, for example, to perform noalias function parameter
attribute inference.

No functionality change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213519 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 13:15:48 +00:00
Aaron Ballman
bb9fd2cbd4 Fixing an MSVC conversion warning about implicitly converting the shift results to 64-bits. No functional change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213515 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 12:31:43 +00:00
Hal Finkel
43b125912e Move isIdentifiedFunctionLocal from BasicAA to AA
The ability to identify function locals will exist outside of BasicAA (for
example, logic for inferring noalias function arguments will need this), so
make this concept generally accessible without code duplication.

No functionality change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213514 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 12:27:23 +00:00
Daniel Sanders
34e658840e [mips] Try to fix the test/ExecutionEngine tests on a MIPS host.
Fix a dangerous default case that caused MipsCodeEmitter to discard pseudo
instructions it didn't recognize. It will now call llvm_unreachable() for
unrecognized pseudo's and explicitly handles PseudoReturn, PseudoReturn64,
PseudoIndirectBranch, PseudoIndirectBranch64, CFI_INSTRUCTION, IMPLICIT_DEF,
and KILL.

There may be other pseudos that need handling but this was enough for the
ExecutionEngine tests to pass on my test system.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213513 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 12:25:34 +00:00
Daniel Sanders
8ecdc6d34f [mips] Do not emit '.module [no]oddspreg' unless we really need to.
We now emit this directive when we need to contradict the default value (e.g.
-mno-odd-spreg is given) or an option changed the default value (e.g. -mfpxx
is given).

This restores support for the currently available head of binutils. However,
at this point binutils 2.24 is still not sufficient since it does not support
'.module fp=...'.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213511 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 10:45:47 +00:00
Tim Northover
f8d927f22b CodeGen: emit IR-level f16 conversion intrinsics as fptrunc/fpext
This makes the first stage DAG for @llvm.convert.to.fp16 an fptrunc,
and correspondingly @llvm.convert.from.fp16 an fpext. The legalisation
path is now uniform, regardless of the input IR:

  fptrunc -> FP_TO_FP16 (if f16 illegal) -> libcall
  fpext -> FP16_TO_FP (if f16 illegal) -> libcall

Each target should be able to select the version that best matches its
operations and not be required to duplicate patterns for both fptrunc
and FP_TO_FP16 (for example).

As a result we can remove some redundant AArch64 patterns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213507 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 09:13:56 +00:00
Chandler Carruth
d4cde8670c [SDAG,cleanup] Switch the DAG combiner over to use the spelling
'Worklist' consistently rather than a deeply confusing mixture of
'WorkList' and 'Worklist'.

Notably, the very 'WorkList' of the DAG combiner was exposed to target
specific DAG combines under an interface 'AddToWorklist' which was
implemented by in turn calling 'AddToWorkList' in the combiner. This has
sent me circling with the wrong case in grep one too many times.

I chose to normalize on 'Worklist' because that one won the grep-vote
for llvm/lib/... by a hundered hits or so, and it is used in places
relatively "canonical" such as InstCombine's Worklist. Let's all jsut
pick this casing, whether "correct", "good", or "bad" and be
consistent...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213506 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 08:56:44 +00:00
Chandler Carruth
0c991ec57a [SDAG] Rather than using a narrow test against the one dummy node on the
stack, filter all handle nodes from the DAG combiner worklist.

This will also handle cases where other handle nodes might be
(erroneously) added to the worklist and then cause bugs and explosions
when deleted. For example, when running the legalizer within the DAG
combiner, there are times when other handle nodes are used and can end
up here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213505 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 08:32:31 +00:00
Andrea Di Biagio
19e39c6f12 [DAGCombiner] Improve the shuffle-vector folding logic.
Canonicalize shuffles according to rules:
 *  shuffle(A, shuffle(A, B)) -> shuffle(shuffle(A,B), A)
 *  shuffle(B, shuffle(A, B)) -> shuffle(shuffle(A,B), B)
 *  shuffle(B, shuffle(A, Undef)) -> shuffle(shuffle(A, Undef), B)

This patch helps identifying more shuffle pairs that could be combined reusing
the already existing rules in the DAGCombiner.

Added new test 'combine-vec-shuffle-5.ll' to verify that the canonicalized
shuffles are now folded into a single shuffle node by the DAGCombiner.
Added more test cases to 'combine-vec-shuffle-4.ll'.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213504 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 07:30:54 +00:00
Andrea Di Biagio
3d1975d44b [DAG] Refactor some logic. No functional change.
This patch removes function 'CommuteVectorShuffle' from X86ISelLowering.cpp
and moves its logic into SelectionDAG.cpp as method 'getCommutedVectorShuffles'.
This refactoring is in preperation of an upcoming change to the DAGCombiner.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213503 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 07:28:51 +00:00
Gerolf Hoflehner
10437f66fa Fix for regression: [Bug 20369] wrong code at -O3 on x86_64-linux-gnu in 64-bit mode
Prevents hoisting of loads above stores and sinking of stores below loads
in MergedLoadStoreMotion.cpp (rdar://15991737)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213497 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 03:02:46 +00:00
Ulrich Weigand
d4542a8cdc [PowerPC] ELFv2 aggregate passing support
This patch adds infrastructure support for passing array types
directly.  These can be used by the front-end to pass aggregate
types (coerced to an appropriate array type).  The details of the
array type being used inform the back-end about ABI-relevant
properties.  Specifically, the array element type encodes:
- whether the parameter should be passed in FPRs, VRs, or just
  GPRs/stack slots  (for float / vector / integer element types,
  respectively)
- what the alignment requirements of the parameter are when passed in
  GPRs/stack slots  (8 for float / 16 for vector / the element type
  size for integer element types) -- this corresponds to the
  "byval align" field

Using the infrastructure provided by this patch, a companion patch
to clang will enable two features:
- In the ELFv2 ABI, pass (and return) "homogeneous" floating-point
  or vector aggregates in FPRs and VRs (this is similar to the ARM
  homogeneous aggregate ABI)
- As an optimization for both ELFv1 and ELFv2 ABIs, pass aggregates
  that fit fully in registers without using the "byval" mechanism

The patch uses the functionArgumentNeedsConsecutiveRegisters callback
to encode that special treatment is required for all directly-passed
array types.  The isInConsecutiveRegs / isInConsecutiveRegsLast bits set
as a results are then used to implement the required size and alignment
rules in CalculateStackSlotSize / CalculateStackSlotAlignment etc.

As a related change, the ABI routines have to be modified to support
passing floating-point types in GPRs.  This is necessary because with
homogeneous aggregates of 4-byte float type we can now run out of FPRs
*before* we run out of the 64-byte argument save area that is shadowed
by GPRs.  Any extra floating-point arguments that no longer fit in FPRs
must now be passed in GPRs until we run out of those too.

Note that there was already code to pass floating-point arguments in
GPRs used with vararg parameters, which was done by writing the argument
out to the argument save area first and then reloading into GPRs.  The
patch re-implements this, however, in favor of code packing float arguments
directly via extension/truncation, BITCAST, and BUILD_PAIR operations.

This is required to support the ELFv2 ABI, since we cannot unconditionally
write to the argument save area (which the caller might not have allocated).
The change does, however, affect ELFv1 varags routines too; but even here
the overall effect should be advantageous: Instead of loading the argument
into the FPR, then storing the argument to the stack slot, and finally
reloading the argument from the stack slot into a GPR, the new code now
just loads the argument into the FPR, and subsequently loads the argument
into the GPR (via BITCAST).  That BITCAST might imply a save/reload from
a stack temporary (in which case we're no worse than before); but it
might be implemented more efficiently in some cases.

The final part of the patch enables up to 8 FPRs and VRs for argument
return in PPCCallingConv.td; this is required to support returning
ELFv2 homogeneous aggregates.  (Note that this doesn't affect other ABIs
since LLVM wil only look for which register to use if the parameter is
marked as "direct" return anyway.)

Reviewed by Hal Finkel.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213493 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 00:13:26 +00:00
Ulrich Weigand
970c019d02 [PowerPC] ELFv2 explicit CFI for CR fields
This is a minor improvement in the ELFv2 ABI.   In ELFv1, DWARF CFI
would represent a saved CR word (holding CR fields CR2, CR3, and CR4)
using just a single CFI record refering to CR2.   In ELFv2 instead,
each of the CR fields is represented by its own CFI record.  The
advantage is that the compiler can now chose to save just a single
(or two) CR fields instead of all of them, if those are the only ones
that actually need saving.  That can lead to more efficient code using
mf(o)crf instead of the (slow) mfcr instruction.

Note that this patch does not (yet) implement this more efficient
code generation, but it does implement the part that is required to
be ABI compliant: creating multiple CFI records if multiple CR fields
are saved.

Reviewed by Hal Finkel.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213492 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 00:03:18 +00:00
Ulrich Weigand
68b292b026 [PowerPC] ELFv2 dynamic loader support
This patch enables the new ELFv2 ABI in the runtime dynamic loader.
The loader has to implement the following features:
- In the ELFv2 ABI, do not look up a function descriptor in .opd, but
  instead use the local entry point when resolving a direct call.
- Update the TOC restore code to use the new TOC slot linkage area
  offset.
- Create PLT stubs appropriate for the ELFv2 ABI.

Note that this patch also adds common-code changes. These are necessary
because the loader must check the newly added ELF flags: the e_flags
header bits encoding the ABI version, and the st_other symbol table
entry bits encoding the local entry point offset.  There is currently
no way to access these, so I've added ObjectFile::getPlatformFlags and
SymbolRef::getOther accessors.

Reviewed by Hal Finkel.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213491 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-20 23:53:14 +00:00
Ulrich Weigand
7fc5011e8d [PowerPC] ELFv2 stack space reduction
The ELFv2 ABI reduces the amount of stack required to implement an
ABI-compliant function call in two ways:
* the "linkage area" is reduced from 48 bytes to 32 bytes by
  eliminating two unused doublewords
* the 64-byte "parameter save area" is now optional and need not be
  present in certain cases (it remains mandatory in functions with
  variable arguments, and functions that have any parameter that is
  passed on the stack)

The following patch implements this required changes:
- reducing the linkage area, and associated relocation of the TOC save
  slot, in getLinkageSize / getTOCSaveOffset (this requires updating all
  callers of these routines to pass in the isELFv2ABI flag).
- (partially) handling the case where the parameter save are is optional

This latter part requires some extra explanation:  Currently, we still
always allocate the parameter save area when *calling* a function.
That is certainly always compliant with the ABI, but may cause code to
allocate stack unnecessarily.  This can be addressed by a follow-on
optimization patch.

On the *callee* side, in LowerFormalArguments, we *must* track
correctly whether the ABI guarantees that the caller has allocated
the parameter save area for our use, and the patch does so. However,
there is one complication: the code that handles incoming "byval"
arguments will currently *always* write to the parameter save area,
because it has to force incoming register arguments to the stack since
it must return an *address* to implement the byval semantics.

To fix this, the patch changes the LowerFormalArguments code to write
arguments to a freshly allocated stack slot on the function's own stack
frame instead of the argument save area in those cases where that area
is not present.

Reviewed by Hal Finkel.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213490 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-20 23:43:15 +00:00