Commit Graph

7426 Commits

Author SHA1 Message Date
Akira Hatanaka
497204a94b [mips] Set isAsCheapAsAMove flag on ADDiu and DADDiu, which enables
re-materialization of immediate loads.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167153 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-31 18:37:55 +00:00
Akira Hatanaka
39fa606992 Test case for r167039. Check that tail-call optimization is disabled for
mips16.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167139 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-31 17:25:23 +00:00
Reed Kotler
9441125d63 Implement ADJCALLSTACKUP and ADJCALLSTACKDOWN
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167107 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-31 05:21:10 +00:00
Bill Schmidt
42d43351b2 This patch addresses an ABI compatibility issue with empty aggregate
parameters.  Examples of these are:

  struct { } a;
  union { } b[256];
  int a[0];

An empty aggregate has an address, although dereferencing that address is
pointless.  When passed as a parameter, an empty aggregate does not consume
a protocol register, nor does it consume a doubleword in the parameter save
area.  Passing an empty aggregate by reference passes an address just as
for any other aggregate.  Returning an empty aggregate uses GPR3 as a hidden
address of the return value location, just as for any other aggregate.

The patch modifies PPCTargetLowering::LowerFormalArguments_64SVR4 and
PPCTargetLowering::LowerCall_64SVR4 to properly skip empty aggregate
parameters passed by value.  The handling of return values and by-reference
parameters was already correct.

Built on powerpc64-unknown-linux-gnu and tested with no new regressions.
A test case is included to test proper handling of empty aggregate
parameters on both sides of the function call protocol.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167090 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-31 01:15:05 +00:00
Manman Ren
dfd0b9b460 X86 SSE: update rsqrtss and rcpss to use two source operands and
the first source operand is tied to the destination operand.

This is to accurately model the corresponding instructions where the upper
bits are unmodified.

rdar://12558838
PR14221


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167064 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-30 23:53:59 +00:00
Manman Ren
4c74a956b2 X86 MMX: optimize transfer from mmx to i32
We used to generate a store (movq) + a load.
Now we use movd.

rdar://9946746


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167056 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-30 22:15:38 +00:00
Akira Hatanaka
2f34d754d0 [mips] Allow tail-call optimization for vararg functions and functions which
use the caller's stack.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167048 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-30 20:16:31 +00:00
Adhemerval Zanella
c83b5dc625 PowerPC: Expand FSRQT for vector types
This patch expands FSQRT for floating point vector types when altivec is
used.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167034 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-30 18:29:42 +00:00
Quentin Colombet
9a419f656e Change ForceSizeOpt attribute into MinSize attribute
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167020 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-30 16:32:52 +00:00
Adhemerval Zanella
5f41fd685b PowerPC: More support for Altivec compare operations
This patch adds more support for vector type comparisons using altivec.
It adds correct support for v16i8, v8i16, v4i32, and v4f32 vector
types for comparison operators ==, !=, >, >=, <, and <=.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167015 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-30 13:50:19 +00:00
Hans Wennborg
04d7d13d30 Use TargetTransformInfo to control switch-to-lookup table transformation
When the switch-to-lookup tables transform landed in SimplifyCFG, it
was pointed out that this could be inappropriate for some targets.
Since there was no way at the time for the pass to know anything about
the target, an awkward reverse-transform was added in CodeGenPrepare
that turned lookup tables back into switches for some targets.

This patch uses the new TargetTransformInfo to determine if a
switch should be transformed, and removes
CodeGenPrepare::ConvertLoadToSwitch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167011 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-30 11:23:25 +00:00
Reed Kotler
c09856b535 Change mips16 delay slot jumps to non delay slot forms by default.
We will make them delay slot forms if there is something that can be
placed in the delay slot during a separate pass. Mips16 extended instructions
cannot be placed in delay slots.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166990 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-30 00:54:49 +00:00
Jakub Staszak
a24262a0f5 Re-commit r166971. I reverted it to quickly, when buildbots didn't have a chance
to test it with chapni's fix (-mattr=+avx).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166985 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-30 00:01:57 +00:00
Jakub Staszak
c1ed096b6b Revert r166971. It causes buildbot failure. To be investigated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166979 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 23:13:50 +00:00
NAKAMURA Takumi
926dd447f1 llvm/test/CodeGen/X86/vec_shuffle-30.ll: Try to unbreak builds - assuming +avx.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166974 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 22:45:18 +00:00
Jakub Staszak
6d317824a5 Allow to fold vector load if there is more than one bitcast, so in the case:
%0 = load <8 x i16>* %dest
%1 = shufflevector <8 x i16> %0, <8 x i16> %in,
      <8 x i32> < i32 0, i32 1, i32 2, i32 3, i32 13, i32 undef, i32 14, i32 14>
store <8 x i16> %1, <8 x i16>* %dest

We get:
  vmovlpd (%eax), %xmm0, %xmm0

instead of:
  vmovaps (%eax), %xmm1
  vmovsd  %xmm1, %xmm0, %xmm0

No extra test-case is added. I just fixed the existing one
(also it uses FileCheck now).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166971 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 21:56:35 +00:00
Bill Schmidt
e6c56433de This patch solves a problem with passing varargs parameters under the PPC64
ELF ABI.

A varargs parameter consisting of a single-precision floating-point value,
or of a single-element aggregate containing a single-precision floating-point
value, must be passed in the low-order (rightmost) four bytes of the
doubleword stack slot reserved for that parameter.  If there are GPR protocol
registers remaining, the parameter must also be mirrored in the low-order
four bytes of the reserved GPR.

Prior to this patch, such parameters were being passed in the high-order
four bytes of the stack slot and the mirrored GPR.

The patch adds a new test case to verify the correct code generation.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166968 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 21:18:16 +00:00
Reed Kotler
576b1dbbef Implement patterns for extloadi8 and extloadi16
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166960 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 19:39:04 +00:00
Ulrich Weigand
e669c930a6 In various places throughout the code generator, there were special
checks to avoid performing compile-time arithmetic on PPCDoubleDouble.

Now that APFloat supports arithmetic on PPCDoubleDouble, those checks
are no longer needed, and we can treat the type like any other.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166958 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 18:35:49 +00:00
Chad Rosier
53e216b304 Remove redundant test case from r166949, per Eli's suggestion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166953 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 18:18:26 +00:00
Chad Rosier
2fbc239e4f [ms-inline asm] Add support for the [] operator. Essentially, [expr1][expr2] is
equivalent to [expr1 + expr2].  See test cases for more examples.
rdar://12470392

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166949 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 18:01:54 +00:00
Michael Liao
2a2263e744 Fix PR14204
- Add missing pattern on X86ISD::VZEXT from VR256 to VR256 when AVX2 is enabled.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166947 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 17:57:12 +00:00
Jakob Stoklund Olesen
573303e62a Completely disallow partial copies in adjustCopiesBackFrom().
Partial copies can show up even when CoalescerPair.isPartial() returns
false. For example:

   %vreg24:dsub_0<def> = COPY %vreg31:dsub_0; QPR:%vreg24,%vreg31

Such a partial-partial copy is not good enough for the transformation
adjustCopiesBackFrom() needs to do.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166944 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 17:51:52 +00:00
Ulrich Weigand
78dab643e0 Allow i32/i64 for 'f' constraint on PowerPC.
This fixes PR12757.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166943 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 17:49:34 +00:00
Reed Kotler
8834a20d5d Expand all atomic ops for mips16.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166935 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 16:16:54 +00:00
Preston Gurd
a836563e32 This patch addresses a problem with the Post RA scheduler generating an
incorrect instruction sequence due to it not being aware that an
inline assembly instruction may reference memory.

This patch fixes the problem by causing the scheduler to always assume that any
inline assembly code instruction could access memory. This is necessary because
the internal representation of the inline instruction does not include
any information about memory accesses.
 
This should fix PR13504.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166929 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 15:01:23 +00:00
Bill Schmidt
01d013ec04 This patch adds alignment information for long double to the 64-bit PowerPC
ELF subtarget.

The existing logic is used as a fallback to avoid any changes to the Darwin
ABI.  PPC64 ELF now has two possible data layout strings: one for FreeBSD,
which requires 8-byte alignment, and a default string that requires
16-byte alignment.

I've added a test for PPC64 Linux to verify the 16-byte alignment.  If
somebody wants to add a separate test for FreeBSD, that would be great.

Note that there is a companion patch to update the alignment information
in Clang, which I am committing now as well.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166928 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 14:59:36 +00:00
Reed Kotler
3a9f4568fb Implement brind operator for mips16.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166903 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-28 23:08:07 +00:00
Reed Kotler
f99998a2b0 This patch is for the implementation of mips16 complex pattern addr16.
Previously mips16 was sharing the pattern addr which is used for mips32
and mips64. This had a number of problems:
1) Storing and loading byte and halfword quantities for mips16 has particular
problems due to the primarily non mips16 nature of SP. When we must
load/store byte/halfword stack objects in a function, we must create a mips16
alias register for SP. This functionality is tested in stchar.ll.
2) We need to have an FP register under certain conditions (such as 
dynamically sized alloca). We use mips16 register S0 for this purpose.
In this case, we also use this register when accessing frame objects so this
issue also affects the complex pattern addr16. This functionality is
tested in alloca16.ll.

The Mips16InstrInfo.td has been updated to use addr16 instead of addr.

The complex pattern C++ function for addr has been copied to addr16 and
updated to reflect the above issues.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166897 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-28 06:02:37 +00:00
Jakob Stoklund Olesen
163f67f4d9 Never attempt to join an early-clobber def with a regular kill.
This fixes PR14194.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166880 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-27 17:41:27 +00:00
Quentin Colombet
80acd97266 [code size][ARM] Emit regular call instructions instead of the move, branch sequence
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166854 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-27 01:10:17 +00:00
Reed Kotler
7797e8f901 Implement MipsHi for mips16
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166852 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-27 00:57:14 +00:00
Akira Hatanaka
21a9a98b77 [mips] Do not tail-call optimize vararg functions or functions with byval
arguments.

This is rather conservative and should be fixed later to be more aggressive.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166851 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-27 00:56:56 +00:00
Akira Hatanaka
4618e0b574 [mips] Make sure FuncArg doesn't advance when OrigArgIndex is the same as in the
previous iteration.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166850 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-27 00:44:39 +00:00
Jakob Stoklund Olesen
17f42e02a1 Revert r163298 "Optimize codegen for VSETLNi{8,16,32} operating on Q registers."
Keep the integer_insertelement test case, the new coalescer can handle
this kind of lane insertion without help from pseudo-instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166835 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 23:39:46 +00:00
Reed Kotler
25424154f4 implement mips16 tls global addr
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166827 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 22:57:32 +00:00
Jakob Stoklund Olesen
cd275f5687 Add GPRPair Register class to ARM.
Some instructions in ARM require 2 even-odd paired GPRs. This
patch adds support for such register class.

Patch by Weiming Zhao!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166816 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 21:29:15 +00:00
Reed Kotler
a81be80b0e Implement carry for subtract/add for mips16
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166755 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 04:46:26 +00:00
Reed Kotler
28a6214b59 implement large (>16 bit) constant loading.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166749 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 03:09:34 +00:00
Reed Kotler
30675b12fd fix test setgek.ll so that it will not give false "make check"
failure in some cases


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166747 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 01:29:42 +00:00
Reed Kotler
4c5a6dab17 implement mips16 patterns for select nodes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166721 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 21:33:30 +00:00
Michael Liao
8d7cd1d8fc Add test for ATOM ISA SSSE3
- Remove SSE4.1 feature in other ATOM-based test cases



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166699 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 17:50:05 +00:00
Bill Schmidt
37900c5dcb This patch addresses a PPC64 ELF issue with passing parameters consisting of
structs having size 3, 5, 6, or 7.  Such a struct must be passed and received
as right-justified within its register or memory slot.  The problem is only
present for structs that are passed in registers.

Previously, as part of a patch handling all structs of size less than 8, I
added logic to rotate the incoming register so that the struct was left-
justified prior to storing the whole register.  This was incorrect because
the address of the parameter had already been adjusted earlier to point to
the right-adjusted value in the storage slot.  Essentially I had accidentally
accounted for the right-adjustment twice.

In this patch, I removed the incorrect logic and reorganized the code to make
the flow clearer.

The removal of the rotates changes the expected code generation, so test case
structsinregs.ll has been modified to reflect this.  I also added a new test
case, jaggedstructs.ll, to demonstrate that structs of these sizes can now
be properly received and passed.

I've built and tested the code on powerpc64-unknown-linux-gnu with no new
regressions.  I also ran the GCC compatibility test suite and verified that
earlier problems with these structs are now resolved, with no new regressions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166680 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 13:38:09 +00:00
Elena Demikhovsky
c0cd72204d The test avx-intel-ocl.ll failed. I can't reproduce on any of my machines. I added -mcpu flag, may be it will fix the problem
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166669 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 08:38:42 +00:00
Chad Rosier
b3009eec47 [ms-inline asm] Add back-end test case for r166632. Make sure we emit the
correct .s output as well as get the correct encoding by the integrated
assembler.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166638 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 23:10:28 +00:00
Evan Cheng
d258eb3ec5 Fix a miscompilation caused by a typo. When turning a adde with negative value
into a sbc with a positive number, the immediate should be complemented, not
negated. Also added a missing pattern for ARM codegen.

rdar://12559385


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166613 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 19:53:01 +00:00
Elena Demikhovsky
3575222175 Special calling conventions for Intel OpenCL built-in library.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166566 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 14:46:16 +00:00
Michael Liao
1a5cc710ee Teach DAG combine to fold (buildvec (Xint2fp x)) to (Xint2fp (buildvec x))
- If more than 1 elemennts are defined and target supports the vectorized
  conversion, use the vectorized one instead to reduce the strength on
  conversion operation.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166546 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 04:14:18 +00:00
Michael Liao
991b6a22b6 Add custom conversion from v2u32 to v2f32 in 32-bit mode
- As there's no 64-bit GPRs in 32-bit mode, a custom conversion from v2u32 to
  v2f32 is added to improve the efficiency of the code generated.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166545 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 04:09:32 +00:00
Akira Hatanaka
2ef5bd3ba6 [mips] Make sure sret argument is returned in register V0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166539 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 02:10:54 +00:00
Rafael Espindola
847a9c6d77 Change x86_fastcallcc to require inreg markers. This allows it to known
the difference from "int x" (which should go in registers and
"struct y {int x;}" (which should not).

Clang will be updated in the next patches.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166536 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 01:58:48 +00:00
Michael Liao
0787274b70 Fix PR14161
- Check index being extracted to be constant 0 before simplfiying.
  Otherwise, retain the original sequence.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166504 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-23 21:40:15 +00:00
Michael Liao
d9d09600ee Enable lowering ZERO_EXTEND/ANY_EXTEND to PMOVZX from SSE4.1
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166486 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-23 17:34:00 +00:00
Reed Kotler
293e5d0e0a implement setXX patterns
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166459 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-23 01:35:48 +00:00
Bill Wendling
8f47fc8f00 When a block ends in an indirect branch, add its successors to the machine basic block.
The CFG of the machine function needs to know that the targets of the indirect
branch are successors to the indirect branch.
<rdar://problem/12529625>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166448 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-22 23:30:04 +00:00
Akira Hatanaka
30580cea43 [mips] Use 64-bit registers to return an sret pointer if target ABI is N64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166344 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 22:11:40 +00:00
Akira Hatanaka
2b861be96e [mips] Add code to do tail call optimization.
Currently, it is enabled only if option "enable-mips-tail-calls" is given and
all of the callee's arguments are passed in registers.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166342 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 21:47:33 +00:00
Shuxin Yang
970755e519 This patch is to fix radar://8426430. It is about llvm support of __builtin_debugtrap()
which is supposed to consistently raise SIGTRAP across all systems. In contrast,
__builtin_trap() behave differently on different systems. e.g. it raises SIGTRAP on ARM, and
SIGILL on X86. The purpose of __builtin_debugtrap() is to consistently provide "trap"
functionality, in the mean time preserve the compatibility with on gcc on __builtin_trap().

  The X86 backend is already able to handle debugtrap(). This patch is to:
  1) make front-end recognize "__builtin_debugtrap()" (emboddied in the one-line change to Clang).
  2) In DAG legalization phase, by default, "debugtrap" will be replaced with "trap", which
     make the __builtin_debugtrap() "available" to all existing ports without the hassle of
     changing their code.
  3) If trap-function is specified (via -trap-func=xyz to llc), both __builtin_debugtrap() and
     __builtin_trap() will be expanded into the function call of the specified trap function.
    This behavior may need change in the future.

  The provided testing-case is to make sure 2) and 3) are working for ARM port, and we
already have a testing case for x86. 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166300 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 20:11:16 +00:00
Michael Liao
facace808c Lower BUILD_VECTOR to SHUFFLE + INSERT_VECTOR_ELT for X86
- If INSERT_VECTOR_ELT is supported (above SSE2, either by custom
  sequence of legal insn), transform BUILD_VECTOR into SHUFFLE +
  INSERT_VECTOR_ELT if most of elements could be built from SHUFFLE with few
  (so far 1) elements being inserted.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166288 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 17:15:18 +00:00
Stepan Dyatkovskiy
0d3c8d5d16 ARM:
Removed extra stack frame object for fixed byval arguments,
VarArgsStyleRegisters invocation was reworked due to some improper usage in
past. PR14099 also demonstrates it.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166273 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 08:23:06 +00:00
Sebastian Pop
83ba06afa8 Clear unknown mem ops when merging stack slots (pr14090)
When merging stack slots, if StackColoring::remapInstructions gets a
value back from GetUnderlyingObject that it does not know about or is
not itself a stack slot, clear the memory operand in case it aliases
the merged slot. This prevents the introduction of incorrect aliasing
information.

Author:    Matthew Curtis <mcurtis@codeaurora.org>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166216 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-18 19:53:48 +00:00
Nadav Rotem
1c5bf3f429 In SimplifySelectOps we pulled two loads through a select node despite the fact that one was dependent on the other.
rdar://12513091



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166196 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-18 18:06:48 +00:00
Ulrich Weigand
6c28a7eec8 This patch fixes failures in the SingleSource/Regression/C/uint64_to_float
test case on PowerPC caused by rounding errors when converting from a 64-bit
integer to a single-precision floating point. The reason for this are
double-rounding effects, since on PowerPC we have to convert to an
intermediate double-precision value first, which gets rounded to the
final single-precision result.

The patch fixes the problem by preparing the 64-bit integer so that the
first conversion step to double-precision will always be exact, and the
final rounding step will result in the correctly-rounded single-precision
result.  The generated code sequence is equivalent to what GCC would generate.

When -enable-unsafe-fp-math is in effect, that extra effort is omitted
and we accept possible rounding errors (just like GCC does as well).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166178 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-18 13:16:11 +00:00
Michael Liao
07edaf3801 Revert part of r166049 back and enable test case in r166125.
- Folding (trunc (concat ... X )) to (concat ... (trunc X) ...) is valid
  when '...' are all 'undef's.
- r166125 relies on this transformation.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166155 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 23:45:54 +00:00
Michael Liao
6e0c2b36d6 Disable extract-concat test case temporarily
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166141 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 23:08:19 +00:00
Michael Liao
4031e9018b Revert r166049
- In general, it's unsafe for this transformation.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166135 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 22:41:15 +00:00
Reed Kotler
95a2bb4cdf Add conditional branch instructions and their patterns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166134 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 22:29:54 +00:00
Michael Liao
13429e224c Teach DAG combine to fold (extract_subvec (concat v1, ..) i) to v_i
- If the extracted vector has the same type of all vectored being concatenated
  together, it should be simplified directly into v_i, where i is the index of
  the element being extracted.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166125 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 20:48:33 +00:00
Anton Korobeynikov
e4b33a115b Fix fallout from RegInfo => FrameLowering refactoring on MSP430.
Patch by Job Noorman!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166108 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 17:37:11 +00:00
Michael Liao
281ae5abf5 Fix setjmp on models with non-Small code model nor non-Static relocation model
- MBB address is only valid as an immediate value in Small & Static
  code/relocation models. On other models, LEA is needed to load IP address of
  the restore MBB.
- A minor fix of MBB in MC lowering is added as well to enable target
  relocation flag being propagated into MC.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166084 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 02:22:27 +00:00
Jakob Stoklund Olesen
320db3f805 Avoid rematerializing a redef immediately after the old def.
PR14098 contains an example where we would rematerialize a MOV8ri
immediately after the original instruction:

  %vreg7:sub_8bit<def> = MOV8ri 9; GR32_ABCD:%vreg7
  %vreg22:sub_8bit<def> = MOV8ri 9; GR32_ABCD:%vreg7

Besides being pointless, it is also wrong since the original instruction
only redefines part of the register, and the value read by the new
instruction is wrong.

The problem was the LiveRangeEdit::allUsesAvailableAt() didn't
special-case OrigIdx == UseIdx and found the wrong SSA value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166068 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 22:51:58 +00:00
Jakob Stoklund Olesen
cdcdfd2cab Revert r166046 "Switch back to the old coalescer for now to fix the 32 bit bit"
A fix for PR14098, including the test case is in the next commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166067 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 22:51:55 +00:00
Michael Liao
272ea03239 Teach DAG combine to fold (trunc (fptoXi x)) to (fptoXi x)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166049 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 19:38:35 +00:00
Rafael Espindola
6f7cccd2e2 Switch back to the old coalescer for now to fix the 32 bit bit
llvm+clang+compiler-rt bootstrap.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166046 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 19:34:06 +00:00
Bill Schmidt
7a6cb15a92 This patch addresses PR13949.
For the PowerPC 64-bit ELF Linux ABI, aggregates of size less than 8
bytes are to be passed in the low-order bits ("right-adjusted") of the
doubleword register or memory slot assigned to them.  A previous patch
addressed this for aggregates passed in registers.  However, small
aggregates passed in the overflow portion of the parameter save area are
still being passed left-adjusted.

The fix is made in PPCTargetLowering::LowerCall_Darwin_Or_64SVR4 on the
caller side, and in PPCTargetLowering::LowerFormalArguments_64SVR4 on
the callee side.  The main fix on the callee side simply extends
existing logic for 1- and 2-byte objects to 1- through 7-byte objects,
and correcting a constant left over from 32-bit code.  There is also a
fix to a bogus calculation of the offset to the following argument in
the parameter save area.

On the caller side, again a constant left over from 32-bit code is
fixed.  Additionally, some code for 1, 2, and 4-byte objects is
duplicated to handle the 3, 5, 6, and 7-byte objects for SVR4 only.  The
LowerCall_Darwin_Or_64SVR4 logic is getting fairly convoluted trying to
handle both ABIs, and I propose to separate this into two functions in a
future patch, at which time the duplication can be removed.

The patch adds a new test (structsinmem.ll) to demonstrate correct
passing of structures of all seven sizes.  Eight dummy parameters are
used to force these structures to be in the overflow portion of the
parameter save area.

As a side effect, this corrects the case when aggregates passed in
registers are saved into the first eight doublewords of the parameter
save area:  Previously they were stored left-justified, and now are
properly stored right-justified.  This requires changing the expected
output of existing test case structsinregs.ll.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166022 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 13:30:53 +00:00
Stepan Dyatkovskiy
b52ba9f8a8 Issue:
Stack is formed improperly for long structures passed as byval arguments for
EABI mode.

If we took AAPCS reference, we can found the next statements:

A: "If the argument requires double-word alignment (8-byte), the NCRN (Next
Core Register Number) is rounded up to the next even register number." (5.5
Parameter Passing, Stage C, C.3).

B: "The alignment of an aggregate shall be the alignment of its most-aligned
component." (4.3 Composite Types, 4.3.1 Aggregates).

So if we have structure with doubles (9 double fields) and 3 Core unused
registers (r1, r2, r3): caller should use r2 and r3 registers only.
Currently r1,r2,r3 set is used, but it is invalid.

Callee VA routine should also use r2 and r3 regs only. All is ok here. This
behaviour is guessed by rounding up SP address with ADD+BFC operations.

Fix:
Main fix is in ARMTargetLowering::HandleByVal. If we detected AAPCS mode and
8 byte alignment, we waste odd registers then.

P.S.:
I also improved LDRB_POST_IMM regression test. Since ldrb instruction will
not generated by current regression test after this patch. 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166018 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 07:16:47 +00:00
NAKAMURA Takumi
e26874556b Reapply r165661, Patch by Shuxin Yang <shuxin.llvm@gmail.com>.
Original message:

The attached is the fix to radar://11663049. The optimization can be outlined by following rules:

   (select (x != c), e, c) -> select (x != c), e, x),
   (select (x == c), c, e) -> select (x == c), x, e)
where the <c> is an integer constant.

 The reason for this change is that : on x86, conditional-move-from-constant needs two instructions;
however, conditional-move-from-register need only one instruction.

  While the LowerSELECT() sounds to be the most convenient place for this optimization, it turns out to be a bad place. The reason is that by replacing the constant <c> with a symbolic value, it obscure some instruction-combining opportunities which would otherwise be very easy to spot. For that reason, I have to postpone the change to last instruction-combining phase.

  The change passes the test of "make check-all -C <build-root/test" and "make -C project/test-suite/SingleSource".

Original message since r165661:

My previous change has a bug: I negated the condition code of a CMOV, and go ahead creating a new CMOV using the *ORIGINAL* condition code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166017 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 06:28:34 +00:00
Rafael Espindola
0cead1cfef Fix the cpu name and add -verify-machineinstrs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166003 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 01:13:06 +00:00
Andrew Trick
27c28cef11 misched: Added handleMove support for updating all kill flags, not just for allocatable regs.
This is a medium term workaround until we have a more robust solution
in the form of a register liveness utility for postRA passes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166001 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 00:22:51 +00:00
Michael Liao
6c0e04c823 Add __builtin_setjmp/_longjmp supprt in X86 backend
- Besides used in SjLj exception handling, __builtin_setjmp/__longjmp is also
  used as a light-weight replacement of setjmp/longjmp which are used to
  implementation continuation, user-level threading, and etc. The support added
  in this patch ONLY addresses this usage and is NOT intended to support SjLj
  exception handling as zero-cost DWARF exception handling is used by default
  in X86.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165989 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 22:39:43 +00:00
Jim Grosbach
64ba635209 ARM: v1i64 and v2i64 VBSL intrinsic support.
rdar://12502028

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165981 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 21:23:40 +00:00
Andrew Trick
874c0a6ec7 Check output of the misched unit tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165959 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 20:33:14 +00:00
Rafael Espindola
32522201a8 Add a cpu to try to fix the atom builder.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165956 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 19:25:43 +00:00
Rafael Espindola
3da6f0392a Add testcase for pr14088.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165954 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 19:00:10 +00:00
Andrew Trick
bb20b24224 misched tests: add a triple to speculatively fix windows builders.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165952 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 18:21:08 +00:00
Andrew Trick
1e94e98b0e misched: ILP scheduler for experimental heuristics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165950 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 18:02:27 +00:00
Silviu Baranga
bb1078ea13 Fixed PR13938: the ARM backend was crashing because it couldn't select a VDUPLANE node with the vector input size different from the output size. This was bacause the BUILD_VECTOR lowering code didn't check that the size of the input vector was correct for using VDUPLANE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165929 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 09:41:32 +00:00
Jakob Stoklund Olesen
d86296a4ae Drop <def,dead> flags when merging into an unused lane.
The new coalescer can merge a dead def into an unused lane of an
otherwise live vector register.

Clear the <dead> flag when that happens since the flag refers to the
full virtual register which is still live after the partial dead def.

This fixes PR14079.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165877 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-13 17:26:47 +00:00
Jakob Stoklund Olesen
af89690760 Allow for loops in LiveIntervals::pruneValue().
It is possible that the live range of the value being pruned loops back
into the kill MBB where the search started. When that happens, make sure
that the beginning of KillMBB is also pruned.

Instead of starting a DFS at KillMBB and skipping the root of the
search, start a DFS at each KillMBB successor, and allow the search to
loop back to KillMBB.

This fixes PR14078.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165872 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-13 16:15:31 +00:00
Benjamin Kramer
f8b65aaf39 X86: Fix accidentally swapped operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165871 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-13 12:50:19 +00:00
Benjamin Kramer
444dccecfc X86: Promote i8 cmov when both operands are coming from truncates of the same width.
X86 doesn't have i8 cmovs so isel would emit a branch. Emitting branches at this
level is often not a good idea because it's too late for many optimizations to
kick in. This solution doesn't add any extensions (truncs are free) and tries
to avoid introducing partial register stalls by filtering direct copyfromregs.

I'm seeing a ~10% speedup on reading a random .png file with libpng15 via
graphicsmagick on x86_64/westmere, but YMMV depending on the microarchitecture.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165868 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-13 10:39:49 +00:00
Manman Ren
e6c3cc8dc5 ARM: tail-call inside a function where part of a byval argument is on caller's
local frame causes problem.

For example:
void f(StructToPass s) {
  g(&s, sizeof(s));
}
will cause problem with tail-call since part of s is passed via registers and
saved in f's local frame. When g tries to access s, part of s may be corrupted
since f's local frame is popped out before the tail-call.

The current fix is to disable tail-call if getVarArgsRegSaveSize is not 0 for
the caller. This is a conservative approach, if we can prove the address of
s or part of s is not taken and passed to g, it should be okay to perform
tail-call.

rdar://12442472


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165853 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 23:39:43 +00:00
Jakob Stoklund Olesen
2bbb07d13c Fix buildbots: -misched=shuffle is only available in +Asserts builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165846 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 23:01:33 +00:00
Jim Grosbach
4346fa9437 ARM: Mark VSELECT as 'expand'.
The backend already pattern matches to form VBSL when it can. We may want to
teach it to use the vbsl intrinsics at some point to prevent machine licm from
mucking with this, but using the Expand is completely correct.

http://llvm.org/bugs/show_bug.cgi?id=13831
http://llvm.org/bugs/show_bug.cgi?id=13961

Patch by Peter Couperus <peter.couperus@st.com>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165845 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 22:59:21 +00:00
Jakob Stoklund Olesen
ad5e969ba9 Use a transposed algorithm for handleMove().
Completely update one interval at a time instead of collecting live
range fragments to be updated. This avoids building data structures,
except for a single SmallPtrSet of updated intervals.

Also share code between handleMove() and handleMoveIntoBundle().

Add support for moving dead defs across other live values in the
interval. The MI scheduler can do that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165824 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 21:31:57 +00:00
Jakob Stoklund Olesen
795f951c6d Fix coalescing with IMPLICIT_DEF values.
PHIElimination inserts IMPLICIT_DEF instructions to guarantee that all
PHI predecessors have a live-out value. These IMPLICIT_DEF values are
not considered to be real interference when coalescing virtual
registers:

  %vreg1 = IMPLICIT_DEF
  %vreg2 = MOV32r0

When joining %vreg1 and %vreg2, the IMPLICIT_DEF instruction and its
value number should simply be erased since the %vreg2 value number now
provides a live-out value for the PHI predecesor block.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165813 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 18:03:04 +00:00
NAKAMURA Takumi
20ce6e6562 llvm/test/CodeGen/PowerPC/2012-10-12-bitcast.ll: Try to fix failure on non-ppc hosts, to add -mattr=+altivec.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165803 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 16:01:08 +00:00
Ulrich Weigand
7bbb9c7b4a Fix big-endian codegen bug in DAGTypeLegalizer::ExpandRes_BITCAST
On PowerPC, a bitcast of <16 x i8> to i128 may run through a code
path in ExpandRes_BITCAST that attempts to do an intermediate
bitcast to a <4 x i32> vector, and then construct the Hi and Lo parts
of the resulting i128 by pairing up two of those i32 vector elements
each.  The code already recognizes that on a big-endian system, the
first two vector elements form the Hi part, and the final two vector
elements form the Lo part (vice-versa from the little-endian situation).

However, we also need to take endianness into account when forming each
of those separate pairs:  on a big-endian system, vector element 0 is
the *high* part of the pair making up the Hi part of the result, and
vector element 1 is the low part of the pair.  The code currently always
uses vector element 0 as the low part and vector element 1 as the high
part, as is appropriate for little-endian platforms only.

This patch fixes this by swapping the vector elements as they are
paired up as appropriate.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165802 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 15:42:58 +00:00
Reed Kotler
7d90d4d709 Div, Rem int/unsigned int
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165783 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 02:01:09 +00:00
Evan Cheng
d36696c4e0 Legalizer optimize a pair of div / mod to a call to divrem libcall if they are
not legal. However, it should use a div instruction + mul + sub if divide is
legal. The rem legalization code was missing a check and incorrectly uses a
divrem libcall even when div is legal.

rdar://12481395


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165778 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 01:15:47 +00:00