Commit Graph

8210 Commits

Author SHA1 Message Date
David Majnemer
526f3ed7da Remove a recently redundant transform from X86ISelLowering.
X86ISelLowering has support to treat:
(icmp ne (and (xor %flags, -1), (shl 1, flag)), 0)

as if it were actually:
(icmp eq (and %flags, (shl 1, flag)), 0)

However, r179386 has code at the InstCombine level to handle this.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181145 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-05 02:00:10 +00:00
Tim Northover
79c1c092df AArch64: support literal pool access in large memory model.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181120 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-04 16:54:07 +00:00
Tim Northover
cd1b09b25b AArch64: support large code model for jump-tables
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181119 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-04 16:54:00 +00:00
Tim Northover
b2efdde06c AArch64: implement support for blockaddress in large code model
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181118 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-04 16:53:53 +00:00
Tim Northover
45db92038b AArch64: implement large code model access to global variables.
The MOVZ/MOVK instruction sequence may not be the most efficient (a
literal-pool load could be better) but adding that would require
reinstating the ConstantIslands pass.

For now the sequence is correct, and that's enough. Beware, as of
commit GNU ld does not appear to support the relocations needed for
this. Its primary purpose (for now) will be to support JITed code,
since in that case there is no guarantee of where your code will end
up in memory relative to external symbols it references.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181117 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-04 16:53:46 +00:00
Reed Kotler
2bb955a693 Remove some uneeded pseudos in the presence of the naked function attribute.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181072 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-03 23:17:24 +00:00
Akira Hatanaka
a2b2200ff8 [mips] Split the DSP control register and define one register for each field of
its fields.

This removes false dependencies between DSP instructions which access different
fields of the the control register. Implicit register operands are added to
instructions RDDSP and WRDSP after instruction selection, depending on the
value of the mask operand.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181041 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-03 18:37:49 +00:00
Tom Stellard
19301d5d12 R600: Expand vector or, shl, srl, and xor nodes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181035 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-03 17:21:31 +00:00
Tom Stellard
83f0a5a5e8 R600: Add pattern for SHA-256 Ma function
This can be optimized using the BFI_INT instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181033 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-03 17:21:20 +00:00
Akira Hatanaka
99ad6ac65e [mips] Handle reading, writing or copying of ccond field of DSP control
register.

- Define pseudo instructions which store or load ccond field of the DSP
  control register.
- Emit the pseudos in MipsSEInstrInfo::storeRegToStack and loadRegFromStack.
- Expand the pseudos before callee-scan save.
- Emit instructions RDDSP or WRDSP to copy between ccond field and GPRs. 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180969 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-02 23:07:05 +00:00
Vincent Lejeune
5ed88013e8 R600: Signed literals are 64bits wide
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180960 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-02 21:53:03 +00:00
Vincent Lejeune
152ebee8f3 R600: If previous bundle is dot4, PV valid chan is always X
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180959 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-02 21:52:55 +00:00
Vincent Lejeune
e117646f1f R600: Add a test to check that use_kill is emitted
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180958 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-02 21:52:46 +00:00
Vincent Lejeune
92f24d403f R600: Prettier asmPrint of Alu
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180956 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-02 21:52:30 +00:00
Pranav Bhandarkar
02d937d864 Hexagon - Add peephole optimizations for zero extends.
* lib/Target/Hexagon/HexagonInstrInfo.td: Add patterns to combine a
	sequence of a pair of i32->i64 extensions followed by a "bitwise or"
	into COMBINE_rr.
	* lib/Target/Hexagon/HexagonPeephole.cpp: Copy propagate Rx in the
	instruction Rp = COMBINE_Ir_V4(0, Rx) to the uses of Rp:subreg_loreg.
	* test/CodeGen/Hexagon/union-1.ll: New test.
	* test/CodeGen/Hexagon/combine_ir.ll: Fix test.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180946 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-02 20:22:51 +00:00
Manman Ren
436849be6a TBAA: remove !tbaa from testing cases if not used.
This will make it easier to turn on struct-path aware TBAA since the metadata
format will change.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180935 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-02 18:11:35 +00:00
Michael Liao
149b2a8b92 Rewrite X86 codegen regression test with FileCheck
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180910 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-02 06:20:42 +00:00
Michael Liao
2bab197bc8 Avoid generating tempfile(s) never used
As DejaGNU is deprecated, it seems pipe-jam issue doesn't exist any more.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180892 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-01 22:46:50 +00:00
Bill Wendling
f18a32eb12 Revert r180737. The companion patch was reverted, and this is not relevant right now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180889 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-01 22:32:08 +00:00
Nadav Rotem
b2ed5fac06 Optimize away nop CONCAT_VECTOR nodes.
Optimize CONCAT_VECTOR nodes that merge EXTRACT_SUBVECTOR values that extract from the same vector.

rdar://13402653
PR15866



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180871 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-01 19:18:51 +00:00
Rafael Espindola
dc0981d3e0 Put VMOVPQIto64rr in the VRPDI class.
Patch by Joshua Magee.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180842 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-01 13:00:16 +00:00
Michael Liao
38d32da0f1 Forget remove the tempfile argument
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180838 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-01 05:45:57 +00:00
Michael Liao
9ed0a1b065 More rewrites of x86 codegen regression tests with FileCheck
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180837 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-01 05:34:30 +00:00
Akira Hatanaka
c147c1b994 [mips] Fix handling of instructions which copy to/from accumulator registers.
Expand copy instructions between two accumulator registers before callee-saved
scan is done. Handle copies between integer GPR and hi/lo registers in
MipsSEInstrInfo::copyPhysReg. Delete pseudo-copy instructions that are not
needed.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180827 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-30 23:22:09 +00:00
Stephen Lin
3484da9479 Only pass 'returned' to target-specific lowering code when the value of entire register is guaranteed to be preserved.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180825 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-30 22:49:28 +00:00
Akira Hatanaka
cd6c57917d [mips] Instruction selection patterns for DSP-ASE vector select and compare
instructions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180820 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-30 22:37:26 +00:00
Adrian Prantl
86a87d9ba1 Temporarily revert "Change the informal convention of DBG_VALUE so that we can express a"
because it breaks some buildbots.

This reverts commit 180816.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180819 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-30 22:35:14 +00:00
Adrian Prantl
a2b56692c8 Change the informal convention of DBG_VALUE so that we can express a
register-indirect address with an offset of 0.
It used to be that a DBG_VALUE is a register-indirect value if the offset
(operand 1) is nonzero. The new convention is that a DBG_VALUE is
register-indirect if the first operand is a register and the second
operand is an immediate. For plain registers use the combination reg, reg.

rdar://problem/13658587

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180816 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-30 22:16:46 +00:00
Hal Finkel
db31bd31d6 LocalStackSlotAllocation improvements
First, taking advantage of the fact that the virtual base registers are allocated in order of the local frame offsets, remove the quadratic register-searching behavior. Because of the ordering, we only need to check the last virtual base register created.

Second, store the frame index in the FrameRef structure, and get the frame index and the local offset from this structure at the top of the loop iteration. This allows us to de-nest the loops in insertFrameReferenceRegisters (and I think makes the code cleaner). I also moved the needsFrameBaseReg check into the first loop over instructions so that we don't bother pushing FrameRefs for instructions that don't want a virtual base register anyway.

Lastly, and this is the only functionality change, avoid the creation of single-use virtual base registers. These are currently not useful because, in general, they end up replacing what would be one r+r instruction with an add and a r+i instruction. Committing this removes the XFAIL in CodeGen/PowerPC/2007-09-07-LoadStoreIdxForms.ll

Jim has okayed this off-list.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180799 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-30 20:04:37 +00:00
Manman Ren
2dc50d3067 TBAA: remove !tbaa from testing cases if not used.
This will make it easier to turn on struct-path aware TBAA since the metadata
format will change.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180796 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-30 17:52:57 +00:00
Vincent Lejeune
1872730c5a R600: fix loop-address.ll test
Texture cache is now used when shader type is not specified

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180785 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-30 12:47:56 +00:00
Michael Liao
8838410a75 Rewrite X86 codegen regression test with FileCheck
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180776 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-30 07:51:08 +00:00
Vincent Lejeune
2c836f84db R600: use native for alu
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180761 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-30 00:14:38 +00:00
Vincent Lejeune
631591e6f3 R600: Add FetchInst bit to instruction defs to denote vertex/tex instructions
v2[Vincent Lejeune]: Split FetchInst into usesTextureCache/usesVertexCache

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180755 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-30 00:13:39 +00:00
Michael Liao
7d8ea50b93 Rewrite test in FileCheck instead of grep in X86 codegen
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180754 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-30 00:13:38 +00:00
Manman Ren
ff6222ef2f TBAA: remove !tbaa from testing cases if not used.
This will make it easier to turn on struct-path aware TBAA since the metadata
format will change.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180745 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-29 22:58:55 +00:00
Bill Wendling
ad96c64355 Duplicate a testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180744 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-29 22:42:47 +00:00
Michael Liao
26ebdd1e7c Rewrite some tests with FileCHeck in X86 codegen
- Revise previous patches of the same purpose by fixing
  *) grep <PA> | not grep <PB> semantically is not the same as
     CHECK: <PA>{{^<PB>.*$}} as the former will check all occurrences of <PA>
     while the later only check the first match. As the result, CHECK needs
     putting in all place where <PA> occurs.
  *) grep <PA> | count <N> needs a final CHECK-NOT of the same pattern.
     (As 'CHECK-<N>' is proposed for discussion, converting 'grep | count <N>'
      where N > 1 is postponed.)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180742 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-29 22:41:29 +00:00
Tom Stellard
d8b2da1136 R600: Use correct CF_END instruction on Northern Island GPUs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180735 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-29 22:23:58 +00:00
Tom Stellard
015f586bc9 R600: Fix encoding of CF_END_{EG, R600} instructions
The EOP bit was not being encoded.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180734 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-29 22:23:54 +00:00
Rafael Espindola
5b0ce3570c Make all darwin ppc stubs local.
This fixes pr15763.
Patch by David Fang.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180657 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-27 00:43:16 +00:00
Benjamin Kramer
fdfdd4cf82 Make CHECK lines a bit less strict so they also match code generated for win64.
Hopefully brings the windows buildbots back to life.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180630 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-26 21:04:21 +00:00
Tom Stellard
99d8179a9b R600: Initialize AMDGPUMachineFunction::ShaderType to ShaderType::COMPUTE
We need to intialize this to something and since clang does not set
the shader type attribute and clang is used only for compute shaders,
initializing it to COMPUTE seems like the best choice.

Reviewed-by: Christian König <christian.koenig@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180620 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-26 18:32:24 +00:00
Benjamin Kramer
4e8590c45d ARM/NEON: Pattern match vector integer abs to vabs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180604 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-26 15:00:57 +00:00
Benjamin Kramer
753981784f X86: Now that we have a canonical form for vector integer abs, match it into pabs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180600 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-26 12:05:21 +00:00
Benjamin Kramer
6242fda42a DAGCombiner: Canonicalize vector integer abs in the same way we do it for scalars.
This already helps SSE2 x86 a lot because it lacks an efficient way to
represent a vector select. The long term goal is to enable the backend to match
a canonicalized pattern into a single instruction (e.g. vabs or pabs).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180597 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-26 09:19:19 +00:00
Preston Gurd
d6ac8e9a03 This patch adds the X86FixupLEAs pass, which will reduce instruction
latency for certain models of the Intel Atom family, by converting
instructions into their equivalent LEA instructions, when it is both
useful and possible to do so.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180573 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-25 20:29:37 +00:00
Chad Rosier
5df2e16ba1 [inline asm] Add a test case for r180226. The specific issue is that the inline
assembly is requesting a 64-bit register, which is invalid for i386.
rdar://13731657


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180445 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-25 17:10:21 +00:00
Silviu Baranga
02066838b5 Fix constant folding for one lane vector types. Constant folding one lane vector types not returns a vector instead of a scalar.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180254 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-25 09:32:33 +00:00
Tom Stellard
87cba4a4c1 R600: Use SHT_PROGBITS for the .AMDGPU.config section
The libelf implementation that is distributed here:
http://www.mr511.de/software/english.html
will not parse sections that are marked SHT_NULL.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180230 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-24 23:56:14 +00:00
Andrew Trick
e38afe1e33 MI Sched: eliminate local vreg copies.
For now, we just reschedule instructions that use the copied vregs and
let regalloc elliminate it. I would really like to eliminate the
copies on-the-fly during scheduling, but we need a complete
implementation of repairIntervalsInRange() first.

The general strategy is for the register coalescer to eliminate as
many global copies as possible and shrink live ranges to be
extended-basic-block local. The coalescer should not have to worry
about resolving local copies (e.g. it shouldn't attemp to reorder
instructions). The scheduler is a much better place to deal with local
interference. The coalescer side of this equation needs work.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180193 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-24 15:54:43 +00:00
Jyotsna Verma
42ba77db53 Hexagon: Use multiclass for combine and STri[bhwd]_shl_V4 instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180145 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-23 21:17:40 +00:00
Stephen Lin
81fef0267b Add more tests for r179925 to verify correct handling of signext/zeroext; strengthen condition check to require actual MVT::i32 virtual register types, just in case (no actual functionality change)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180138 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-23 19:42:25 +00:00
Jyotsna Verma
47089c91ae Hexagon: Remove assembler mapped instruction definitions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180133 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-23 19:15:55 +00:00
Vincent Lejeune
2a74639bc7 R600: Use .AMDGPU.config section to emit stacksize
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180124 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-23 17:34:12 +00:00
Vincent Lejeune
7a28d8afa7 R600: Add CF_END
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180123 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-23 17:34:00 +00:00
Jyotsna Verma
3d7b39e7d4 Hexagon: Remove duplicate instructions to handle global/immediate values
for absolute/absolute-set addressing modes.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180120 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-23 17:11:46 +00:00
Rafael Espindola
2848ff01e2 Move test from grep to FileCheck.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180092 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-23 12:03:27 +00:00
Akira Hatanaka
d597263b94 [mips] In performDSPShiftCombine, check that all elements in the vector are
shifted by the same amount and the shift amount is smaller than the element
size.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180039 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-22 19:58:23 +00:00
Stephen Lin
2fdbbe307d Extra paranoid test for r179925 (verify that tail calls are not generated to 'this'-returning constructors of objects with different 'this' pointers than the caller)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180032 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-22 17:23:49 +00:00
Stepan Dyatkovskiy
78e3c90419 Fix for 5.5 Parameter Passing --> Stage C:
-- C.4 and C.5 statements, when NSAA is not equal to SP.
 -- C.1.cp statement for VA functions. Note: There are no VFP CPRCs in a
    variadic procedure.

Before this patch "NSAA != 0" means "don't use GPRs anymore ". But there are
some exceptions in AAPCS.
1. For non VA function: allocate all VFP regs for CPRC. When all VFPs are allocated
   CPRCs would be sent to stack, while non CPRCs may be still allocated in GRPs.
2. Check that for VA functions all params uses GPRs and then stack.
   No exceptions, no CPRCs here.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180011 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-22 13:06:52 +00:00
Arnaud A. de Grandmaison
d9e70873f3 Cleanup: test source files do not need to be executable
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180003 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-22 08:02:43 +00:00
David Blaikie
c462db6d66 Revert "Revert "PR14606: debug info imported_module support""
This reverts commit r179840 with a fix to test/DebugInfo/two-cus-from-same-file.ll

I'm not sure why that test only failed on ARM & MIPS and not X86 Linux, even
though the debug info was clearly invalid on all of them, but this ought to fix
it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179996 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-22 06:12:31 +00:00
Jim Grosbach
0cb1019e9c Legalize vector truncates by parts rather than just splitting.
Rather than just splitting the input type and hoping for the best, apply
a bit more cleverness. Just splitting the types until the source is
legal often leads to an illegal result time, which is then widened and a
scalarization step is introduced which leads to truly horrible code
generation. With the loop vectorizer, these sorts of operations are much
more common, and so it's worth extra effort to do them well.

Add a legalization hook for the operands of a TRUNCATE node, which will
be encountered after the result type has been legalized, but if the
operand type is still illegal. If simple splitting of both types
ends up with the result type of each half still being legal, just
do that (v16i16 -> v16i8 on ARM, for example). If, however, that would
result in an illegal result type (v8i32 -> v8i8 on ARM, for example),
we can get more clever with power-two vectors. Specifically,
split the input type, but also widen the result element size, then
concatenate the halves and truncate again.  For example on ARM,
To perform a "%res = v8i8 trunc v8i32 %in" we transform to:
  %inlo = v4i32 extract_subvector %in, 0
  %inhi = v4i32 extract_subvector %in, 4
  %lo16 = v4i16 trunc v4i32 %inlo
  %hi16 = v4i16 trunc v4i32 %inhi
  %in16 = v8i16 concat_vectors v4i16 %lo16, v4i16 %hi16
  %res = v8i8 trunc v8i16 %in16

This allows instruction selection to generate three VMOVN instructions
instead of a sequences of moves, stores and loads.

Update the ARMTargetTransformInfo to take this improved legalization
into account.

Consider the simplified IR:

define <16 x i8> @test1(<16 x i32>* %ap) {
  %a = load <16 x i32>* %ap
  %tmp = trunc <16 x i32> %a to <16 x i8>
  ret <16 x i8> %tmp
}

define <8 x i8> @test2(<8 x i32>* %ap) {
  %a = load <8 x i32>* %ap
  %tmp = trunc <8 x i32> %a to <8 x i8>
  ret <8 x i8> %tmp
}

Previously, we would generate the truly hideous:
	.syntax unified
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_test1
	.align	2
_test1:                                 @ @test1
@ BB#0:
	push	{r7}
	mov	r7, sp
	sub	sp, sp, #20
	bic	sp, sp, #7
	add	r1, r0, #48
	add	r2, r0, #32
	vld1.64	{d24, d25}, [r0:128]
	vld1.64	{d16, d17}, [r1:128]
	vld1.64	{d18, d19}, [r2:128]
	add	r1, r0, #16
	vmovn.i32	d22, q8
	vld1.64	{d16, d17}, [r1:128]
	vmovn.i32	d20, q9
	vmovn.i32	d18, q12
	vmov.u16	r0, d22[3]
	strb	r0, [sp, #15]
	vmov.u16	r0, d22[2]
	strb	r0, [sp, #14]
	vmov.u16	r0, d22[1]
	strb	r0, [sp, #13]
	vmov.u16	r0, d22[0]
	vmovn.i32	d16, q8
	strb	r0, [sp, #12]
	vmov.u16	r0, d20[3]
	strb	r0, [sp, #11]
	vmov.u16	r0, d20[2]
	strb	r0, [sp, #10]
	vmov.u16	r0, d20[1]
	strb	r0, [sp, #9]
	vmov.u16	r0, d20[0]
	strb	r0, [sp, #8]
	vmov.u16	r0, d18[3]
	strb	r0, [sp, #3]
	vmov.u16	r0, d18[2]
	strb	r0, [sp, #2]
	vmov.u16	r0, d18[1]
	strb	r0, [sp, #1]
	vmov.u16	r0, d18[0]
	strb	r0, [sp]
	vmov.u16	r0, d16[3]
	strb	r0, [sp, #7]
	vmov.u16	r0, d16[2]
	strb	r0, [sp, #6]
	vmov.u16	r0, d16[1]
	strb	r0, [sp, #5]
	vmov.u16	r0, d16[0]
	strb	r0, [sp, #4]
	vldmia	sp, {d16, d17}
	vmov	r0, r1, d16
	vmov	r2, r3, d17
	mov	sp, r7
	pop	{r7}
	bx	lr

	.globl	_test2
	.align	2
_test2:                                 @ @test2
@ BB#0:
	push	{r7}
	mov	r7, sp
	sub	sp, sp, #12
	bic	sp, sp, #7
	vld1.64	{d16, d17}, [r0:128]
	add	r0, r0, #16
	vld1.64	{d20, d21}, [r0:128]
	vmovn.i32	d18, q8
	vmov.u16	r0, d18[3]
	vmovn.i32	d16, q10
	strb	r0, [sp, #3]
	vmov.u16	r0, d18[2]
	strb	r0, [sp, #2]
	vmov.u16	r0, d18[1]
	strb	r0, [sp, #1]
	vmov.u16	r0, d18[0]
	strb	r0, [sp]
	vmov.u16	r0, d16[3]
	strb	r0, [sp, #7]
	vmov.u16	r0, d16[2]
	strb	r0, [sp, #6]
	vmov.u16	r0, d16[1]
	strb	r0, [sp, #5]
	vmov.u16	r0, d16[0]
	strb	r0, [sp, #4]
	ldm	sp, {r0, r1}
	mov	sp, r7
	pop	{r7}
	bx	lr

Now, however, we generate the much more straightforward:
	.syntax unified
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_test1
	.align	2
_test1:                                 @ @test1
@ BB#0:
	add	r1, r0, #48
	add	r2, r0, #32
	vld1.64	{d20, d21}, [r0:128]
	vld1.64	{d16, d17}, [r1:128]
	add	r1, r0, #16
	vld1.64	{d18, d19}, [r2:128]
	vld1.64	{d22, d23}, [r1:128]
	vmovn.i32	d17, q8
	vmovn.i32	d16, q9
	vmovn.i32	d18, q10
	vmovn.i32	d19, q11
	vmovn.i16	d17, q8
	vmovn.i16	d16, q9
	vmov	r0, r1, d16
	vmov	r2, r3, d17
	bx	lr

	.globl	_test2
	.align	2
_test2:                                 @ @test2
@ BB#0:
	vld1.64	{d16, d17}, [r0:128]
	add	r0, r0, #16
	vld1.64	{d18, d19}, [r0:128]
	vmovn.i32	d16, q8
	vmovn.i32	d17, q9
	vmovn.i16	d16, q8
	vmov	r0, r1, d16
	bx	lr

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179989 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-21 23:47:41 +00:00
Jim Grosbach
5eabdf2601 ARM: Split out cost model vcvt testcases.
They had a separate RUN line already, so may as well be in a separate file.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179988 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-21 23:47:37 +00:00
Jakob Stoklund Olesen
ddb14ce76c Passing arguments to varags functions under the SPARC v9 ABI.
Arguments after the fixed arguments never use the floating point
registers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179987 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-21 21:36:49 +00:00
Jakob Stoklund Olesen
2c6b5a8d33 Fix the SETHIimm pattern for 64-bit code.
Don't ignore the high 32 bits of the immediate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179985 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-21 21:18:03 +00:00
Tim Northover
c3a93013bc ARM: fix part of test which actually needed an asserts build
This should fix a buildbot failure that occurred after r179977.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179978 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-21 12:20:19 +00:00
Tim Northover
4cc1407b84 ARM: Use ldrd/strd to spill 64-bit pairs when available.
This allows common sp-offsets to be part of the instruction and is
probably faster on modern CPUs too.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179977 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-21 11:57:07 +00:00
Bill Wendling
d868af77df Remove tbaa metadata.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179970 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-21 01:38:25 +00:00
Jakob Stoklund Olesen
da8768b2dd Compile varargs functions for SPARCv9.
With a little help from the frontend, it looks like the standard va_*
intrinsics can do the job.

Also clean up an old bitcast hack in LowerVAARG that dealt with
unaligned double loads. Load SDNodes can specify an alignment now.

Still missing: Calling varargs functions with float arguments.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179961 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-20 22:49:16 +00:00
Tim Northover
335dd0d1a6 ARM: don't add FrameIndex offset for LDMIA (has no immediate)
Previously, when spilling 64-bit paired registers, an LDMIA with both
a FrameIndex and an offset was produced. This kind of instruction
shouldn't exist, and the extra operand was being confused with the
predicate, causing aborts later on.

This removes the invalid 0-offset from the instruction being
produced.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179956 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-20 19:31:00 +00:00
Stephen Lin
514d514bb2 Minor renaming of tests (for consistency with an in-development patch)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179954 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-20 16:21:26 +00:00
Benjamin Kramer
d9f82b73eb Don't litter .s files in test directory.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179937 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-20 10:43:40 +00:00
Stephen Lin
456ca048af Add CodeGen support for functions that always return arguments via a new parameter attribute 'returned', which is taken advantage of in target-independent tail call opportunity detection and in ARM call lowering (when placed on an integral first parameter).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179925 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-20 05:14:40 +00:00
Stephen Lin
5c34e08b9f Allow tail call opportunity detection through nested and/or multiple iterations of extractelement/insertelement indirection
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179924 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-20 04:27:51 +00:00
Akira Hatanaka
97a62bf2a4 [mips] Instruction selection patterns for DSP-ASE vector shifts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179906 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-19 23:21:32 +00:00
Hal Finkel
87c1e42be7 Fix PPC optimizeCompareInstr swapped-sub argument handling
When matching a compare with a subtract where the arguments of the compare are
swapped w.r.t. the arguments of the subtract, we need to negate the predicates
(or CR bit indices) of the users. This, however, is not the same as inverting
the predicate (negating LT -> GT, but inverting LT -> GE, for example). The ARM
backend seems to do this correctly, but when I adapted the code for the PPC
backend, I introduced an error in this logic.

Comparison optimization is now enabled again by default.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179899 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-19 22:08:38 +00:00
Anton Korobeynikov
8caffc1e75 Do not mangle in MS-way the globals with magic \001 in the name.
Based on the patch by David Nadlinger!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179889 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-19 21:20:56 +00:00
Bill Wendling
a317eb8229 Make test slightly more readable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179888 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-19 21:14:59 +00:00
Bill Wendling
bb418038e1 Add a testcase to make sure we generate the proper compact unwind section for a function that cannot produce a compact unwind encoding.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179887 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-19 21:07:11 +00:00
Eric Christopher
41201ed06f Revert "PR14606: debug info imported_module support"
This reverts commit r179836 as it seems to have caused test failures.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179840 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-19 07:47:16 +00:00
David Blaikie
bcb81360a2 PR14606: debug info imported_module support
Adding another CU-wide list, in this case of imported_modules (since they
should be relatively rare, it seemed better to add a list where each element
had a "context" value, rather than add a (usually empty) list to every scope).
This takes care of DW_TAG_imported_module, but to fully address PR14606 we'll
need to expand this to cover DW_TAG_imported_declaration too.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179836 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-19 06:57:04 +00:00
Tom Stellard
48b809e6e5 R600: Add pattern for the BFI_INT instruction
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179830 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-19 02:11:06 +00:00
Tom Stellard
3abd23bac5 R600: Reorganize lit tests and document how they should be organized
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179828 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-19 02:10:53 +00:00
Hal Finkel
4029c3feed Disable PPC comparison optimization by default
This seems to cause a stage-2 LLVM compile failure (by crashing TableGen); do
I'm disabling this for now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179807 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-18 22:54:25 +00:00
Hal Finkel
860c08cad5 Implement optimizeCompareInstr for PPC
Many PPC instructions have a so-called 'record form' which stores to a specific
condition register the result of comparing the result of the instruction with
zero (always as a signed comparison). For integer operations on PPC64, this is
always a 64-bit comparison.

This implementation is derived from the implementation in the ARM backend;
there are some differences because PPC condition registers are allocatable
virtual registers (although the record forms always use a specific one), and we
look for a matching subtraction instruction after the compare (but before the
first use) in addition to before it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179802 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-18 22:15:08 +00:00
Benjamin Kramer
fcba22decb X86: Add an SSE2 lowering for 64 bit compares when pcmpgtq (SSE4.2) isn't available.
This pattern started popping up in vectorized min/max reductions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179797 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-18 21:37:45 +00:00
Derek Schuff
2061dcf0e4 Allow misaligned stores in x86 fast-isel.
In X86FastISel::X86SelectStore(), improperly aligned stores are rejected and
handled by the DAG-based ISel.  However, X86FastISel::X86SelectLoad() makes
no such requirement.  There doesn't appear to be an x86 architectural
correctness issue with allowing potentially unaligned store instructions.
This patch removes this restriction.

Patch by Jim Stichnot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179774 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-18 17:41:08 +00:00
Hao Liu
d050e96133 Fix for PR14824, An ARM Load/Store Optimization bug
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179751 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-18 09:11:08 +00:00
Eli Bendersky
50125482d3 This patch teaches x86 fast-isel to generate the native div/idiv instructions
for the sdiv/srem/udiv/urem bitcode instructions.  This is done for the i8,
i16, and i32 types, as well as i64 for the x86_64 target.

Patch by Jim Stichnoth



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179715 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-17 20:10:13 +00:00
Vincent Lejeune
26ebd7aafc R600: Make Export Instruction not duplicable
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179686 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-17 15:17:39 +00:00
Richard Osborne
13a16284a5 [XCore] Extend test to check positve offsets are folded into addresses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179621 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-16 20:05:52 +00:00
Richard Osborne
db51e31527 [XCore] Give test more generic name.
I intend to extend the test with more offset folding checks


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179620 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-16 19:56:55 +00:00
Richard Osborne
b509b65240 [XCore] Convert a couple of tests to FileCheck.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179619 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-16 19:41:19 +00:00
Logan Chien
532854d7ab Implement ARM unwind opcode assembler.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179591 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-16 12:02:21 +00:00
Jakob Stoklund Olesen
ad36608499 Add 64-bit multiply and divide instructions for SPARC v9.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179582 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-16 02:57:02 +00:00
Tom Stellard
9a256300f8 R600/SI: Emit config values in register value pairs.
Instead of emitting config values in a predefined order, the code
emitter will now emit a 32-bit register index followed by the 32-bit
config value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179546 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-15 17:51:35 +00:00
Tom Stellard
bf1efe6421 R600/SI: Emit configuration value in the .AMDGPU.config ELF section
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179545 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-15 17:51:30 +00:00
Tom Stellard
3a63bf27c5 R600: Emit ELF formatted code rather than raw ISA.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179544 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-15 17:51:21 +00:00