Commit Graph

16514 Commits

Author SHA1 Message Date
Chandler Carruth
8bc3434e68 Teach the 'opt' tool about '-Os' and '-Oz', corresponding to the Clang
options, to enable easier testing of the innards of LLVM that are
enabled by such optimization strategies.

Note that this doesn't provide the (much needed) function attribute
support for -Oz (as opposed to -Os), but still seems like a positive
step to better test the logic that Clang currently relies on.

Patch by Patrik Hägglund.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156913 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-16 08:32:49 +00:00
Evan Cheng
6100366c2f Avoid creating a cycle when folding load / op with flag / store. PR11451474. rdar://11451474
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156896 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-16 01:54:27 +00:00
Jakob Stoklund Olesen
83b3a29334 Enable sub-sub-register copy coalescing.
It is now possible to coalesce weird skewed sub-register copies by
picking a super-register class larger than both original registers. The
included test case produces code like this:

  vld2.32 {d16, d17, d18, d19}, [r0]!
  vst2.32 {d18, d19, d20, d21}, [r0]

We still perform interference checking as if it were a normal full copy
join, so this is still quite conservative. In particular, the f1 and f2
functions in the included test case still have remaining copies because
of false interference.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156878 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-15 23:31:35 +00:00
Kevin Enderby
e545c4e45d Add a test case for r156840, a fix to llvm-objdump when disassembling using
-macho to disassemble the last symbol to the end of the section.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156850 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-15 20:20:50 +00:00
Sirish Pande
0031af6615 Enable all Hexagon tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156824 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-15 16:13:12 +00:00
David Majnemer
ac78266674 Teach SimplifyLibCalls about stpcpy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156815 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-15 11:46:21 +00:00
Jakob Stoklund Olesen
4d10829e12 Fix PR12821.
RAFast must add an <imp-def> operand when it is rewriting a sub-register
def that isn't a read-modify-write.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156777 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-14 21:10:25 +00:00
Chad Rosier
3a884f5c17 Move the capture analysis from MemoryDependencyAnalysis to a more general place
so that it can be reused in MemCpyOptimizer.  This analysis is needed to remove
an unnecessary memcpy when returning a struct into a local variable.
rdar://11341081
PR12686


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156776 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-14 20:35:04 +00:00
Brendon Cahoon
5262abb268 Revert 156634 upon request until code improvement changes are made.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156775 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-14 19:35:42 +00:00
Dan Gohman
a6063c6e29 Rename @llvm.debugger to @llvm.debugtrap.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156774 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-14 18:58:10 +00:00
Rafael Espindola
2ec304c0bf Add support for the .rept directive. Patch by Vladmir Sorokin. I added support
for nesting.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156714 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-12 16:31:10 +00:00
Benjamin Kramer
6a80f9da8c ELF: Add support for the asm .version directive.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156712 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-12 14:30:47 +00:00
Benjamin Kramer
bc3b27ccd9 AsmParser: Add support for the .purgem directive.
Based on a patch by Team PaX.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156709 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-12 11:21:46 +00:00
Benjamin Kramer
e14a3c5084 AsmParser: ignore the .extern directive.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156707 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-12 11:18:59 +00:00
Benjamin Kramer
dec06ef431 AsmParser: Add support for .ifc and .ifnc directives.
Based on a patch from PaX Team.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156706 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-12 11:18:51 +00:00
Benjamin Kramer
a3dd0eb93c AsmParser: Add support for .ifb and .ifnb directives.
Based on a patch from PaX Team.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156705 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-12 11:18:42 +00:00
Stepan Dyatkovskiy
1cce5bf8ef Recommited r156374 with critical fixes in BitcodeReader/Writer:
Ordinary patch for PR1255.
Added new case-ranges orientated methods for adding/removing cases in SwitchInst. After this patch cases will internally representated as ConstantArray-s instead of ConstantInt, externally cases wrapped within the ConstantRangesSet object.
Old methods of SwitchInst are also works well, but marked as deprecated. So on this stage we have no side effects except that I added support for case ranges in BitcodeReader/Writer, of course test for Bitcode is also added. Old "switch" format is also supported.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156704 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-12 10:48:17 +00:00
Jay Foad
b7454fd9df Teach Function::hasAddressTaken that BlockAddress doesn't really take
the address of a function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156703 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-12 08:30:16 +00:00
Sirish Pande
b33857040f Support for Hexagon feature, New Value Jump.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156698 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-12 05:10:30 +00:00
Akira Hatanaka
1da1cdf504 Fix test cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156697 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-12 03:25:16 +00:00
Akira Hatanaka
4147e4d054 Make the following changes in MipsAsmPrinter.cpp:
- Remove code which lowers pseudo SETGP01.
- Fix LowerSETGP01. The first two of the three instructions that are emitted to
  initialize the global pointer register now use register $2.
- Stop emitting .cpload directive.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156689 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-12 00:48:43 +00:00
Akira Hatanaka
27ba61df9f Insert instructions to the entry basic block which initializes the global
pointer register. 


This is the first of the series of patches which clean up the way global pointer
register is used. The patches will make the following improvements:

- Make $gp an allocatable temporary register rather than reserving it.
- Use a virtual register as the global pointer register and let the register
  allocator decide which register to assign to it or whether spill/reloads are
  needed.
- Make sure $gp is valid at the entry of a called function, which is necessary
  for functions using lazy binding.
- Remove the need for emitting .cprestore and .cpload directives.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156671 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-12 00:17:17 +00:00
Akira Hatanaka
3011a33a63 Do not replace operands of pseudo instructions with register $zero.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156663 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 23:22:18 +00:00
Akira Hatanaka
f5b1e2802f Use regular expression to match register names.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156656 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 23:00:40 +00:00
Chad Rosier
226ddf5278 [fast-isel] Add support for selecting @llvm.trap().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156646 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 21:33:49 +00:00
Brendon Cahoon
6d532d8860 Hexagon constant extender support.
Patch by Jyotsna Verma.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156634 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 19:56:59 +00:00
Chad Rosier
2b3b335f2d [fast-isel] Remove -disable-arm-fast-isel option. -fast-isel=0 suffices. Minor cleanup.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156632 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 19:40:25 +00:00
Chad Rosier
2a2e9d54e9 [fast-isel] Cleaner fix for when we're unable to handle a non-double multi-reg
retval.  Hoists check before emitting the call to avoid unnecessary work.
rdar://11430407
PR12796


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156628 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 18:51:55 +00:00
Nuno Lopes
12c807873a objectsize: add a few more tests and fix a bug
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156625 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 18:25:29 +00:00
Hans Wennborg
12447575dc Fix test/CodeGen/X86/tls-pie.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156612 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 10:19:54 +00:00
Hans Wennborg
228756c744 Implement initial-exec TLS model for 32-bit PIC x86
This fixes a TODO from 2007 :) Previously, LLVM would emit the wrong
code here (see the update to test/CodeGen/X86/tls-pie.ll).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156611 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 10:11:01 +00:00
Silviu Baranga
169e9ba2b2 Added the missing bit definition for the 4th bit of the STR (post reg) instruction. It is now set to 0. The patch also sets the unpredictable mask for SEL and SXTB-type instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156609 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 09:28:27 +00:00
Silviu Baranga
ca3cd419a5 Fixed the LLVM ARM v7 assembler and instruction printer for 8-bit immediate offset addressing. The assembler and instruction printer were not properly handeling the #-0 immediate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156608 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 09:10:54 +00:00
Eli Friedman
5b6dfee28e Fix a minor logic mistake transforming compares in instcombine. PR12514.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156600 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 01:32:59 +00:00
Manman Ren
247c5ab07c ARM: peephole optimization to remove cmp instruction
This patch will optimize the following cases:
  sub r1, r3 | sub r1, imm
  cmp r3, r1 or cmp r1, r3 | cmp r1, imm
  bge L1

TO
  subs r1, r3
  bge  L1 or ble L1

If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.

rdar: 10734411


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156599 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 01:30:47 +00:00
Dan Gohman
d4347e1af9 Define a new intrinsic, @llvm.debugger. It will be similar to __builtin_trap(),
but it generates int3 on x86 instead of ud2.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156593 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 00:19:32 +00:00
Nuno Lopes
9d236f909c objectsize: add support for GEPs with non-constant indexes
add an additional parameter to InstCombiner::EmitGEPOffset() to force it to *not* emit operations with NUW flag

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156585 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-10 23:17:35 +00:00
Eric Christopher
05b7a50210 Add support for the 'X' inline asm operand modifier.
Patch by Jack Carter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156577 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-10 21:48:22 +00:00
Sirish Pande
7517bbc91a Hexagon V5 FP Support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156568 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-10 20:20:25 +00:00
Dan Gohman
b401e3bd16 Teach DeadStoreElimination to eliminate exit-block stores with phi addresses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156558 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-10 18:57:38 +00:00
Manman Ren
fe65d98dad Revert: 156550 "ARM: peephole optimization to remove cmp instruction"
This commit broke an external linux bot and gave a compile-time warning.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156556 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-10 18:49:43 +00:00
Nuno Lopes
e54874471c teach DSE and isInstructionTriviallyDead() about calloc
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156553 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-10 17:14:00 +00:00
Joel Jones
9777f61bfe formatting change: strip debug info from test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156551 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-10 16:55:31 +00:00
Manman Ren
8ae4f062e4 ARM: peephole optimization to remove cmp instruction
This patch will optimize the following cases:
  sub r1, r3 | sub r1, imm
  cmp r3, r1 or cmp r1, r3 | cmp r1, imm
  bge L1

TO
  subs r1, r3
  bge  L1 or ble L1

If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.

rdar: 10734411


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156550 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-10 16:48:21 +00:00
Joel Jones
9df72a93ef Fix a problem with incomplete equality testing of PHINodes in
Instruction::IsIdenticalToWhenDefined.

This manifested itself when inlining two calls to the same function.  The 
inlined function had a switch statement that returned one of a set of 
global variables.  Without this modification, the two phi instructions that 
chose values from the branches of the switch instruction inlined from the 
callee were considered equivalent and jump-threading replaced a load for the 
first switch value with a phi selecting from the second switch, thereby 
producing incorrect code.

This patch has been tested with "make check-all", "lnt runteste nt", and 
llvm self-hosted, and on the original program that had this problem, 
wireshark.

<rdar://problem/11025519>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156548 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-10 15:59:41 +00:00
Nadav Rotem
b210651654 AVX2: Add an additional broadcast idiom.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156540 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-10 12:39:13 +00:00
Nadav Rotem
5fc2187a02 Generate AVX/AVX2 shuffles even when there is a memory op somewhere else in the program.
Starting r155461 we are able to select patterns for vbroadcast even when the load op is used by other users.

Fix PR11900.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156539 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-10 12:22:05 +00:00
Dan Gohman
95b8cf1f11 Fix the objc_storeStrong recognizer to stop before walking off the
end of a basic block if there's no store.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156520 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-09 23:08:33 +00:00
Nuno Lopes
e3305b1750 objectsize:
refactor code a bit to enable future changes to support run-time information
add support to compute allocation sizes at run-time if penalty > 1 (e.g., malloc(x), calloc(x, y), and VLAs)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156515 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-09 21:30:57 +00:00
Danil Malyshev
fd39afb5e1 Added a regress test for the bug #9964 before close it.
This bug was fixed by Jim Grosbach in #138879, thanks Jim!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156505 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-09 19:07:04 +00:00
Nuno Lopes
30759542aa change the objectsize intrinsic signature: add a 3rd parameter to denote the maximum runtime performance penalty that the user is willing to accept.
This commit only adds the parameter. Code taking advantage of it will follow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156473 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-09 15:52:43 +00:00
Filipe Cabecinhas
7305295538 Fixed a typo
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156471 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-09 14:43:50 +00:00
Akira Hatanaka
2b409b65d4 Add another peephole pattern for conditional moves.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156460 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-09 02:29:29 +00:00
Akira Hatanaka
56ec9f2c09 Make register FP allocatable if the compiled function does not have dynamic
allocas.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156458 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-09 01:38:13 +00:00
Akira Hatanaka
a284acb8a7 Expand 64-bit shifts if target ABI is O32.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156457 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-09 00:55:21 +00:00
Dan Gohman
4670dace66 Fix objc_storeStrong pattern matching to catch a potential use of the
old value after the store but before it is released.
This fixes rdar:/11116986.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156442 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-08 23:34:08 +00:00
Eric Christopher
501207676c Handle OpDeref in case it comes in as a register operand.
Part of rdar://11352000

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156405 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-08 18:56:00 +00:00
Daniel Dunbar
f468fd805a Revert r156393, "[tests] Remove some remaining DejaGNU related cruft.", this
patch wasn't ready yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156395 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-08 18:26:07 +00:00
Daniel Dunbar
b3593a60ce [tests] Remove some remaining DejaGNU related cruft.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156393 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-08 18:11:49 +00:00
Duncan Sands
a337010989 Calling ReassociateExpression recursively is extremely dangerous since it will
replace the operands of expressions with only one use with undef and generate
a new expression for the original without using RAUW to update the original.
Thus any copies of the original expression held in a vector may end up
referring to some bogus value - and using a ValueHandle won't help since there
is no RAUW.  There is already a mechanism for getting the effect of recursion
non-recursively: adding the value to be recursed on to RedoInsts.  But it wasn't
being used systematically.  Have various places where recursion had snuck in at
some point use the RedoInsts mechanism instead.  Fixes PR12169.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156379 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-08 12:16:05 +00:00
Stepan Dyatkovskiy
1f9838347f Rejected r156374: Ordinary PR1255 patch. Due to clang-x86_64-debian-fnt buildbot failure.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156377 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-08 08:33:21 +00:00
Craig Topper
189bce48c7 Remove 256-bit AVX non-temporal store intrinsics. Similar was previously done for 128-bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156375 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-08 06:58:15 +00:00
Stepan Dyatkovskiy
85a4406959 Ordinary patch for PR1255.
Added new case-ranges orientated methods for adding/removing cases in SwitchInst. After this patch cases will internally representated as ConstantArray-s instead of ConstantInt, externally cases wrapped within the ConstantRangesSet object.
Old methods of SwitchInst are also works well, but marked as deprecated. So on this stage we have no side effects except that I added support for case ranges in BitcodeReader/Writer, of course test for Bitcode is also added. Old "switch" format is also supported.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156374 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-08 06:36:08 +00:00
Owen Anderson
713e953118 Teach DAG combine to fold x-x to 0.0 when unsafe FP math is enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156324 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 20:51:25 +00:00
Owen Anderson
423f19f2da Teach reassociate to commute FMul's and FAdd's in order to canonicalize the order of their operands across instructions. This allows for greater CSE opportunities.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156323 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 20:47:23 +00:00
Chad Rosier
42726835e3 Fix a regression from r147481. This combine should only happen if there is a
single use.
rdar://11360370


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156316 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 18:47:44 +00:00
Manman Ren
ed57984483 X86: optimization for -(x != 0)
This patch will optimize -(x != 0) on X86
FROM 
cmpl	$0x01,%edi
sbbl	%eax,%eax
notl	%eax
TO
negl %edi
sbbl %eax %eax

In order to generate negl, I added patterns in Target/X86/X86InstrCompiler.td:
def : Pat<(X86sub_flag 0, GR32:$src), (NEG32r GR32:$src)>;

rdar: 10961709


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156312 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 18:06:23 +00:00
Eric Christopher
4adbefebd2 Add support for the 'l' constraint.
Patch by Jack Carter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156294 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 06:25:15 +00:00
Eric Christopher
1d5a392e2c Add support for the 'c' constraint.
Patch by Jack Carter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156293 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 06:25:10 +00:00
Eric Christopher
54412a789a Add support for the 'P' constraint.
Patch by Jack Carter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156292 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 06:25:02 +00:00
Eric Christopher
1ce2034e43 Add support for the 'O' constraint.
Patch by Jack Carter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156285 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 05:46:48 +00:00
Eric Christopher
60cfc7908e Add support for the 'N' inline asm constraint.
Patch by Jack Carter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156284 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 05:46:43 +00:00
Eric Christopher
5ac47bba83 Add support for the 'L' inline asm constraint.
Patch by Jack Carter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156283 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 05:46:37 +00:00
Eric Christopher
f49f846eec Add support for the inline asm constraint 'K'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156282 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 05:46:29 +00:00
Craig Topper
5f9cccc509 Add SSE4A MOVNTSS/MOVNTSD instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156281 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 05:36:19 +00:00
Eric Christopher
e5076d484b Support the 'J' constraint.
Patch by Jack Carter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156280 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 03:13:42 +00:00
Eric Christopher
50ab03954e Add support for the 'I' inline asm constraint. Also add tests
from the previous 2 patches.

Patch by Jack Carter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156279 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 03:13:32 +00:00
Benjamin Kramer
77c4ef8a47 Switch the select to branch transformation on by default.
The primitive conservative heuristic seems to give a slight overall
improvement while not regressing stuff. Make it available to wider
testing. If you notice any speed regressions (or significant code
size regressions) let me know!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156258 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-06 14:25:16 +00:00
Benjamin Kramer
59957500f9 CodeGenPrepare: Add a transform to turn selects into branches in some cases.
This came up when a change in block placement formed a cmov and slowed down a
hot loop by 50%:

	ucomisd	(%rdi), %xmm0
	cmovbel	%edx, %esi

cmov is a really bad choice in this context because it doesn't get branch
prediction. If we emit it as a branch, an out-of-order CPU can do a better job
(if the branch is predicted right) and avoid waiting for the slow load+compare
instruction to finish. Of course it won't help if the branch is unpredictable,
but those are really rare in practice.

This patch uses a dumb conservative heuristic, it turns all cmovs that have one
use and a direct memory operand into branches. cmovs usually save some code
size, so we disable the transform in -Os mode. In-Order architectures are
unlikely to benefit as well, those are included in the
"predictableSelectIsExpensive" flag.

It would be better to reuse branch probability info here, but BPI doesn't
support select instructions currently. It would make sense to use the same
heuristics as the if-converter pass, which does the opposite direction of this
transform.


Test suite shows a small improvement here and there on corei7-level machines,
but the actual results depend a lot on the used microarchitecture. The
transformation is currently disabled by default and available by passing the
-enable-cgp-select2branch flag to the code generator.

Thanks to Chandler for the initial test case to him and Evan Cheng for providing
me with comments and test-suite numbers that were more stable than mine :)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156234 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-05 12:49:22 +00:00
Stepan Dyatkovskiy
3f71cf14b2 Small fix in InstCombineCasts.cpp. Restored "alloca + bitcast" reducing for case when alloca's size is calculated within the "add/sub/... nsw".
Also added fix to 2011-06-13-nsw-alloca.ll test.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156231 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-05 07:09:40 +00:00
Justin Holewinski
49683f3c96 This patch adds a new NVPTX back-end to LLVM which supports code generation for NVIDIA PTX 3.0. This back-end will (eventually) replace the current PTX back-end, while maintaining compatibility with it.
The new target machines are:

nvptx (old ptx32) => 32-bit PTX
nvptx64 (old ptx64) => 64-bit PTX

The sources are based on the internal NVIDIA NVPTX back-end, and
contain more functionality than the current PTX back-end currently
provides.

NV_CONTRIB

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156196 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-04 20:18:50 +00:00
Sebastian Pop
2c7e5c714c Added missing CMN case in Thumb2SizeReduction pass so that LLVM emits 16-bits encoding of CMN instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156195 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-04 19:53:56 +00:00
Craig Topper
f3640d7ec1 Allow v16i16 and v32i8 shuffles to be rewritten as narrower shuffles.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156156 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-04 04:44:49 +00:00
Kevin Enderby
2d524b0765 Fix issues with the ARM bl and blx thumb instructions and the J1 and J2 bits
for the assembler and disassembler.  Which were not being set/read correctly
for offsets greater than 22 bits in some cases.

Changes to lib/Target/ARM/ARMAsmBackend.cpp from Gideon Myles!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156118 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-03 22:41:56 +00:00
Nuno Lopes
cb348b9b45 remove calls to calloc if the allocated memory is not used (it was already being done for malloc)
fix a few typos found by Chad in my previous commit

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156110 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-03 22:08:19 +00:00
Sirish Pande
26f61a158b Support for target dependent Hexagon VLIW packetizer.
This patch creates and optimizes packets as per Hexagon ISA rules.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156109 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-03 21:52:53 +00:00
Nuno Lopes
252ef566e8 add support for calloc to objectsize lowering
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156102 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-03 21:19:58 +00:00
Silviu Baranga
b422d0b65e Fixed disassembler for vstm/vldm ARM VFP instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156077 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-03 16:38:40 +00:00
Craig Topper
6b28d356c5 Fix 256-bit vpshuflw and vpshufhw immediate encoding to handle undefs in the lower half correctly. Missed in r155982.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156059 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-03 07:12:59 +00:00
Evan Cheng
d99d68bcee Fix two-address pass's aggressive instruction commuting heuristics. It's meant
to catch cases like:
 %reg1024<def> = MOV r1
 %reg1025<def> = MOV r0
 %reg1026<def> = ADD %reg1024, %reg1025
 r0            = MOV %reg1026

By commuting ADD, it let coalescer eliminate all of the copies. However, there
was a bug in the heuristics where it ended up commuting the ADD in:

 %reg1024<def> = MOV r0
 %reg1025<def> = MOV 0
 %reg1026<def> = ADD %reg1024, %reg1025
 r0            = MOV %reg1026

That did no benefit but rather ensure the last MOV would not be coalesced.

rdar://11355268


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156048 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-03 01:45:13 +00:00
Owen Anderson
062c0a5b58 Teach DAGCombine the same multiply-by-1.0 folding trick when doing FMAs, just like it now knows for FMULs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156029 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-02 22:17:40 +00:00
Owen Anderson
363e4b90c0 Teach DAG combine that multiplication by 1.0 can always be constant folded.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156023 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-02 21:32:35 +00:00
Jim Grosbach
2727930ab4 ARM: Add missing two-operand VBIC aliases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156019 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-02 21:11:56 +00:00
Manman Ren
e2849851b2 Revert r155853
The commit is intended to fix rdar://10961709.
But it is the root cause of PR12720.
Revert it for now.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155992 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-02 15:24:32 +00:00
Bill Wendling
55e7098bbc The value held in the vector may be RAUW'ed by some of the canonicalization
methods. Use a weak value handle to keep up with this.
PR12245


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155984 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-02 09:59:45 +00:00
Richard Barton
0a552d611e Disallow YIELD and other allocated nop hints in pre-ARMv6 architectures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155983 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-02 09:43:18 +00:00
Craig Topper
a9a568a79d Add support for selecting AVX2 vpshuflw and vpshufhw. Add decoding support for AsmPrinter.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155982 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-02 08:03:44 +00:00
Bill Wendling
95dd442041 Strip the pointer casts off of allocas so that the selection DAG can find them.
PR10799


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155954 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-01 22:50:45 +00:00
Jim Grosbach
54319e2a8c ARM: Add a few missing add->sub aliases w/ 'w' suffix.
Aliases for adding a negative immediate when using an explicit 'w'
suffix. E.g.,
        adds.w r2, #-16
        adds.w r2, r2, #-16
        addw r2, #-16
        addw r2, #-16
        addw r2, r2, #-16

rdar://11330769

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155946 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-01 21:17:34 +00:00
Jim Grosbach
94b590f8fa ARM: allow vanilla expressions for movw/movt.
Expressions for movw/movt don't always have an :upper16: or :lower16:
on them and that's ok. When they don't, it's just a plain [0-65536]
immediate result, effectively the same as a :lower16: variant kind.

rdar://10550147

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155941 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-01 20:43:21 +00:00
Jim Grosbach
686c01854e MC: Unknown assembler directives are now hard errors.
Previously, an unsupported/unknown assembler directive issued a warning.
That's generally unsafe, and inconsistent with the behaviour of pretty
much every system assembler. Now that the MC assemblers are mature
enough to be the default on multiple targets, it's reasonable to
issue errors for these.

For target or platform directives that need to stay warnings, we
should add explicit handlers for them in, e.g., ELFAsmParser.cpp,
DarwinAsmParser.cpp, et. al., and issue the warning there.

rdar://9246275

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155926 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-01 18:38:27 +00:00
Manman Ren
769ea2f93f X86: optimization for max-like struct
This patch will optimize the following cases on X86
(a > b) ? (a-b) : 0
(a >= b) ? (a-b) : 0
(b < a) ? (a-b) : 0
(b <= a) ? (a-b) : 0

FROM
movl    %edi, %ecx
subl    %esi, %ecx
cmpl    %edi, %esi
movl    $0, %eax
cmovll  %ecx, %eax
TO
xorl    %eax, %eax
subl    %esi, %edi
cmovll  %eax, %edi
movl    %edi, %eax

rdar: 10734411


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155919 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-01 17:16:15 +00:00
Alexey Samsonov
d07d06ceef X86: Use StackRegister instead of FrameRegister in getFrameIndexReference (to generate debug info for local variables) if stack needs realignment
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155917 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-01 15:16:06 +00:00
Jay Foad
4ab6fcaadc Regression test for PR2960.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155912 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-01 11:11:34 +00:00
Nick Lewycky
4056a73638 An instruction in a loop is not guaranteed to be executed just because the loop
has no exit blocks. Fixes PR12706!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155884 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-01 04:03:01 +00:00
Lang Hames
973f72a29a Add support for llvm.arm.neon.vmull* intrinsics to InstCombine. Fixes
<rdar://problem/11291436>.

This is a second attempt at a fix for this, the first was r155468. Thanks
to Chandler, Bob and others for the feedback that helped me improve this.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155866 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-01 00:20:38 +00:00
Manman Ren
16a76519a5 X86: optimization for -(x != 0)
This patch will optimize -(x != 0) on X86
FROM 
cmpl	$0x01,%edi
sbbl	%eax,%eax
notl	%eax
TO
negl %edi
sbbl %eax %eax


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155853 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-30 22:51:25 +00:00
Manman Ren
1701105022 test/CodeGen/X86/select.ll: remove spaces
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155840 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-30 18:54:27 +00:00
Derek Schuff
ddc693bd22 Fix fastcc structure return with fast-isel on x86-32
On x86-32, structure return via sret lets the callee pop the hidden
pointer argument off the stack, which the caller then re-pushes.
However if the calling convention is fastcc, then a register is used
instead, and the caller should not adjust the stack. This is
implemented with a check of IsTailCallConvention
X86TargetLowering::LowerCall but is now checked properly in
X86FastISel::DoSelectCall.

(this time, actually commit what was reviewed!)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155825 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-30 16:57:15 +00:00
Bob Wilson
ff73d8fef9 Don't introduce illegal types when creating vmull operations. <rdar://11324364>
ARM BUILD_VECTORs created after type legalization cannot use i8 or i16
operands, since those types are not legal.  Instead use i32 operands, which
will be implicitly truncated by the BUILD_VECTOR to match the element type.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155824 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-30 16:53:34 +00:00
Duncan Sands
5ff30e70f8 Just mark the sign bit as known zero, rather than any other irrelevant bits
known zero in the LHS.  Fixes PR12541.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155818 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-30 11:56:58 +00:00
Bill Wendling
bfbab99b58 Second attempt at PR12573:
Allow the "SplitCriticalEdge" function to split the edge to a landing pad. If
the pass is *sure* that it thinks it knows what it's doing, then it may go ahead
and specify that the landing pad can have its critical edge split. The loop
unswitch pass is one of these passes. It will split the critical edges of all
edges coming from a loop to a landing pad not within the loop. Doing so will
retain important loop analysis information, such as loop simplify.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155817 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-30 10:44:54 +00:00
Rafael Espindola
9719cf329b Make sure HoistInsertPosition finds a position that is dominated by all
inputs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155809 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-30 03:53:06 +00:00
Andrew Trick
2674a4acdb Reapply 155668: Fix the SD scheduler to avoid gluing the same node twice.
This time, also fix the caller of AddGlue to properly handle
incomplete chains. AddGlue had failure modes, but shamefully hid them
from its caller. It's luck ran out.

Fixes rdar://11314175: BuildSchedUnits assert.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155749 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-28 01:03:23 +00:00
Jim Grosbach
a9cc08f24f ARM: Thumb add(sp plus register) asm constraints.
Make sure when parsing the Thumb1 sp+register ADD instruction that
the source and destination operands match. In thumb2, just use the
wide encoding if they don't. In Thumb1, issue a diagnostic.

rdar://11219154

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155748 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27 23:51:36 +00:00
Derek Schuff
f3db6b855e Revert r155745
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155746 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27 23:37:41 +00:00
Derek Schuff
9dc28b0722 Fix fastcc structure return with fast-isel on x86-32
On x86-32, structure return via sret lets the callee pop the hidden
pointer argument off the stack, which the caller then re-pushes.
However if the calling convention is fastcc, then a register is used
instead, and the caller should not adjust the stack. This is
implemented with a check of IsTailCallConvention
X86TargetLowering::LowerCall but is now checked properly in
X86FastISel::DoSelectCall.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155745 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27 23:27:17 +00:00
Andrew Trick
0e47cfd5b6 Temporarily revert r155668: Fix the SD scheduler to avoid gluing.
This definitely caused regression with ARM -mno-thumb.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155743 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27 22:55:59 +00:00
Chad Rosier
a73b6fc511 Add x86-specific DAG combine to simplify:
x == -y --> x+y == 0
 x != -y --> x+y != 0

On x86, the generated code goes from
   negl    %esi
   cmpl    %esi, %edi
   je    .LBB0_2
to
   addl    %esi, %edi
   je    .L4

This case is correctly handled for ARM with "cmn".

Patch by Manman Ren.
rdar://11245199
PR12545


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155739 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27 22:33:25 +00:00
Evan Cheng
da71cca458 Make test less fragile.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155732 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27 20:48:18 +00:00
Hal Finkel
e32e5440d6 Don't vectorize target-specific types (ppc_fp128, x86_fp80, etc.).
Target specific types should not be vectorized. As a practical matter,
these types are already register matched (at least in the x86 case),
and codegen does not always work correctly (at least in the ppc case,
and this is not worth fixing because ppc_fp128 is currently broken and
will probably go away soon).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155729 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27 19:34:00 +00:00
Lang Hames
7787800481 Fix the order of the operands in the llvm.fma intrinsic patterns for ARM,
<rdar://problem/11325085>.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155724 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27 18:51:24 +00:00
Dan Gohman
03e091f0b5 Reapply r155682, making constant folding more consistent, with a fix to work
properly with how the code handles all-undef PHI nodes.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155721 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27 17:50:22 +00:00
Richard Barton
04a09a461b Fix ARM assembly parsing for upper case condition codes on IT instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155720 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27 17:34:01 +00:00
Benjamin Kramer
71275b129b Missed some register numbers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155706 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27 12:21:46 +00:00
Benjamin Kramer
a356e9414c Update edis test for r155704.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155705 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27 12:14:03 +00:00
Benjamin Kramer
17c836c4b5 X86: Don't emit conditional floating point moves on when targeting pre-pentiumpro architectures.
* Model FPSW (the FPU status word) as a register.
* Add ISel patterns for the FUCOM*, FNSTSW and SAHF instructions.
* During Legalize/Lowering, build a node sequence to transfer the comparison
result from FPSW into EFLAGS. If you're wondering about the right-shift: That's
an implicit sub-register extraction (%ax -> %ah) which is handled later on by
the instruction selector.

Fixes PR6679. Patch by Christoph Erhardt!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155704 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27 12:07:43 +00:00
NAKAMURA Takumi
d213ee7643 Revert r155682, "Use ConstantExpr::getExtractElement when constant-folding vectors"
It broke stage2 build. stage1/clang sometimes crashed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155699 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27 07:59:20 +00:00
Kostya Serebryany
e507922779 [tsan] Atomic support for ThreadSanitizer, patch by Dmitry Vyukov
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155698 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27 07:31:53 +00:00
Craig Topper
76c5897eae Add mcpu to tests to prevent them from using AVX instructions on Sandy Bridge after r155618.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155696 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27 07:11:58 +00:00
Evan Cheng
afb3b5ebe6 Implement a bastardized ABI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155686 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27 02:11:10 +00:00
Evan Cheng
97a454317a - thumbv6 shouldn't imply +thumb2. Cortex-M0 doesn't suppport 32-bit Thumb2
instructions.
- However, it does support dmb, dsb, isb, mrs, and msr.
rdar://11331541


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155685 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27 01:27:19 +00:00
Dan Gohman
97b44f9b80 Use ConstantExpr::getExtractElement when constant-folding vectors
instead of getAggregateElement. This has the advantage of being
more consistent and allowing higher-level constant folding to
procede even if an inner extract element cannot be folded.

Make ConstantFoldInstruction call ConstantFoldConstantExpression
on the instruction's operands, making it more consistent with 
ConstantFoldConstantExpression itself. This makes sure that
ConstantExprs get TargetData-aware folding before being handed
off as operands for further folding.

This causes more expressions to be folded, but due to a known
shortcoming in constant folding, this currently has the side effect
of stripping a few more nuw and inbounds flags in the non-targetdata
side of constant-fold-gep.ll. This is mostly harmless.

This fixes rdar://11324230.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155682 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27 00:54:36 +00:00
Chad Rosier
c1fc5e4464 Add instcombine patterns for the following transformations:
(x & y) | (x ^ y) -> x | y 
 (x & y) + (x ^ y) -> x | y 

Patch by Manman Ren.
rdar://10770603


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155674 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-26 23:29:14 +00:00
Andrew Trick
aec9240be2 Fix the SD scheduler to avoid gluing the same node twice.
DAGCombine strangeness may result in multiple loads from the same
offset. They both may try to glue themselves to another load. We could
insist that the redundant loads glue themselves to each other, but the
beter fix is to bail out from bad gluing at the time we detect it.

Fixes rdar://11314175: BuildSchedUnits assert.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155668 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-26 21:48:25 +00:00
Tim Northover
37abe8df4a Use VLD1 in NEON extenting-load patterns instead of VLDR.
On some cores it's a bad idea for performance to mix VFP and NEON instructions
and since these patterns are NEON anyway, the NEON load should be used.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155630 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-26 08:46:29 +00:00
Chandler Carruth
464bda3a16 Teach the reassociate pass to fold chains of multiplies with repeated
elements to minimize the number of multiplies required to compute the
final result. This uses a heuristic to attempt to form near-optimal
binary exponentiation-style multiply chains. While there are some cases
it misses, it seems to at least a decent job on a very diverse range of
inputs.

Initial benchmarks show no interesting regressions, and an 8%
improvement on SPASS. Let me know if any other interesting results (in
either direction) crop up!

Credit to Richard Smith for the core algorithm, and helping code the
patch itself.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155616 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-26 05:30:30 +00:00
Evan Cheng
cac31de146 Specify cpu to unbreak tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155604 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-26 01:38:10 +00:00
Evan Cheng
e67a4163f5 If triple is armv7 / thumbv7 and a CPU is specified, do not automatically assume
the feature set of v7a. This comes about if the user specifies something like
-arch armv7 -mcpu=cortex-m3. We shouldn't be generating instructions such as
uxtab in this case.

rdar://11318438


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155601 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-26 01:13:36 +00:00
Jakob Stoklund Olesen
6962106496 Try to fix llvm-arm-linux builder with -mcpu.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155589 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-25 21:22:33 +00:00
Preston Gurd
01c0dd1be3 Trivial change to make the test use -mcpu=generic so as to avoid
a failure if run on an Intel Atom with post RA instruction scheduling.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155587 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-25 21:04:54 +00:00
Chandler Carruth
3ef91f575e Actually delete now-empty file.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155532 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-25 02:30:00 +00:00
Lang Hames
87aac6a877 Reverting r155468. Chris and Chandler have convinced me that it's dangerous and
in poor taste.

Talking through some alternate solutions with Chandler.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155530 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-25 02:16:54 +00:00
Akira Hatanaka
25052f4077 Do not use $gp as a dedicated global register if the target ABI is not O32.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155522 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-25 01:24:52 +00:00
Jim Grosbach
14ce6fac24 ARM: improved assembler diagnostics for missing CPU features.
When an instruction match is found, but the subtarget features it
requires are not available (missing floating point unit, or thumb vs arm
mode, for example), issue a diagnostic that identifies what the feature
mismatch is.

rdar://11257547

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155499 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-24 22:40:08 +00:00
Nadav Rotem
80c1ea6f9b ConstantFoldSelectInstruction swapped the operands of the select.
Fix 12592. Patch by Matt Pharr.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155480 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-24 20:18:49 +00:00
Nadav Rotem
2003e03045 Fix the testcase. We do expect two vblendw on XMMs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155477 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-24 19:57:38 +00:00
Nadav Rotem
34a13bb412 Add a testcase for 155440
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155475 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-24 19:45:28 +00:00
Evan Cheng
ddb1420e17 MachineBasicBlock::SplitCriticalEdge() should follow LLVM IR variant and refuse to break edge to EH landing pad. rdar://11300144
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155470 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-24 19:06:55 +00:00
Lang Hames
1d9e68dab1 Add support for llvm.arm.neon.vmull* intrinsics to InstCombine. This fixes
<rdar://problem/11291436>.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155468 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-24 18:58:36 +00:00
Chandler Carruth
7362ac7f8c Fix a crash on valid (if UB) bitcode that is produced for some global
constants in C++11 mode. I have no idea why it required such particular
circumstances to get here, the code seems clearly to rely upon unchecked
assumptions.

Specifically, when we decide to form an index into a struct type, we may
have gone through (at least one) zero-length array indexing round, which
would have left the offset un-adjusted, and thus not necessarily valid
for use when indexing the struct type.

This is just an canonicalization step, so the correct thing is to refuse
to canonicalize nonsensical GEPs of this form. Implemented, and test
case added.

Fixes PR12642. Pair debugged and coded with Richard Smith. =] I credit
him with most of the debugging, and preventing me from writing the wrong
code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155466 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-24 18:42:47 +00:00
Kevin Enderby
24e767d076 Add missing test cases for ARM VLD3 (single 3-element structure to all lanes)
instructions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155453 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-24 17:45:56 +00:00
Kevin Enderby
2c66edf434 Add missing test cases for ARM VLD4 (single 4-element structure to all lanes)
instructions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155444 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-24 15:55:00 +00:00
Nadav Rotem
d1a79136e3 AVX: We lower VECTOR_SHUFFLE and BUILD_VECTOR nodes into vbroadcast instructions
using the pattern (vbroadcast (i32load src)). In some cases, after we generate
this pattern new users are added to the load node, which prevent the selection
of the blend pattern. This commit provides fallback patterns which perform
in-vector broadcast (using in-vector vbroadcast in AVX2 and pshufd on AVX1).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155437 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-24 11:07:03 +00:00
Bill Wendling
75920ad424 FileCheck-ize tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155434 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-24 10:45:44 +00:00
Bill Wendling
c6490d138f FileCheck-ize these tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155433 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-24 10:36:42 +00:00
Bill Wendling
d5cc8b81ca FileCheck-ize these tests. Harden some of them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155432 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-24 09:15:38 +00:00
Nadav Rotem
a35407705d Optimize the vector UINT_TO_FP, SINT_TO_FP and FP_TO_SINT operations where the integer type is i8 (commonly used in graphics).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155397 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-23 21:53:37 +00:00
Preston Gurd
6a8c7bf8e7 This patch fixes a problem which arose when using the Post-RA scheduler
on X86 Atom. Some of our tests failed because the tail merging part of
the BranchFolding pass was creating new basic blocks which did not
contain live-in information. When the anti-dependency code in the Post-RA
scheduler ran, it would sometimes rename the register containing
the function return value because the fact that the return value was
live-in to the subsequent block had been lost. To fix this, it is necessary
to run the RegisterScavenging code in the BranchFolding pass.

This patch makes sure that the register scavenging code is invoked
in the X86 subtarget only when post-RA scheduling is being done.
Post RA scheduling in the X86 subtarget is only done for Atom.

This patch adds a new function to the TargetRegisterClass to control
whether or not live-ins should be preserved during branch folding.
This is necessary in order for the anti-dependency optimizations done
during the PostRASchedulerList pass to work properly when doing
Post-RA scheduling for the X86 in general and for the Intel Atom in particular.

The patch adds and invokes the new function trackLivenessAfterRegAlloc()
instead of using the existing requiresRegisterScavenging().
It changes BranchFolding.cpp to call trackLivenessAfterRegAlloc() instead of
requiresRegisterScavenging(). It changes the all the targets that
implemented requiresRegisterScavenging() to also implement
trackLivenessAfterRegAlloc().  

It adds an assertion in the Post RA scheduler to make sure that post RA
liveness information is available when it is needed.

It changes the X86 break-anti-dependencies test to use –mcpu=atom, in order
to avoid running into the added assertion.

Finally, this patch restores the use of anti-dependency checking
(which was turned off temporarily for the 3.1 release) for
Intel Atom in the Post RA scheduler.

Patch by Andy Zhang!

Thanks to Jakob and Anton for their reviews.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155395 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-23 21:39:35 +00:00
Jim Grosbach
c34954d432 ARM: Add testcases for two-operand variants of VSRA/VRSRA/VSRI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155391 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-23 21:00:47 +00:00
Jim Grosbach
10a3933c5f Add ARM mode tests for the NEON vector shift-accumulate tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155390 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-23 21:00:44 +00:00
Jim Grosbach
2b8525068a Tidy up. Reformat for ease of reading.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155389 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-23 21:00:42 +00:00
Chandler Carruth
d410eaba04 Revert r155365, r155366, and r155367. All three of these have regression
test suite failures. The failures occur at each stage, and only get
worse, so I'm reverting all of them.

Please resubmit these patches, one at a time, after verifying that the
regression test suite passes. Never submit a patch without running the
regression test suite.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155372 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-23 18:25:57 +00:00
Sirish Pande
15e56ad885 Hexagon V5 (floating point) support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155367 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-23 17:49:40 +00:00
Sirish Pande
1bfd24851e Support for Hexagon architectural feature, new value jump.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155366 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-23 17:49:28 +00:00
Sirish Pande
0dac3919e5 Support for Hexagon VLIW Packetizer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155365 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-23 17:49:20 +00:00
Jakob Stoklund Olesen
72847f3057 Reapply r155136 after fixing PR12599.
Original commit message:

Defer some shl transforms to DAGCombine.

The shl instruction is used to represent multiplication by a constant
power of two as well as bitwise left shifts. Some InstCombine
transformations would turn an shl instruction into a bit mask operation,
making it difficult for later analysis passes to recognize the
constsnt multiplication.

Disable those shl transformations, deferring them to DAGCombine time.
An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'.

These transformations are deferred:

  (X >>? C) << C   --> X & (-1 << C)  (When X >> C has multiple uses)
  (X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2)   (When C2 > C1)
  (X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2)  (When C1 > C2)

The corresponding exact transformations are preserved, just like
div-exact + mul:

  (X >>?,exact C) << C   --> X
  (X >>?,exact C1) << C2 --> X << (C2-C1)
  (X >>?,exact C1) << C2 --> X >>?,exact (C1-C2)

The disabled transformations could also prevent the instruction selector
from recognizing rotate patterns in hash functions and cryptographic
primitives. I have a test case for that, but it is too fragile.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155362 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-23 17:39:52 +00:00
Elena Demikhovsky
dd9047815c cleaned line endings in the newly added test file
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155315 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-22 13:22:48 +00:00
Chandler Carruth
a3e3481c57 Tidy up this test more:
1) Make the checked assertions a bit more precise. We really want the
   canonical forms coming out of reassociate to be exactly what is
   expected.
2) Remove other passes, and switch the test to actually directly check
   that reassociate makes the important transforms and
   canonicalizations.
3) Fold in a related test case now that we're using FileCheck. Make the
   same tidying changes to it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155311 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-22 10:11:26 +00:00
Chandler Carruth
71f8bc37f2 FileCheck-ize a test, and tidy it up a touch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155310 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-22 10:11:23 +00:00
Elena Demikhovsky
1da5867236 ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155309 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-22 09:39:03 +00:00
Nadav Rotem
db3461662e Teach getVectorTypeBreakdown about promotion of vectors in addition to widening of vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155296 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-21 20:08:32 +00:00
Jakob Stoklund Olesen
0b35c35efc Fix PR12599.
The X86 target is editing the selection DAG while isel is selecting
nodes following a topological ordering. When the DAG hacking triggers
CSE, nodes can be deleted and bad things happen.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155257 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-20 23:36:09 +00:00
Jim Grosbach
d8b3ed8f25 ARM: Update NEON assembly two-operand aliases.
Use the new TwoOperandAliasConstraint to handle lots of the two-operand aliases
for NEON instructions. There's still more to go, but this is a good chunk of
them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155210 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-20 18:12:54 +00:00
Manuel Klimek
ee54010afe Removes json-bench from the test dependencies.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155197 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-20 13:45:49 +00:00
Jakob Stoklund Olesen
eece9dc81c Revert r155136 "Defer some shl transforms to DAGCombine."
While the patch was perfect and defect free, it exposed a really nasty
bug in X86 SelectionDAG that caused an llc crash when compiling lencod.

I'll put the patch back in after fixing the SelectionDAG problem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155181 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-20 00:38:45 +00:00
Jim Grosbach
181b147975 ARM some VFP tblgen'erated two-operand aliases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155178 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-20 00:15:00 +00:00
Jim Grosbach
bfb3c5a50c Tidy up. Formatting.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155177 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-20 00:14:57 +00:00
Dan Gohman
8b74e5afda Avoid a bug in the path count computation, preventing an infinite
loop repeatedlt making the same change. This is for rdar://11256239.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155160 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-19 21:50:46 +00:00
Joel Jones
c8969fd291 Test for the the problem with xors being changed into ands
when the set bits aren't the same for both args of the xor.
This transformation is in the function TargetLowering::SimplifyDemandedBits
in the file lib/CodeGen/SelectionDAG/TargetLowering.cpp.

I have tested this test using a previous version of llc which the defect and 
the a version of llc which does not. I got the expected fail and pass, 
respectively.

This test goes with rdar://11195364 and the check in with the fix: svn r154955


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155156 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-19 20:54:44 +00:00
Michael J. Spencer
75338097c7 Remove llvm-ld and llvm-stub (which is only used by llvm-ld).
llvm-ld is no longer useful and causes confusion and so it is being removed.

* Does not work very well on Windows because it must call a gcc like driver to
  assemble and link.
* Has lots of hard coded paths which are wrong on many systems.
* Does not understand most of ld's options.
* Can be partially replaced by llvm-link | opt | {llc | as, llc -filetype=obj} |
  ld, or fully replaced by Clang.

I know of no production use of llvm-ld, and hacking use should be
replaced by Clang's driver.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155147 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-19 19:27:54 +00:00
Jakob Stoklund Olesen
0d5fcae6cd Defer some shl transforms to DAGCombine.
The shl instruction is used to represent multiplication by a constant
power of two as well as bitwise left shifts. Some InstCombine
transformations would turn an shl instruction into a bit mask operation,
making it difficult for later analysis passes to recognize the
constsnt multiplication.

Disable those shl transformations, deferring them to DAGCombine time.
An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'.

These transformations are deferred:

  (X >>? C) << C   --> X & (-1 << C)  (When X >> C has multiple uses)
  (X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2)   (When C2 > C1)
  (X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2)  (When C1 > C2)

The corresponding exact transformations are preserved, just like
div-exact + mul:

  (X >>?,exact C) << C   --> X
  (X >>?,exact C1) << C2 --> X << (C2-C1)
  (X >>?,exact C1) << C2 --> X >>?,exact (C1-C2)

The disabled transformations could also prevent the instruction selector
from recognizing rotate patterns in hash functions and cryptographic
primitives. I have a test case for that, but it is too fragile.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155136 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-19 16:46:26 +00:00
Jakob Stoklund Olesen
0d77b9c29c Extract the broken part of XFAILed test into its own file.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155081 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-19 00:20:38 +00:00
Jakob Stoklund Olesen
f5782e2d60 FileCheckize
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155010 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-18 17:01:26 +00:00
Jakob Stoklund Olesen
377bf1acb9 Nobody likes shifty instructions, but that was a bit strong.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155009 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-18 16:44:44 +00:00
Silviu Baranga
35ee7d28a6 Added support for disassembling unpredictable swp/swpb ARM instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155004 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-18 14:18:57 +00:00
Silviu Baranga
6b9f97dd89 Fix the bahavior of the disassembler when decoding unpredictable mrs instructions on ARM. Now the diasassembler emmits warnings instead of errors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155002 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-18 14:09:07 +00:00
Silviu Baranga
fa1ebc6abe Added support for unpredictable mcrr/mcrr2/mrrc/mrrc2 ARM instruction in the disassembler. Since the upredicability conditions are complex, C++ code was added to handle them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155001 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-18 13:12:50 +00:00
Silviu Baranga
e546c4c9c3 Fixed decoding for the ARM cdp2 instruction. The restriction on the coprocessor number was removed for this instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155000 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-18 13:02:55 +00:00
Silviu Baranga
9e71231309 Add suport for unpredicatble cases of the cmp, tst, teq and cmnz ARM instructions in the disassembler.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154999 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-18 12:48:43 +00:00
Joe Groff
41c3e9a326 FileCheckify, un-XFAIL SimplifyLibCalls/floor test
Fixes build on MSVC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154970 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-18 00:36:07 +00:00
Joe Groff
d15c581100 Move win32 SimplifyLibcall test under Transforms
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154967 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-18 00:07:45 +00:00
Joe Groff
d5bda5ec66 fix pr12559: mark unavailable win32 math libcalls
also fix SimplifyLibCalls to use TLI rather than compile-time conditionals to enable optimizations on floor, ceil, round, rint, and nearbyint

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154960 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-17 23:05:54 +00:00
Akira Hatanaka
ecdc9d5bb2 Add disassembler to MIPS.
Patch by Vladimir Medic. 



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154935 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-17 18:03:21 +00:00
Benjamin Kramer
93751c8c99 Force cmov on test so block placement doesn't shuffle the code around.
This made the test fail with -mcpu=generic (when building on a non-x86 host).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154926 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-17 13:55:23 +00:00
James Molloy
72aadc057c Fix bad EXTRACT_SUBREG in instruction selection for extending-loads on NEON.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154915 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-17 08:18:00 +00:00
Benjamin Kramer
86df062791 Revert "SCEV: When expanding a GEP the final addition to the base pointer has NUW but not NSW."
This isn't right either, reverting for now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154910 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-17 06:33:57 +00:00
Andrew Trick
8ca441aad3 Test cases that assume layout should use -disable-code-place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154908 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-17 06:20:42 +00:00
Kevin Enderby
c5a2a33938 Fix ARM disassembly of VLD2 (single 2-element structure to all lanes)
instructions with writebacks. And add test a case for all opcodes handed by
DecodeVLD2DupInstruction() in ARMDisassembler.cpp .


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154884 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-17 00:49:27 +00:00
Preston Gurd
8975f510c0 temporarily XFAIL this test until post RA
live-ins is properly enabled.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154882 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-17 00:21:35 +00:00
Chandler Carruth
fd2e4e65f7 Disable the atom scheduling test after r154874 broke it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154877 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 23:11:39 +00:00
Jim Grosbach
bf42f24e6e ARM two-operand forms for vhadd and vhsub instructions.
rdar://11252521

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154875 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 23:00:25 +00:00
Chandler Carruth
177bea5330 Relax this test a touch to cope with different assembly variants.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154870 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 22:20:48 +00:00
Chandler Carruth
f1a60c734c Fix updateTerminator to be resiliant to degenerate terminators where
both fallthrough and a conditional branch target the same successor.
Gracefully delete the conditional branch and introduce any unconditional
branch needed to reach the actual successor. This fixes memory
corruption in 2009-06-15-RegScavengerAssert.ll and possibly other tests.

Also, while I'm here fix a latent bug I spotted by inspection. I never
applied the same fundamental fix to this fallthrough successor finding
logic that I did to the logic used when there are no conditional
branches. As a consequence it would have selected landing pads had they
be aligned in just the right way here. I don't have a test case as
I spotted this by inspection, and the previous time I found this
required have of TableGen's source code to produce it. =/ I hate backend
bugs. ;]

Thanks to Jim Grosbach for helping me reason through this and reviewing
the fix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154867 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 22:03:00 +00:00
Jim Grosbach
68f89a6158 MC assembly parser handling for trailing comma in macro instantiation.
A trailing comma means no argument at all (i.e., as if the comma were not
present), not an empty argument to the invokee.

rdar://11252521

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154863 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 21:18:49 +00:00
Jakob Stoklund Olesen
39ac3252e8 FileCheckize these tests.
Add an extra test to ldr_post with an immediate increment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154859 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 20:56:42 +00:00
Jakob Stoklund Olesen
fbefc9125d Disable code placement for this test.
It makes it less sensitive to small changes in heuristics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154857 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 20:49:06 +00:00
Duncan Sands
2867c85a37 Remove support for the special 'fast' value for fpmath accuracy for the moment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154850 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 19:39:33 +00:00
Richard Smith
2c651fe6f4 Fix incorrect atomics codegen introduced in r154705, and extend test to catch it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154845 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 18:43:53 +00:00
Akira Hatanaka
1fbfea7b06 This patch fixes 3 problems:
1. CHECKNEXT was used instead of CHECK-NEXT which caused the line to be
   ignored which in turn hid the next 2 problems:
2. ('sh_offset', 0x{{{[0-9,a-f]+}}) had one too many leading curly braces and
   failed to do it's job of accepting all hex digits and:
3. The check for the hex values for the code instructions didn't account for
   blank separators.

Patch by Jack Carter. 



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154842 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 18:20:26 +00:00
Jim Grosbach
199366a6a6 ARM assembly two-operand forms for VRSHL.
rdar://11252521

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154840 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 18:03:16 +00:00
Jim Grosbach
695eca66b1 Tidy up. Test formatting.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154839 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 18:03:14 +00:00
Akira Hatanaka
3ef7edc77a Do not add offset in applyFixup. This has already been accounted for in Value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154838 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 18:00:19 +00:00
Jim Grosbach
705e2572b4 ARM two-operand aliases for VRHADD instructions.
rdar://11252521

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154832 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 17:14:11 +00:00
Jim Grosbach
dbd6ba36e4 Tidy up. Testcase formatting.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154831 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 17:14:07 +00:00
Bill Wendling
57ca13ecc4 Move to X86 directory because this fails on non-X86 platforms.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154825 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 16:38:48 +00:00
Duncan Sands
8883c43ddc Make it possible to indicate relaxed floating point requirements at the IR level
through the use of 'fpmath' metadata.  Currently this only provides a 'fpaccuracy'
value, which may be a number in ULPs or the keyword 'fast', however the intent is
that this will be extended with additional information about NaN's, infinities
etc later.  No optimizations have been hooked up to this so far.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154822 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 16:28:59 +00:00
Chandler Carruth
9e67db4af1 Flip the new block-placement pass to be on by default.
This is mostly to test the waters. I'd like to get results from FNT
build bots and other bots running on non-x86 platforms.

This feature has been pretty heavily tested over the last few months by
me, and it fixes several of the execution time regressions caused by the
inlining work by preventing inlining decisions from radically impacting
block layout.

I've seen very large improvements in yacr2 and ackermann benchmarks,
along with the expected noise across all of the benchmark suite whenever
code layout changes. I've analyzed all of the regressions and fixed
them, or found them to be impossible to fix. See my email to llvmdev for
more details.

I'd like for this to be in 3.1 as it complements the inliner changes,
but if any failures are showing up or anyone has concerns, it is just
a flag flip and so can be easily turned off.

I'm switching it on tonight to try and get at least one run through
various folks' performance suites in case SPEC or something else has
serious issues with it. I'll watch bots and revert if anything shows up.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154816 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 13:49:17 +00:00
Chandler Carruth
0de089a4d0 Remove an overly brittle test. This test will no longer be interesting
once we start changing the block layout, so just nuke it. If anyone has
ideas about how to craft a code layout agnostic form of the test please
let me know.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154815 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 13:49:09 +00:00
Chandler Carruth
e773e8c3e5 Add a somewhat hacky heuristic to do something different from whole-loop
rotation. When there is a loop backedge which is an unconditional
branch, we will end up with a branch somewhere no matter what. Try
placing this backedge in a fallthrough position above the loop header as
that will definitely remove at least one branch from the loop iteration,
where whole loop rotation may not.

I haven't seen any benchmarks where this is important but loop-blocks.ll
tests for it, and so this will be covered when I flip the default.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154812 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 13:33:36 +00:00
Richard Barton
d0c478d95f Add -disassemble support for -show-inst and -show-encode capability llvm-mc. Also refactor so all MC paraphernalia are created once for all uses as much as possible.
The test change is to account for the fact that the default disassembler behaviour has changed with regards to specifying the assembly syntax to use.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154809 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 11:32:10 +00:00
Chandler Carruth
16295fc20b Tweak the loop rotation logic to check whether the loop is naturally
laid out in a form with a fallthrough into the header and a fallthrough
out of the bottom. In that case, leave the loop alone because any
rotation will introduce unnecessary branches. If either side looks like
it will require an explicit branch, then the rotation won't add any, do
it to ensure the branch occurs outside of the loop (if possible) and
maximize the benefit of the fallthrough in the bottom.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154806 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 09:31:23 +00:00
Hal Finkel
31490baf38 Remove dead SD nodes after the combining pass. Fixes PR12201.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154786 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 03:33:22 +00:00
Chandler Carruth
70daea90af Rewrite how machine block placement handles loop rotation.
This is a complex change that resulted from a great deal of
experimentation with several different benchmarks. The one which proved
the most useful is included as a test case, but I don't know that it
captures all of the relevant changes, as I didn't have specific
regression tests for each, they were more the result of reasoning about
what the old algorithm would possibly do wrong. I'm also failing at the
moment to craft more targeted regression tests for these changes, if
anyone has ideas, it would be welcome.

The first big thing broken with the old algorithm is the idea that we
can take a basic block which has a loop-exiting successor and a looping
successor and use the looping successor as the layout top in order to
get that particular block to be the bottom of the loop after layout.
This happens to work in many cases, but not in all.

The second big thing broken was that we didn't try to select the exit
which fell into the nearest enclosing loop (to which we exit at all). As
a consequence, even if the rotation worked perfectly, it would result in
one of two bad layouts. Either the bottom of the loop would get
fallthrough, skipping across a nearer enclosing loop and thereby making
it discontiguous, or it would be forced to take an explicit jump over
the nearest enclosing loop to earch its successor. The point of the
rotation is to get fallthrough, so we need it to fallthrough to the
nearest loop it can.

The fix to the first issue is to actually layout the loop from the loop
header, and then rotate the loop such that the correct exiting edge can
be a fallthrough edge. This is actually much easier than I anticipated
because we can handle all the hard parts of finding a viable rotation
before we do the layout. We just store that, and then rotate after
layout is finished. No inner loops get split across the post-rotation
backedge because we check for them when selecting the rotation.

That fix exposed a latent problem with our exitting block selection --
we should allow the backedge to point into the middle of some inner-loop
chain as there is no real penalty to it, the whole point is that it
*won't* be a fallthrough edge. This may have blocked the rotation at all
in some cases, I have no idea and no test case as I've never seen it in
practice, it was just noticed by inspection.

Finally, all of these fixes, and studying the loops they produce,
highlighted another problem: in rotating loops like this, we sometimes
fail to align the destination of these backwards jumping edges. Fix this
by actually walking the backwards edges rather than relying on loopinfo.

This fixes regressions on heapsort if block placement is enabled as well
as lots of other cases where the previous logic would introduce an
abundance of unnecessary branches into the execution.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154783 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 01:12:56 +00:00
Craig Topper
2cb1e9dc7d Remove AVX2 vpermq and vpermpd intrinsics. These can now be handled with normal shuffle vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154778 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-15 22:43:31 +00:00
Nadav Rotem
f16af0a053 Fix PR12529. The Vxx family of instructions are only supported by AVX.
Use non-vex instructions for SSE4.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154770 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-15 19:36:44 +00:00
Nadav Rotem
3ab32ea49e When emulating vselect using OR/AND/XOR make sure to bitcast the result back to the original type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154764 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-15 15:08:09 +00:00
Elena Demikhovsky
73c504af9d Added VPERM optimization for AVX2 shuffles
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154761 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-15 11:18:59 +00:00
Duncan Sands
5e5c5f8259 Rename "fpaccuracy" metadata to the more generic "fpmath". That's because I'm
thinking of generalizing it to be able to specify other freedoms beyond accuracy
(such as that NaN's don't have to be respected).  I'd like the 3.1 release (the
first one with this metadata) to have the more generic name already rather than
having to auto-upgrade it in 3.2.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154744 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-14 12:36:06 +00:00
Hal Finkel
bba23ed672 Fix an error in BBVectorize important for vectorizing pointer types.
When vectorizing pointer types it is important to realize that potential
pairs cannot be connected via the address pointer argument of a load or store.
This is because even after vectorization, the address is still a scalar because
the address of the higher half of the pair is implicit from the address of the
lower half (it need not be, and should not be, explicitly computed).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154735 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-14 07:32:50 +00:00
Hal Finkel
f3f5a1e6f7 Enhance BBVectorize to more-properly handle pointer values and vectorize GEPs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154734 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-14 07:32:43 +00:00
Richard Smith
42fc29e717 Fix X86 codegen for 'atomicrmw nand' to generate *x = ~(*x & y), not *x = ~*x & y.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154705 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-13 22:47:00 +00:00
Hal Finkel
fc3665c875 Add support to BBVectorize for vectorizing selects.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154700 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-13 20:45:45 +00:00
Evan Cheng
7ece9539c2 On Darwin targets, only use vfma etc. if the source use fma() intrinsic explicitly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154689 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-13 18:59:28 +00:00
Dan Gohman
4423477548 Consider ObjC runtime calls objc_storeWeak and others which make a copy of
their argument as "escape" points for objc_retainBlock optimization.
This fixes rdar://11229925.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154682 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-13 18:28:58 +00:00
Sylvestre Ledru
e92077f11e Catch the Python exception when subprocess.Popen is failing.
For example, if llc cannot be found, the full python stacktrace is displayed
and no interesting information are provided.
+ fail the process when an exception occurs



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154665 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-13 11:22:18 +00:00
Dan Gohman
6c189ecbe6 Use the new Use-aware dominates method to apply the objc runtime
library return value optimization for phi uses. Even when the
phi itself is not dominated, the specific use may be dominated.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154647 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-13 01:08:28 +00:00
Dan Gohman
511568dd1f Don't move objc_autorelease calls past autorelease pool boundaries when
optimizing autorelease calls on phi nodes with null operands.
This fixes rdar://11207070.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154642 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-13 00:59:57 +00:00
Sirish Pande
2f69e4cf32 Disable Hexagon test temporarily.
There is an assert at line 558 in ScheduleDAGInstrs::buildSchedGraph(AliasAnalysis *AA).
This assert needs to addressed for post RA scheduler. Until that assert is addressed,
any passes that uses post ra scheduler will fail. So, I am temporarily disabling the
hexagon tests until that fix is in.

The assert is as follows:
    assert(!MI->isTerminator() && !MI->isLabel() &&
               "Cannot schedule terminators or labels!");

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154617 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-12 21:06:54 +00:00
Preston Gurd
c68dda815e This patch improves the MCJIT runtime dynamic loader by adding new handling
of zero-initialized sections, virtual sections and common symbols
and preventing the loading of sections which are not required for
execution such as debug information.

Patch by Andy Kaylor!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154610 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-12 20:13:57 +00:00
Craig Topper
bf596c9c61 Fix 128-bit ptest intrinsics to take v2i64 instead of v4f32 since these are integer instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154580 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-12 07:23:00 +00:00
Akira Hatanaka
ed08489a71 Revert changes that were accidentally committed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154563 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 23:19:55 +00:00
Akira Hatanaka
55e0e43e4f Fix string that is being checked.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154547 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 23:11:33 +00:00
Akira Hatanaka
1cc6333161 Emit neg.s or neg.d only if -enable-no-nans-fp-math is supplied by user,
otherwise expand FNEG during legalization.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154546 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 22:59:08 +00:00
Akira Hatanaka
c12a6e6b53 Emit abs.s or abs.d only if -enable-no-nans-fp-math is supplied by user.
Invalid operation is signaled if the operand of these instructions is NaN.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154545 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 22:49:04 +00:00
Kevin Enderby
b318cc16c9 Fixed a case of ARM disassembly getting an assert on a bad encoding
of a VST instruction.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154544 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 22:40:17 +00:00
Akira Hatanaka
056c51e598 Fix bugs in lowering of FCOPYSIGN nodes.
- FCOPYSIGN nodes that have operands of different types were not handled.
- Different code was generated depending on the endianness of the target.

Additionally, code is added that emits INS and EXT instructions, if they are
supported by target (they are R2 instructions).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154540 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 22:13:04 +00:00
Jim Grosbach
1835547ec1 ARM 'vuzp.32 Dd, Dm' is a pseudo-instruction.
While there is an encoding for it in VUZP, the result of that is undefined,
so we should avoid it. Define the instruction as a pseudo for VTRN.32
instead, as the ARM ARM indicates.

rdar://11222366

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154511 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 17:40:18 +00:00
Jim Grosbach
6073b30b05 ARM 'vzip.32 Dd, Dm' is a pseudo-instruction.
While there is an encoding for it in VZIP, the result of that is undefined,
so we should avoid it. Define the instruction as a pseudo for VTRN.32
instead, as the ARM ARM indicates.

rdar://11221911

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154505 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 16:53:25 +00:00
Evan Cheng
14b4c03580 Add more fused mul+add/sub patterns. rdar://10139676
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154484 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 06:59:47 +00:00
Nadav Rotem
e611378a6e Reapply 154396 after fixing a test.
Original message:
Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
blendV uses a register for the selection while Vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154483 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 06:40:27 +00:00
Evan Cheng
bee78fe5fc Clean up ARM fused multiply + add/sub support some more: rename some isel
predicates.
Also remove NEON2 since it's not really useful and it is confusing. If
NEON + VFP4 implies NEON2 but NEON2 doesn't imply NEON + VFP4, what does it
really mean?

rdar://10139676


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154480 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 05:33:07 +00:00
Evan Cheng
92c904539a Match (fneg (fma) to vfnma. rdar://10139676
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154469 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 01:21:25 +00:00
Charles Davis
0d82fe77f2 Add retw and lretw instructions. Also, fix Intel syntax parsing for all
ret instructions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154468 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 01:10:53 +00:00
Evan Cheng
a0908d0a44 Merge fma.ll into fusedMAC.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154466 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 01:03:11 +00:00
Kevin Enderby
a69da35c12 Fix ARM disassembly of VLD instructions with writebacks.  And add test a case
for all opcodes handed by DecodeVLDInstruction() in ARMDisassembler.cpp .


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154459 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 00:25:40 +00:00
Jim Grosbach
a5378ebe78 ARM add missing Thumb1 two-operand aliases for shift-by-immediate.
rdar://11222742

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154457 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 00:15:16 +00:00
Evan Cheng
82509e5c62 Fix a number of problems with ARM fused multiply add/subtract instructions.
1. The new instruction itinerary entries are not properly described.
2. The asm parser can't handle vfms and vfnms.
3. There were no assembler, disassembler test cases.
4. HasNEON2 has the wrong assembler predicate.
rdar://10139676


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154456 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 00:13:00 +00:00
Jakob Stoklund Olesen
89cdaf46ec Fix test to be register assignment invariant.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154453 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 00:00:24 +00:00
Owen Anderson
06886aaaeb Move the constant-folding support for FP_ROUND in SelectionDAG from the one-operand version of getNode() to the two-operand version, since it became a two-operand node at sound point.
Zap a testcase that this allows us to completely fold away.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154447 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 22:46:53 +00:00
Kostya Serebryany
cff60c1409 [tsan] two more compile-time optimizations:
- don't isntrument reads from constant globals.
Saves ~1.5% of instrumented instructions on CPU2006
(counting static instructions, not their execution).
- don't insrument reads from vtable (which is a global constant too).
Saves ~5%.

I did not measure the run-time impact of this,
but it is certainly non-negative.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154444 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 22:29:17 +00:00
Evan Cheng
3aef2ff514 Handle llvm.fma.* intrinsics. rdar://10914096
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154439 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 21:40:28 +00:00
Duncan Sands
507bb7a42f Add a comment noting that the fdiv -> fmul conversion won't generate
multiplication by a denormal, and some tests checking that.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154431 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 20:35:27 +00:00
Eric Christopher
a139051654 Temporarily revert this patch to see if it brings the buildbots back.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154425 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 19:33:16 +00:00
Kostya Serebryany
2076af0184 [tsan] compile-time instrumentation: do not instrument a read if
a write to the same temp follows in the same BB.
Also add stats printing.

On Spec CPU2006 this optimization saves roughly 4% of instrumented reads
(which is 3% of all instrumented accesses):
Writes            : 161216
Reads             : 446458
Reads-before-write: 18295



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154418 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 18:18:56 +00:00
Eric Christopher
18112d83e7 To ensure that we have more accurate line information for a block
don't elide the branch instruction if it's the only one in the block,
otherwise it's ok.

PR9796 and rdar://11215207

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154417 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 18:18:10 +00:00
Jim Grosbach
a23ecc2ba9 ARM fix cc_out operand handling for t2SUBrr instructions.
We were incorrectly conflating some add variants which don't have a
cc_out operand with the mirroring sub encodings, which do. Part of the
awesome non-orthogonality legacy of thumb1. Similarly, handling of
add/sub of an immediate was sometimes incorrectly removing the cc_out
operand for add/sub register variants.

rdar://11216577

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154411 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 17:31:55 +00:00
Nadav Rotem
50e64cfe6e Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
blendv uses a register for the selection while vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154396 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 14:33:13 +00:00
Anton Korobeynikov
999821cddf Transform div to mul with reciprocal only when fp imm is legal.
This fixes PR12516 and uncovers one weird problem in legalize (workarounded)


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154394 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 13:22:49 +00:00
Duncan Sands
1fd63df693 Express the number of ULPs in fpaccuracy metadata as a real rather than a
rational number, eg as 2.5 rather than 5, 2.  OK'd by Peter Collingbourne.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154387 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 08:22:43 +00:00
Andrew Trick
d9fc1ce809 Fix 12513: Loop unrolling breaks with indirect branches.
Take this opportunity to generalize the indirectbr bailout logic for
loop transformations. CFG transformations will never get indirectbr
right, and there's no point trying.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154386 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 05:14:42 +00:00
Evan Cheng
fa12d0df5a Add proper checks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154379 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 03:15:42 +00:00
Evan Cheng
bf010eb911 Fix a long standing tail call optimization bug. When a libcall is emitted
legalizer always use the DAG entry node. This is wrong when the libcall is
emitted as a tail call since it effectively folds the return node. If
the return node's input chain is not the entry (i.e. call, load, or store)
use that as the tail call input chain.

PR12419
rdar://9770785
rdar://11195178


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154370 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 01:51:00 +00:00
Rafael Espindola
fdb230a154 Don't try to zExt just to check if an integer constant is zero, it might
not fit in a i64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154364 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 00:16:22 +00:00
Lang Hames
23f369d1fe Test case for PR12495.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154359 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-09 23:58:59 +00:00
Akira Hatanaka
787c3fd385 Have TargetLowering::getPICJumpTableRelocBase return a node that points to the
GOT if jump table uses 64-bit gp-relative relocation.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154341 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-09 20:32:12 +00:00
Chad Rosier
7f35455708 When performing a truncating store, it's possible to rearrange the data
in-register, such that we can use a single vector store rather then a 
series of scalar stores.

For func_4_8 the generated code

	vldr	d16, LCPI0_0
	vmov	d17, r0, r1
	vadd.i16	d16, d17, d16
	vmov.u16	r0, d16[3]
	strb	r0, [r2, #3]
	vmov.u16	r0, d16[2]
	strb	r0, [r2, #2]
	vmov.u16	r0, d16[1]
	strb	r0, [r2, #1]
	vmov.u16	r0, d16[0]
	strb	r0, [r2]
	bx	lr

becomes

	vldr	d16, LCPI0_0
	vmov	d17, r0, r1
	vadd.i16	d16, d17, d16
	vuzp.8	d16, d17
	vst1.32	{d16[0]}, [r2, :32]
	bx	lr

I'm not fond of how this combine pessimizes 2012-03-13-DAGCombineBug.ll,
but I couldn't think of a way to judiciously apply this combine.

This

	ldrh	r0, [r0, #4]
	strh	r0, [r1]

becomes

	vldr	d16, [r0]
	vmov.u16	r0, d16[2]
	vmov.32	d16[0], r0
	vuzp.16	d16, d17
	vst1.32	{d16[0]}, [r1, :32]

PR11158
rdar://10703339


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154340 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-09 20:32:02 +00:00
Rafael Espindola
decbc43f72 Pattern match a setcc of boolean value with 0 as a truncate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154322 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-09 16:06:03 +00:00
Nadav Rotem
e80aa7c783 Lower some x86 shuffle sequences to the vblend family of instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154313 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-09 08:33:21 +00:00
Nadav Rotem
154819dd6f Fix a bug in the lowering of broadcasts: ConstantPools need to use the target pointer type.
Move NormalizeVectorShuffle and LowerVectorBroadcast into X86TargetLowering.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154310 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-09 07:45:58 +00:00
Chandler Carruth
ab5a55e118 Cleanup and relax a restriction on the matching of global offsets into
x86 addressing modes. This allows PIE-based TLS offsets to fit directly
into an addressing mode immediate offset, which is the last remaining
code quality issue from PR12380. With this patch, that PR is completely
fixed.

To understand why this patch is correct to match these offsets into
addressing mode immediates, break it down by cases:
1) 32-bit is trivially correct, and unmodified here.
2) 64-bit non-small mode is unchanged and never matches.
3) 64-bit small PIC code which is RIP-relative is handled specially in
   the match to try to fit RIP into the base register. If it fails, it
   now early exits. This behavior is unchanged by the patch.
4) 64-bit small non-PIC code which is not RIP-relative continues to work
   as it did before. The reason these immediates are safe is because the
   ABI ensures they fit in small mode. This behavior is unchanged.
5) 64-bit small PIC code which is *not* using RIP-relative addressing.
   This is the only case changed by the patch, and the primary place you
   see it is in TLS, either the win64 section offset TLS or Linux
   local-exec TLS model in a PIC compilation. Here the ABI again ensures
   that the immediates fit because we are in small mode, and any other
   operations required due to the PIC relocation model have been handled
   externally to the Wrapper node (extra loads etc are made around the
   wrapper node in ISelLowering).

I've tested this as much as I can comparing it with GCC's output, and
everything appears safe. I discussed this with Anton and it made sense
to him at least at face value. That said, if there are issues with PIC
code after this patch, yell and we can revert it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154304 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-09 02:13:06 +00:00
Chandler Carruth
6916a2375a Fold 15 tiny test cases into a single file that implements the
comprehensive testing of TLS codegen for x86. Convert all of the ones
that were still using grep to use FileCheck. Remove some redundancies
between them.

Perhaps most interestingly expand the test cases so that they actually
fully list the instruction snippet being tested. TLS operations are
*very* narrowly defined, and so these seem reasonably stable. More
importantly, the existing test cases already were crazy fine grained,
expecting specific registers to be allocated. This just clarifies that
no *other* instructions are expected, and fills in some crucial gaps
that weren't being tested at all.

This will make any subsequent changes to TLS much more clear during
review.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154303 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-09 01:43:17 +00:00
Duncan Sands
3ef3fcfc04 Only have codegen turn fdiv by a constant into fmul by the reciprocal
when -ffast-math, i.e. don't just always do it if the reciprocal can
be formed exactly.  There is already an IR level transform that does
that, and it does it more carefully.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154296 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-08 18:08:12 +00:00
Chandler Carruth
253933ee9e Teach LLVM about a PIE option which, when enabled on top of PIC, makes
optimizations which are valid for position independent code being linked
into a single executable, but not for such code being linked into
a shared library.

I discussed the design of this with Eric Christopher, and the decision
was to support an optional bit rather than a completely separate
relocation model. Fundamentally, this is still PIC relocation, its just
that certain optimizations are only valid under a PIC relocation model
when the resulting code won't be in a shared library. The simplest path
to here is to expose a single bit option in the TargetOptions. If folks
have different/better designs, I'm all ears. =]

I've included the first optimization based upon this: changing TLS
models to the *Exec models when PIE is enabled. This is the LLVM
component of PR12380 and is all of the hard work.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154294 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-08 17:51:45 +00:00
Chandler Carruth
2450eca960 Teach InstCombine to nuke a common alloca pattern -- an alloca which has
GEPs, bit casts, and stores reaching it but no other instructions. These
often show up during the iterative processing of the inliner, SROA, and
DCE. Once we hit this point, we can completely remove the alloca. These
were actually showing up in the final, fully optimized code in a bunch
of inliner tests I've been working on, and notably they show up after
LLVM finishes optimizing away all function calls involved in
hash_combine(a, b).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154285 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-08 14:36:56 +00:00
Nadav Rotem
9d68b06bc5 AVX2: Build splat vectors by broadcasting a scalar from the constant pool.
Previously we used three instructions to broadcast an immediate value into a
vector register.
On Sandybridge we continue to load the broadcasted value from the constant pool.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154284 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-08 12:54:54 +00:00
Bill Wendling
864737cc51 Remove old 'grep' lines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154283 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-08 11:53:54 +00:00
Bill Wendling
a0126afec8 FileCheckize these testcases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154281 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-08 11:00:38 +00:00
Nadav Rotem
d16c8d0d33 1. Remove the part of r153848 which optimizes shuffle-of-shuffle into a new
shuffle node because it could introduce new shuffle nodes that were not
   supported efficiently by the target.

2. Add a more restrictive shuffle-of-shuffle optimization for cases where the
   second shuffle reverses the transformation of the first shuffle.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154266 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-07 21:19:08 +00:00
Duncan Sands
961d666be4 Convert floating point division by a constant into multiplication by the
reciprocal if converting to the reciprocal is exact.  Do it even if inexact
if -ffast-math.  This substantially speeds up ac.f90 from the polyhedron
benchmarks.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154265 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-07 20:04:00 +00:00
Chandler Carruth
c0d18b6696 Fix ValueTracking to conclude that debug intrinsics are safe to
speculate. Without this, loop rotate (among many other places) would
suddenly stop working in the presence of debug info. I found this
looking at loop rotate, and have augmented its tests with a reduction
out of a very hot loop in yacr2 where failing to do this rotation costs
sometimes more than 10% in runtime performance, perturbing numerous
downstream optimizations.

This should have no impact on performance without debug info, but the
change in performance when debug info is enabled can be extreme. As
a consequence (and this how I got to this yak) any profiling of
performance problems should be treated with deep suspicion -- they may
have been wildly innacurate of debug info was enabled for profiling. =/
Just a heads up.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154263 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-07 19:22:18 +00:00
Benjamin Kramer
c77764591b SCEV: When expanding a GEP the final addition to the base pointer has NUW but not NSW.
Found by inspection.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154262 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-07 17:19:26 +00:00
Sean Hunt
0fdfaafb70 Make the test for r154235 more platform-independent with a shorter
string.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154243 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-07 01:33:14 +00:00
Sean Hunt
3420e7f360 Output UTF-8-encoded characters as identifier characters into assembly
by default.

This is a behaviour configurable in the MCAsmInfo. I've decided to turn
it on by default in (possibly optimistic) hopes that most assemblers are
reasonably sane. If this proves a problem, switching to default seems
reasonable.

I'm not sure if this is the opportune place to test, but it seemed good
to make sure it was tested somewhere.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154235 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-07 00:37:53 +00:00
Akira Hatanaka
3e59b5edd6 Add lines in global-address.ll to test N32 and N64 code generation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154202 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06 20:23:36 +00:00
Jakob Stoklund Olesen
70fbea7c75 Allow negative immediates in ARM and Thumb2 compares.
ARM and Thumb2 mode can use cmn instructions to compare against negative
immediates. Thumb1 mode can't.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154183 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06 17:45:04 +00:00
Chandler Carruth
9ceebb7e92 Sink the collection of return instructions until after *all*
simplification has been performed. This is a bit less efficient
(requires another ilist walk of the basic blocks) but shouldn't matter
in practice. More importantly, it's just too much work to keep track of
all the various ways the return instructions can be mutated while
simplifying them. This fixes yet another crasher, reported by Daniel
Dunbar.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154179 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06 17:21:31 +00:00
Chandler Carruth
be2df1675d Tweak this test to ensure the inliner did indeed fire. Thanks to Richard
Smith for pointing this out in review.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154178 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06 17:21:28 +00:00
Craig Topper
f85cb768fe Test case for PR12413
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154172 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06 14:38:25 +00:00
Craig Topper
9a2b6e1d7b Allow 256-bit shuffles to be split if a 128-bit lane contains elements from a single source. This is a rewrite of the 256-bit shuffle splitting code based on similar code from legalize types. Fixes PR12413.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154166 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06 07:45:23 +00:00
Craig Topper
e45cddfa08 Add the tests that were supposed to go with r153935 that I forgot svn add
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154165 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06 07:09:59 +00:00
Chandler Carruth
c0a7a1280c Actually finish this sentence in the comment the way I intended. Thanks
Matt for pointing this out.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154158 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06 01:19:38 +00:00
Chandler Carruth
6bbab86af9 Sink the return instruction collection until after we're done deleting
dead code, including dead return instructions in some cases. Otherwise,
we end up having a bogus poniter to a return instruction that blows up
much further down the road.

It turns out that this pattern is both simpler to code, easier to update
in the face of enhancements to the inliner cleanup, and likely cheaper
given that it won't add dead instructions to the list.

Thanks to John Regehr's numerous test cases for teasing this out.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154157 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06 01:11:52 +00:00
Jim Grosbach
4e53fe8dc6 ARM assembly aliases for add negative immediates using sub.
'add r2, #-1024' should just use 'sub r2, #1024' rather than erroring out.
Thumb1 aliases for adding a negative immediate to the stack pointer,
also.

rdar://11192734

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154123 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-05 20:57:13 +00:00
Akira Hatanaka
ba9536a3c6 Reapply test case in 154038, this time with triple to prevent the backend
from emitting gp_rel relocation.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154122 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-05 20:44:35 +00:00
Eric Christopher
60b35f408b Patch to set is_stmt a little better for prologue lines in a function.
This enables debuggers to see what are interesting lines for a
breakpoint rather than any line that starts a function.

rdar://9852092

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154120 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-05 20:39:05 +00:00
Jakob Stoklund Olesen
740cd657f3 Don't break the IV update in TLI::SimplifySetCC().
LSR always tries to make the ICmp in the loop latch use the incremented
induction variable. This allows the induction variable to be kept in a
single register.

When the induction variable limit is equal to the stride,
SimplifySetCC() would break LSR's hard work by transforming:

   (icmp (add iv, stride), stride) --> (cmp iv, 0)

This forced us to use lea for the IC update, preventing the simpler
incl+cmp.

<rdar://problem/7643606>
<rdar://problem/11184260>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154119 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-05 20:30:20 +00:00
Dan Gohman
036ebfd874 Fix accidentally inverted logic from r152803, and make the
testcase slightly less trivial. This fixes rdar://11171718.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154118 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-05 20:27:21 +00:00
Silviu Baranga
1c01249191 Added support for unpredictable ADC/SBC instructions on ARM, and also fixed some corner cases involving the PC register as an operand for these instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154101 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-05 16:19:29 +00:00
Silviu Baranga
82e1bba0e4 Added support for handling unpredictable arithmetic instructions on ARM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154100 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-05 16:13:15 +00:00
James Molloy
17dcaf5ef9 An oversight when applying the patches for r150956 and r150957 to a vanilla tree meant I forgot to svn add these testcases.
Noticed while investigating PR12274!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154090 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-05 10:01:12 +00:00
Jim Grosbach
22378fd664 ARM assembly aliases for two-operand V[R]SHR instructions.
rdar://11189467

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154087 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-05 07:23:53 +00:00
Jim Grosbach
b657a90929 ARM assembly parsing for 'msr' plain 'cpsr' operand.
Plain 'cpsr' is an alias for 'cpsr_fc'.

rdar://11153753

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154080 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-05 03:17:53 +00:00
Jakob Stoklund Olesen
9243c4f7c5 Pass the right sign to TLI->isLegalICmpImmediate.
LSR can fold three addressing modes into its ICmpZero node:

  ICmpZero BaseReg + Offset      => ICmp BaseReg, -Offset
  ICmpZero -1*ScaleReg + Offset  => ICmp ScaleReg, Offset
  ICmpZero BaseReg + -1*ScaleReg => ICmp BaseReg, ScaleReg

The first two cases are only used if TLI->isLegalICmpImmediate() likes
the offset.

Make sure the right Offset sign is passed to this method in the second
case. The ARM version is not symmetric.

<rdar://problem/11184260>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154079 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-05 03:10:56 +00:00
Akira Hatanaka
56ce6b3520 Reapply 154038 without the failing test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154062 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-04 22:16:36 +00:00
Owen Anderson
657a4e774c Revert r154038. It was causing make check failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154054 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-04 21:18:58 +00:00
Akira Hatanaka
e825fb3888 Fix LowerGlobalAddress to produce instructions with the correct relocation
types for N32 ABI. Add new test case and update existing ones.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154038 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-04 19:02:38 +00:00
Akira Hatanaka
86a2733055 Fix LowerConstantPool to produce instructions with the correct relocation
types for N32 ABI and update test case.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154034 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-04 18:26:12 +00:00
Jakob Stoklund Olesen
c5041cac7d Implement ARMBaseInstrInfo::commuteInstruction() for MOVCCr.
A MOVCCr instruction can be commuted by inverting the condition. This
can help reduce register pressure and remove unnecessary copies in some
cases.

<rdar://problem/11182914>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154033 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-04 18:23:42 +00:00
Akira Hatanaka
03d830e4f9 Fix LowerBlockAddress to produce instructions with the correct relocation
types for N32 ABI and update test case.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154031 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-04 18:22:53 +00:00
Hongbin Zheng
99c8a5a64a Add testcase for r154007, when a function has the optsize attribute,
the loop should be unrolled according the value of OptSizeUnrollThreshold.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154014 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-04 13:24:40 +00:00
Rafael Espindola
26c8dcc692 Always compute all the bits in ComputeMaskedBits.
This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154011 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-04 12:51:34 +00:00
Michael J. Spencer
93210e847a Add YAML parser to Support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153977 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-03 23:09:22 +00:00
Pete Cooper
2ce63c7352 Add VSELECT to LegalizeVectorTypes::ScalariseVectorResult. Previously it would crash if it encountered a 1 element VSELECT. Solution is slightly more complicated than just creating a SELET as we have to mask or sign extend the vector condition if it had different boolean contents from the scalar condition. Fixes <rdar://problem/11178095>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153976 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-03 22:57:55 +00:00
Eric Christopher
fa5b050136 Fix thinko check for number of operands to be the one that actually
might have more than 19 operands. Add a testcase to make sure I
never screw that up again.

Part of rdar://11026482

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153961 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-03 17:55:42 +00:00
Nadav Rotem
43b32e0cff Add an additional testcase which checks ops with multiple users.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153939 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-03 07:39:36 +00:00
Craig Topper
769bbfd951 Add support for AVX enhanced comparison predicates. Patch from Kay Tiong Khoo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153935 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-03 05:20:24 +00:00
Akira Hatanaka
9dd16d41a2 Revert r153924. Delete test/MC/Disassembler/Mips and lib/Target/Mips/Disassembler.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153926 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-03 03:01:13 +00:00
Akira Hatanaka
02365945a6 Revert r153924. There were buildbot failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153925 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-03 02:51:09 +00:00
Akira Hatanaka
885020a7a7 MIPS disassembler support.
Patch by Vladimir Medic.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153924 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-03 02:20:58 +00:00
Jakob Stoklund Olesen
e3b23cde80 Allocate virtual registers in ascending order.
This is just the fallback tie-breaker ordering, the main allocation
order is still descending size.

Patch by Shamil Kurmangaleev!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153904 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-02 22:30:39 +00:00
Lang Hames
be9fe49b17 During two-address lowering, rescheduling an instruction does not untie
operands. Make TryInstructionTransform return false to reflect this.
Fixes PR11861.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153892 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-02 19:58:43 +00:00
Rafael Espindola
ce167840b2 No need to run llvm-as.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153890 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-02 19:44:20 +00:00
Akira Hatanaka
a551a48402 Initial 64 bit direct object support.
This patch allows llvm to recognize that a 64 bit object file is being produced
and that the subsequently generated ELF header has the correct information.

The test case checks for both big and little endian flavors.

Patch by Jack Carter.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153889 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-02 19:25:22 +00:00
Stepan Dyatkovskiy
aad9c3f17a Fast fix for PR12343:
http://llvm.org/bugs/show_bug.cgi?id=12343

We have not trivial way for splitting edges that are goes from indirect branch. We can do it with some tricks, but it should be additionally discussed. And it is still dangerous due to difficulty of indirect branches controlling.

Fix forbids this case for unswitching.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153879 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-02 17:16:45 +00:00
Silviu Baranga
50ac2e9229 Added fix in TableGen instruction decoder generation. The decoder now breaks for every leaf node.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153874 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-02 15:20:39 +00:00
Nadav Rotem
44b5e6de8c Optimizing swizzles of complex shuffles may generate additional complex shuffles.
Do not try to optimize swizzles of shuffles if the source shuffle has more than
a single user, except when the source shuffle is also a swizzle.






git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153864 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-02 07:11:12 +00:00
Hal Finkel
19aa2b5015 Enable prefetch generation on PPC64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153851 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-01 20:08:17 +00:00
Nadav Rotem
4ac9081c71 This commit contains a few changes that had to go in together.
1. Simplify xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B))
   (and also scalar_to_vector).

2. Xor/and/or are indifferent to the swizzle operation (shuffle of one src).
   Simplify xor/and/or (shuff(A), shuff(B)) -> shuff(op (A, B))

3. Optimize swizzles of shuffles:  shuff(shuff(x, y), undef) -> shuff(x, y).

4. Fix an X86ISelLowering optimization which was very bitcast-sensitive.

Code which was previously compiled to this:

movd    (%rsi), %xmm0
movdqa  .LCPI0_0(%rip), %xmm2
pshufb  %xmm2, %xmm0
movd    (%rdi), %xmm1
pshufb  %xmm2, %xmm1
pxor    %xmm0, %xmm1
pshufb  .LCPI0_1(%rip), %xmm1
movd    %xmm1, (%rdi)
ret

Now compiles to this:

movl    (%rsi), %eax
xorl    %eax, (%rdi)
ret




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153848 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-01 19:31:22 +00:00
Hal Finkel
4d989ac93c Add instruction itinerary for the PPC64 A2 core.
This adds a full itinerary for IBM's PPC64 A2 embedded core. These
cores form the basis for the CPUs in the new IBM BG/Q supercomputer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153842 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-01 19:22:40 +00:00
Chandler Carruth
48ec3b50e7 Add some more testing to cover the remaining two cases where
always-inlining is disabled: recursive functions and indirectbr.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153833 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-01 10:36:17 +00:00
Chandler Carruth
6052eef8bd Fix a pretty scary bug I introduced into the always inliner with
a single missing character. Somehow, this had gone untested. I've added
tests for returns-twice logic specifically with the always-inliner that
would have caught this, and fixed the bug.

Thanks to Matt for the careful review and spotting this!!! =D

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153832 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-01 10:21:05 +00:00
Chandler Carruth
0b42f9dd2f Replace four tiny tests with various uses of grep and not with a single
test and FileCheck.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153831 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-01 10:11:17 +00:00
Rafael Espindola
f10037b9ff Add a triple to the test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153818 91177308-0d34-0410-b5e6-96231b3b80d8
2012-03-31 18:59:07 +00:00
Rafael Espindola
95d594cac3 Teach CodeGen's version of computeMaskedBits to understand the range metadata.
This is the CodeGen equivalent of r153747. I tested that there is not noticeable
performance difference with any combination of -O0/-O2 /-g when compiling
gcc as a single compilation unit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153817 91177308-0d34-0410-b5e6-96231b3b80d8
2012-03-31 18:14:00 +00:00
Chandler Carruth
f2286b0152 Initial commit for the rewrite of the inline cost analysis to operate
on a per-callsite walk of the called function's instructions, in
breadth-first order over the potentially reachable set of basic blocks.

This is a major shift in how inline cost analysis works to improve the
accuracy and rationality of inlining decisions. A brief outline of the
algorithm this moves to:

- Build a simplification mapping based on the callsite arguments to the
  function arguments.
- Push the entry block onto a worklist of potentially-live basic blocks.
- Pop the first block off of the *front* of the worklist (for
  breadth-first ordering) and walk its instructions using a custom
  InstVisitor.
- For each instruction's operands, re-map them based on the
  simplification mappings available for the given callsite.
- Compute any simplification possible of the instruction after
  re-mapping, and store that back int othe simplification mapping.
- Compute any bonuses, costs, or other impacts of the instruction on the
  cost metric.
- When the terminator is reached, replace any conditional value in the
  terminator with any simplifications from the mapping we have, and add
  any successors which are not proven to be dead from these
  simplifications to the worklist.
- Pop the next block off of the front of the worklist, and repeat.
- As soon as the cost of inlining exceeds the threshold for the
  callsite, stop analyzing the function in order to bound cost.

The primary goal of this algorithm is to perfectly handle dead code
paths. We do not want any code in trivially dead code paths to impact
inlining decisions. The previous metric was *extremely* flawed here, and
would always subtract the average cost of two successors of
a conditional branch when it was proven to become an unconditional
branch at the callsite. There was no handling of wildly different costs
between the two successors, which would cause inlining when the path
actually taken was too large, and no inlining when the path actually
taken was trivially simple. There was also no handling of the code
*path*, only the immediate successors. These problems vanish completely
now. See the added regression tests for the shiny new features -- we
skip recursive function calls, SROA-killing instructions, and high cost
complex CFG structures when dead at the callsite being analyzed.

Switching to this algorithm required refactoring the inline cost
interface to accept the actual threshold rather than simply returning
a single cost. The resulting interface is pretty bad, and I'm planning
to do lots of interface cleanup after this patch.

Several other refactorings fell out of this, but I've tried to minimize
them for this patch. =/ There is still more cleanup that can be done
here. Please point out anything that you see in review.

I've worked really hard to try to mirror at least the spirit of all of
the previous heuristics in the new model. It's not clear that they are
all correct any more, but I wanted to minimize the change in this single
patch, it's already a bit ridiculous. One heuristic that is *not* yet
mirrored is to allow inlining of functions with a dynamic alloca *if*
the caller has a dynamic alloca. I will add this back, but I think the
most reasonable way requires changes to the inliner itself rather than
just the cost metric, and so I've deferred this for a subsequent patch.
The test case is XFAIL-ed until then.

As mentioned in the review mail, this seems to make Clang run about 1%
to 2% faster in -O0, but makes its binary size grow by just under 4%.
I've looked into the 4% growth, and it can be fixed, but requires
changes to other parts of the inliner.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153812 91177308-0d34-0410-b5e6-96231b3b80d8
2012-03-31 12:42:41 +00:00
Chandler Carruth
426d5715b1 Clean up the naming in this test. Someone pointed this out in review at
one point, and I forgot to go back and clean it up. Sorry about that. =/

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153801 91177308-0d34-0410-b5e6-96231b3b80d8
2012-03-31 10:38:48 +00:00
Chandler Carruth
c3e955927f FileCheck-ize this test, and generally tidy it up prior to changing
things around.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153799 91177308-0d34-0410-b5e6-96231b3b80d8
2012-03-31 09:22:33 +00:00
Hal Finkel
6173ed95da Correctly vectorize powi.
The powi intrinsic requires special handling because it always takes a single
integer power regardless of the result type. As a result, we can vectorize
only if the powers are equal. Fixes PR12364.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153797 91177308-0d34-0410-b5e6-96231b3b80d8
2012-03-31 03:38:40 +00:00
Jim Grosbach
ad353c6303 ARM assembler should prefer non-aliases encoding of cmp.
When an immediate is both a value [t2_]so_imm and a [t2_]so_imm_neg,
we want to use the non-negated form to make sure we prefer the normal
encoding, not the aliased encoding via the negation of, e.g., 'cmp.w'.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153770 91177308-0d34-0410-b5e6-96231b3b80d8
2012-03-30 19:59:02 +00:00