------------------------------------------------------------------------
r214481 | hfinkel | 2014-07-31 22:20:41 -0700 (Thu, 31 Jul 2014) | 38 lines
[PowerPC] Generate unaligned vector loads using intrinsics instead of regular loads
Altivec vector loads on PowerPC have an interesting property: They always load
from an aligned address (by rounding down the address actually provided if
necessary). In order to generate an actual unaligned load, you can generate two
load instructions, one with the original address, one offset by one vector
length, and use a special permutation to extract the bytes desired.
When this was originally implemented, I generated these two loads using regular
ISD::LOAD nodes, now marked as aligned. Unfortunately, there is a problem with
this:
The alignment of a load does not contribute to its identity, and SDNodes
are uniqued. So, imagine that we have some unaligned load, L1, that is not
aligned. The routine will create two loads, L1(aligned) and (L1+16)(aligned).
Further imagine that there had already existed a load (L1+16)(unaligned) with
the same chain operand as the load L1. When (L1+16)(aligned) is created as part
of the lowering of L1, this load *is* also the (L1+16)(unaligned) node, just
now marked as aligned (because the new alignment overwrites the old). But the
original users of (L1+16)(unaligned) now get the data intended for the
permutation yielding the data for L1, and (L1+16)(unaligned) no longer exists
to get its own permutation-based expansion. This was PR19991.
A second potential problem has to do with the MMOs on these loads, which can be
used by AA during instruction scheduling to break chain-based dependencies. If
the new "aligned" loads get the MMO from the original unaligned load, this does
not represent the fact that it will load data from below the original address.
Normally, this would not matter, but this load might be combined with another
load pair for a previous vector, and then the dependency on the otherwise-
ignored lower bytes can matter.
To fix both problems, instead of generating the necessary loads using regular
ISD::LOAD instructions, ppc_altivec_lvx intrinsics are used instead. These are
provided with MMOs with a conservative address range.
Unfortunately, I no longer have a failing test case (since PR19991 was
reported, other changes in CodeGen have forced this bug back into hiding it
again). Nevertheless, this should fix the underlying problem.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@215058 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r214923 | wschmidt | 2014-08-05 15:47:25 -0500 (Tue, 05 Aug 2014) | 12 lines
[PowerPC] Swap arguments and adjust shift count for vsldoi on little endian
Commits r213915 and r214718 fix recognition of shuffle masks for vmrg*
and vpku*um instructions for a little-endian target, by swapping the
input arguments. The vsldoi instruction requires similar treatment,
and also needs its shift count adjusted for little endian.
Reviewed by Ulrich Weigand.
This is a bug fix candidate for release 3.5 (and hopefully the last of
those for PowerPC).
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214926 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r214865 | thomas.stellard | 2014-08-05 10:40:52 -0400 (Tue, 05 Aug 2014) | 5 lines
R600/SI: Avoid generating REGISTER_LOAD instructions.
SI doesn't use REGISTER_LOAD anymore, but it was still hitting this code
path for 8-bit and 16-bit private loads.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214895 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r214463 | thomas.stellard | 2014-07-31 20:32:28 -0400 (Thu, 31 Jul 2014) | 7 lines
R600/SI: Fix incorrect commute operation in shrink instructions pass
We were commuting the instruction by still shrinking it using the
original opcode.
NOTE: This is a candidate for the 3.5 branch.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214894 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r213799 | grosbach | 2014-07-23 13:41:38 -0700 (Wed, 23 Jul 2014) | 5 lines
X86: restrict combine to when type sizes are safe.
The folding of unary operations through a vector compare and mask operation
is only safe if the unary operation result is of the same size as its input.
For example, it's not safe for [su]itofp from v4i32 to v4f64.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214841 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r214800 | wschmidt | 2014-08-04 18:21:01 -0500 (Mon, 04 Aug 2014) | 13 lines
[PPC64LE] Fix wrong IR for vec_sld and vec_vsldoi
My original LE implementation of the vsldoi instruction, with its
altivec.h interfaces vec_sld and vec_vsldoi, produces incorrect
shufflevector operations in the LLVM IR. Correct code is generated
because the back end handles the incorrect shufflevector in a
consistent manner.
This patch and a companion patch for Clang correct this problem by
removing the fixup from altivec.h and the corresponding fixup from the
PowerPC back end. Several test cases are also modified to reflect the
now-correct LLVM IR.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214821 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r214721 | uweigand | 2014-08-04 09:55:26 -0500 (Mon, 04 Aug 2014) | 4 lines
[PowerPC] Add target triple to vec_urem_const.ll test case
This should hopefully fix build bots on other architectures.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214820 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r214718 | uweigand | 2014-08-04 08:53:40 -0500 (Mon, 04 Aug 2014) | 12 lines
[PowerPC] Swap arguments to vpkuhum/vpkuwum on little-endian
In commit r213915, Bill fixed little-endian usage of vmrgh* and vmrgl*
by swapping the input arguments. As it turns out, the exact same fix
is also required for the vpkuhum/vpkuwum patterns.
This fixes another regression in llvmpipe when vector support is
enabled.
Reviewed by Bill Schmidt.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214819 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r214716 | uweigand | 2014-08-04 08:27:12 -0500 (Mon, 04 Aug 2014) | 9 lines
[PowerPC] MULHU/MULHS are not legal for vector types
I ran into some test failures where common code changed vector division
by constant into a multiply-high operation (MULHU). But these are not
implemented by the back-end, so we failed to recognize the insn.
Fixed by marking MULHU/MULHS as Expand for vector types.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214818 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r214714 | uweigand | 2014-08-04 08:13:57 -0500 (Mon, 04 Aug 2014) | 19 lines
[PowerPC] Fix and improve vector comparisons
This patch refactors code generation of vector comparisons.
This fixes a wrong code-gen bug for ISD::SETGE for floating-point types,
and improves generated code for vector comparisons in general.
Specifically, the patch moves all logic deciding how to implement vector
comparisons into getVCmpInst, which gets two extra boolean outputs
indicating to its caller whether its needs to swap the input operands
and/or negate the result of the comparison. Apart from implementing
these two modifications as directed by getVCmpInst, there is no need
to ever implement vector comparisons in any other manner; in particular,
there is never a need to perform two separate comparisons (e.g. one for
equal and one for greater-than, as code used to do before this patch).
Reviewed by Bill Schmidt.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214817 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r213665 | tnorthover | 2014-07-22 08:47:09 -0700 (Tue, 22 Jul 2014) | 11 lines
X86: drop relocations on __eh_frame sections globally.
Without this, we produce non-extern relocations when targeting older OS X
versions that ld64 can't cope with in the particular context of __eh_frame
sections (who'd want generic relocation-processing anyway?).
This means that an updated linker (ld64 from Xcode 3.2.6 or later) may be
needed when targeting such platforms with a modern version of LLVM, but this is
probably the case anyway and a reasonable requirement.
PR20212, rdar://problem/17544795
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214689 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r213726 | nicholas | 2014-07-22 23:24:49 -0700 (Tue, 22 Jul 2014) | 2 lines
We may visit a call that uses an alloca multiple times in callUsesLocalStack, sometimes with IsNocapture true and sometimes with IsNocapture false. We accidentally skipped work we needed to do in the IsNocapture=false case if we were called with IsNocapture=true the first time. Fixes PR20405!
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214688 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r213896 | compnerd | 2014-07-24 15:09:06 -0700 (Thu, 24 Jul 2014) | 6 lines
Target: invert condition for Windows
The Microsoft ABI and MSVCRT are considered the canonical C runtime and ABI.
The long double routines are not part of this environment. However, cygwin and
MinGW both provide supplementary implementations. Change the condition to
reflect this reality.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214687 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r213883 | compnerd | 2014-07-24 10:46:36 -0700 (Thu, 24 Jul 2014) | 5 lines
X86: correct library call setup for Windows itanium
This target is identical to the Windows MSVC (and follows Microsoft ABI for C).
Correct the library call setup for this target. The same set of library calls
are missing on this environment.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214686 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r214423 | hfinkel | 2014-07-31 12:13:38 -0700 (Thu, 31 Jul 2014) | 9 lines
Fix ScalarEvolutionExpander when creating a PHI in a block with duplicate predecessors
It seems that when I fixed this, almost exactly a year ago, I did not quite do
it correctly. When we have duplicate block predecessors, we can indeed not have
different incoming values for the same block, but we *must* have duplicate
entries. So, instead of skipping the duplicates, we explicitly add the
duplicate incoming values.
Fixes PR20442.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214684 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r214429 | willschm | 2014-07-31 12:50:53 -0700 (Thu, 31 Jul 2014) | 29 lines
Disable IsSub subregister assert. pr18663.
This is a follow-up to the activity in the bug at
http://llvm.org/bugs/show_bug.cgi?id=18663 . The underlying issue has
to do with how the KILL pseudo-instruction is handled. I defer to
Hal/Jakob/Uli for additional details and background.
This will disable the (bad?) assert, add an associated fixme comment,
and add a pair of tests.
The code change and the pr18663-2.ll test are copied from the referenced
bug. That test does not immediately fail in my environment, but I have
added the pr18663.ll test which does.
(Comment from Hal)
to provide everyone else with some context, this assert was not bad when
it was written. At that time, we only generated KILL pseudo instructions
around subregister copies. This logic, unfortunately, had its own problems.
In r199797, the relevant logic in MachineCopyPropagation was replaced to
generate KILLs for other kinds of copies too. This change in semantics broke
this now-problematic assumption in AggressiveAntiDepBreaker. The
AggressiveAntiDepBreaker really needs a proper cleanup to deal with the
change, but removing the assert (which just allows the function to return
false) is a safe conservative behavior, and should do for the time being.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214683 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r214519 | rafael | 2014-08-01 07:57:05 -0700 (Fri, 01 Aug 2014) | 3 lines
Remove lto_codegen_set_attr.
It was never exported, so no functionality change.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214682 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r213798 | grosbach | 2014-07-23 13:41:31 -0700 (Wed, 23 Jul 2014) | 7 lines
DAG: fp->int conversion for non-splat constants.
Constant fold the lanes of the input constant build_vector individually
so we correctly handle when the vector elements are not all the same
constant value.
PR20394
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214413 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r213793 | jholewinski | 2014-07-23 16:23:47 -0400 (Wed, 23 Jul 2014) | 4 lines
[NVPTX] Silence a GCC warning found by the buildbots
The cast to NVPTXTargetLowering was missing a 'const', but let's
just access the right pointer through the subtarget anyway.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214310 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r213773 | jholewinski | 2014-07-23 13:40:45 -0400 (Wed, 23 Jul 2014) | 5 lines
[NVPTX] Make sure we do not generate MULWIDE ISD nodes when optimizations are disabled
With optimizations disabled, we disable the isel patterns for mul.wide; but we
were still generating MULWIDE ISD nodes. Now, we only try to generate MULWIDE
ISD nodes in DAGCombine if the optimization level is not zero.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214309 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r214287 | chandlerc | 2014-07-29 22:44:04 -0700 (Tue, 29 Jul 2014) | 9 lines
Don't manually (and forcibly) run the verifier on the entire module from
the jump instruction table pass. First, the verifier is already built
into all the tools. The test case is adapted to just run llvm-as
demonstrating that we still catch the broken module. Second, the
verifier is *extremely* slow. This was responsible for very significant
compile time regressions.
If you have deployed a Clang binary anywhere from r210280 to this
commit, you really want to re-deploy.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214288 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r213884 | mren | 2014-07-24 10:57:09 -0700 (Thu, 24 Jul 2014) | 1 line
Try to fix the bots again by moving test to X86 directory.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214253 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r213880 | mren | 2014-07-24 10:18:33 -0700 (Thu, 24 Jul 2014) | 1 line
Try to fix the bots. If this does not work, I am going to move it to X86 directory.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214252 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r213815 | mren | 2014-07-23 16:13:23 -0700 (Wed, 23 Jul 2014) | 17 lines
SimplifyCFG: fix a bug in switch to table conversion
We use gep to access the global array "switch.table", and the table index
should be treated as unsigned. When the highest bit is 1, this commit
zero-extends the index to an integer type with larger size.
For a switch on i2, we used to generate:
%switch.tableidx = sub i2 %0, -2
getelementptr inbounds [4 x i64]* @switch.table, i32 0, i2 %switch.tableidx
It is incorrect when %switch.tableidx is 2 or 3. The fix is to generate
%switch.tableidx = sub i2 %0, -2
%switch.tableidx.zext = zext i2 %switch.tableidx to i3
getelementptr inbounds [4 x i64]* @switch.table, i32 0, i3 %switch.tableidx.zext
rdar://17735071
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214251 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r213894 | hans | 2014-07-24 14:09:45 -0700 (Thu, 24 Jul 2014) | 4 lines
Windows: Don't wildcard expand /? or -?
Even if there's a file called c:\a, we want /? to be preserved as
an option, not expanded to a filename.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214248 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r214078 | delcypher | 2014-07-28 14:36:50 +0100 (Mon, 28 Jul 2014) | 6 lines
Emit a warning if llvm_map_components_to_libraries() is used noting that its
use is deprecated in favour of llvm_map_components_to_libnames()
Although message(DEPRECATION "msg") would probably be a better fit this
does nothing if CMAKE_ERROR_DEPRECATED and CMAKE_WARNING_DEPRECATED are
both off, which is the default.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214080 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r214077 | delcypher | 2014-07-28 14:36:37 +0100 (Mon, 28 Jul 2014) | 2 lines
Document the new LLVM CMake interface for building against LLVM
libraries. With many contributions from Brad King.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214079 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r213915 | wschmidt | 2014-07-24 18:55:55 -0700 (Thu, 24 Jul 2014) | 21 lines
[PATCH][PPC64LE] Correct little-endian usage of vmrgh* and vmrgl*.
Because the PowerPC vmrgh* and vmrgl* instructions have a built-in
big-endian bias, it is necessary to swap their inputs in little-endian
mode when using them to implement a vector shuffle. This was
previously missed in the vector LE implementation.
There was already logic to distinguish between unary and "normal"
vmrg* vector shuffles, so this patch extends that logic to use a third
option: "swapped" vmrg* vector shuffles that are used for little
endian in place of the "normal" ones.
I've updated the vec-shuffle-le.ll test to check for the expected
register ordering on the generated instructions.
This bug was discovered when testing the LE and ELFv2 patches for
safety if they were backported to 3.4. A different vectorization
decision was made in 3.4 than on mainline trunk, and that exposed the
problem. I've verified this fix takes care of that issue.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@213961 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r213826 | filcab | 2014-07-23 18:28:21 -0700 (Wed, 23 Jul 2014) | 7 lines
Fixed PR20411 - bug in getINSERTPS()
When we had a vector_shuffle where we had an input from each vector, we
could miscompile it because we were assuming the input from V2 wouldn't
be moved from where it was on the vector.
Added a test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@213911 91177308-0d34-0410-b5e6-96231b3b80d8