If we have two unique values for a v2i64 build vector, this will always result
in two vector loads if we expand using shuffles. Only one is necessary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205231 91177308-0d34-0410-b5e6-96231b3b80d8
There are two general methods for expanding a BUILD_VECTOR node:
1. Use SCALAR_TO_VECTOR on the defined scalar values and then shuffle
them together.
2. Build the vector on the stack and then load it.
Currently, we use a fixed heuristic: If there are only one or two unique
defined values, then we attempt an expansion in terms of SCALAR_TO_VECTOR and
vector shuffles (provided that the required shuffle mask is legal). Otherwise,
always expand via the stack. Even when SCALAR_TO_VECTOR is not legal, this
can still be a good idea depending on what tricks the target can play when
lowering the resulting shuffle. If the target can't do anything special,
however, and if SCALAR_TO_VECTOR is expanded via the stack, this heuristic
leads to sub-optimal code (two stack loads instead of one).
Because only the target knows whether the SCALAR_TO_VECTORs and shuffles for a
build vector of a particular type are likely to be optimial, this adds a new
TLI function: shouldExpandBuildVectorWithShuffles which takes the vector type
and the count of unique defined values. If this function returns true, then
method (1) will be used, subject to the constraint that all of the necessary
shuffles are legal (as determined by isShuffleMaskLegal). If this function
returns false, then method (2) is always used.
This commit does not enhance the current code to support expanding a
build_vector with more than two unique values using shuffles, but I'll commit
an implementation of the more-general case shortly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205230 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Highlights:
- Registers are resolved much later (by the render method).
Prior to that point, GPR32's/GPR64's are GPR's regardless of register
size. Similarly FGR32's/FGR64's/AFGR64's are FGR's regardless of register
size or FR mode. Numeric registers can be anything.
- All registers are parsed the same way everywhere (even when handling
symbol aliasing)
- One consequence is that all registers can be specified numerically
almost anywhere (e.g. $fccX, $wX). The exception is symbol aliasing
but that can be easily resolved.
- Removes the need for the hasConsumedDollar hack
- Parenthesis and Bracket suffixes are handled generically
- Micromips instructions are parsed directly instead of going through the
standard encodings first.
- rdhwr accepts all 32 registers, and the following instructions that previously
xfailed now work:
ddiv, ddivu, div, divu, cvt.l.[ds], se[bh], wsbh, floor.w.[ds], c.ngl.d,
c.sf.s, dsbh, dshd, madd.s, msub.s, nmadd.s, nmsub.s, swxc1
- Diagnostics involving registers point at the correct character (the $)
- There's only one kind of immediate in MipsOperand. LSA immediates are handled
by the predicate and renderer.
Lowlights:
- Hardcoded '$zero' in the div patterns is handled with a hack.
MipsOperand::isReg() will return true for a k_RegisterIndex token
with Index == 0 and getReg() will return ZERO for this case. Note that it
doesn't return ZERO_64 on isGP64() targets.
- I haven't cleaned up all of the now-unused functions.
Some more of the generic parser could be removed too (integers and relocs
for example).
- insve.df needed a custom decoder to handle the implicit fourth operand that
was needed to make it parse correctly. The difficulty was that the matcher
expected a Token<'0'> but gets an Imm<0>. Adding an implicit zero solved this.
Reviewers: matheusalmeida, vmedic
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D3222
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205229 91177308-0d34-0410-b5e6-96231b3b80d8
part of an asm .symver directive as being used. This prevents referenced
functions from being internalized and deleted.
Without the patch to LTOModule.cpp, the test case will produce the error:
LLVM ERROR: A @@ version cannot be undefined.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205221 91177308-0d34-0410-b5e6-96231b3b80d8
This generalises the object file type parsing to all Windows environments. This
is used by cygwin as well as MSVC environments for MCJIT. This also makes the
triple more similar to Chandler's suggestion of a separate field for the object
file format.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205219 91177308-0d34-0410-b5e6-96231b3b80d8
Now that r205212 was committed, r203483 is no longer necessary; it was a
temporary workaround that only handled a small number of the problematic cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205216 91177308-0d34-0410-b5e6-96231b3b80d8
This is a more thorough fix for the issue than r203483. An IR pass will run
before NVPTX codegen to make sure there are no invalid symbol names that can't
be consumed by the ptxas assembler.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205212 91177308-0d34-0410-b5e6-96231b3b80d8
%got_hi, %got_lo, %call_hi, %call_lo, %higher, and %highest are now recognised
by MipsAsmParser::getVariantKind().
To prevent future issues with missing entries in this StringSwitch, I've added
an assertion to the default case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205200 91177308-0d34-0410-b5e6-96231b3b80d8
Unlike my previous commit, don't try to remove the corresponding VK_Mips_GOT yet
even though it shares the same assembly text since that is used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205196 91177308-0d34-0410-b5e6-96231b3b80d8
This allows allows us to replace ISD::EXTRACT_ELEMENT, which is lowered
using shifts, with ISD::EXTRACT_VECTOR_ELT, which is a no-op.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205187 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Where those ISA's are not currently supported, the test is run with the smallest
superset of that ISA.
Some instructions are valid but don't pass yet. These have been placed in the
valid-xfail.s's which will XPASS if _any_ instruction starts working.
The valid.s's do not verify the encoding yet. There are also no tests checking that instructions from neighbouring ISA's are not accepted.
Reviewers: matheusalmeida
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D3214
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205180 91177308-0d34-0410-b5e6-96231b3b80d8
When the loop vectorizer vectorizes code that uses the loop induction variable,
we often end up with IR like this:
%b1 = insertelement <2 x i32> undef, i32 %v, i32 0
%b2 = shufflevector <2 x i32> %b1, <2 x i32> undef, <2 x i32> zeroinitializer
%i = add <2 x i32> %b2, <i32 2, i32 3>
If the add in this example is not legal (as is the case on PPC with VSX), it
will be scalarized, and we'll end up with a number of extract_vector_elt nodes
with the vector shuffle as the input operand, and that vector shuffle is fed by
one or more build_vector nodes. By the time that vector operations are
expanded, visitEXTRACT_VECTOR_ELT will not create new extract_vector_elt by
looking through the vector shuffle (to make sure that no illegal operations are
created), and so the extract_vector_elt -> vector shuffle -> build_vector is
never simplified to an operand of the build vector.
By looking at build_vectors through a shuffle we fix this particular situation,
preventing a vector from being built, only to be deconstructed again (for the
scalarized add) -- an expensive proposition when this all needs to be done via
the stack. We probably want a more comprehensive fix here where we look back
recursively through any shuffles to any build_vectors or scalar_to_vectors,
etc. but that can come later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205179 91177308-0d34-0410-b5e6-96231b3b80d8
While reviewing r204163, I noticed that the MIPS16 test only checked for a .ent
directive and didn't actually check the code emitted. Fixed this and added a
check for llvm.bswap.i32 on MIPS64 at the same time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205177 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The FileHeader mapping now accepts an optional Flags sequence that accepts
the EF_<arch>_<flag> constants. When not given, Flags defaults to zero.
Reviewers: atanasyan
Reviewed By: atanasyan
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D3213
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205173 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, MinGW OS was Triple::MinGW and Cygwin was Triple::Cygwin
and now it is Triple::Win32 with Environment being GNU or Cygwin.
So,
TheTriple.getOS() == Triple::Win32
is replaced by
TheTriple.isWindowsMSVCEnvironment()
and
(TheTriple.getOS() == Triple::MinGW32 || TheTriple.getOS() == Triple::Cygwin)
is replaced by
TheTriple.isOSCygMing()
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205170 91177308-0d34-0410-b5e6-96231b3b80d8
is not a pattern to lower this with clever instructions that zero the
register, so restrict the zero immediate legality special case to f64
and f32 (the only two sizes which fmov seems to directly support). Fixes
backend errors when building code such as libxml.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205161 91177308-0d34-0410-b5e6-96231b3b80d8
There is no direct AVX instruction to convert to unsigned. I have some ideas
how we may be able to do this with three vector instructions but the current
backend just bails on this to get it scalarized.
See the comment why we need to adjust the cost returned by BasicTTI.
The test is a bit roundabout (and checks assembly rather than bit code) because
I'd like it to work even if at some point we could vectorize this conversion.
Fixes <rdar://problem/16371920>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205159 91177308-0d34-0410-b5e6-96231b3b80d8
When expanding EXTRACT_VECTOR_ELT and EXTRACT_SUBVECTOR using
SelectionDAGLegalize::ExpandExtractFromVectorThroughStack, we store the entire
vector and then load the piece we want. This is fine in isolation, but
generating a new store (and corresponding stack slot) for each extraction ends
up producing code of poor quality. When we scalarize a vector operation (using
SelectionDAG::UnrollVectorOp for example) we generate one EXTRACT_VECTOR_ELT
for each element in the vector. This used to generate one stored copy of the
vector for each element in the vector. Now we search the uses of the vector for
a suitable store before generating a new one, which results in much more
efficient scalarization code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205153 91177308-0d34-0410-b5e6-96231b3b80d8