not marked nounwind, or for all functions when -enable-eh
is set, provided the target supports Dwarf EH.
llvm-gcc generates nounwind in the right places; other FEs
will need to do so also. Given such a FE, -enable-eh should
no longer be needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49006 91177308-0d34-0410-b5e6-96231b3b80d8
In order to handle indexed nodes I had to introduce
a new constructor, and since I was there I factorized
the code in the various load constructors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48894 91177308-0d34-0410-b5e6-96231b3b80d8
nodes. This doesn't currently have much impact the generated code, but it
does produce simpler-looking SelectionDAGs, and consequently
simpler-looking ScheduleDAGs, because there are fewer spurious
dependencies.
In particular, CopyValueToVirtualRegister now uses the entry node as the
input chain dependency for new CopyToReg nodes instead of calling getRoot
and depending on the most recent memory reference.
Also, rename UnorderedChains to PendingExports and pull it up from being
a local variable in SelectionDAGISel::BuildSelectionDAG to being a
member variable of SelectionDAGISel, so that it doesn't have to be
passed around to all the places that need it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48893 91177308-0d34-0410-b5e6-96231b3b80d8
called LimitedSumOfUnscheduledPredsOfSuccs. It terminates the computation
after a given treshold is reached. This new function is always faster, but
brings real wins only on bigger test-cases.
The old function SumOfUnscheduledPredsOfSuccs is left in-place for now and therefore a warning about an unused static function is produced.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48872 91177308-0d34-0410-b5e6-96231b3b80d8
LLVM Value/Use does and MachineRegisterInfo/MachineOperand does.
This allows constant time for all uses list maintenance operations.
The idea was suggested by Chris. Reviewed by Evan and Dan.
Patch is tested and approved by Dan.
On normal use-cases compilation speed is not affected. On very big basic
blocks there are compilation speedups in the range of 15-20% or even better.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48822 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes Bugzilla #1835 (http://llvm.org/bugs/show_bug.cgi?id=1835).
This patched is reviewed by Tanya and Dan. Dan tested and approved it.
The reason for the bad performance of the old algorithm is that it is very naive and scans every
time all nodes of the DAG in the worst case.
This patch introduces a new algorithm based on the paper "Online algorithms
for maintaining the topological order of a directed acyclic graph" by
David J.Pearce and Paul H.J.Kelly. This is the MNR algorithm. It has a
linear time worst-case and performs much better in most situations.
The paper can be found here:
http://fano.ics.uci.edu/cites/Document/Online-algorithms-for-maintaining-the-topological-order-of-a-directed-acyclic-graph.html
The main idea of the new algorithm is to compute the topological ordering of the SNodes in the
DAG and to maintain it even after DAG modifications. The topological ordering allows for very fast
node reachability checks.
Tests on very big input files with tens of thousands of instructions in a BB indicate huge
speed-ups (up to 10x compilation time improvement) compared to the old version.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48817 91177308-0d34-0410-b5e6-96231b3b80d8
flags. This is needed by the new legalize types
infrastructure which wants to expand the 64 bit
constants previously used to hold the flags on
32 bit machines. There are two functional changes:
(1) in LowerArguments, if a parameter has the zext
attribute set then that is marked in the flags;
before it was being ignored; (2) PPC had some bogus
code for handling two word arguments when using the
ELF 32 ABI, which was hard to convert because of
the bogusness. As suggested by the original author
(Nicolas Geoffray), I've disabled it for the moment.
Tested with "make check" and the Ada ACATS testsuite.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48640 91177308-0d34-0410-b5e6-96231b3b80d8
Use getIntPtrConstant in a couple places to shorten stuff up
Handle splitting vector shuffles with undefs in the mask
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48351 91177308-0d34-0410-b5e6-96231b3b80d8
the fcopysign expansion from LegalizeDAG to get rid of
what seems to be a bug: the use of sign extension means
that when copying the sign bit from an f32 to an f64,
the upper 32 bits of the f64 (now an i64) are set, not
just the top bit... I also generalized it to work for
any sized floating point types, and removed the bogosity:
SDOperand Mask1 = (SrcVT == MVT::f64)
? DAG.getConstantFP(BitsToDouble(1ULL << 63), SrcVT)
: DAG.getConstantFP(BitsToFloat(1U << 31), SrcVT);
Mask1 = DAG.getNode(ISD::BIT_CONVERT, SrcNVT, Mask1);
(here SrcNVT is an integer with the same size as SrcVT).
As far as I can see this takes a 1 << 63, converts to
a double, converts that to a floating point constant
then converts that to an integer constant, ending up
with... 1 << 63 as an integer constant! So I just
generate this integer constant directly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48305 91177308-0d34-0410-b5e6-96231b3b80d8
getCopyToParts problem was noticed by the new
LegalizeTypes infrastructure. In order to avoid
this kind of thing in the future I've added a
check that EXTRACT_ELEMENT is only used with
integers. Once LegalizeTypes is up and running
most likely BUILD_PAIR and EXTRACT_ELEMENT can
be removed, in favour of using apints instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48294 91177308-0d34-0410-b5e6-96231b3b80d8
X86 lowering normalize vector 0 to v4i32. However DAGCombine can fold (sub x, x) -> 0 after legalization. It can create a zero vector of a type that's not expected (e.g. v8i16). We don't want to disable the optimization since leaving a (sub x, x) is really bad. Add isel patterns for other types of vector 0 to ensure correctness. It's highly unlikely to happen other than in bugpoint reduced test cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48279 91177308-0d34-0410-b5e6-96231b3b80d8
and it's the result that requires expansion. This code is a little confusing
because the TargetLoweringInfo tables for [US]INT_TO_FP use the operand type
(the integer type) rather than the result type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48206 91177308-0d34-0410-b5e6-96231b3b80d8
return ValueType can depend its operands' ValueType.
This is a cosmetic change, no functionality impacted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48145 91177308-0d34-0410-b5e6-96231b3b80d8
Change insert/extract subreg instructions to be able to be used in TableGen patterns.
Use the above features to reimplement an x86-64 pseudo instruction as a pattern.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48130 91177308-0d34-0410-b5e6-96231b3b80d8
field to 32 bits, thus enabling correct handling of ByVal
structs bigger than 0x1ffff. Abstract interface a bit.
Fixes gcc.c-torture/execute/pr23135.c and
gcc.c-torture/execute/pr28982b.c in gcc testsuite (were ICE'ing
on ppc32, quietly producing wrong code on x86-32.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48122 91177308-0d34-0410-b5e6-96231b3b80d8
they are produced by calls (which are known exact) and by cross block copies
which are known to be produced by extends.
This improves:
define double @test2() {
%tmp85 = call double asm sideeffect "fld0", "={st(0)}"()
ret double %tmp85
}
from:
_test2:
subl $20, %esp
# InlineAsm Start
fld0
# InlineAsm End
fstpl 8(%esp)
movsd 8(%esp), %xmm0
movsd %xmm0, (%esp)
fldl (%esp)
addl $20, %esp
#FP_REG_KILL
ret
to:
_test2:
# InlineAsm Start
fld0
# InlineAsm End
#FP_REG_KILL
ret
by avoiding a f64 <-> f80 trip
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48108 91177308-0d34-0410-b5e6-96231b3b80d8
an RFP register class.
Teach ScheduleDAG how to handle CopyToReg with different src/dst
reg classes.
This allows us to compile trivial inline asms that expect stuff
on the top of x87-fp stack.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48107 91177308-0d34-0410-b5e6-96231b3b80d8
in different register classes, e.g. copy of ST(0) to RFP*. This gets
some really trivial inline asm working that plops things on the top of
stack (PR879)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48105 91177308-0d34-0410-b5e6-96231b3b80d8
of BUILD_VECTORS that only have two unique elements:
1. The previous code was nondeterminstic, because it walked a map in
SDOperand order, which isn't determinstic.
2. The previous code didn't handle the case when one element was undef
very well. Now we ensure that the generated shuffle mask has the
undef vector on the RHS (instead of potentially being on the LHS)
and that any elements that refer to it are themselves undef. This
allows us to compile CodeGen/X86/vec_set-9.ll into:
_test3:
movd %rdi, %xmm0
punpcklqdq %xmm0, %xmm0
ret
instead of:
_test3:
movd %rdi, %xmm1
#IMPLICIT_DEF %xmm0
punpcklqdq %xmm1, %xmm0
ret
... saving a register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48060 91177308-0d34-0410-b5e6-96231b3b80d8
_test3:
movd %rdi, %xmm1
#IMPLICIT_DEF %xmm0
punpcklqdq %xmm1, %xmm0
ret
instead of:
_test3:
#IMPLICIT_DEF %rax
movd %rax, %xmm0
movd %rdi, %xmm1
punpcklqdq %xmm1, %xmm0
ret
This is still not ideal. There is no reason to two xmm regs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48058 91177308-0d34-0410-b5e6-96231b3b80d8
except ppc long double. This allows us to shrink constant pool
entries for x86 long double constants, which in turn allows us to
use flds/fldl instead of fldt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47938 91177308-0d34-0410-b5e6-96231b3b80d8
bug in r47928 (Int64Ty is the correct type for the constant
pool entry here) and removes the asserts, now that the code
is capable of handling i128.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47932 91177308-0d34-0410-b5e6-96231b3b80d8
For x86, if sse2 is available, it's not a good idea since cvtss2sd is slower than a movsd load and it prevents load folding. On x87, it's important to shrink fp constant since fldt is very expensive.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47931 91177308-0d34-0410-b5e6-96231b3b80d8
The basic idea is that all these algorithms are computing the longest paths from the root node or to the exit node. Therefore the existing implementation that uses and iterative and potentially
exponential algorithm was changed to a well-known graph algorithm based on dynamic programming. It has a linear run-time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47884 91177308-0d34-0410-b5e6-96231b3b80d8
generic & x86 versions; change generic to follow x86
and improve comments. Add PPC version (not right
for non-Darwin.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47734 91177308-0d34-0410-b5e6-96231b3b80d8
same size as an int type by doing a bitconvert of
load/store of the int type (same algorithm as floating point).
This makes them work for ppc Altivec. There was some
code that purported to handle loads of (some) vectors
by splitting them into two smaller vectors, but getExtLoad
rejects subvector loads, so this could never have worked;
the patch removes it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47696 91177308-0d34-0410-b5e6-96231b3b80d8
approach taken is different to that in LegalizeDAG
when it is a question of expanding or promoting the
result type: for example, if extracting an i64 from
a <2 x i64>, when i64 needs expanding, it bitcasts
the vector to <4 x i32>, extracts the appropriate
two i32's, and uses those for the Lo and Hi parts.
Likewise, when extracting an i16 from a <4 x i16>,
and i16 needs promoting, it bitcasts the vector to
<2 x i32>, extracts the appropriate i32, twiddles
the bits if necessary, and uses that as the promoted
value. This puts more pressure on bitcast legalization,
and I've added the appropriate cases. They needed to
be added anyway since users can generate such bitcasts
too if they want to. Also, when considering various
cases (Legal, Promote, Expand, Scalarize, Split) it is
a pain that expand can correspond to Expand, Scalarize
or Split, so I've changed the LegalizeTypes enum so it
lists those different cases - now Expand only means
splitting a scalar in two.
The code produced is the same as by LegalizeDAG for
all relevant testcases, except for
2007-10-31-extractelement-i64.ll, where the code seems
to have improved (see below; can an expert please tell
me if it is better or not).
Before < vs after >.
< subl $92, %esp
< movaps %xmm0, 64(%esp)
< movaps %xmm0, (%esp)
< movl 4(%esp), %eax
< movl %eax, 28(%esp)
< movl (%esp), %eax
< movl %eax, 24(%esp)
< movq 24(%esp), %mm0
< movq %mm0, 56(%esp)
---
> subl $44, %esp
> movaps %xmm0, 16(%esp)
> pshufd $1, %xmm0, %xmm1
> movd %xmm1, 4(%esp)
> movd %xmm0, (%esp)
> movq (%esp), %mm0
> movq %mm0, 8(%esp)
< subl $92, %esp
< movaps %xmm0, 64(%esp)
< movaps %xmm0, (%esp)
< movl 12(%esp), %eax
< movl %eax, 28(%esp)
< movl 8(%esp), %eax
< movl %eax, 24(%esp)
< movq 24(%esp), %mm0
< movq %mm0, 56(%esp)
---
> subl $44, %esp
> movaps %xmm0, 16(%esp)
> pshufd $3, %xmm0, %xmm1
> movd %xmm1, 4(%esp)
> movhlps %xmm0, %xmm0
> movd %xmm0, (%esp)
> movq (%esp), %mm0
> movq %mm0, 8(%esp)
< subl $92, %esp
< movaps %xmm0, 64(%esp)
---
> subl $44, %esp
< movl 16(%esp), %eax
< movl %eax, 48(%esp)
< movl 20(%esp), %eax
< movl %eax, 52(%esp)
< movaps %xmm0, (%esp)
< movl 4(%esp), %eax
< movl %eax, 60(%esp)
< movl (%esp), %eax
< movl %eax, 56(%esp)
---
> pshufd $1, %xmm0, %xmm1
> movd %xmm1, 4(%esp)
> movd %xmm0, (%esp)
> movd %xmm1, 12(%esp)
> movd %xmm0, 8(%esp)
< subl $92, %esp
< movaps %xmm0, 64(%esp)
---
> subl $44, %esp
< movl 24(%esp), %eax
< movl %eax, 48(%esp)
< movl 28(%esp), %eax
< movl %eax, 52(%esp)
< movaps %xmm0, (%esp)
< movl 12(%esp), %eax
< movl %eax, 60(%esp)
< movl 8(%esp), %eax
< movl %eax, 56(%esp)
---
> pshufd $3, %xmm0, %xmm1
> movd %xmm1, 4(%esp)
> movhlps %xmm0, %xmm0
> movd %xmm0, (%esp)
> movd %xmm1, 12(%esp)
> movd %xmm0, 8(%esp)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47672 91177308-0d34-0410-b5e6-96231b3b80d8
operand of a VECTOR_SHUFFLE. The mask is a
vector of constant integers. The code in
LegalizeDAG doesn't bother to legalize the
mask, since it's basically just storage for
a bunch of constants, however LegalizeTypes
is more picky. The problem is that there may
not exist any legal vector-of-integers type
with a legal element type, so it is impossible
to create a legal mask! Unless of course you
cheat by creating a BUILD_VECTOR where the
operands have a different type to the element
type of the vector being built... This is
pretty ugly but works - all relevant tests in
the testsuite pass, and produce the same
assembler with and without LegalizeTypes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47670 91177308-0d34-0410-b5e6-96231b3b80d8
Change several cases in SimplifyDemandedMask that don't ever do any
simplifying to reuse the logic in ComputeMaskedBits instead of
duplicating it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47648 91177308-0d34-0410-b5e6-96231b3b80d8
CodeGen/PowerPC/illegal-element-type.ll): suppose
a node X is processed, and processing maps it to
a node Y. Then X continues to exist in the DAG,
but with no users. While processing some other
node, a new node may be created that happens to
be equal to X, and thus X will be reused rather
than a truly new node. This can cause X to
"magically reappear", and since it is in the
Processed state in will not be reprocessed, so
at the end of type legalization the illegal node
X can still be present. The solution is to replace
X with Y whenever X gets resurrected like this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47601 91177308-0d34-0410-b5e6-96231b3b80d8
after legalize. Just because a constant is legal (e.g. 0.0 in SSE)
doesn't mean that its negated value is legal (-0.0). We could make
this stronger by checking to see if the negated constant is actually
legal post negation, but it doesn't seem like a big deal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47591 91177308-0d34-0410-b5e6-96231b3b80d8
out of illegal elements (BUILD_VECTOR). Uses and beefs
up BUILD_PAIR, though it didn't really have to. Like
most of LegalizeTypes, does not support soft-float.
This cures all "make check" vector building failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47537 91177308-0d34-0410-b5e6-96231b3b80d8
early clobbers if the clobber list contains a *register* not some thing
like {memory}, {dirflag} etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47457 91177308-0d34-0410-b5e6-96231b3b80d8
any, we force sdisel to do all regalloc for an asm. This
leads to gross but correct codegen.
This fixes the rest of PR2078.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47454 91177308-0d34-0410-b5e6-96231b3b80d8
inline asms.
Fix PR2078 by marking aliases of registers used when a register is
marked used. This prevents EAX from being allocated when AX is listed
in the clobber set for the asm.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47426 91177308-0d34-0410-b5e6-96231b3b80d8
and splitting extract_subvector. This fixes nine
"make check" testcases, for example
2008-02-04-ExtractSubvector.ll and (partially)
CodeGen/Generic/vector.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47384 91177308-0d34-0410-b5e6-96231b3b80d8
AddNodeIDNode does profiling for a ConstantSDNode, but so does
SelectionDAG::getConstant. This profiling should be moved to a common
static function in ConstantSDNode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47359 91177308-0d34-0410-b5e6-96231b3b80d8
- X86 now normalize SCALAR_TO_VECTOR to (BIT_CONVERT (v4i32 SCALAR_TO_VECTOR)). Get rid of X86ISD::S2VEC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47290 91177308-0d34-0410-b5e6-96231b3b80d8
it actually does. Simplify CountOperands a little by reusing
ComputeMemOperandsEnd. And reword some comments for both.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47198 91177308-0d34-0410-b5e6-96231b3b80d8
tblgen will complain if a sign-extended constant does not fit into a
data type smaller than i32, e.g., i16. This causes a problem when certain
hex constants are used, such as 0xff for byte masks or immediate xor
values.
tblgen will try the sign-extended value first and, if the sign extended
value would overflow, it tries to see if the unsigned value will fit.
Consequently, a software developer can now safely incant:
(XORHIr16 R16C:$rA, 0xffff)
which is somewhat clearer and more informative than incanting:
(XORHIr16 R16C:$rA, (i16 -1))
even if the two are bitwise equivalent.
Tblgen also outputs the 64-bit unsigned constant in the generated ISel code
when getTargetConstant() is invoked.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47188 91177308-0d34-0410-b5e6-96231b3b80d8