x * 40
=>
shlq $3, %rdi
leaq (%rdi,%rdi,4), %rax
This has the added benefit of allowing more multiply to be folded into addressing mode. e.g.
a * 24 + b
=>
leaq (%rdi,%rdi,2), %rax
leaq (%rsi,%rax,8), %rax
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67917 91177308-0d34-0410-b5e6-96231b3b80d8
%a = ...
%b = and i32 %a, 2
%c = srl i32 %b, 1
%d = br i32 %c,
into
%a = ...
%b = and %a, 2
%c = X86ISD::CMP %b, 0
%d = X86ISD::BRCOND %c ...
This applies only when the AND constant value has one bit set and the SRL
constant is equal to the log2 of the AND constant. The back-end is smart enough
to convert the result into a TEST/JMP sequence.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67728 91177308-0d34-0410-b5e6-96231b3b80d8
to be returned in DL. LLVM's multiple-return-value support is
not ABI-conforming; front-ends that wish to have code emitted
that conforms to an ABI are currently expected to make
arrangements for this on their own rather than assuming that
multiple-return-values will automatically do the right thing.
This commit doesn't fundamentally change this situation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67588 91177308-0d34-0410-b5e6-96231b3b80d8
same as a normal i80 {low64, high16} rather
than its own {high64, low16}. A depressing number
of places know about this; I think I got them all.
Bitcode readers and writers convert back to the old
form to avoid breaking compatibility.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67562 91177308-0d34-0410-b5e6-96231b3b80d8
in selectiondag patterns. This is required for the upcoming shuffle_vector rewrite,
and as it turns out, cleans up a hack in the Alpha instruction info.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67286 91177308-0d34-0410-b5e6-96231b3b80d8
not safe in general because the immediate could be an arbitrary
value that does not fit in a 32-bit pcrel displacement.
Conservatively fall back to loading the value into a register
and calling through it.
We still do the optzn on X86-32.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67142 91177308-0d34-0410-b5e6-96231b3b80d8
- Fix fabs, fneg for f32 and f64.
- Use BuildVectorSDNode.isConstantSplat, now that the functionality exists
- Continue to improve i64 constant lowering. Lower certain special constants
to the constant pool when they correspond to SPU's shufb instruction's
special mask values. This avoids the overhead of performing a shuffle on a
zero-filled vector just to get the special constant when the memory load
suffices.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67067 91177308-0d34-0410-b5e6-96231b3b80d8
Incorporate Tilmann's 128-bit operation patch. Evidently, it gets the
llvm-gcc bootstrap a bit further along.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67048 91177308-0d34-0410-b5e6-96231b3b80d8
operand is a signed 32-bit immediate. Unlike with the 8-bit
signed immediate case, it isn't actually smaller to fold a
32-bit signed immediate instead of a load. In fact, it's
larger in the case of 32-bit unsigned immediates, because
they can be materialized with movl instead of movq.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67001 91177308-0d34-0410-b5e6-96231b3b80d8
ptrtoint and inttoptr in X86FastISel. These casts aren't always
handled in the generic FastISel code because X86 sometimes needs
custom code to do truncation and zero-extension.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66988 91177308-0d34-0410-b5e6-96231b3b80d8
by inserting explicit zero extensions where necessary. Included
is a testcase where SelectionDAG produces a virtual register
holding an i1 value which FastISel previously mistakenly assumed
to be zero-extended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66941 91177308-0d34-0410-b5e6-96231b3b80d8
codegen to the same thing as integer truncates to i8 (the top bits are
just undefined). This implements rdar://6667338
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66902 91177308-0d34-0410-b5e6-96231b3b80d8
1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants.
2. MachineConstantPool alignment field is also a log2 value.
3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values.
4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries.
5. Asm printer uses expensive data structure multimap to track constant pool entries by sections.
6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic.
Solutions:
1. ConstantPoolSDNode alignment field is changed to keep non-log2 value.
2. MachineConstantPool alignment field is also changed to keep non-log2 value.
3. Functions that create ConstantPool nodes are passing in non-log2 alignments.
4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT.
5. Asm printer uses cheaper data structure to group constant pool entries.
6. Asm printer compute entry offsets after grouping is done.
7. Change JIT code to compute entry offsets on the fly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66875 91177308-0d34-0410-b5e6-96231b3b80d8
for i32/i64 expressions (we could also do i16 on cpus where
i16 lea is fast, but I didn't add this). On the example, we now
generate:
_test:
movl 4(%esp), %eax
cmpl $42, (%eax)
setl %al
movzbl %al, %eax
leal 4(%eax,%eax,8), %eax
ret
instead of:
_test:
movl 4(%esp), %eax
cmpl $41, (%eax)
movl $4, %ecx
movl $13, %eax
cmovg %ecx, %eax
ret
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66869 91177308-0d34-0410-b5e6-96231b3b80d8
operands can't both be fully folded at the same time. For example,
in the included testcase, a global variable is being added with
an add of two values. The global variable wants RIP-relative
addressing, so it can't share the address with another base
register, but it's still possible to fold the initial add.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66865 91177308-0d34-0410-b5e6-96231b3b80d8
related transformations out of target-specific dag combine into the
ARM backend. These were added by Evan in r37685 with no testcases
and only seems to help ARM (e.g. test/CodeGen/ARM/select_xform.ll).
Add some simple X86-specific (for now) DAG combines that turn things
like cond ? 8 : 0 -> (zext(cond) << 3). This happens frequently
with the recently added cp constant select optimization, but is a
very general xform. For example, we now compile the second example
in const-select.ll to:
_test:
movsd LCPI2_0, %xmm0
ucomisd 8(%esp), %xmm0
seta %al
movzbl %al, %eax
movl 4(%esp), %ecx
movsbl (%ecx,%eax,4), %eax
ret
instead of:
_test:
movl 4(%esp), %eax
leal 4(%eax), %ecx
movsd LCPI2_0, %xmm0
ucomisd 8(%esp), %xmm0
cmovbe %eax, %ecx
movsbl (%ecx), %eax
ret
This passes multisource and dejagnu.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66779 91177308-0d34-0410-b5e6-96231b3b80d8
linkage: this linkage type only applies to declarations,
but ODR is only relevant to globals with definitions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66650 91177308-0d34-0410-b5e6-96231b3b80d8
and extern_weak_odr. These are the same as the non-odr versions,
except that they indicate that the global will only be overridden
by an *equivalent* global. In C, a function with weak linkage can
be overridden by a function which behaves completely differently.
This means that IP passes have to skip weak functions, since any
deductions made from the function definition might be wrong, since
the definition could be replaced by something completely different
at link time. This is not allowed in C++, thanks to the ODR
(One-Definition-Rule): if a function is replaced by another at
link-time, then the new function must be the same as the original
function. If a language knows that a function or other global can
only be overridden by an equivalent global, it can give it the
weak_odr linkage type, and the optimizers will understand that it
is alright to make deductions based on the function body. The
code generators on the other hand map weak and weak_odr linkage
to the same thing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66339 91177308-0d34-0410-b5e6-96231b3b80d8
the same say the "test" instruction does in overflow cases,
so eliminating the test is only safe when those bits aren't
needed, as is the case for COND_E and COND_NE, or if it
can be proven that no overflow will occur. For now, just
restrict the optimization to COND_E and COND_NE and don't
do any overflow analysis.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66318 91177308-0d34-0410-b5e6-96231b3b80d8
INC64_32r and INC64_16r, because these instructions are encoded
differently on x86-64. This fixes JIT regressions on x86-64 in
kimwitu++ and others.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66207 91177308-0d34-0410-b5e6-96231b3b80d8
of MachineInstr def operands must be subtracted out. This bug
was uncovered by the recent x86 EFLAGS optimization. Before
that, the only instructions that ever needed unfolding were
things like CMP32rm, where NumDefs is zero.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66056 91177308-0d34-0410-b5e6-96231b3b80d8
arbitrary vector sizes. Add an optional MinSplatBits parameter to specify
a minimum for the splat element size. Update the PPC target to use the
revised interface.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65899 91177308-0d34-0410-b5e6-96231b3b80d8
possibly for the reason suggested by the comment.
No wonder it didn't work very well. This unblocks
bootstrap with assertions on ppc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65601 91177308-0d34-0410-b5e6-96231b3b80d8
results via reference parameters.
This patch also appears to fix Evan's reported problem supplied as a
reduced bugpoint test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65426 91177308-0d34-0410-b5e6-96231b3b80d8
them are generic changes.
- Use the "fast" flag that's already being passed into the asm printers instead
of shoving it into the DwarfWriter.
- Instead of calling "MI->getParent()->getParent()" for every MI, set the
machine function when calling "runOnMachineFunction" in the asm printers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65379 91177308-0d34-0410-b5e6-96231b3b80d8
instruction. The class also consolidates the code for detecting constant
splats that's shared across PowerPC and the CellSPU backends (and might be
useful for other backends.) Also introduces SelectionDAG::getBUID_VECTOR() for
generating new BUILD_VECTOR nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65296 91177308-0d34-0410-b5e6-96231b3b80d8
Now we're using one gross, but quite robust hack :) (previous ones
did not work, for example, when ext_weak symbol was used deep inside
constant expression in the initializer).
The proper fix of this problem will require some quite huge asmprinter
changes and that's why was postponed. This fixes PR3629 by the way :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65230 91177308-0d34-0410-b5e6-96231b3b80d8
that has not been JIT'd yet, the callee is put on a list of pending functions
to JIT. The call is directed through a stub, which is updated with the address
of the function after it has been JIT'd. A new interface for allocating and
updating empty stubs is provided.
Add support for removing the ModuleProvider the JIT was created with, which
would otherwise invalidate the JIT's PassManager, which is initialized with the
ModuleProvider's Module.
Add support under a new ExecutionEngine flag for emitting the infomration
necessary to update Function and GlobalVariable stubs after JITing them, by
recording the address of the stub and the name of the GlobalValue. This allows
code to be copied from one address space to another, where libraries may live
at different virtual addresses, and have the stubs updated with their new
correct target addresses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64906 91177308-0d34-0410-b5e6-96231b3b80d8
(Note: Eventually, commits like this will be handled via a pre-commit hook that
does this automagically, as well as expand tabs to spaces and look for 80-col
violations.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64827 91177308-0d34-0410-b5e6-96231b3b80d8
U include/llvm/CodeGen/DebugLoc.h
U lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
U lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp
U lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.cpp
Enable debug location generation at -Os. This goes with the reapplication of the
r63639 patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64715 91177308-0d34-0410-b5e6-96231b3b80d8
in inline asm as signed (what gcc does). Add partial support
for x86-specific "e" and "Z" constraints, with appropriate
signedness for printing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64400 91177308-0d34-0410-b5e6-96231b3b80d8
suprise to some callers, e.g. register coalescer. For now, add an parameter
that tells AnalyzeBranch whether it's safe to modify the mbb. A better
solution is out there, but I don't have time to deal with it right now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64124 91177308-0d34-0410-b5e6-96231b3b80d8
Many targets build placeholder nodes for special operands, e.g.
GlobalBaseReg on X86 and PPC for the PIC base. There's no
sensible way to associate debug info with these. I've left
them built with getNode calls with explicit DebugLoc::getUnknownLoc operands.
I'm not too happy about this but don't see a good improvement;
I considered adding a getPseudoOperand or something, but it
seems to me that'll just make it harder to read.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63992 91177308-0d34-0410-b5e6-96231b3b80d8
getCALLSEQ_{END,START} to permit passing no DebugLoc
there. UNDEF doesn't logically have DebugLoc; add
getUNDEF to encapsulate this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63978 91177308-0d34-0410-b5e6-96231b3b80d8
target directories themselves. This also means that VMCore no longer
needs to know about every target's list of intrinsics. Future work
will include converting the PowerPC target to this interface as an
example implementation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63765 91177308-0d34-0410-b5e6-96231b3b80d8
is given, override the subtarget settings and enable 64-bit support.
This restores the earlier behavior, and fixes regressions on
Non-64-bit-capable x86-32 hosts.
This isn't necessarily the best approach, but the most obvious
alternative is to require -mcpu=x86-64 or -mattr=+64bit to be used
with -march=x86-64 when the host doesn't have 64-bit support. This
makes things little more consistent, but it's less convenient, and
it has the practical drawback of requiring lots of test changes, so
I opted for the above approach for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63642 91177308-0d34-0410-b5e6-96231b3b80d8
SSE2, however it's possible to disable SSE2, and the subtarget support
code thinks that if 64-bit implies SSE2 and SSE2 is disabled then
64-bit should also be disabled. Instead, just mark all the 64-bit
subtargets as explicitly supporting SSE2.
Also, move the code that makes -march=x86-64 enable 64-bit support by
default to only apply when there is no explicit subtarget. If you
need to specify a subtarget and you want 64-bit code, you'll need to
select a subtarget that supports 64-bit code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63575 91177308-0d34-0410-b5e6-96231b3b80d8
crashes or wrong code with codegen of large integers:
eliminate the legacy getIntegerVTBitMask and
getIntegerVTSignBit methods, which returned their
value as a uint64_t, so couldn't handle huge types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63494 91177308-0d34-0410-b5e6-96231b3b80d8
dagcombines that help it match in several more cases. Add
several more cases to test/CodeGen/X86/bt.ll. This doesn't
yet include matching for BT with an immediate operand, it
just covers more register+register cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63266 91177308-0d34-0410-b5e6-96231b3b80d8
mergeable string section. I don't see any bad impact of such decision
(rather then placing it into mergeable const section, as it was before),
but at least Darwin linker won't complain anymore.
The problem in LLVM is that we don't have special type for string constants
(like gcc does). Even more, we have two separate types: ConstatArray for non-null
strings and ConstantAggregateZero for null stuff.... It's a bit weird :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63142 91177308-0d34-0410-b5e6-96231b3b80d8
Don't use the Red Zone when dynamic stack realignment is needed.
This could be implemented, but most x86-64 ABIs don't require
dynamic stack realignment so it isn't urgent.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63074 91177308-0d34-0410-b5e6-96231b3b80d8
doesn't support it. The default is set to 'true', so this should not
impact any other target backends.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63058 91177308-0d34-0410-b5e6-96231b3b80d8
tidy up SDUse and related code.
- Replace the operator= member functions with a set method, like
LLVM Use has, and variants setInitial and setNode, which take
care up updating use lists, like LLVM Use's does. This simplifies
code that calls these functions.
- getSDValue() is renamed to get(), as in LLVM Use, though most
places can either use the implicit conversion to SDValue or the
convenience functions instead.
- Fix some more node vs. value terminology issues.
Also, eliminate the one remaining use of SDOperandPtr, and
SDOperandPtr itself.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62995 91177308-0d34-0410-b5e6-96231b3b80d8
- Rename fcmp.ll test to fcmp32.ll, start adding new double tests to fcmp64.ll
- Fix select_bits.ll test
- Capitulate to the DAGCombiner and move i64 constant loads to instruction
selection (SPUISelDAGtoDAG.cpp).
<rant>DAGCombiner will insert all kinds of 64-bit optimizations after
operation legalization occurs and now we have to do most of the work that
instruction selection should be doing twice (once to determine if v2i64
build_vector can be handled by SelectCode(), which then runs all of the
predicates a second time to select the necessary instructions.) But,
CellSPU is a good citizen.</rant>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62990 91177308-0d34-0410-b5e6-96231b3b80d8
corresponding to the "not" and "vnot" PatFrags. Use the new method
in some places where it seems appropriate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62768 91177308-0d34-0410-b5e6-96231b3b80d8
we want to clear %ah to zero before a division, just use a
zero-extending mov to %al. This fixes PR3366.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62691 91177308-0d34-0410-b5e6-96231b3b80d8
- Ensure that (operation) legalization emits proper FDIV libcall when needed.
- Fix various bugs encountered during llvm-spu-gcc build, along with various
cleanups.
- Start supporting double precision comparisons for remaining libgcc2 build.
Discovered interesting DAGCombiner feature, which is currently solved via
custom lowering (64-bit constants are not legal on CellSPU, but DAGCombiner
insists on inserting one anyway.)
- Update README.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62664 91177308-0d34-0410-b5e6-96231b3b80d8
optimize it to a SINT_TO_FP when the sign bit is known zero. X86 isel should perform the optimization itself.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62504 91177308-0d34-0410-b5e6-96231b3b80d8
X86. This code:
void f() {
uint32_t x;
float y = (float)x;
}
used to be:
movl %eax, -8(%ebp)
movl [2^52 double], -4(%ebp)
movsd -8(%ebp), %xmm0
subsd [2^52 double], %xmm0
cvtsd2ss %xmm0, %xmm0
Is now:
movsd [2^52 double], %xmm0
movsd %xmm0, %xmm1
movd %ecx, %xmm2
orps %xmm2, %xmm1
subsd %xmm0, %xmm1
cvtsd2ss %xmm1, %xmm0
This is faster on X86. Note that there's an extra load of %xmm0 into %xmm1. That
will be fixed in a later coalescer fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62404 91177308-0d34-0410-b5e6-96231b3b80d8
a new toy hazard recognizier heuristic which attempts to direct the
scheduler to avoid clumping large groups of loads or stores too densely.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62291 91177308-0d34-0410-b5e6-96231b3b80d8
and into the ScheduleDAGInstrs class, so that they don't get
destructed and re-constructed for each block. This fixes a
compile-time hot spot in the post-pass scheduler.
To help facilitate this, tidy and do some minor reorganization
in the scheduler constructor functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62275 91177308-0d34-0410-b5e6-96231b3b80d8
sequences in SPUDAGToDAGISel.cpp and SPU64InstrInfo.td, killing custom
DAG node types as needed.
- i64 mul is now a legal instruction, but emits an instruction sequence
that stretches tblgen and the imagination, as well as violating laws of
several small countries and most southern US states (just kidding, but
looking at a function with 80+ parameters is really weird and just plain
wrong.)
- Update tests as needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62254 91177308-0d34-0410-b5e6-96231b3b80d8
frame index. eliminateFrameIndex will replace these instructions with
(LDWSP|STWSP|LDAWSP) or (LDW|STW|LDAWF) if a frame pointer is in use.
This fixes PR 3324. Previously we used LDWSP, STWSP, LDAWSP before frame
pointer elimination. However since they were marked as implicitly using
SP they could not be rematerialised.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62238 91177308-0d34-0410-b5e6-96231b3b80d8
to Eli for pointing out that these forms don't ignore the high bits of
their index operands, and as such are not immediately suitable for use
by isel.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62194 91177308-0d34-0410-b5e6-96231b3b80d8
Now Users request DwarfWriter through getAnalysisUsage() instead of creating an instance of DwarfWriter object directly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61955 91177308-0d34-0410-b5e6-96231b3b80d8
into their left operand, rather than their right. Do this
by commuting the operands and inverting the condition.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61842 91177308-0d34-0410-b5e6-96231b3b80d8
converted to LEA64_32r in x86's convertToThreeAddress. This
replaces code like this:
movl %esi, %edi
inc %edi
with this:
lea 1(%rsi), %edi
which appears to be beneficial.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61830 91177308-0d34-0410-b5e6-96231b3b80d8
- Add preliminary support for v2i32; load/store generates the right code but
there's a lot work to be done to make this vector type operational.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61829 91177308-0d34-0410-b5e6-96231b3b80d8
- Fix bugs 3194, 3195: i128 load/stores produce correct code (although, we
need to ensure that i128 is 16-byte aligned in real life), and 128 zero-
extends are supported.
- New td file: SPU128InstrInfo.td: this is where all new i128 support should
be put in the future.
- Continue to hammer on i64 operations and test cases; ensure that the only
remaining problem will be i64 mul.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61784 91177308-0d34-0410-b5e6-96231b3b80d8
- Fix (brcond (setq ...)) bug, where BRNZ should have been used vice BRZ.
- Kill unused/unnecessary nodes in SPUNodes.td
- Beef out the i64operations.c test harness to use a lot of unaligned
loads, test loops and LLVM loop/basic block optimizations; run the
test harness successfully on real Cell hardware.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61664 91177308-0d34-0410-b5e6-96231b3b80d8
- Remove custom lowering for BRCOND
- Add remaining functionality for branches in SPUInstrInfo, such as branch
condition reversal and load/store folding. Updated BrCond test to reflect
branch reversal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61597 91177308-0d34-0410-b5e6-96231b3b80d8
promote from i1 all the way up to the canonical SetCC type.
In order to discover an appropriate type to use, pass
MVT::Other to getSetCCResultType. In order to be able to
do this, change getSetCCResultType to take a type as an
argument, not a value (this is also more logical).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61542 91177308-0d34-0410-b5e6-96231b3b80d8
instruction sequence and cannot ordinarily be simplified by DAGcombine
into the various target description files or SPUDAGToDAGISel.cpp.
This makes some 64-bit operations legal.
- Eliminate target-dependent ISD enums.
- Update tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61508 91177308-0d34-0410-b5e6-96231b3b80d8
- Move v4i32, i32 mul into SPUInstrInfo.td, with a few more instruction
cleanups there as well.
- Make SMUL_LOHI, UMUL_LOHI competely illegal for Cell SPU, to better
assist Chris to see the problem in bug 3101.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61464 91177308-0d34-0410-b5e6-96231b3b80d8
DAGcombine's ability to find reasons to remove truncates when they were not
needed. Consequently, the CellSPU backend would produce correct, but _really
slow and horrible_, code.
Replaced with instruction sequences that do the equivalent truncation in
SPUInstrInfo.td.
- Re-examine how unaligned loads and stores work. Generated unaligned
load code has been tested on the CellSPU hardware; see the i32operations.c
and i64operations.c in CodeGen/CellSPU/useful-harnesses. (While they may be
toy test code, it does prove that some real world code does compile
correctly.)
- Fix truncating stores in bug 3193 (note: unpack_df.ll will still make llc
fault because i64 ult is not yet implemented.)
- Added i64 eq and neq for setcc and select/setcc; started new instruction
information file for them in SPU64InstrInfo.td. Additional i64 operations
should be added to this file and not to SPUInstrInfo.td.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61447 91177308-0d34-0410-b5e6-96231b3b80d8
This removes all the _8, _16, _32, and _64 opcodes and replaces each
group with an unsuffixed opcode. The MemoryVT field of the AtomicSDNode
is now used to carry the size information. In tablegen, the size-specific
opcodes are replaced by size-independent opcodes that utilize the
ability to compose them with predicates.
This shrinks the per-opcode tables and makes the code that handles
atomics much more concise.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61389 91177308-0d34-0410-b5e6-96231b3b80d8
constant shift count that doesn't fit in the shift instruction's
immediate field. This fixes PR3242.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61281 91177308-0d34-0410-b5e6-96231b3b80d8
that have i32 immediates so that they get selected first. This
currently only matters in the JIT, as assemblers will
automatically use the smallest encoding.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61250 91177308-0d34-0410-b5e6-96231b3b80d8
The EH_frame and .eh symbols are now private, except for darwin9 and earlier.
The patch also fixes the definition of PrivateGlobalPrefix on pcc linux.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61242 91177308-0d34-0410-b5e6-96231b3b80d8
and the RegisterScavenger not to expect traditional liveness
techniques are applicable to these registers, since we don't fully
modify the effects of push and pop after stackification.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61179 91177308-0d34-0410-b5e6-96231b3b80d8
which are identical to the original patterns.
- Change the multiply with overflow so that we distinguish between signed and
unsigned multiplication. Currently, unsigned multiplication with overflow
isn't working!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60963 91177308-0d34-0410-b5e6-96231b3b80d8
ISD::ADD to emit an implicit EFLAGS. This was horribly broken. Instead, replace
the intrinsic with an ISD::SADDO node. Then custom lower that into an
X86ISD::ADD node with a associated SETCC that checks the correct condition code
(overflow or carry). Then that gets lowered into the correct X86::ADDOvf
instruction.
Similar for SUB and MUL instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60915 91177308-0d34-0410-b5e6-96231b3b80d8
- Fix bug 3185, with misc other cleanups.
- Needed to implement SPUInstrInfo::InsertBranch(). CAUTION: Not sure what
gets or needs to get passed to InsertBranch() to insert a conditional
branch. This will abort for now until a good test case shows up.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60811 91177308-0d34-0410-b5e6-96231b3b80d8
overflow/carry from the "arithmetic with overflow" intrinsics. It searches the
machine basic block from bottom to top to find the SETO/SETC instruction that is
its conditional. If an instruction modifies EFLAGS before it reaches the
SETO/SETC instruction, then it defaults to the normal instruction emission.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60807 91177308-0d34-0410-b5e6-96231b3b80d8
target-independent way of determining overflow on multiplication. It's very
tricky. Patch by Zoltan Varga!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60800 91177308-0d34-0410-b5e6-96231b3b80d8
- Change default scheduling preference to list-burr, which produces somewhat
better code than the default. Could also use list-tdrr, but need to ask
dev list about the appropriate handy mnemonic before commiting.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60738 91177308-0d34-0410-b5e6-96231b3b80d8
complete. For instance, it lowers the common case into this less-than-optimal
code:
addl %ecx, %eax
seto %cl
testb %cl, %cl
jne LBB1_2 ## overflow
instead of:
addl %ecx, %eax
jo LBB1_2 ## overflow
That will come in a future commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60737 91177308-0d34-0410-b5e6-96231b3b80d8
and use it in x86 address mode folding. Also, make
getRegForValue return 0 for illegal types even if it has a
ValueMap for them, because Argument values are put in the
ValueMap. This fixes PR3181.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60696 91177308-0d34-0410-b5e6-96231b3b80d8
loops when they can be subsumed into addressing modes.
Change X86 addressing mode check to realize that
some PIC references need an extra register.
(I believe this is correct for Linux, if not, I'm sure
someone will tell me.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60608 91177308-0d34-0410-b5e6-96231b3b80d8
1. GlobalBaseReg may have been spilled.
2. It may not be live at the use.
3. Spiller doesn't know this is happening so it won't prevent GlobalBaseReg from being spilled later (That by itself is a nasty hack. It's needed because we don't insert the reload until later).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60595 91177308-0d34-0410-b5e6-96231b3b80d8
- Add v4f32, v2f64 to LowerVECTOR_SHUFFLE
- Look for vector rotate in shuffle elements, generate a vector rotate
instead of a full-blown shuffle when opportunity presents itself.
- Generate larger test harness and fix a few interesting but obscure bugs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60552 91177308-0d34-0410-b5e6-96231b3b80d8
- First patch from Nehal Desai, a new contributor at Aerospace. Nehal's patch
fixes sign/zero/any-extending loads for integers and floating point. Example
code, compiled w/o debugging or optimization where he first noticed the bug:
int main(void) {
float a = 99.0;
printf("%d\n", a);
return 0;
}
Verified that this code actually works on a Cell SPU.
Changes by Scott Michel:
- Fix bug in the value type list constructed by SPUISD::LDRESULT to include
both the load result's result and chain, not just the chain alone.
- Simplify LowerLOAD and remove extraneous and unnecessary chains.
- Remove unused SPUISD pseudo instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60526 91177308-0d34-0410-b5e6-96231b3b80d8
the frame reference. This will help post-RA scheduling determine
that spills to distinct stack slots are independent.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60486 91177308-0d34-0410-b5e6-96231b3b80d8
foldMemoryOperand how to "fold" them, by converting them into constant-pool
loads. When they aren't folded, they use xorps/cmpeqd, but for example when
register pressure is high, they may now be folded as memory operands, which
reduces register pressure.
Also, mark V_SET0 isAsCheapAsAMove so that two-address-elimination will
remat it instead of copying zeros around (V_SETALLONES was already marked).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60461 91177308-0d34-0410-b5e6-96231b3b80d8
is set but mayLoad is not set. Fix all the problems this turned up.
Change code to not use isSimpleLoad instead of mayLoad unless it
really wants isSimpleLoad.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60459 91177308-0d34-0410-b5e6-96231b3b80d8
delegates to the regular x86-32 convention which handles byval, but only
after it handles a few cases, and it's necessary to handle byval before
handling those cases. This fixes PR3122 (and rdar://6400815), llvm-gcc
miscompiling LLVM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60453 91177308-0d34-0410-b5e6-96231b3b80d8
- Incorporate Tilmann Scheller's ISD::TRUNCATE custom lowering patch
- Update SPU calling convention info, even if it's not used yet (but can be
at some point or another)
- Ensure that any-extended f32 loads are custom lowered, especially when
they're promoted for use in printf.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60438 91177308-0d34-0410-b5e6-96231b3b80d8
constant. If X is a constant, then this is folded elsewhere.
- Added a note to Target/README.txt to indicate that we'd like to implement
this when we're able.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60399 91177308-0d34-0410-b5e6-96231b3b80d8
- LowerXADDO lowers [SU]ADDO into an ADD with an implicit EFLAGS define. The
EFLAGS are fed into a SETCC node which has the conditional COND_O or COND_C,
depending on the type of ADDO requested.
- LowerBRCOND now recognizes if it's coming from a SETCC node with COND_O or
COND_C set.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60388 91177308-0d34-0410-b5e6-96231b3b80d8
- Fix v2[if]64 vector insertion code before IBM files a bug report.
- Ensure that zero (0) offsets relative to $sp don't trip an assert
(add $sp, 0 gets legalized to $sp alone, tripping an assert)
- Shuffle masks passed to SPUISD::SHUFB are now v16i8 or v4i32
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60358 91177308-0d34-0410-b5e6-96231b3b80d8
MERGE_VALUES node with only one operand, so get
rid of special code that only existed to handle
that possibility.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60349 91177308-0d34-0410-b5e6-96231b3b80d8
ReplaceNodeResults: rather than returning a node which
must have the same number of results as the original
node (which means mucking around with MERGE_VALUES,
and which is also easy to get wrong since SelectionDAG
folding may mean you don't get the node you expect),
return the results in a vector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60348 91177308-0d34-0410-b5e6-96231b3b80d8
the conditional for the BRCOND statement. For instance, it will generate:
addl %eax, %ecx
jo LOF
instead of
addl %eax, %ecx
; About 10 instructions to compare the signs of LHS, RHS, and sum.
jl LOF
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60123 91177308-0d34-0410-b5e6-96231b3b80d8