- Fix fabs, fneg for f32 and f64.
- Use BuildVectorSDNode.isConstantSplat, now that the functionality exists
- Continue to improve i64 constant lowering. Lower certain special constants
to the constant pool when they correspond to SPU's shufb instruction's
special mask values. This avoids the overhead of performing a shuffle on a
zero-filled vector just to get the special constant when the memory load
suffices.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67067 91177308-0d34-0410-b5e6-96231b3b80d8
1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants.
2. MachineConstantPool alignment field is also a log2 value.
3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values.
4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries.
5. Asm printer uses expensive data structure multimap to track constant pool entries by sections.
6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic.
Solutions:
1. ConstantPoolSDNode alignment field is changed to keep non-log2 value.
2. MachineConstantPool alignment field is also changed to keep non-log2 value.
3. Functions that create ConstantPool nodes are passing in non-log2 alignments.
4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT.
5. Asm printer uses cheaper data structure to group constant pool entries.
6. Asm printer compute entry offsets after grouping is done.
7. Change JIT code to compute entry offsets on the fly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66875 91177308-0d34-0410-b5e6-96231b3b80d8
instruction. The class also consolidates the code for detecting constant
splats that's shared across PowerPC and the CellSPU backends (and might be
useful for other backends.) Also introduces SelectionDAG::getBUID_VECTOR() for
generating new BUILD_VECTOR nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65296 91177308-0d34-0410-b5e6-96231b3b80d8
- Rename fcmp.ll test to fcmp32.ll, start adding new double tests to fcmp64.ll
- Fix select_bits.ll test
- Capitulate to the DAGCombiner and move i64 constant loads to instruction
selection (SPUISelDAGtoDAG.cpp).
<rant>DAGCombiner will insert all kinds of 64-bit optimizations after
operation legalization occurs and now we have to do most of the work that
instruction selection should be doing twice (once to determine if v2i64
build_vector can be handled by SelectCode(), which then runs all of the
predicates a second time to select the necessary instructions.) But,
CellSPU is a good citizen.</rant>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62990 91177308-0d34-0410-b5e6-96231b3b80d8
- Ensure that (operation) legalization emits proper FDIV libcall when needed.
- Fix various bugs encountered during llvm-spu-gcc build, along with various
cleanups.
- Start supporting double precision comparisons for remaining libgcc2 build.
Discovered interesting DAGCombiner feature, which is currently solved via
custom lowering (64-bit constants are not legal on CellSPU, but DAGCombiner
insists on inserting one anyway.)
- Update README.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62664 91177308-0d34-0410-b5e6-96231b3b80d8
and into the ScheduleDAGInstrs class, so that they don't get
destructed and re-constructed for each block. This fixes a
compile-time hot spot in the post-pass scheduler.
To help facilitate this, tidy and do some minor reorganization
in the scheduler constructor functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62275 91177308-0d34-0410-b5e6-96231b3b80d8
sequences in SPUDAGToDAGISel.cpp and SPU64InstrInfo.td, killing custom
DAG node types as needed.
- i64 mul is now a legal instruction, but emits an instruction sequence
that stretches tblgen and the imagination, as well as violating laws of
several small countries and most southern US states (just kidding, but
looking at a function with 80+ parameters is really weird and just plain
wrong.)
- Update tests as needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62254 91177308-0d34-0410-b5e6-96231b3b80d8
instruction sequence and cannot ordinarily be simplified by DAGcombine
into the various target description files or SPUDAGToDAGISel.cpp.
This makes some 64-bit operations legal.
- Eliminate target-dependent ISD enums.
- Update tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61508 91177308-0d34-0410-b5e6-96231b3b80d8
DAGcombine's ability to find reasons to remove truncates when they were not
needed. Consequently, the CellSPU backend would produce correct, but _really
slow and horrible_, code.
Replaced with instruction sequences that do the equivalent truncation in
SPUInstrInfo.td.
- Re-examine how unaligned loads and stores work. Generated unaligned
load code has been tested on the CellSPU hardware; see the i32operations.c
and i64operations.c in CodeGen/CellSPU/useful-harnesses. (While they may be
toy test code, it does prove that some real world code does compile
correctly.)
- Fix truncating stores in bug 3193 (note: unpack_df.ll will still make llc
fault because i64 ult is not yet implemented.)
- Added i64 eq and neq for setcc and select/setcc; started new instruction
information file for them in SPU64InstrInfo.td. Additional i64 operations
should be added to this file and not to SPUInstrInfo.td.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61447 91177308-0d34-0410-b5e6-96231b3b80d8
- Fix bug 3185, with misc other cleanups.
- Needed to implement SPUInstrInfo::InsertBranch(). CAUTION: Not sure what
gets or needs to get passed to InsertBranch() to insert a conditional
branch. This will abort for now until a good test case shows up.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60811 91177308-0d34-0410-b5e6-96231b3b80d8
- First patch from Nehal Desai, a new contributor at Aerospace. Nehal's patch
fixes sign/zero/any-extending loads for integers and floating point. Example
code, compiled w/o debugging or optimization where he first noticed the bug:
int main(void) {
float a = 99.0;
printf("%d\n", a);
return 0;
}
Verified that this code actually works on a Cell SPU.
Changes by Scott Michel:
- Fix bug in the value type list constructed by SPUISD::LDRESULT to include
both the load result's result and chain, not just the chain alone.
- Simplify LowerLOAD and remove extraneous and unnecessary chains.
- Remove unused SPUISD pseudo instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60526 91177308-0d34-0410-b5e6-96231b3b80d8
- Fix v2[if]64 vector insertion code before IBM files a bug report.
- Ensure that zero (0) offsets relative to $sp don't trip an assert
(add $sp, 0 gets legalized to $sp alone, tripping an assert)
- Shuffle masks passed to SPUISD::SHUFB are now v16i8 or v4i32
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60358 91177308-0d34-0410-b5e6-96231b3b80d8
(a) Remove conditionally removed code in SelectXAddr. Basically, hope for the
best that the A-form and D-form address predicates catch everything before
the code decides to emit a X-form address.
(b) Expand vector store test cases to include the usual suspects.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60034 91177308-0d34-0410-b5e6-96231b3b80d8
priority function. Instead, just iterate over the AllNodes list, which is
already in topological order. This eliminates a fair amount of bookkeeping,
and speeds up the isel phase by about 15% on many testcases.
The impact on most targets is that AddToISelQueue calls can be simply removed.
In the x86 target, there are two additional notable changes.
The rule-bending AND+SHIFT optimization in MatchAddress that creates new
pre-isel nodes during isel is now a little more verbose, but more robust.
Instead of either creating an invalid DAG or creating an invalid topological
sort, as it has historically done, it can now just insert the new nodes into
the node list at a position where they will be consistent with the topological
ordering.
Also, the address-matching code has logic that checked to see if a node was
"already selected". However, when a node is selected, it has all its uses
taken away via ReplaceAllUsesWith or equivalent, so it won't recieve any
further visits from MatchAddress. This code is now removed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@58748 91177308-0d34-0410-b5e6-96231b3b80d8
flag. Then in a debugger developers can set breakpoints at these calls
to see waht is about to be selected and what the resulting subgraph
looks like. This really helps when debugging instruction selection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@58278 91177308-0d34-0410-b5e6-96231b3b80d8
with ConstantInt. This led to fixing a bug in TargetLowering.cpp
using getValue instead of getAPIntValue.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@56159 91177308-0d34-0410-b5e6-96231b3b80d8
process up to a higher level. This allows FastISel to leverage
more of SelectionDAGISel's infastructure, such as updating Machine
PHI nodes.
Also, implement transitioning from SDISel back to FastISel in
the middle of a block, so it's now possible to go back and
forth. This allows FastISel to hand individual CallInsts and other
complicated things off to SDISel to handle, while handling the rest
of the block itself.
To help support this, reorganize the SelectionDAG class so that it
is allocated once and reused throughout a function, instead of
being completely reallocated for each block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@55219 91177308-0d34-0410-b5e6-96231b3b80d8
replacement of multiple values. This is slightly more efficient
than doing multiple ReplaceAllUsesOfValueWith calls, and theoretically
could be optimized even further. However, an important property of this
new function is that it handles the case where the source value set and
destination value set overlap. This makes it feasible for isel to use
SelectNodeTo in many very common cases, which is advantageous because
SelectNodeTo avoids a temporary node and it doesn't require CSEMap
updates for users of values that don't change position.
Revamp MorphNodeTo, which is what does all the work of SelectNodeTo, to
handle operand lists more efficiently, and to correctly handle a number
of corner cases to which its new wider use exposes it.
This commit also includes a change to the encoding of post-isel opcodes
in SDNodes; now instead of being sandwiched between the target-independent
pre-isel opcodes and the target-dependent pre-isel opcodes, post-isel
opcodes are now represented as negative values. This makes it possible
to test if an opcode is pre-isel or post-isel without having to know
the size of the current target's post-isel instruction set.
These changes speed up llc overall by 3% and reduce memory usage by 10%
on the InstructionCombining.cpp testcase with -fast and -regalloc=local.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53728 91177308-0d34-0410-b5e6-96231b3b80d8
of apint codegen failure is the DAG combiner doing
the wrong thing because it was comparing MVT's using
< rather than comparing the number of bits. Removing
the < method makes this mistake impossible to commit.
Instead, add helper methods for comparing bits and use
them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52098 91177308-0d34-0410-b5e6-96231b3b80d8
and better control the abstraction. Rename the type
to MVT. To update out-of-tree patches, the main
thing to do is to rename MVT::ValueType to MVT, and
rewrite expressions like MVT::getSizeInBits(VT) in
the form VT.getSizeInBits(). Use VT.getSimpleVT()
to extract a MVT::SimpleValueType for use in switch
statements (you will get an assert failure if VT is
an extended value type - these shouldn't exist after
type legalization).
This results in a small speedup of codegen and no
new testsuite failures (x86-64 linux).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52044 91177308-0d34-0410-b5e6-96231b3b80d8
several things that were neither in an anonymous namespace nor static
but not intended to be global.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51017 91177308-0d34-0410-b5e6-96231b3b80d8
fixes are target-specific lowering of frame indices, fix constants generated
for the FSMBI instruction, and fixing SPUTargetLowering::computeMaskedBitsFor-
TargetNode().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50462 91177308-0d34-0410-b5e6-96231b3b80d8
Fix bugs encountered, mostly due to range matching for immediates;
the CellSPU's 10-bit immediates are sign extended, covering a
larger range of unsigned values.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48575 91177308-0d34-0410-b5e6-96231b3b80d8
for CellSPU modifications:
- SPUInstrInfo.td refactoring: "multiclass" really is _your_ friend.
- Other improvements based on refactoring effort in SPUISelLowering.cpp,
esp. in SPUISelLowering::PerformDAGCombine(), where zero amount shifts and
rotates are now eliminiated, other scalar-to-vector-to-scalar silliness
is also eliminated.
- 64-bit operations are being implemented, _muldi3.c gcc runtime now
compiles and generates the right code. More work still needs to be done.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47532 91177308-0d34-0410-b5e6-96231b3b80d8
Added ISD::DECLARE node type to represent llvm.dbg.declare intrinsic. Now the intrinsic calls are lowered into a SDNode and lives on through out the codegen passes.
For now, since all the debugging information recording is done at isel time, when a ISD::DECLARE node is selected, it has the side effect of also recording the variable. This is a short term solution that should be fixed in time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46659 91177308-0d34-0410-b5e6-96231b3b80d8
- Expand tabs... (poss 80-col violations, will get them later...)
- Consolidate logic for SelectDFormAddr and SelectDForm2Addr into a single
function, simplifying maintenance. Also reduced custom instruction
generation for SPUvecinsert/INSERT_MASK.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46544 91177308-0d34-0410-b5e6-96231b3b80d8
only two addressing mode nodes, SPUaform and SPUindirect (vice the
three previous ones, SPUaform, SPUdform and SPUxform). This improves
code somewhat because we now avoid using reg+reg addressing when
it can be avoided. It also simplifies the address selection logic,
which was the main point for doing this.
Also, for various global variables that would be loaded using SPU's
A-form addressing, prefer D-form offs[reg] addressing, keeping the
base in a register if the variable is used more than once.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46483 91177308-0d34-0410-b5e6-96231b3b80d8
Fixed CellSPU's A-form (local store) address mode, so that all globals,
externals, constant pool and jump table symbols are now wrapped within
a SPUISD::AFormAddr pseudo-instruction. This now identifies all local
store memory addresses, although it requires a bit of legerdemain during
instruction selection to properly select loads to and stores from local
store, properly generating "LQA" instructions.
Also added mul_ops.ll test harness for exercising integer multiplication.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46142 91177308-0d34-0410-b5e6-96231b3b80d8