Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset,
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100304 91177308-0d34-0410-b5e6-96231b3b80d8
Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset,
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100191 91177308-0d34-0410-b5e6-96231b3b80d8
- Do not try to infer GV alignment unless its type is sized. It's not possible to infer alignment if it has opaque type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100118 91177308-0d34-0410-b5e6-96231b3b80d8
1. Makes it possible to lower with floating point loads and stores.
2. Avoid unaligned loads / stores unless it's fast.
3. Fix some memcpy lowering logic bug related to when to optimize a
load from constant string into a constant.
4. Adjust x86 memcpy lowering threshold to make it more sane.
5. Fix x86 target hook so it uses vector and floating point memory
ops more effectively.
rdar://7774704
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100090 91177308-0d34-0410-b5e6-96231b3b80d8
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)
A update of langref will occur in a subsequent checkin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99928 91177308-0d34-0410-b5e6-96231b3b80d8
--- Reverse-merging r99440 into '.':
U test/MC/AsmParser/X86/x86_32-bit_cat.s
U test/MC/AsmParser/X86/x86_32-encoding.s
U include/llvm/IntrinsicsX86.td
U include/llvm/CodeGen/SelectionDAGNodes.h
U lib/Target/X86/X86InstrSSE.td
U lib/Target/X86/X86ISelLowering.h
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99450 91177308-0d34-0410-b5e6-96231b3b80d8
We cannot use a normal call here since it has extra unmodelled side
effects (it changes stack pointer). This should fix PR5292.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97884 91177308-0d34-0410-b5e6-96231b3b80d8
don't alis it in the MMX .td file with a different width,
split into two X86ISD opcodes. This fixes an x86 testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96859 91177308-0d34-0410-b5e6-96231b3b80d8
even when -tailcallopt is not specified and it does not require changing ABI.
First case is the most trivial one. Perform tail call optimization when both
the caller and callee do not return values and when the callee does not take
any input arguments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94664 91177308-0d34-0410-b5e6-96231b3b80d8
Target independent isel should always pass along the "tail call" property. Change
target hook LowerCall's parameter "isTailCall" into a refernce. If the target
decides it's impossible to honor the tail call request, it should set isTailCall
to false to make target independent isel happy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94626 91177308-0d34-0410-b5e6-96231b3b80d8
which is more convenient, and change getPICJumpTableRelocBaseExpr
to take a MachineFunction to match.
Next, move the X86 code that create a PICBase symbol to
X86TargetLowering::getPICBaseSymbol from
X86MCInstLower::GetPICBaseSymbol, which was an asmprinter specific
library. This eliminates a 'gross hack', and allows us to
implement X86ISelLowering::getPICJumpTableRelocBaseExpr which now
calls it.
This in turn allows us to eliminate the
X86AsmPrinter::printPICJumpTableSetLabel method, which was the
only overload of printPICJumpTableSetLabel.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94526 91177308-0d34-0410-b5e6-96231b3b80d8
jump table entry kind, instead of overloading
AsmPrinter::printPICJumpTableEntry.
This has a pretty horrible and inefficient FIXME around how @GOTOFF
is currently smashed into the mcsymbol name, but otherwise this is
much cleaner.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94516 91177308-0d34-0410-b5e6-96231b3b80d8
1. rename the movhp patfrag to movlhps, since thats what it actually matches
2. eliminate the bogus movhps load and store patterns, they were incorrect. The load transforms are already handled (correctly) by shufps/unpack.
3. revert a recent test change to its correct form.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@86415 91177308-0d34-0410-b5e6-96231b3b80d8
- Allocate MachineMemOperands and MachineMemOperand lists in MachineFunctions.
This eliminates MachineInstr's std::list member and allows the data to be
created by isel and live for the remainder of codegen, avoiding a lot of
copying and unnecessary translation. This also shrinks MemSDNode.
- Delete MemOperandSDNode. Introduce MachineSDNode which has dedicated
fields for MachineMemOperands.
- Change MemSDNode to have a MachineMemOperand member instead of its own
fields with the same information. This introduces some redundancy, but
it's more consistent with what MachineInstr will eventually want.
- Ignore alignment when searching for redundant loads for CSE, but remember
the greatest alignment.
Target-specific code which previously used MemOperandSDNodes with generic
SDNodes now use MemIntrinsicSDNodes, with opcodes in a designated range
so that the SelectionDAG framework knows that MachineMemOperand information
is available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@82794 91177308-0d34-0410-b5e6-96231b3b80d8
on x86, to avoid explicit test instructions. A few existing tests changed
due to arbitrary register allocation differences.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@82263 91177308-0d34-0410-b5e6-96231b3b80d8
encodings.
- Make some of the values emitted by the FDEs dependent upon the pointer
size. This is in line with how GCC does things. And it has the benefit of
working for Darwin in 64-bit mode now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@80428 91177308-0d34-0410-b5e6-96231b3b80d8
Add patterns and instruction encoding information.
Add custom lowering to deal with hardwired return register of
uncertain type (xmm0).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@79377 91177308-0d34-0410-b5e6-96231b3b80d8
support unaligned mem access only for certain types. (Should it be size
instead?)
ARM v7 supports unaligned access for i16 and i32, some v6 variants support it
as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@79127 91177308-0d34-0410-b5e6-96231b3b80d8
the register save area if %al is 0. This avoids touching xmm
regsiters when they aren't actually used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@79061 91177308-0d34-0410-b5e6-96231b3b80d8
Instead of awkwardly encoding calling-convention information with ISD::CALL,
ISD::FORMAL_ARGUMENTS, ISD::RET, and ISD::ARG_FLAGS nodes, TargetLowering
provides three virtual functions for targets to override:
LowerFormalArguments, LowerCall, and LowerRet, which replace the custom
lowering done on the special nodes. They provide the same information, but
in a more immediately usable format.
This also reworks much of the target-independent tail call logic. The
decision of whether or not to perform a tail call is now cleanly split
between target-independent portions, and the target dependent portion
in IsEligibleForTailCallOptimization.
This also synchronizes all in-tree targets, to help enable future
refactoring and feature work.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@78142 91177308-0d34-0410-b5e6-96231b3b80d8
When the return value is not used (i.e. only care about the value in the memory), x86 does not have to use add to implement these. Instead, it can use add, sub, inc, dec instructions with the "lock" prefix.
This is currently implemented using a bit of instruction selection trick. The issue is the target independent pattern produces one output and a chain and we want to map it into one that just output a chain. The current trick is to select it into a merge_values with the first definition being an implicit_def. The proper solution is to add new ISD opcodes for the no-output variant. DAG combiner can then transform the node before it gets to target node selection.
Problem #2 is we are adding a whole bunch of x86 atomic instructions when in fact these instructions are identical to the non-lock versions. We need a way to add target specific information to target nodes and have this information carried over to machine instructions. Asm printer (or JIT) can use this information to add the "lock" prefix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@77582 91177308-0d34-0410-b5e6-96231b3b80d8
have the alignment be calculated up front, and have the back-ends obey whatever
alignment is decided upon.
This allows for future work that would allow for precise no-op placement and the
like.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74564 91177308-0d34-0410-b5e6-96231b3b80d8
Update code generator to use this attribute and remove NoImplicitFloat target option.
Update llc to set this attribute when -no-implicit-float command line option is used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72959 91177308-0d34-0410-b5e6-96231b3b80d8
ADDC/ADDE use MVT::i1 (later, whatever it gets legalized to)
instead of MVT::Flag. Remove CARRY_FALSE in favor of 0; adjust
all target-independent code to use this format.
Most targets will still produce a Flag-setting target-dependent
version when selection is done. X86 is converted to use i32
instead, which means TableGen needs to produce different code
in xxxGenDAGISel.inc. This keys off the new supportsHasI1 bit
in xxxInstrInfo, currently set only for X86; in principle this
is temporary and should go away when all other targets have
been converted. All relevant X86 instruction patterns are
modified to represent setting and using EFLAGS explicitly. The
same can be done on other targets.
The immediate behavior change is that an ADC/ADD pair are no
longer tightly coupled in the X86 scheduler; they can be
separated by instructions that don't clobber the flags (MOV).
I will soon add some peephole optimizations based on using
other instructions that set the flags to feed into ADC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72707 91177308-0d34-0410-b5e6-96231b3b80d8
e.g.
orl $65536, 8(%rax)
=>
orb $1, 10(%rax)
Since narrowing is not always a win, e.g. i32 -> i16 is a loss on x86, dag combiner consults with the target before performing the optimization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72507 91177308-0d34-0410-b5e6-96231b3b80d8
systems instead of attempting to promote them to a 64-bit SINT_TO_FP or
FP_TO_SINT. This is in preparation for removing the type legalization
code from LegalizeDAG: once type legalization is gone from LegalizeDAG,
it won't be able to handle the i64 operand/result correctly.
This isn't quite ideal, but I don't think any other operation for any
target ends up in this situation, so treating this case specially seems
reasonable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72324 91177308-0d34-0410-b5e6-96231b3b80d8
PR2957
ISD::VECTOR_SHUFFLE now stores an array of integers representing the shuffle
mask internal to the node, rather than taking a BUILD_VECTOR of ConstantSDNodes
as the shuffle mask. A value of -1 represents UNDEF.
In addition to eliminating the creation of illegal BUILD_VECTORS just to
represent shuffle masks, we are better about canonicalizing the shuffle mask,
resulting in substantially better code for some classes of shuffles.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@70225 91177308-0d34-0410-b5e6-96231b3b80d8
ISD::VECTOR_SHUFFLE now stores an array of integers representing the shuffle
mask internal to the node, rather than taking a BUILD_VECTOR of ConstantSDNodes
as the shuffle mask. A value of -1 represents UNDEF.
In addition to eliminating the creation of illegal BUILD_VECTORS just to
represent shuffle masks, we are better about canonicalizing the shuffle mask,
resulting in substantially better code for some classes of shuffles.
A clean up of x86 shuffle code, and some canonicalizing in DAGCombiner is next.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@69952 91177308-0d34-0410-b5e6-96231b3b80d8
with SUBREG_TO_REG, teach SimpleRegisterCoalescing to coalesce
SUBREG_TO_REG instructions (which are similar to INSERT_SUBREG
instructions), and teach the DAGCombiner to take advantage of this on
targets which support it. This eliminates many redundant
zero-extension operations on x86-64.
This adds a new TargetLowering hook, isZExtFree. It's similar to
isTruncateFree, except it only applies to actual definitions, and not
no-op truncates which may not zero the high bits.
Also, this adds a new optimization to SimplifyDemandedBits: transform
operations like x+y into (zext (add (trunc x), (trunc y))) on targets
where all the casts are no-ops. In contexts where the high part of the
add is explicitly masked off, this allows the mask operation to be
eliminated. Fix the DAGCombiner to avoid undoing these transformations
to eliminate casts on targets where the casts are no-ops.
Also, this adds a new two-address lowering heuristic. Since
two-address lowering runs before coalescing, it helps to be able to
look through copies when deciding whether commuting and/or
three-address conversion are profitable.
Also, fix a bug in LiveInterval::MergeInClobberRanges. It didn't handle
the case that a clobber range extended both before and beyond an
existing live range. In that case, multiple live ranges need to be
added. This was exposed by the new subreg coalescing code.
Remove 2008-05-06-SpillerBug.ll. It was bugpoint-reduced, and the
spiller behavior it was looking for no longer occurrs with the new
instruction selection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@68576 91177308-0d34-0410-b5e6-96231b3b80d8
builds.
--- Reverse-merging (from foreign repository) r68552 into '.':
U test/CodeGen/X86/tls8.ll
U test/CodeGen/X86/tls10.ll
U test/CodeGen/X86/tls2.ll
U test/CodeGen/X86/tls6.ll
U lib/Target/X86/X86Instr64bit.td
U lib/Target/X86/X86InstrSSE.td
U lib/Target/X86/X86InstrInfo.td
U lib/Target/X86/X86RegisterInfo.cpp
U lib/Target/X86/X86ISelLowering.cpp
U lib/Target/X86/X86CodeEmitter.cpp
U lib/Target/X86/X86FastISel.cpp
U lib/Target/X86/X86InstrInfo.h
U lib/Target/X86/X86ISelDAGToDAG.cpp
U lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.cpp
U lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.cpp
U lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.h
U lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.h
U lib/Target/X86/X86ISelLowering.h
U lib/Target/X86/X86InstrInfo.cpp
U lib/Target/X86/X86InstrBuilder.h
U lib/Target/X86/X86RegisterInfo.td
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@68560 91177308-0d34-0410-b5e6-96231b3b80d8
This introduces a small regression on the generated code
quality in the case we are just computing addresses, not
loading values.
Will work on it and on X86-64 support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@68552 91177308-0d34-0410-b5e6-96231b3b80d8
the same say the "test" instruction does in overflow cases,
so eliminating the test is only safe when those bits aren't
needed, as is the case for COND_E and COND_NE, or if it
can be proven that no overflow will occur. For now, just
restrict the optimization to COND_E and COND_NE and don't
do any overflow analysis.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66318 91177308-0d34-0410-b5e6-96231b3b80d8
X86. This code:
void f() {
uint32_t x;
float y = (float)x;
}
used to be:
movl %eax, -8(%ebp)
movl [2^52 double], -4(%ebp)
movsd -8(%ebp), %xmm0
subsd [2^52 double], %xmm0
cvtsd2ss %xmm0, %xmm0
Is now:
movsd [2^52 double], %xmm0
movsd %xmm0, %xmm1
movd %ecx, %xmm2
orps %xmm2, %xmm1
subsd %xmm0, %xmm1
cvtsd2ss %xmm1, %xmm0
This is faster on X86. Note that there's an extra load of %xmm0 into %xmm1. That
will be fixed in a later coalescer fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62404 91177308-0d34-0410-b5e6-96231b3b80d8
promote from i1 all the way up to the canonical SetCC type.
In order to discover an appropriate type to use, pass
MVT::Other to getSetCCResultType. In order to be able to
do this, change getSetCCResultType to take a type as an
argument, not a value (this is also more logical).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61542 91177308-0d34-0410-b5e6-96231b3b80d8
which are identical to the original patterns.
- Change the multiply with overflow so that we distinguish between signed and
unsigned multiplication. Currently, unsigned multiplication with overflow
isn't working!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60963 91177308-0d34-0410-b5e6-96231b3b80d8
ISD::ADD to emit an implicit EFLAGS. This was horribly broken. Instead, replace
the intrinsic with an ISD::SADDO node. Then custom lower that into an
X86ISD::ADD node with a associated SETCC that checks the correct condition code
(overflow or carry). Then that gets lowered into the correct X86::ADDOvf
instruction.
Similar for SUB and MUL instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60915 91177308-0d34-0410-b5e6-96231b3b80d8
target-independent way of determining overflow on multiplication. It's very
tricky. Patch by Zoltan Varga!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60800 91177308-0d34-0410-b5e6-96231b3b80d8