it is highly specific to the object file that will be generated in the end,
this introduces a new TargetLoweringObjectFile interface that is implemented
for each of ELF/MachO/COFF/Alpha/PIC16 and XCore.
Though still is still a brutal and ugly refactoring, this is a major step
towards goodness.
This patch also:
1. fixes a bunch of dangling pointer problems in the PIC16 backend.
2. disables the TargetLowering copy ctor which PIC16 was accidentally using.
3. gets us closer to xcore having its own crazy target section flags and
pic16 not having to shadow sections with its own objects.
4. fixes wierdness where ELF targets would set CStringSection but not
CStringSection_. Factor the code better.
5. fixes some bugs in string lowering on ELF targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@77294 91177308-0d34-0410-b5e6-96231b3b80d8
to support vector arguments and scalar arguments correctly. Update
lowering and fix comment to refer to pinsr* instead of insertps.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@76921 91177308-0d34-0410-b5e6-96231b3b80d8
be useful, and it's currently unused. (Some issues: it isn't actually
rich enough to capture the semantics on many architectures, and
semantics can vary depending on the type being shifted.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@76633 91177308-0d34-0410-b5e6-96231b3b80d8
flags set properly. (hasMemory is clearly irrelevant
when matching 'i', I don't understand what this was
supposed to be doing.)
gcc.apple/asm-block-25.c (test passed before by
accident, but generated code was wrong)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@76503 91177308-0d34-0410-b5e6-96231b3b80d8
This adds location info for all llvm_unreachable calls (which is a macro now) in
!NDEBUG builds.
In NDEBUG builds location info and the message is off (it only prints
"UREACHABLE executed").
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75640 91177308-0d34-0410-b5e6-96231b3b80d8
Basically, using:
lea symbol(%rip), %rax
is not valid in -static mode, because the current RIP may not be
within 32-bits of "symbol" when an app is built partially pic and
partially static. The fix for this is to compile it to:
lea symbol, %rax
It would be better to codegen this as:
movq $symbol, %rax
but that will come next.
The hard part of fixing this bug was fixing abi-isel, which was actively
testing for the wrong behavior. Also, the RUN lines are completely impossible
to understand what they are testing. To help with this, convert the -static
x86-64 codegen tests to use filecheck. This is much more stable and makes it
more clear what the codegen is expected to be.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75382 91177308-0d34-0410-b5e6-96231b3b80d8
Make llvm_unreachable take an optional string, thus moving the cerr<< out of
line.
LLVM_UNREACHABLE is now a simple wrapper that makes the message go away for
NDEBUG builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75379 91177308-0d34-0410-b5e6-96231b3b80d8
is just a trivial wrapper around "ClassifyGlobalReference", which
stole a ton of logic from LowerGlobalAddress.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75237 91177308-0d34-0410-b5e6-96231b3b80d8
With the SVR4 ABI on PowerPC, vector arguments for vararg calls are passed differently depending on whether they are a fixed or a variable argument. Variable vector arguments always go into memory, fixed vector arguments are put
into vector registers. If there are no free vector registers available, fixed vector arguments are put on the stack.
The NumFixedArgs attribute allows to decide for an argument in a vararg call whether it belongs to the fixed or variable portion of the parameter list.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74764 91177308-0d34-0410-b5e6-96231b3b80d8
have the alignment be calculated up front, and have the back-ends obey whatever
alignment is decided upon.
This allows for future work that would allow for precise no-op placement and the
like.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74564 91177308-0d34-0410-b5e6-96231b3b80d8
fence-atomic-fence down to just the atomic op. This is possible thanks to
X86's relatively strong memory model, which guarantees that locked instructions
(which are used to implement atomics) are implicit fences.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74435 91177308-0d34-0410-b5e6-96231b3b80d8
implementation primarily differs from the former in that the asmprinter
doesn't make a zillion decisions about whether or not something will be
RIP relative or not. Instead, those decisions are made by isel lowering
and propagated through to the asm printer. To achieve this, we:
1. Represent RIP relative addresses by setting the base of the X86 addr
mode to X86::RIP.
2. When ISel Lowering decides that it is safe to use RIP, it lowers to
X86ISD::WrapperRIP. When it is unsafe to use RIP, it lowers to
X86ISD::Wrapper as before.
3. This removes isRIPRel from X86ISelAddressMode, representing it with
a basereg of RIP instead.
4. The addressing mode matching logic in isel is greatly simplified.
5. The asmprinter is greatly simplified, notably the "NotRIPRel" predicate
passed through various printoperand routines is gone now.
6. The various symbol printing routines in asmprinter now no longer infer
when to emit (%rip), they just print the symbol.
I think this is a big improvement over the previous situation. It does have
two small caveats though: 1. I implemented a horrible "no-rip" modifier for
the inline asm "P" constraint modifier. This is a short term hack, there is
a much better, but more involved, solution. 2. I had to xfail an
-aggressive-remat testcase because it isn't handling the use of RIP in the
constant-pool reading instruction. This specific test is easy to fix without
-aggressive-remat, which I intend to do next.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74372 91177308-0d34-0410-b5e6-96231b3b80d8
out of sync with regular cc.
The only difference between the tail call cc and the normal
cc was that one parameter register - R9 - was reserved for
calling functions through a function pointer. After time the
tail call cc has gotten out of sync with the regular cc.
We can use R11 which is also caller saved but not used as
parameter register for potential function pointers and
remove the special tail call cc on x86-64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73233 91177308-0d34-0410-b5e6-96231b3b80d8
on x86 to handle more cases. Fix a bug in said code that would cause it
to read past the end of an object. Rewrite the code in
SelectionDAGLegalize::ExpandBUILD_VECTOR to be a bit more general.
Remove PerformBuildVectorCombine, which is no longer necessary with
these changes. In addition to simplifying the code, with this change,
we can now catch a few more cases of consecutive loads.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73012 91177308-0d34-0410-b5e6-96231b3b80d8
nodes for vectors with an i16 element type. Add an optimization for
building a vector which is all zeros/undef except for the bottom
element, where the bottom element is an i8 or i16.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72988 91177308-0d34-0410-b5e6-96231b3b80d8
Update code generator to use this attribute and remove NoImplicitFloat target option.
Update llc to set this attribute when -no-implicit-float command line option is used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72959 91177308-0d34-0410-b5e6-96231b3b80d8
build vectors with i64 elements will only appear on 32b x86 before legalize.
Since vector widening occurs during legalize, and produces i64 build_vector
elements, the dag combiner is never run on these before legalize splits them
into 32b elements.
Teach the build_vector dag combine in x86 back end to recognize consecutive
loads producing the low part of the vector.
Convert the two uses of TLI's consecutive load recognizer to pass LoadSDNodes
since that was required implicitly.
Add a testcase for the transform.
Old:
subl $28, %esp
movl 32(%esp), %eax
movl 4(%eax), %ecx
movl %ecx, 4(%esp)
movl (%eax), %eax
movl %eax, (%esp)
movaps (%esp), %xmm0
pmovzxwd %xmm0, %xmm0
movl 36(%esp), %eax
movaps %xmm0, (%eax)
addl $28, %esp
ret
New:
movl 4(%esp), %eax
pmovzxwd (%eax), %xmm0
movl 8(%esp), %eax
movaps %xmm0, (%eax)
ret
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72957 91177308-0d34-0410-b5e6-96231b3b80d8
ADDC/ADDE use MVT::i1 (later, whatever it gets legalized to)
instead of MVT::Flag. Remove CARRY_FALSE in favor of 0; adjust
all target-independent code to use this format.
Most targets will still produce a Flag-setting target-dependent
version when selection is done. X86 is converted to use i32
instead, which means TableGen needs to produce different code
in xxxGenDAGISel.inc. This keys off the new supportsHasI1 bit
in xxxInstrInfo, currently set only for X86; in principle this
is temporary and should go away when all other targets have
been converted. All relevant X86 instruction patterns are
modified to represent setting and using EFLAGS explicitly. The
same can be done on other targets.
The immediate behavior change is that an ADC/ADD pair are no
longer tightly coupled in the X86 scheduler; they can be
separated by instructions that don't clobber the flags (MOV).
I will soon add some peephole optimizations based on using
other instructions that set the flags to feed into ADC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72707 91177308-0d34-0410-b5e6-96231b3b80d8
e.g.
orl $65536, 8(%rax)
=>
orb $1, 10(%rax)
Since narrowing is not always a win, e.g. i32 -> i16 is a loss on x86, dag combiner consults with the target before performing the optimization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72507 91177308-0d34-0410-b5e6-96231b3b80d8
FP_TO_XINT. Necessary for some cleanups I'm working on. Updated
from the previous version (r72431) to fix a bug and make some things a
bit clearer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72445 91177308-0d34-0410-b5e6-96231b3b80d8
systems instead of attempting to promote them to a 64-bit SINT_TO_FP or
FP_TO_SINT. This is in preparation for removing the type legalization
code from LegalizeDAG: once type legalization is gone from LegalizeDAG,
it won't be able to handle the i64 operand/result correctly.
This isn't quite ideal, but I don't think any other operation for any
target ends up in this situation, so treating this case specially seems
reasonable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72324 91177308-0d34-0410-b5e6-96231b3b80d8
PR2957
ISD::VECTOR_SHUFFLE now stores an array of integers representing the shuffle
mask internal to the node, rather than taking a BUILD_VECTOR of ConstantSDNodes
as the shuffle mask. A value of -1 represents UNDEF.
In addition to eliminating the creation of illegal BUILD_VECTORS just to
represent shuffle masks, we are better about canonicalizing the shuffle mask,
resulting in substantially better code for some classes of shuffles.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@70225 91177308-0d34-0410-b5e6-96231b3b80d8
ISD::VECTOR_SHUFFLE now stores an array of integers representing the shuffle
mask internal to the node, rather than taking a BUILD_VECTOR of ConstantSDNodes
as the shuffle mask. A value of -1 represents UNDEF.
In addition to eliminating the creation of illegal BUILD_VECTORS just to
represent shuffle masks, we are better about canonicalizing the shuffle mask,
resulting in substantially better code for some classes of shuffles.
A clean up of x86 shuffle code, and some canonicalizing in DAGCombiner is next.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@69952 91177308-0d34-0410-b5e6-96231b3b80d8
in the MachineFunction class, renaming it to addLiveIn for consistency with
the same method in MachineBasicBlock. Thanks for Anton for suggesting this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@69615 91177308-0d34-0410-b5e6-96231b3b80d8
leaq foo@TLSGD(%rip), %rdi
as part of the instruction sequence. Using a register other than %rdi and then
copying it to %rdi is not valid.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@69350 91177308-0d34-0410-b5e6-96231b3b80d8
with SUBREG_TO_REG, teach SimpleRegisterCoalescing to coalesce
SUBREG_TO_REG instructions (which are similar to INSERT_SUBREG
instructions), and teach the DAGCombiner to take advantage of this on
targets which support it. This eliminates many redundant
zero-extension operations on x86-64.
This adds a new TargetLowering hook, isZExtFree. It's similar to
isTruncateFree, except it only applies to actual definitions, and not
no-op truncates which may not zero the high bits.
Also, this adds a new optimization to SimplifyDemandedBits: transform
operations like x+y into (zext (add (trunc x), (trunc y))) on targets
where all the casts are no-ops. In contexts where the high part of the
add is explicitly masked off, this allows the mask operation to be
eliminated. Fix the DAGCombiner to avoid undoing these transformations
to eliminate casts on targets where the casts are no-ops.
Also, this adds a new two-address lowering heuristic. Since
two-address lowering runs before coalescing, it helps to be able to
look through copies when deciding whether commuting and/or
three-address conversion are profitable.
Also, fix a bug in LiveInterval::MergeInClobberRanges. It didn't handle
the case that a clobber range extended both before and beyond an
existing live range. In that case, multiple live ranges need to be
added. This was exposed by the new subreg coalescing code.
Remove 2008-05-06-SpillerBug.ll. It was bugpoint-reduced, and the
spiller behavior it was looking for no longer occurrs with the new
instruction selection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@68576 91177308-0d34-0410-b5e6-96231b3b80d8
builds.
--- Reverse-merging (from foreign repository) r68552 into '.':
U test/CodeGen/X86/tls8.ll
U test/CodeGen/X86/tls10.ll
U test/CodeGen/X86/tls2.ll
U test/CodeGen/X86/tls6.ll
U lib/Target/X86/X86Instr64bit.td
U lib/Target/X86/X86InstrSSE.td
U lib/Target/X86/X86InstrInfo.td
U lib/Target/X86/X86RegisterInfo.cpp
U lib/Target/X86/X86ISelLowering.cpp
U lib/Target/X86/X86CodeEmitter.cpp
U lib/Target/X86/X86FastISel.cpp
U lib/Target/X86/X86InstrInfo.h
U lib/Target/X86/X86ISelDAGToDAG.cpp
U lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.cpp
U lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.cpp
U lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.h
U lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.h
U lib/Target/X86/X86ISelLowering.h
U lib/Target/X86/X86InstrInfo.cpp
U lib/Target/X86/X86InstrBuilder.h
U lib/Target/X86/X86RegisterInfo.td
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@68560 91177308-0d34-0410-b5e6-96231b3b80d8
This introduces a small regression on the generated code
quality in the case we are just computing addresses, not
loading values.
Will work on it and on X86-64 support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@68552 91177308-0d34-0410-b5e6-96231b3b80d8
x * 40
=>
shlq $3, %rdi
leaq (%rdi,%rdi,4), %rax
This has the added benefit of allowing more multiply to be folded into addressing mode. e.g.
a * 24 + b
=>
leaq (%rdi,%rdi,2), %rax
leaq (%rsi,%rax,8), %rax
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67917 91177308-0d34-0410-b5e6-96231b3b80d8
%a = ...
%b = and i32 %a, 2
%c = srl i32 %b, 1
%d = br i32 %c,
into
%a = ...
%b = and %a, 2
%c = X86ISD::CMP %b, 0
%d = X86ISD::BRCOND %c ...
This applies only when the AND constant value has one bit set and the SRL
constant is equal to the log2 of the AND constant. The back-end is smart enough
to convert the result into a TEST/JMP sequence.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67728 91177308-0d34-0410-b5e6-96231b3b80d8
1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants.
2. MachineConstantPool alignment field is also a log2 value.
3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values.
4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries.
5. Asm printer uses expensive data structure multimap to track constant pool entries by sections.
6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic.
Solutions:
1. ConstantPoolSDNode alignment field is changed to keep non-log2 value.
2. MachineConstantPool alignment field is also changed to keep non-log2 value.
3. Functions that create ConstantPool nodes are passing in non-log2 alignments.
4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT.
5. Asm printer uses cheaper data structure to group constant pool entries.
6. Asm printer compute entry offsets after grouping is done.
7. Change JIT code to compute entry offsets on the fly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66875 91177308-0d34-0410-b5e6-96231b3b80d8
for i32/i64 expressions (we could also do i16 on cpus where
i16 lea is fast, but I didn't add this). On the example, we now
generate:
_test:
movl 4(%esp), %eax
cmpl $42, (%eax)
setl %al
movzbl %al, %eax
leal 4(%eax,%eax,8), %eax
ret
instead of:
_test:
movl 4(%esp), %eax
cmpl $41, (%eax)
movl $4, %ecx
movl $13, %eax
cmovg %ecx, %eax
ret
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66869 91177308-0d34-0410-b5e6-96231b3b80d8
related transformations out of target-specific dag combine into the
ARM backend. These were added by Evan in r37685 with no testcases
and only seems to help ARM (e.g. test/CodeGen/ARM/select_xform.ll).
Add some simple X86-specific (for now) DAG combines that turn things
like cond ? 8 : 0 -> (zext(cond) << 3). This happens frequently
with the recently added cp constant select optimization, but is a
very general xform. For example, we now compile the second example
in const-select.ll to:
_test:
movsd LCPI2_0, %xmm0
ucomisd 8(%esp), %xmm0
seta %al
movzbl %al, %eax
movl 4(%esp), %ecx
movsbl (%ecx,%eax,4), %eax
ret
instead of:
_test:
movl 4(%esp), %eax
leal 4(%eax), %ecx
movsd LCPI2_0, %xmm0
ucomisd 8(%esp), %xmm0
cmovbe %eax, %ecx
movsbl (%ecx), %eax
ret
This passes multisource and dejagnu.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66779 91177308-0d34-0410-b5e6-96231b3b80d8
the same say the "test" instruction does in overflow cases,
so eliminating the test is only safe when those bits aren't
needed, as is the case for COND_E and COND_NE, or if it
can be proven that no overflow will occur. For now, just
restrict the optimization to COND_E and COND_NE and don't
do any overflow analysis.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66318 91177308-0d34-0410-b5e6-96231b3b80d8
instruction. The class also consolidates the code for detecting constant
splats that's shared across PowerPC and the CellSPU backends (and might be
useful for other backends.) Also introduces SelectionDAG::getBUID_VECTOR() for
generating new BUILD_VECTOR nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65296 91177308-0d34-0410-b5e6-96231b3b80d8
(Note: Eventually, commits like this will be handled via a pre-commit hook that
does this automagically, as well as expand tabs to spaces and look for 80-col
violations.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64827 91177308-0d34-0410-b5e6-96231b3b80d8
in inline asm as signed (what gcc does). Add partial support
for x86-specific "e" and "Z" constraints, with appropriate
signedness for printing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64400 91177308-0d34-0410-b5e6-96231b3b80d8
Many targets build placeholder nodes for special operands, e.g.
GlobalBaseReg on X86 and PPC for the PIC base. There's no
sensible way to associate debug info with these. I've left
them built with getNode calls with explicit DebugLoc::getUnknownLoc operands.
I'm not too happy about this but don't see a good improvement;
I considered adding a getPseudoOperand or something, but it
seems to me that'll just make it harder to read.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63992 91177308-0d34-0410-b5e6-96231b3b80d8
getCALLSEQ_{END,START} to permit passing no DebugLoc
there. UNDEF doesn't logically have DebugLoc; add
getUNDEF to encapsulate this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63978 91177308-0d34-0410-b5e6-96231b3b80d8
crashes or wrong code with codegen of large integers:
eliminate the legacy getIntegerVTBitMask and
getIntegerVTSignBit methods, which returned their
value as a uint64_t, so couldn't handle huge types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63494 91177308-0d34-0410-b5e6-96231b3b80d8
dagcombines that help it match in several more cases. Add
several more cases to test/CodeGen/X86/bt.ll. This doesn't
yet include matching for BT with an immediate operand, it
just covers more register+register cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63266 91177308-0d34-0410-b5e6-96231b3b80d8
tidy up SDUse and related code.
- Replace the operator= member functions with a set method, like
LLVM Use has, and variants setInitial and setNode, which take
care up updating use lists, like LLVM Use's does. This simplifies
code that calls these functions.
- getSDValue() is renamed to get(), as in LLVM Use, though most
places can either use the implicit conversion to SDValue or the
convenience functions instead.
- Fix some more node vs. value terminology issues.
Also, eliminate the one remaining use of SDOperandPtr, and
SDOperandPtr itself.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62995 91177308-0d34-0410-b5e6-96231b3b80d8
corresponding to the "not" and "vnot" PatFrags. Use the new method
in some places where it seems appropriate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62768 91177308-0d34-0410-b5e6-96231b3b80d8
optimize it to a SINT_TO_FP when the sign bit is known zero. X86 isel should perform the optimization itself.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62504 91177308-0d34-0410-b5e6-96231b3b80d8
X86. This code:
void f() {
uint32_t x;
float y = (float)x;
}
used to be:
movl %eax, -8(%ebp)
movl [2^52 double], -4(%ebp)
movsd -8(%ebp), %xmm0
subsd [2^52 double], %xmm0
cvtsd2ss %xmm0, %xmm0
Is now:
movsd [2^52 double], %xmm0
movsd %xmm0, %xmm1
movd %ecx, %xmm2
orps %xmm2, %xmm1
subsd %xmm0, %xmm1
cvtsd2ss %xmm1, %xmm0
This is faster on X86. Note that there's an extra load of %xmm0 into %xmm1. That
will be fixed in a later coalescer fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62404 91177308-0d34-0410-b5e6-96231b3b80d8
promote from i1 all the way up to the canonical SetCC type.
In order to discover an appropriate type to use, pass
MVT::Other to getSetCCResultType. In order to be able to
do this, change getSetCCResultType to take a type as an
argument, not a value (this is also more logical).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61542 91177308-0d34-0410-b5e6-96231b3b80d8
This removes all the _8, _16, _32, and _64 opcodes and replaces each
group with an unsuffixed opcode. The MemoryVT field of the AtomicSDNode
is now used to carry the size information. In tablegen, the size-specific
opcodes are replaced by size-independent opcodes that utilize the
ability to compose them with predicates.
This shrinks the per-opcode tables and makes the code that handles
atomics much more concise.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61389 91177308-0d34-0410-b5e6-96231b3b80d8
which are identical to the original patterns.
- Change the multiply with overflow so that we distinguish between signed and
unsigned multiplication. Currently, unsigned multiplication with overflow
isn't working!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60963 91177308-0d34-0410-b5e6-96231b3b80d8
ISD::ADD to emit an implicit EFLAGS. This was horribly broken. Instead, replace
the intrinsic with an ISD::SADDO node. Then custom lower that into an
X86ISD::ADD node with a associated SETCC that checks the correct condition code
(overflow or carry). Then that gets lowered into the correct X86::ADDOvf
instruction.
Similar for SUB and MUL instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60915 91177308-0d34-0410-b5e6-96231b3b80d8
target-independent way of determining overflow on multiplication. It's very
tricky. Patch by Zoltan Varga!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60800 91177308-0d34-0410-b5e6-96231b3b80d8
loops when they can be subsumed into addressing modes.
Change X86 addressing mode check to realize that
some PIC references need an extra register.
(I believe this is correct for Linux, if not, I'm sure
someone will tell me.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60608 91177308-0d34-0410-b5e6-96231b3b80d8
- LowerXADDO lowers [SU]ADDO into an ADD with an implicit EFLAGS define. The
EFLAGS are fed into a SETCC node which has the conditional COND_O or COND_C,
depending on the type of ADDO requested.
- LowerBRCOND now recognizes if it's coming from a SETCC node with COND_O or
COND_C set.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60388 91177308-0d34-0410-b5e6-96231b3b80d8
MERGE_VALUES node with only one operand, so get
rid of special code that only existed to handle
that possibility.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60349 91177308-0d34-0410-b5e6-96231b3b80d8
ReplaceNodeResults: rather than returning a node which
must have the same number of results as the original
node (which means mucking around with MERGE_VALUES,
and which is also easy to get wrong since SelectionDAG
folding may mean you don't get the node you expect),
return the results in a vector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60348 91177308-0d34-0410-b5e6-96231b3b80d8
the conditional for the BRCOND statement. For instance, it will generate:
addl %eax, %ecx
jo LOF
instead of
addl %eax, %ecx
; About 10 instructions to compare the signs of LHS, RHS, and sum.
jl LOF
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60123 91177308-0d34-0410-b5e6-96231b3b80d8
- Mark "add with overflow" as having a custom lowering for X86. Give it a null
lowering representation for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@59971 91177308-0d34-0410-b5e6-96231b3b80d8
(actually, code already all worked, only the comment
changed). Use this to implement 'A' constraint on x86.
Fixes PR 1779.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@59266 91177308-0d34-0410-b5e6-96231b3b80d8
a memset using 16-byte XMM stores, but where the stack realignment code
didn't work. Until it does (PR2962) disable use of xmm regs in memcpy
and memset formation for linux and other targets with insufficiently
aligned stacks.
This is part of PR2888
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@58317 91177308-0d34-0410-b5e6-96231b3b80d8
LHS is a foldable load, then LHS and RHS are swapped
and SetCCOpcode is changed to SETUGT. But the later
code is expecting operands to be the wrong way round
for SETUGT, but they are not in this case, resulting
in an inverted compare. The solution is to move the
load normalization before the correction for SETUGT.
This bug was tickled by LegalizeTypes which happened
to legalize the testcase slightly differently to
LegalizeDAG.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@58092 91177308-0d34-0410-b5e6-96231b3b80d8
assume that i64 has been turned into a BUILD_PAIR
node (when called from LegalizeTypes this hasn't
happened yet) and don't use a vector shuffle mask
with an illegal element type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57972 91177308-0d34-0410-b5e6-96231b3b80d8
The same one Apple gcc uses, faster. Also gets the
extreme case in gcc.c-torture/execute/ieee/rbug.c
correct which we weren't before; this is not
sufficient to get the test to pass though, there
is another bug.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57926 91177308-0d34-0410-b5e6-96231b3b80d8
in the 32-bit signed offset field of addresses. Even though this
may be intended, some linkers refuse to relocate code where the
relocated address computation overflows.
Also, fix the sign-extension of constant offsets to use the
actual pointer size, rather than the size of the GlobalAddress
node, which may be different, for example on x86-64 where MVT::i32
is used when the address is being fit into the 32-bit displacement
field.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57885 91177308-0d34-0410-b5e6-96231b3b80d8
Where previously LLVM might emit code like this:
ucomisd %xmm1, %xmm0
setne %al
setp %cl
orb %al, %cl
jne .LBB4_2
it now emits this:
ucomisd %xmm1, %xmm0
jne .LBB4_2
jp .LBB4_2
It has fewer instructions and uses fewer registers, but it does
have more branches. And in the case that this code is followed by
a non-fallthrough edge, it may be followed by a jmp instruction,
resulting in three branch instructions in sequence. Some effort
is made to avoid this situation.
To achieve this, X86ISelLowering.cpp now recognizes FCMP_OEQ and
FCMP_UNE in lowered form, and replace them with code that emits
two branches, except in the case where it would require converting
a fall-through edge to an explicit branch.
Also, X86InstrInfo.cpp's branch analysis and transform code now
knows now to handle blocks with multiple conditional branches. It
uses loops instead of having fixed checks for up to two
instructions. It can now analyze and transform code generated
from FCMP_OEQ and FCMP_UNE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57873 91177308-0d34-0410-b5e6-96231b3b80d8
LowerOperation if it doesn't know what else to do.
This methods should probably be factorized some,
but this is good enough for the moment. Have
LowerATOMIC_BINARY_64 use EXTRACT_ELEMENT rather
than assuming the operand is a BUILD_PAIR (if it
is then getNode will automagically simplify the
EXTRACT_ELEMENT). This way LowerATOMIC_BINARY_64
usable from LegalizeTypes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57831 91177308-0d34-0410-b5e6-96231b3b80d8
and add a TargetLowering hook for it to use to determine when this
is legal (i.e. not in PIC mode, etc.)
This allows instruction selection to emit folded constant offsets
in more cases, such as the included testcase, eliminating the need
for explicit arithmetic instructions.
This eliminates the need for the C++ code in X86ISelDAGToDAG.cpp
that attempted to achieve the same effect, but wasn't as effective.
Also, fix handling of offsets in GlobalAddressSDNodes in several
places, including changing GlobalAddressSDNode's offset from
int to int64_t.
The Mips, Alpha, Sparc, and CellSPU targets appear to be
unaware of GlobalAddress offsets currently, so set the hook to
false on those targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57748 91177308-0d34-0410-b5e6-96231b3b80d8
in 32-bit mode instead of assigning a register pair. This has nothing to
do with PR2356, but I happened to notice it while working on it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57704 91177308-0d34-0410-b5e6-96231b3b80d8
i.e. conditions that cannot be checked with a single instruction. For example,
SETONE and SETUEQ on x86.
- Teach legalizer to implement *illegal* setcc as a and / or of a number of
legal setcc nodes. For now, only implement FP conditions. e.g. SETONE is
implemented as SETO & SETNE, SETUEQ is SETUO | SETEQ.
- Move x86 target over.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57542 91177308-0d34-0410-b5e6-96231b3b80d8
- Move the EH landing-pad code and adjust it so that it works
with FastISel as well as with SDISel.
- Add FastISel support for @llvm.eh.exception and
@llvm.eh.selector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57539 91177308-0d34-0410-b5e6-96231b3b80d8
parameters instead of raw Constants. This prevents the constants from
being selected by the isel pass, fixing PR2735.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57385 91177308-0d34-0410-b5e6-96231b3b80d8
`-fno-builtin' flag. Currently, it's used to replace "memset" with "_bzero"
instead of "__bzero" on Darwin10+. This arguably violates the meaning of this
flag, but is currently sufficient. The meaning of this flag should become more
specific over time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@56885 91177308-0d34-0410-b5e6-96231b3b80d8
its size). Adjust various lowering functions to
pass this info through from CallInst. Use it to
implement sseregparm returns on X86. Remove
X86_ssecall calling convention.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@56677 91177308-0d34-0410-b5e6-96231b3b80d8
s/ParamAttr/Attribute/g
s/PAList/AttrList/g
s/FnAttributeWithIndex/AttributeWithIndex/g
s/FnAttr/Attribute/g
This sets the stage
- to implement function notes as function attributes and
- to distinguish between function attributes and return value attributes.
This requires corresponding changes in llvm-gcc and clang.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@56622 91177308-0d34-0410-b5e6-96231b3b80d8