The registers are placed into the saved registers list in the reverse order,
which is why the original loop was written to loop backwards.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@148064 91177308-0d34-0410-b5e6-96231b3b80d8
lc: X86ISelLowering.cpp:6480: llvm::SDValue llvm::X86TargetLowering::LowerVECTOR_SHUFFLE(llvm::SDValue, llvm::SelectionDAG&) const: Assertion `V1.getOpcode() != ISD::UNDEF&& "Op 1 of shuffle should not be undef"' failed.
Added a test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@148044 91177308-0d34-0410-b5e6-96231b3b80d8
Uses the pvArbitrary slot of the TIB, which is reserved for applications. We
only support frames with a static size.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@148040 91177308-0d34-0410-b5e6-96231b3b80d8
Restore the (obviously wrong) behavior from before r147938 without relying on
undefined behavior. Add a fat FIXME note.
This should fix nightly tester failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@148030 91177308-0d34-0410-b5e6-96231b3b80d8
In att style asm syntax memory operand size is derived from suffix attached with mnemonic. In intel style asm syntax it is part of memory operand hence predicate method check is required to select appropriate instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@148006 91177308-0d34-0410-b5e6-96231b3b80d8
same pattern. We already had this pattern is a few places, but others
tried to make a rough approximation of an actual DAG structure. As not
everywhere went to this trouble, nothing could rely on this being done.
In fact, I've checked all references to these node Ids, and the ones
that are using the topo-sort properties are actually satisfied with
a strict-weak-ordering. The requirement appears to be that Use >= Def.
I've added a big blurb of comments to this bit of the transform to
clarify why the order is so important for the next reader of the code.
I'm starting with this change as it is very small, and trivially
reverted if something breaks or the >= above really does need to be >.
If that proves the case, we can hide the problem by reverting this
patch, but the problem exists elsewhere as well, and so a more
comprehensive solution will be needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@148001 91177308-0d34-0410-b5e6-96231b3b80d8
This uses TLS slot 90, which actually belongs to JavaScriptCore. We only support
frames with static size
Patch by Brian Anderson.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147960 91177308-0d34-0410-b5e6-96231b3b80d8
hoped this would revive one of the llvm-gcc selfhost build bots, but it
didn't so it doesn't appear that my transform is the culprit.
If anyone else is seeing failures, please let me know!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147957 91177308-0d34-0410-b5e6-96231b3b80d8
strange build bot failures that look like a miscompile into an infloop.
I'll investigate this tomorrow, but I'd both like to know whether my
patch is the culprit, and get the bots back to green.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147945 91177308-0d34-0410-b5e6-96231b3b80d8
factor the differences that were hiding in one of them into its other
caller, the SRL handling code. No change in behavior.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147940 91177308-0d34-0410-b5e6-96231b3b80d8
mask+shift pairs at the beginning of the ISD::AND case block, and then
hoist the final pattern into a helper function, simplifying and
reflowing it appropriately. This should have no observable behavior
change, but several simplifications fell out of this such as directly
computing the new mask constant, etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147939 91177308-0d34-0410-b5e6-96231b3b80d8
extracts and scaled addressing modes into its own helper function. No
functionality changed here, just hoisting and layout fixes falling out
of that hoisting.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147937 91177308-0d34-0410-b5e6-96231b3b80d8
detect a pattern which can be implemented with a small 'shl' embedded in
the addressing mode scale. This happens in real code as follows:
unsigned x = my_accelerator_table[input >> 11];
Here we have some lookup table that we look into using the high bits of
'input'. Each entity in the table is 4-bytes, which means this
implicitly gets turned into (once lowered out of a GEP):
*(unsigned*)((char*)my_accelerator_table + ((input >> 11) << 2));
The shift right followed by a shift left is canonicalized to a smaller
shift right and masking off the low bits. That hides the shift right
which x86 has an addressing mode designed to support. We now detect
masks of this form, and produce the longer shift right followed by the
proper addressing mode. In addition to saving a (rather large)
instruction, this also reduces stalls in Intel chips on benchmarks I've
measured.
In order for all of this to work, one part of the DAG needs to be
canonicalized *still further* than it currently is. This involves
removing pointless 'trunc' nodes between a zextload and a zext. Without
that, we end up generating spurious masks and hiding the pattern.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147936 91177308-0d34-0410-b5e6-96231b3b80d8
Allow LDRD to be formed from pairs with different LDR encodings. This was the original intention of the pass. Somewhere along the way, the LDR opcodes were refined which broke the optimization. We really don't care what the original opcodes are as long as they both map to the same LDRD and the immediate still fits.
Fixes rdar://10435045 ARMLoadStoreOptimization cannot handle mixed LDRi8/LDRi12
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147922 91177308-0d34-0410-b5e6-96231b3b80d8
Add a test that checks the stack alignment of a simple function for
Darwin, Linux and NetBSD for 32bit and 64bit mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147888 91177308-0d34-0410-b5e6-96231b3b80d8
This function runs after all constant islands have been placed, and may
shrink some instructions to their 2-byte forms. This can actually cause
some constant pool entries to move out of range because of growing
alignment padding.
Treat instructions that may be shrunk the same as inline asm - they
erode the known alignment bits.
Also reinstate an old assertion in verify(). It is correct now that
basic block offsets include alignments.
Add a single large test case that will hopefully exercise many parts of
the constant island pass.
<rdar://problem/10670199>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147885 91177308-0d34-0410-b5e6-96231b3b80d8
As the comment around 7746 says, it's better to use the x87 extended precision
here than SSE. And the generic code doesn't know how to do that. It also regains
the speed lost for the uint64_to_float.c testcase.
<rdar://problem/10669858>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147869 91177308-0d34-0410-b5e6-96231b3b80d8
of several newly un-defaulted switches. This also helps optimizers
(including LLVM's) recognize that every case is covered, and we should
assume as much.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147861 91177308-0d34-0410-b5e6-96231b3b80d8
On Thumb, the displacement computation hardware uses the address of the
current instruction rouned down to a multiple of 4. Include this
rounding in the UserOffset we compute for each instruction.
When inline asm is present, the instruction alignment may not be known.
Constrain the maximum displacement instead in that case.
This makes it possible for CreateNewWater() and OffsetIsInRange() to
agree about the valid displacements. When they disagree, infinite
looping happens.
As always, test cases for this stuff are insane.
<rdar://problem/10660175>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147825 91177308-0d34-0410-b5e6-96231b3b80d8
The pass is prone to looping, and it is better to crash than loop
forever, even in a -Asserts build.
<rdar://problem/10660175>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147806 91177308-0d34-0410-b5e6-96231b3b80d8
AsmParser holds info specific to target parser.
AsmParserVariant holds info specific to asm variants supported by the target.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147787 91177308-0d34-0410-b5e6-96231b3b80d8
this substraction will result in small negative numbers at worst which
become very large positive numbers on assignment and are thus caught by
the <=4 check on the next line. The >0 check clearly intended to catch
these as negative numbers.
Spotted by inspection, and impossible to trigger given the shift widths
that can be used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147773 91177308-0d34-0410-b5e6-96231b3b80d8
This enables basic local CSE, giving us 20% smaller code for
consumer-typeset in -O0 builds.
<rdar://problem/10658692>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147720 91177308-0d34-0410-b5e6-96231b3b80d8
file error checking. Use that to error on an unfinished cfi_startproc.
The error is not nice, but is already better than a segmentation fault.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147717 91177308-0d34-0410-b5e6-96231b3b80d8
exposed with an upcoming change will would delete the copy to return register
because there is no use! It's amazing anything works.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147715 91177308-0d34-0410-b5e6-96231b3b80d8
This eliminates a lot of constant pool entries for -O0 builds of code
with many global variable accesses.
This speeds up -O0 codegen of consumer-typeset by 2x because the
constant island pass no longer has to look at thousands of constant pool
entries.
<rdar://problem/10629774>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147712 91177308-0d34-0410-b5e6-96231b3b80d8
Testing: passed 'make check' including LIT tests for all sequences being handled (both SSE and AVX)
Reviewers: Evan Cheng, David Blaikie, Bruno Lopes, Elena Demikhovsky, Chad Rosier, Anton Korobeynikov
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147601 91177308-0d34-0410-b5e6-96231b3b80d8
This small bit of ASM code is sufficient to do what the old algorithm did:
movq %rax, %xmm0
punpckldq (c0), %xmm0 // c0: (uint4){ 0x43300000U, 0x45300000U, 0U, 0U }
subpd (c1), %xmm0 // c1: (double2){ 0x1.0p52, 0x1.0p52 * 0x1.0p32 }
#ifdef __SSE3__
haddpd %xmm0, %xmm0
#else
pshufd $0x4e, %xmm0, %xmm1
addpd %xmm1, %xmm0
#endif
It's arguably faster. One caveat, the 'haddpd' instruction isn't very fast on
all processors.
<rdar://problem/7719814>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147593 91177308-0d34-0410-b5e6-96231b3b80d8
Now that canRealignStack() understands frozen reserved registers, it is
safe to use it for aligned spill instructions.
It will only return true if the registers reserved at the beginning of
register allocation allow for dynamic stack realignment.
<rdar://problem/10625436>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147579 91177308-0d34-0410-b5e6-96231b3b80d8
Once register allocation has started the reserved registers are frozen.
Fix the ARM canRealignStack() hook to respect the frozen register state.
Now the hook returns false if register allocation was started with frame
pointer elimination enabled.
It also returns false if register allocation started without a reserved
base pointer, and stack realignment would require a base pointer. This
bug was breaking oggenc on armv6.
No test case, an upcoming patch will use this functionality to realign
the stack for spill slots when possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147578 91177308-0d34-0410-b5e6-96231b3b80d8
versions derive from them.
- JALR64 is not needed since N64 does not emit jal.
- Add template parameter to BranchLink that sets the rt field.
- Fix the set of temporary registers for O32 and N64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147518 91177308-0d34-0410-b5e6-96231b3b80d8
(x > y) ? x : y
=>
(x >= y) ? x : y
So for something like
(x - y) > 0 : (x - y) ? 0
It will be
(x - y) >= 0 : (x - y) ? 0
This makes is possible to test sign-bit and eliminate a comparison against
zero. e.g.
subl %esi, %edi
testl %edi, %edi
movl $0, %eax
cmovgl %edi, %eax
=>
xorl %eax, %eax
subl %esi, $edi
cmovsl %eax, %edi
rdar://10633221
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147512 91177308-0d34-0410-b5e6-96231b3b80d8
This patch caused a miscompilation of oggenc because a frame pointer was
suddenly needed halfway through register allocation.
<rdar://problem/10625436>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147487 91177308-0d34-0410-b5e6-96231b3b80d8