the generated FastISel. X86 doesn't need to generate code to match ADD16ri8
since ADD16ri will do just fine. This is a small codesize win in the generated
instruction selector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129692 91177308-0d34-0410-b5e6-96231b3b80d8
simplifying them and exposing more information to tblgen. It would be nice
if other target authors adopted this as well, particularly arm since it has fastisel.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129676 91177308-0d34-0410-b5e6-96231b3b80d8
kind of predicate: one that is specific to imm nodes. The predicate function
specified here just checks an int64_t directly instead of messing around with
SDNode's. The virtue of this is that it means that fastisel and other things
can reason about these predicates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129675 91177308-0d34-0410-b5e6-96231b3b80d8
structure and fix some fixmes. We now have a TreePredicateFn class
that handles all of the decoding of these things. This is an internal
cleanup that has no impact on the code generated by tblgen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129670 91177308-0d34-0410-b5e6-96231b3b80d8
2. implement rdar://9289501 - fast isel should fold trivial multiplies to shifts
3. teach tblgen to handle shift immediates that are different sizes than the
shifted operands, eliminating some code from the X86 fast isel backend.
4. Have FastISel::SelectBinaryOp use (the poorly named) FastEmit_ri_ function
instead of FastEmit_ri to simplify code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129666 91177308-0d34-0410-b5e6-96231b3b80d8
when we have a global variable base an an index. Instead, just give up on
folding the global variable.
Before we'd geenrate:
_test: ## @test
## BB#0:
movq _rtx_length@GOTPCREL(%rip), %rax
leaq (%rax), %rax
addq %rdi, %rax
movzbl (%rax), %eax
ret
now we generate:
_test: ## @test
## BB#0:
movq _rtx_length@GOTPCREL(%rip), %rax
movzbl (%rax,%rdi), %eax
ret
The difference is even more significant when there is a scale
involved.
This fixes rdar://9289558 - total fail with addr mode formation at -O0/x86-64
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129664 91177308-0d34-0410-b5e6-96231b3b80d8
less trivial things) into a dummy lea. Before we generated:
_test: ## @test
movq _G@GOTPCREL(%rip), %rax
leaq (%rax), %rax
ret
now we produce:
_test: ## @test
movq _G@GOTPCREL(%rip), %rax
ret
This is part of rdar://9289558
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129662 91177308-0d34-0410-b5e6-96231b3b80d8
Change ELF systems to use CFI for producing the EH tables. This reduces the
size of the clang binary in Debug builds from 690MB to 679MB.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129571 91177308-0d34-0410-b5e6-96231b3b80d8
Now that we have a first-class way to represent unaligned loads, the unaligned
load intrinsics are superfluous.
First part of <rdar://problem/8460511>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129401 91177308-0d34-0410-b5e6-96231b3b80d8
with the newer, cleaner model. It uses the IAPrinter class to hold the
information that is needed to match an instruction with its alias. This also
takes into account the available features of the platform.
There is one bit of ugliness. The way the logic determines if a pattern is
unique is O(N**2), which is gross. But in reality, the number of items it's
checking against isn't large. So while it's N**2, it shouldn't be a massive time
sink.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129110 91177308-0d34-0410-b5e6-96231b3b80d8
Define most shift masks incrementally to reduce the redundant
hard-coding. Introduce new shift for the VEX flags to replace the
magic constant 32 in various places.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128822 91177308-0d34-0410-b5e6-96231b3b80d8
I'm backing this out for the second time. It was supposed to be fixed by r128164, but the mingw self-host must be defeating the fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128181 91177308-0d34-0410-b5e6-96231b3b80d8
the alias of an InstAlias instead of the thing being aliased. Because we need to
know the features that are valid for an InstAlias.
This is part of a work-in-progress.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127986 91177308-0d34-0410-b5e6-96231b3b80d8
to have single return block (at least getting there) for optimizations. This
is general goodness but it would prevent some tailcall optimizations.
One specific case is code like this:
int f1(void);
int f2(void);
int f3(void);
int f4(void);
int f5(void);
int f6(void);
int foo(int x) {
switch(x) {
case 1: return f1();
case 2: return f2();
case 3: return f3();
case 4: return f4();
case 5: return f5();
case 6: return f6();
}
}
=>
LBB0_2: ## %sw.bb
callq _f1
popq %rbp
ret
LBB0_3: ## %sw.bb1
callq _f2
popq %rbp
ret
LBB0_4: ## %sw.bb3
callq _f3
popq %rbp
ret
This patch teaches codegenprep to duplicate returns when the return value
is a phi and where the phi operands are produced by tail calls followed by
an unconditional branch:
sw.bb7: ; preds = %entry
%call8 = tail call i32 @f5() nounwind
br label %return
sw.bb9: ; preds = %entry
%call10 = tail call i32 @f6() nounwind
br label %return
return:
%retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ]
ret i32 %retval.0
This allows codegen to generate better code like this:
LBB0_2: ## %sw.bb
jmp _f1 ## TAILCALL
LBB0_3: ## %sw.bb1
jmp _f2 ## TAILCALL
LBB0_4: ## %sw.bb3
jmp _f3 ## TAILCALL
rdar://9147433
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127953 91177308-0d34-0410-b5e6-96231b3b80d8
not have native support for this operation (such as X86).
The legalized code uses two vector INT_TO_FP operations and is faster
than scalarizing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127951 91177308-0d34-0410-b5e6-96231b3b80d8
comparisons on x86. Essentially, the way this works is that SUB+SBB sets
the relevant flags the same way a double-width CMP would.
This is a substantial improvement over the generic lowering in LLVM. The output
is also shorter than the gcc-generated output; I haven't done any detailed
benchmarking, though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127852 91177308-0d34-0410-b5e6-96231b3b80d8
rather than an int. Thankfully, this only causes LLVM to miss optimizations, not
generate incorrect code.
This just fixes the zext at the return. We still insert an i32 ZextAssert when
reading a function's arguments, but it is followed by a truncate and another i8
ZextAssert so it is not optimized.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127766 91177308-0d34-0410-b5e6-96231b3b80d8
in the instruction tables and fixed a few bugs that
were causing decode conflicts. Rudimentary tests
are coming up in the next patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127646 91177308-0d34-0410-b5e6-96231b3b80d8