of lea. It is better for code size (and presumably efficiency) to use:
movl $foo, %eax
rather than:
leal foo, eax
Both give a nice zero extending "move immediate" instruction, the former is just
smaller. Note that global addresses should be handled different by the x86
backend, but I chose to follow the style already in place and add more fixme's.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75403 91177308-0d34-0410-b5e6-96231b3b80d8
Basically, using:
lea symbol(%rip), %rax
is not valid in -static mode, because the current RIP may not be
within 32-bits of "symbol" when an app is built partially pic and
partially static. The fix for this is to compile it to:
lea symbol, %rax
It would be better to codegen this as:
movq $symbol, %rax
but that will come next.
The hard part of fixing this bug was fixing abi-isel, which was actively
testing for the wrong behavior. Also, the RUN lines are completely impossible
to understand what they are testing. To help with this, convert the -static
x86-64 codegen tests to use filecheck. This is much more stable and makes it
more clear what the codegen is expected to be.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75382 91177308-0d34-0410-b5e6-96231b3b80d8
A side-effect of this change is asm printer is now using unified assembly. There are some minor clean ups and fixes as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75359 91177308-0d34-0410-b5e6-96231b3b80d8
of loops. Add several new functions to for working with ScalarEvolution's
add-hoc value-range analysis functionality.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75252 91177308-0d34-0410-b5e6-96231b3b80d8
value. Adjust other code to deal with that correctly. Make
DAGTypeLegalizer::PromoteIntRes_EXTRACT_VECTOR_ELT take advantage of
this new flexibility to simplify the code and make it deal with unusual
vectors (like <4 x i1>) correctly. Fixes PR3037.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75176 91177308-0d34-0410-b5e6-96231b3b80d8
registers based on dynamic conditions. For example, X86 EBP/RBP, when used as
frame register has to be spilled in the first fixed object. It should inform
PEI this so it doesn't get allocated another stack object. Also, it should not
be spilled as other callee-saved registers but rather its spilling and restoring
are being handled by emitPrologue and emitEpilogue. Avoid spilling it twice.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75116 91177308-0d34-0410-b5e6-96231b3b80d8
as an (index,bool) pair. The bool flag records whether the kill is a
PHI kill or not. This code will be used to enable splitting of live
intervals containing PHI-kills.
A slight change to live interval weights introduced an extra spill
into lsr-code-insertion (outside the critical sections). The test
condition has been updated to reflect this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75097 91177308-0d34-0410-b5e6-96231b3b80d8
* remove some old code that was needed when we'd put ESP in the scale instead of
the base of some instructions.
* Fix a bug with the P modifier in inline asm that caused us to drop it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75077 91177308-0d34-0410-b5e6-96231b3b80d8
VSETCC must define all bits, which is different than it was documented
to before. Since all targets that implement VSETCC already have this
behavior, and we don't optimize based on this, just change the
documentation. We now get nice code for vec_compare.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74978 91177308-0d34-0410-b5e6-96231b3b80d8
finishes off enough support for vector compares to get the icmp/fcmp
version of 2008-07-23-VSetCC.ll passing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74961 91177308-0d34-0410-b5e6-96231b3b80d8
than a wider one, before trying to compare their contents which will crash
if their sizes are different.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74792 91177308-0d34-0410-b5e6-96231b3b80d8
we could do this, doing so requires adjusting the demanded mask and the code isn't
doing that yet. This fixes PR4495
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74699 91177308-0d34-0410-b5e6-96231b3b80d8
Note, isUndef marker must be placed even on implicit_def def operand or else the scavenger will not ignore it. This is necessary because -O0 path does not use liveintervalanalysis, it treats implicit_def just like any other def.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74601 91177308-0d34-0410-b5e6-96231b3b80d8
allowed to be undefined when the expression is seen, we cannot enforce the
same-section requirement until the entire assembly file has been seen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74565 91177308-0d34-0410-b5e6-96231b3b80d8
Avoid unnecessary duplication of operand 0 of X86::FpSET_ST0_80. This duplication would
cause one register to remain on the stack at the function return.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74534 91177308-0d34-0410-b5e6-96231b3b80d8
The register allocator, when it allocates a register to a virtual register defined by an implicit_def, can allocate any physical register without worrying about overlapping live ranges. It should mark all of operands of the said virtual register so later passes will do the right thing.
This is not the best solution. But it should be a lot less fragile to having the scavenger try to track what is defined by implicit_def.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74518 91177308-0d34-0410-b5e6-96231b3b80d8
an individual exhaustive evaluation reflects only the exit value
implied by an individual exit, which may differ from the actual
exit value of the loop if there are other exits. This fixes PR4477.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74447 91177308-0d34-0410-b5e6-96231b3b80d8
Not sure I understand how the temp register gets used,
but this fixes a bug and introduces no regressions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74446 91177308-0d34-0410-b5e6-96231b3b80d8
After much back and forth, I decided to deviate from ARM design and split LDR into 4 instructions (r + imm12, r + imm8, r + r << imm12, constantpool). The advantage of this is 1) it follows the latest ARM technical manual, and 2) makes it easier to reduce the width of the instruction later. The down side is this creates more inconsistency between the two sub-targets. We should split ARM LDR instruction in a similar fashion later. I've added a README entry for this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74420 91177308-0d34-0410-b5e6-96231b3b80d8
when one of them can be converted to a trivial icmp and conditional
branch.
This addresses what is essentially a phase ordering problem.
SimplifyCFG knows how to do this transformation, but it doesn't do so
if the primary block has any instructions in it other than an icmp and
a branch. In the given testcase, the block contains other instructions,
however they are loop-invariant and can be hoisted. SimplifyCFG doesn't
have LoopInfo though, so it can't hoist them. And, it's important that
the blocks be merged before LoopRotation, as it doesn't support
multiple-exit loops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74396 91177308-0d34-0410-b5e6-96231b3b80d8
inserted to replace that value must dominate all of of the basic
blocks associated with the uses of the value in the PHI, not just
one of them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74376 91177308-0d34-0410-b5e6-96231b3b80d8
implementation primarily differs from the former in that the asmprinter
doesn't make a zillion decisions about whether or not something will be
RIP relative or not. Instead, those decisions are made by isel lowering
and propagated through to the asm printer. To achieve this, we:
1. Represent RIP relative addresses by setting the base of the X86 addr
mode to X86::RIP.
2. When ISel Lowering decides that it is safe to use RIP, it lowers to
X86ISD::WrapperRIP. When it is unsafe to use RIP, it lowers to
X86ISD::Wrapper as before.
3. This removes isRIPRel from X86ISelAddressMode, representing it with
a basereg of RIP instead.
4. The addressing mode matching logic in isel is greatly simplified.
5. The asmprinter is greatly simplified, notably the "NotRIPRel" predicate
passed through various printoperand routines is gone now.
6. The various symbol printing routines in asmprinter now no longer infer
when to emit (%rip), they just print the symbol.
I think this is a big improvement over the previous situation. It does have
two small caveats though: 1. I implemented a horrible "no-rip" modifier for
the inline asm "P" constraint modifier. This is a short term hack, there is
a much better, but more involved, solution. 2. I had to xfail an
-aggressive-remat testcase because it isn't handling the use of RIP in the
constant-pool reading instruction. This specific test is easy to fix without
-aggressive-remat, which I intend to do next.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74372 91177308-0d34-0410-b5e6-96231b3b80d8
Also, added a pattern for the thumb-2 MOV of shifted immediate since that can encode immediates not encodable by the 16-bit immediate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74288 91177308-0d34-0410-b5e6-96231b3b80d8
test suite. Remove documentation for --with-f2c, which
is no longer supported. Remove information about obtaining
tcl/expect, which ship with Mac OS X by default since
10.4.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74271 91177308-0d34-0410-b5e6-96231b3b80d8
- Includes some DG tests in test/MC/AsmParser, which are rather primitive since
we don't have a -verify mode yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74139 91177308-0d34-0410-b5e6-96231b3b80d8
computations in loops with multiple exits.
Adjust the testcase for PR4436 so that the relevant portion isn't
optimized away.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74073 91177308-0d34-0410-b5e6-96231b3b80d8
terminator, instead of after the last phi. This fixes a bug
exposed by ScalarEvolution analyzing more kinds of loops.
This fixes PR4436.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74072 91177308-0d34-0410-b5e6-96231b3b80d8
trip counts in more cases.
Generalize ScalarEvolution's isLoopGuardedByCond code to recognize
And and Or conditions, splitting the code out into an
isNecessaryCond helper function so that it can evaluate Ands and Ors
recursively, and make SCEVExpander be much more aggressive about
hoisting instructions out of loops.
test/CodeGen/X86/pr3495.ll has an additional instruction now, but
it appears to be due to an arbitrary register allocation difference.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74048 91177308-0d34-0410-b5e6-96231b3b80d8
generating LLVM IR; it is correct in the code as written
to use 8-byte-aligned operations to copy Key in bar. Formerly
the gcc inliner was run, now it isn't. I don't think it's
possible to preserve this as a pure FE test. Adding -O2 lets
the llvm optimizers get rid of the 8-byte-aligned stores, at least.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73981 91177308-0d34-0410-b5e6-96231b3b80d8
generated code was apparently doing stores directly into the return value
aggregate; now, it's doing a copy from a compiler-generated static object.
That object is initialized using [4 x i8] which breaks the test. I believe
this change preserves the original point of the test.
Of course it would be better for the code to do stores directly into the
return aggregate, but that is not what happens at -O0; the llvm optimizers
seem to do that on x86 but not on ppc32, possibly because of the explicit
padding (which is unavoidable). I think it must have been being done by
a gcc optimizer pass before.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73972 91177308-0d34-0410-b5e6-96231b3b80d8
std::pair<double, float*>
is 16 bytes on darwin-powerpc, but not always.
See testcase for full weirdness.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73874 91177308-0d34-0410-b5e6-96231b3b80d8
blocks, and also exit blocks with multiple conditions (combined
with (bitwise) ands and ors). It's often infeasible to compute an
exact trip count in such cases, but a useful upper bound can often
be found.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73866 91177308-0d34-0410-b5e6-96231b3b80d8
a global with that gets printed with the :mem modifier. All operands to lea's
should be handled with the lea32mem operand kind, and this allows the TLS stuff
to do this. There are several better ways to do this, but I went for the minimal
change since I can't really test this (beyond make check).
This also makes the use of EBX explicit in the operand list in the 32-bit,
instead of implicit in the instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73834 91177308-0d34-0410-b5e6-96231b3b80d8
SCEVUnknowns with identical Instructions to be equal. This allows
it to analze cases such as the attached testcase, where the front-end
has cloned the loop controlling expression. Along with r73805, this
lets IndVarSimplify eliminate all the sign-extend casts in the
loop in the attached testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73807 91177308-0d34-0410-b5e6-96231b3b80d8
expression in IVUsers, because in the case of a use of a non-linear
addrec outside of a loop, this causes the addrec to be evaluated as
a linear addrec.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73774 91177308-0d34-0410-b5e6-96231b3b80d8
as if they were multiple uses of the same instruction. This interacts
well with the existing loadpre that j-t does to open up many new jump
threads earlier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73768 91177308-0d34-0410-b5e6-96231b3b80d8
while experimenting. I'm reasonably sure this is correct, but please
tell me if these instructions have some strange property which makes this
change unsafe.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73746 91177308-0d34-0410-b5e6-96231b3b80d8
as signed max tests. Along with r73717, this helps CodeGen avoid
emitting code for a maximum operation for this class of loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73718 91177308-0d34-0410-b5e6-96231b3b80d8
casted induction variables in cases where the cast
isn't foldable. It ended up being a pessimization in
many cases. This could be fixed, but it would require
a bunch of complicated code in IVUsers' clients. The
advantages of this approach aren't visible enough to
justify it at this time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73706 91177308-0d34-0410-b5e6-96231b3b80d8
If C is a single bit and the and gets analyzed as a truncate and
zero-extend, the xor can be represnted as an add.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73664 91177308-0d34-0410-b5e6-96231b3b80d8
move loads back past a check that the load address
is valid, see new testcase. The test that went
in with 72661 has exactly this case, except that
the conditional it's moving past is checking
something else; I've settled for changing that
test to reference a global, not a pointer. It
may be possible to scan all the tests you pass and
make sure none of them are checking any component
of the address, but it's not trivial and I'm not
trying to do that here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73632 91177308-0d34-0410-b5e6-96231b3b80d8
that gets recognized with a SCEVZeroExtendExpr must be an And
with a low-bits mask. With r73540, this is no longer the case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73594 91177308-0d34-0410-b5e6-96231b3b80d8
obscuring what would otherwise be a low-bits mask. Use ComputeMaskedBits
to compute what ShrinkDemandedConstant knew about to reconstruct a
low-bits mask value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73540 91177308-0d34-0410-b5e6-96231b3b80d8
(this is the case when we have thumb vararg function with single
callee-saved register, which is handled separately).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73529 91177308-0d34-0410-b5e6-96231b3b80d8
TurnCopyIntoImpDef turns a copy into implicit_def and remove the val# defined by it. This causes an scavenger assertion later if the def reaches other blocks. Disable the transformation if the value live interval extends beyond its def block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73478 91177308-0d34-0410-b5e6-96231b3b80d8
support for x86, and UMULO/SMULO for many architectures, including PPC
(PR4201), ARM, and Cell. The resulting expansion isn't perfect, but it's
not bad.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73477 91177308-0d34-0410-b5e6-96231b3b80d8
failures.
To support this, add some utility functions to Type to help support
vector/scalar-independent code. Change ConstantInt::get and
ConstantFP::get to support vector types, and add an overload to
ConstantInt::get that uses a static IntegerType type, for
convenience.
Introduce a new getConstant method for ScalarEvolution, to simplify
common use cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73431 91177308-0d34-0410-b5e6-96231b3b80d8