not unrolling loops that contain calls that would be better off getting inlined. This mostly
comes up when an interleaved devirtualization pass has devirtualized a call which the inliner
will inline on a future pass. Thus, rather than blocking all loops containing calls, add
a metric for "inline candidate calls" and block loops containing those instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113535 91177308-0d34-0410-b5e6-96231b3b80d8
instruction in the class would be decoded to. Or zero if the number of
uOPs must be determined dynamically.
This will be used to determine the cost-effectiveness of predicating a
micro-coded instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113513 91177308-0d34-0410-b5e6-96231b3b80d8
don't use any InlineCostAnalyzer state, and are useful for other clients who don't necessarily want to use
all of InlineCostAnalyzer's logic, some of which is fairly inlining-specific.
No intended functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113499 91177308-0d34-0410-b5e6-96231b3b80d8
the VST pseudos. The VLD/VST scheduling still needs work (see pr6722), but
at least we shouldn't confuse the loads with the stores.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113473 91177308-0d34-0410-b5e6-96231b3b80d8
uses MMX, even if it also uses other things) from InstrSSE
into InstrMMX. No (intended) functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113462 91177308-0d34-0410-b5e6-96231b3b80d8
operand from the pseudo instruction to the new instruction as an implicit use.
This will preserve any other flags (e.g., kill) on the operand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113456 91177308-0d34-0410-b5e6-96231b3b80d8
for integer and fp constants. Implement todo to use vfp3 instructions
to materialize easy constants if we can.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113453 91177308-0d34-0410-b5e6-96231b3b80d8
For VLD3/VLD4 with double-spaced registers, add the implicit use of the
super register for both the instruction loading the even registers and the
instruction loading the odd registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113452 91177308-0d34-0410-b5e6-96231b3b80d8
unrolling threshold to the optimize-for-size threshold. Basically, for loops containing calls, unrolling
can still be profitable as long as the loop is REALLY small.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113439 91177308-0d34-0410-b5e6-96231b3b80d8
Omission of memory form of PI2PD is intentional; this
does not use an MMX register and does not put the chip
into MMX mode (PI2PS, oddly enough, does).
Operands of PI2PS follow the gcc builtin, not Intel.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113388 91177308-0d34-0410-b5e6-96231b3b80d8
modules are instantiated in them. If the context is deleted, all of its owned
modules are also deleted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113374 91177308-0d34-0410-b5e6-96231b3b80d8
nodes to emit shuffles and don't do isel mask matching anymore.
- Add the selection of the remaining shuffle opcode (movddup)
- Introduce two new functions to "recognize" where we may get
potential folds and add several comments to them explaining why
they are not yet in the desidered shape.
- Add more patterns to fallback the case where we select
a specific shuffle opcode as if it could fold a load, but it
can't, so remap to a valid instruction.
- Add a couple of FIXMEs to address in the following days once
there's a good solution to the current folding problem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113369 91177308-0d34-0410-b5e6-96231b3b80d8
option to disable base pointer usage, pay attention to it when deciding
if we can realign (if no base pointer and VLAs, we can't).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113366 91177308-0d34-0410-b5e6-96231b3b80d8
AliasAnalysis, and some code for implementing the new query on top of
existing implementations by making standard alias and getModRefInfo
queries.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113329 91177308-0d34-0410-b5e6-96231b3b80d8
The threshold value of 50 is arbitrary, and I chose it simply by analogy to the inlining thresholds, where
the baseline unrolling threshold is slightly smaller than the baseline inlining threshold. This could
undoubtedly use some tuning.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113306 91177308-0d34-0410-b5e6-96231b3b80d8
LiveIntervals already adds <imp-def> operands for super-registers when a subreg
def defines the whole register. Thus, it is not necessary to do it again when
rewriting.
In fact, the super-register imp-defs caused miscompilations because the late
scheduler couldn't see that the super-register was read.
We still add super-reg <imp-use,kill> operands when rewriting virtuals to
physicals.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113299 91177308-0d34-0410-b5e6-96231b3b80d8
register must be one of the destination registers for the load. Otherwise,
the tLDM instruction will write-back to the base register, which isn't what's
desired (otherwise, we'd have a t2LDM_UPD instead).
rdar://8394087
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113297 91177308-0d34-0410-b5e6-96231b3b80d8
switch to using a ManagedStatic for the global PassRegistry instead of a
ManagedCleanup, and fix a destruction ordering bug this exposed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113283 91177308-0d34-0410-b5e6-96231b3b80d8
turning (fptrunc (sqrt (fpext x))) -> (sqrtf x) is great, but we have
to delete the original sqrt as well. Not doing so causes us to do
two sqrt's when building with -fmath-errno (the default on linux).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113260 91177308-0d34-0410-b5e6-96231b3b80d8
Switch from isWeakForLinker to mayBeOverridden which is more accurate.
Add more statistics and debugging info. Add comments. Move static function
outside anonymous namespace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113190 91177308-0d34-0410-b5e6-96231b3b80d8
Fix zeroExtend and signExtend to support empty sets, and to return the smallest
possible result set which contains the extension of each element in their
inputs. For example zext i8 [100, 10) to i16 is now [0, 256), not i16 [100, 10)
which contains 63446 members.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113187 91177308-0d34-0410-b5e6-96231b3b80d8
always be disambiguated as sldtw. sldtw and sldtq with
a mem operands have the same effect, but sldtw is more
compact. Force it to sldtw, resolving rdar://8017530
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113186 91177308-0d34-0410-b5e6-96231b3b80d8
of a mneumonic, report operand errors with better location
info. For example, we now report:
t.s:6:14: error: invalid operand for instruction
cwtl $1
^
but we fail for common cases like:
t.s:11:4: error: invalid operand for instruction
addl $1, $1
^
because we don't know if this is supposed to be the reg/imm or imm/reg
form.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113178 91177308-0d34-0410-b5e6-96231b3b80d8
failed because a subtarget feature was not enabled. Use this to
remove a bunch of hacks from the X86AsmParser for rejecting things
like popfl in 64-bit mode. Previously these hacks weren't needed,
but were important to get a message better than "invalid instruction"
when used in the wrong mode.
This also fixes bugs where pushal would not be rejected correctly in
32-bit mode (just pusha).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113166 91177308-0d34-0410-b5e6-96231b3b80d8
into the middle of the class, and rework how the different sections of
the generated file are conditionally included for simplicity.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113163 91177308-0d34-0410-b5e6-96231b3b80d8
in the duplicated block instead of duplicating them.
Duplicating them into the end of the loop and the preheader
means that we got a phi node in the header of the loop,
which prevented LICM from hoisting them. GVN would
usually come around later and merge the duplicated
instructions so we'd get reasonable output... except that
anything dependent on the shoulda-been-hoisted value can't
be hoisted. In PR5319 (which this fixes), a memory value
didn't get promoted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113134 91177308-0d34-0410-b5e6-96231b3b80d8
Loop::hasLoopInvariantOperands method. Remove
a useless and confusing Loop::isLoopInvariant(Instruction)
method, which didn't do what you thought it did.
No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113133 91177308-0d34-0410-b5e6-96231b3b80d8
strong functions first to make sure they're the canonical definitions and then
do a second pass looking only for weak functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113104 91177308-0d34-0410-b5e6-96231b3b80d8
Since mem2reg isn't run at -O0, we get a ton of reloads from the stack,
for example, before, this code:
int foo(int x, int y, int z) {
return x+y+z;
}
used to compile into:
_foo: ## @foo
subq $12, %rsp
movl %edi, 8(%rsp)
movl %esi, 4(%rsp)
movl %edx, (%rsp)
movl 8(%rsp), %edx
movl 4(%rsp), %esi
addl %edx, %esi
movl (%rsp), %edx
addl %esi, %edx
movl %edx, %eax
addq $12, %rsp
ret
Now we produce:
_foo: ## @foo
subq $12, %rsp
movl %edi, 8(%rsp)
movl %esi, 4(%rsp)
movl %edx, (%rsp)
movl 8(%rsp), %edx
addl 4(%rsp), %edx ## Folded load
addl (%rsp), %edx ## Folded load
movl %edx, %eax
addq $12, %rsp
ret
Fewer instructions and less register use = faster compiles.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113102 91177308-0d34-0410-b5e6-96231b3b80d8
Clobber ranges are no longer used when joining physical registers.
Instead, all aliases are checked for interference.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113084 91177308-0d34-0410-b5e6-96231b3b80d8
location is being re-stored to the memory location. We would get
a dangling pointer from the SSAUpdate data structure and miss a
use. This fixes PR8068
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113042 91177308-0d34-0410-b5e6-96231b3b80d8
checking each standalone condition and decide whether emit target
specific nodes or remove the condition if it's already matched before.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113031 91177308-0d34-0410-b5e6-96231b3b80d8
invertible. ScalarEvolution's folding routines don't always succeed
in canonicalizing equal expressions to a single canonical form, and
this can cause these asserts to fail, even though there's no actual
correctness problem. This fixes PR8066.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113021 91177308-0d34-0410-b5e6-96231b3b80d8
"Use target specific nodes instead of relying in unpckl and
unpckh pattern fragments during isel time. Also place a
depth limit in getShuffleScalarElt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113020 91177308-0d34-0410-b5e6-96231b3b80d8
overload UserInInstr. Explicitly check Allocatable. The early exit in the
condition will mean the performance impact of the extra test should be
minimal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113016 91177308-0d34-0410-b5e6-96231b3b80d8
mask pattern fragment", which depends on r112934, which introduced some infinite
loop and select failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112998 91177308-0d34-0410-b5e6-96231b3b80d8
solve the root problem, but it corrects the bug in the code I added to
support legalizing in the case where the non-extended type is also legal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112997 91177308-0d34-0410-b5e6-96231b3b80d8
"For ARM stack frames that utilize variable sized objects and have either
large local stack areas or require dynamic stack realignment, allocate a
base register via which to access the local frame. This allows efficient
access to frame indices not accessible via the FP (either due to being out
of range or due to dynamic realignment) or the SP (due to variable sized
object allocation). In particular, this greatly improves efficiency of access
to spill slots in Thumb functions which contain VLAs."
r112986 fixed a latent bug exposed by the above.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112989 91177308-0d34-0410-b5e6-96231b3b80d8
slot.
Teach it to also check for early clobbered aliases, and early clobber operands
following the current operand.
This fixes the miscompilation in PR8044 where EC registers eax and ecx were
being used for inputs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112988 91177308-0d34-0410-b5e6-96231b3b80d8
alignment should be performed. Otherwise dynamic realignment may trigger
when the register allocator has already used the frame pointer as a general
purpose register. That is, we need to make sure that the list of reserved
registers doesn't change after register allocation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112986 91177308-0d34-0410-b5e6-96231b3b80d8
instructions prior to regalloc. Since it's getting a little close to
the 2.8 branch deadline, I'll have to leave the rest of the instructions
handled by the NEONPreAllocPass for now, but I didn't want to leave half
of the VLD instructions converted and the other half not.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112983 91177308-0d34-0410-b5e6-96231b3b80d8
Original commit message:
Use the SSAUpdator to turn calls to eh.exception that are not in a
landing pad into uses of registers rather than loads from a stack
slot. Doesn't touch the 'orrible hack code - Bill needs to persuade
me harder :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112952 91177308-0d34-0410-b5e6-96231b3b80d8
vabd intrinsic and add and/or zext operations. In the case of vaba, this
also avoids the need for a DAG combine pattern to combine vabd with add.
Update tests. Auto-upgrade the old intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112941 91177308-0d34-0410-b5e6-96231b3b80d8
- Teach getShuffleScalarElt how to handle more target
specific nodes, so the DAGCombine can make use of it.
- Add another hack to avoid the node update problem
during legalization. More description on the comments
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112934 91177308-0d34-0410-b5e6-96231b3b80d8
Remove #uses comments from functions: they we're padded out to column 50
and were potentially confusing for externally visible functions.
going further, remove the "<i8**> [#uses=3]" comments entirely. They
add a lot of noise, confuse people about what the IR is, and don't add
any particular value. When the types are long it makes it really really
hard to read IR.
If someone is interested in this sort of thing, the right way to do this
is to implement an AsmAnnotationWriter that produces the same output, and
add a flag to llvm-dis (only) to produce this output.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112899 91177308-0d34-0410-b5e6-96231b3b80d8
and were potentially confusing for externally visible functions.
going further, remove the "<i8**> [#uses=3]" comments entirely. They
add a lot of noise, confuse people about what the IR is, and don't add
any particular value. When the types are long it makes it really really
hard to read IR.
If someone is interested in this sort of thing, the right way to do this
is to implement an AsmAnnotationWriter that produces the same output, and
add a flag to llvm-dis (only) to produce this output.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112894 91177308-0d34-0410-b5e6-96231b3b80d8
I wasn't able to convince myself that all GetMainExecutable
implementations always return absolute paths; this prevents
unexpected behavior in case they ever don't.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112888 91177308-0d34-0410-b5e6-96231b3b80d8
large local stack areas or require dynamic stack realignment, allocate a
base register via which to access the local frame. This allows efficient
access to frame indices not accessible via the FP (either due to being out
of range or due to dynamic realignment) or the SP (due to variable sized
object allocation). In particular, this greatly improves efficiency of access
to spill slots in Thumb functions which contain VLAs.
rdar://7352504
rdar://8374540
rdar://8355680
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112883 91177308-0d34-0410-b5e6-96231b3b80d8
capacity and remove the workaround in SmallVector<T,0>. There are some
theoretical benefits to a N->2N+1 growth policy anyway.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112870 91177308-0d34-0410-b5e6-96231b3b80d8
there are clearly no stores between the load and the store. This fixes
this miscompile reported as PR7833.
This breaks the test/CodeGen/X86/narrow_op-2.ll optimization, which is
safe, but awkward to prove safe. Move it to X86's README.txt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112861 91177308-0d34-0410-b5e6-96231b3b80d8