This folding happens as early as possible for performance reasons, and to make sure it isn't foiled by other transforms (e.g. forming FMAs).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163519 91177308-0d34-0410-b5e6-96231b3b80d8
Previously we checked if the register is def'd in a block via the def/use list a
nd walked the list of kills to check if the register is killed in a block. Both
of these checks can be made much cheaper by walking the block first and
recording all defs and kills.
This reduces the compile time of the test case from PR13651 from 40s to 15s at
-O2. The compile time is still dominated by LV updating but now the main culprit
is SparseBitVector's slowness.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163478 91177308-0d34-0410-b5e6-96231b3b80d8
For some reason .lcomm uses byte alignment and .comm log2 alignment so we can't
use the same setting for both. Fix this by reintroducing the LCOMM enum.
I verified this against mingw's gcc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163420 91177308-0d34-0410-b5e6-96231b3b80d8
- Darwin lied about not supporting .lcomm and turned it into zerofill in the
asm parser. Push the zerofill-conversion down into macho-specific code.
- This makes the tri-state LCOMMType enum superfluous, there are no targets
without .lcomm.
- Do proper error reporting when trying to use .lcomm with alignment on a target
that doesn't support it.
- .comm and .lcomm alignment was parsed in bytes on COFF, should be power of 2.
- Fixes PR13755 (.lcomm crashes on ELF).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163395 91177308-0d34-0410-b5e6-96231b3b80d8
The RegisterCoalescer understands overlapping live ranges where one
register is defined as a copy of the other. With this change, register
allocators using LiveRegMatrix can do the same, at least for copies
between physical and virtual registers.
When a physreg is defined by a copy from a virtreg, allow those live
ranges to overlap:
%CL<def> = COPY %vreg11:sub_8bit; GR32_ABCD:%vreg11
%vreg13<def,tied1> = SAR32rCL %vreg13<tied0>, %CL<imp-use,kill>
We can assign %vreg11 to %ECX, overlapping the live range of %CL.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163336 91177308-0d34-0410-b5e6-96231b3b80d8
We will soon allow virtual register live ranges to overlap regunit live
ranges when the physreg is defined as a copy of the virtreg:
%EAX = COPY %vreg5
FOO %vreg5
BAR %EAX<kill>
There is no real interference since %vreg5 and %EAX have the same value
where they overlap.
This patch prevents addKillFlags from adding virtreg kill flags to FOO
where the assigned physreg is overlapping the virtual register live
range.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163335 91177308-0d34-0410-b5e6-96231b3b80d8
Kill flags are difficult to maintain, and liveness queries are better
handled by live intervals.
Kill flags are reinserted after register allocation by addKillFlags().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163334 91177308-0d34-0410-b5e6-96231b3b80d8
Implicit uses can be dynamically tied to defs. This will soon be used
for predicated instructions on ARM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163177 91177308-0d34-0410-b5e6-96231b3b80d8
The MachineOperand::TiedTo field was maintained, but not used.
This patch enables it in isRegTiedToDefOperand() and
isRegTiedToUseOperand() which are the actual functions use by the
register allocator.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163153 91177308-0d34-0410-b5e6-96231b3b80d8
After much agonizing, use a full 4 bits of precious MachineOperand space
to encode this. This uses existing padding, and doesn't grow
MachineOperand beyond its current 32 bytes.
This allows tied defs among the first 15 operands on a normal
instruction, just like the current MCInstrDesc constraint encoding.
Inline assembly needs to be able to tie more than the first 15 operands,
and gets special treatment.
Tied uses can appear beyond 15 operands, as long as they are tied to a
def that's in range.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163151 91177308-0d34-0410-b5e6-96231b3b80d8
- CodeGenPrepare pass for identifying div/rem ops
- Backend specifies the type mapping using addBypassSlowDivType
- Enabled only for Intel Atom with O2 32-bit -> 8-bit
- Replace IDIV with instructions which test its value and use DIVB if the value
is positive and less than 256.
- In the case when the quotient and remainder of a divide are used a DIV
and a REM instruction will be present in the IR. In the non-Atom case
they are both lowered to IDIVs and CSE removes the redundant IDIV instruction,
using the quotient and remainder from the first IDIV. However,
due to this optimization CSE is not able to eliminate redundant
IDIV instructions because they are located in different basic blocks.
This is overcome by calculating both the quotient (DIV) and remainder (REM)
in each basic block that is inserted by the optimization and reusing the result
values when a subsequent DIV or REM instruction uses the same operands.
- Test cases check for the presents of the optimization when calculating
either the quotient, remainder, or both.
Patch by Tyler Nowicki!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163150 91177308-0d34-0410-b5e6-96231b3b80d8
No test case unfortunately as i couldn't find a target which fit all
the conditions needed to hit this code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163075 91177308-0d34-0410-b5e6-96231b3b80d8
Manage tied operands entirely internally to MachineInstr. This makes it
possible to change the representation of tied operands, as I will do
shortly.
The constraint that tied uses and defs must be in the same order was too
restrictive.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163021 91177308-0d34-0410-b5e6-96231b3b80d8
I was too optimistic, inline asm can have tied operands that don't
follow the def order.
Fixes PR13742.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162998 91177308-0d34-0410-b5e6-96231b3b80d8
because it does not support CMOV of vectors. To implement this efficientlyi, we broadcast the condition bit and use a sequence of NAND-OR
to select between the two operands. This is the same sequence we use for targets that don't have vector BLENDs (like SSE2).
rdar://12201387
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162926 91177308-0d34-0410-b5e6-96231b3b80d8
When a MachineInstr is constructed, its implicit operands are added
first, then the explicit operands are inserted before the implicits.
MCInstrDesc has oprand flags like early clobber and operand ties that
apply to the explicit operands.
Don't look at those flags when the implicit operands are first added in
the explicit operands's positions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162910 91177308-0d34-0410-b5e6-96231b3b80d8
When there are multiple tied use-def pairs on an inline asm instruction,
the tied uses must appear in the same order as the defs.
It is possible to write an LLVM IR inline asm instruction that breaks
this constraint, but there is no reason for a front end to emit the
operands out of order.
The gnu inline asm syntax specifies tied operands as a single read/write
constraint "+r", so ouf of order operands are not possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162878 91177308-0d34-0410-b5e6-96231b3b80d8
For normal instructions, isTied() is set automatically by addOperand(),
based on MCInstrDesc, but inline asm has tied operands outside the
descriptor.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162869 91177308-0d34-0410-b5e6-96231b3b80d8
Ordered memory operations are more constrained than volatile loads and
stores because they must be ordered with respect to all other memory
operations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162861 91177308-0d34-0410-b5e6-96231b3b80d8
It is technically allowed to move a normal load across a volatile load,
but probably not a good idea.
It is not allowed to move a load across an atomic load with
Ordering > Monotonic, and we model those with MOVolatile as well.
I recently removed the mayStore flag from atomic load instructions, so
they don't need a pseudo-opcode. This patch makes up for the difference.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162857 91177308-0d34-0410-b5e6-96231b3b80d8
The operands on an INLINEASM machine instruction are divided into groups
headed by immediate flag operands. Verify this structure.
Extract verifyTiedOperands(), and only call it for non-inlineasm
instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162849 91177308-0d34-0410-b5e6-96231b3b80d8
WHen running with -verify-machineinstrs, check that tied operands come
in matching use/def pairs, and that they are consistent with MCInstrDesc
when it applies.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162816 91177308-0d34-0410-b5e6-96231b3b80d8
The isTied bit is set automatically when a tied use is added and
MCInstrDesc indicates a tied operand. The tie is broken when one of the
tied operands is removed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162814 91177308-0d34-0410-b5e6-96231b3b80d8
While in SSA form, a MachineInstr can have pairs of tied defs and uses.
The tied operands are used to represent read-modify-write operands that
must be assigned the same physical register.
Previously, tied operand pairs were computed from fixed MCInstrDesc
fields, or by using black magic on inline assembly instructions.
The isTied flag makes it possible to add tied operands to any
instruction while getting rid of (some of) the inlineasm magic.
Tied operands on normal instructions are needed to represent predicated
individual instructions in SSA form. An extra <tied,imp-use> operand is
required to represent the output value when the instruction predicate is
false.
Adding a predicate to:
%vreg0<def> = ADD %vreg1, %vreg2
Will look like:
%vreg0<tied,def> = ADD %vreg1, %vreg2, pred:3, %vreg7<tied,imp-use>
The virtual register %vreg7 is the value given to %vreg0 when the
predicate is false. It will be assigned the same physreg as %vreg0.
This commit adds the isTied flag and sets it based on MCInstrDesc when
building an instruction. The flag is not used for anything yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162774 91177308-0d34-0410-b5e6-96231b3b80d8
Register operands are manipulated by a lot of target-independent code,
and it is not always possible to preserve target flags. That means it is
not safe to use target flags on register operands.
None of the targets in the tree are using register operand target flags.
External targets should be using immediate operands to annotate
instructions with operand modifiers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162770 91177308-0d34-0410-b5e6-96231b3b80d8
These extra flags are not required to properly order the atomic
load/store instructions. SelectionDAGBuilder chains atomics as if they
were volatile, and SelectionDAG::getAtomic() sets the isVolatile bit on
the memory operands of all atomic operations.
The volatile bit is enough to order atomic loads and stores during and
after SelectionDAG.
This means we set mayLoad on atomic_load, mayStore on atomic_store, and
mayLoad+mayStore on the remaining atomic read-modify-write operations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162733 91177308-0d34-0410-b5e6-96231b3b80d8
In SelectionDAGLegalize::ExpandLegalINT_TO_FP, expand INT_TO_FP nodes without
using any f64 operations if f64 is not a legal type.
Patch by Stefan Kristiansson.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162728 91177308-0d34-0410-b5e6-96231b3b80d8
It is legal to have a register node as an explicit operand, it shouldn't
be counted as an implicit use.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162591 91177308-0d34-0410-b5e6-96231b3b80d8
the case of multiple edges from one block to another.
A simple example is a switch statement with multiple values to the same
destination. The definition of an edge is modified from a pair of blocks to
a pair of PredBlock and an index into the successors.
Also set the weight correctly when building SelectionDAG from LLVM IR,
especially when converting a Switch.
IntegersSubsetMapping is updated to calculate the weight for each cluster.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162572 91177308-0d34-0410-b5e6-96231b3b80d8
output (we're emitting a specification already and the information
isn't changing) and we're not in old gdb compat mode.
Saves 1% on the debug information for a build of llvm.
Fixes rdar://11043421
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162493 91177308-0d34-0410-b5e6-96231b3b80d8
The logic for recomputing latency based on a ScheduleDAG edge was
shady. This bypasses the problem by requiring the client to provide
operand indices. This ensures consistent use of the machine model's
API.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162420 91177308-0d34-0410-b5e6-96231b3b80d8
Based on CR feedback from r162301 and Craig Topper's refactoring in r162347
here are a few other places that could use the same API (& in one instance drop
a Function.h dependency).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162367 91177308-0d34-0410-b5e6-96231b3b80d8
SelectionDAG's 'init' has not been called when the SelectionDAGBuilder is
constructed (in SelectionDAGISel's constructor), so this was previously always
initialized with 0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162333 91177308-0d34-0410-b5e6-96231b3b80d8
Even looking at the revision history I couldn't quite piece together why this
cast was ever written in the first place, but I assume it was because of some
change in the inheritance, perhaps this function was reimplemented in a
derived type & this caller was meant to get the base version (& it wasn't
virtual)?
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162301 91177308-0d34-0410-b5e6-96231b3b80d8
The getSumForBlock function was quadratic in the number of successors
because getSuccWeight would perform a linear search for an already known
iterator.
This patch was originally committed as r161460, but reverted again
because of assertion failures. Now that duplicate Machine CFG edges have
been eliminated, this works properly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162233 91177308-0d34-0410-b5e6-96231b3b80d8
IR that hasn't been through SimplifyCFG can look like this:
br i1 %b, label %r, label %r
Make sure we don't create duplicate Machine CFG edges in this case.
Fix the machine code verifier to accept conditional branches with a
single CFG edge.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162230 91177308-0d34-0410-b5e6-96231b3b80d8
The DAGCombiner tries to optimise a BUILD_VECTOR by checking if it
consists purely of get_vector_elts from one or two source vectors. If
so, it either makes a concat_vectors node or a shufflevector node.
However, it doesn't check the element type width of the underlying
vector, so if you have this sequence:
Node0: v4i16 = ...
Node1: i32 = extract_vector_elt Node0
Node2: i32 = extract_vector_elt Node0
Node3: v16i8 = BUILD_VECTOR Node1, Node2, ...
It will attempt to:
Node0: v4i16 = ...
NewNode1: v16i8 = concat_vectors Node0, ...
Where this is actually invalid because the element width is completely
different. This causes an assertion failure on DAG legalization stage.
Fix:
If output item type of BUILD_VECTOR differs from input item type.
Make concat_vectors based on input element type and then bitcast it to the output vector type. So the case described above will transformed to:
Node0: v4i16 = ...
NewNode1: v8i16 = concat_vectors Node0, ...
NewNode2: v16i8 = bitcast NewNode1
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162195 91177308-0d34-0410-b5e6-96231b3b80d8
make it more consistent with its intended semantics.
The `linker_private_weak_def_auto' linkage type was meant to automatically hide
globals which never had their addresses taken. It has nothing to do with the
`linker_private' linkage type, which outputs the symbols with a `l' (ell) prefix
among other things.
The intended semantic is more like the `linkonce_odr' linkage type.
Change the name of the linkage type to `linkonce_odr_auto_hide'. And therefore
changing the semantics so that it produces the correct output for the linker.
Note: The old linkage name `linker_private_weak_def_auto' will still parse but
is not a synonym for `linkonce_odr_auto_hide'. This should be removed in 4.0.
<rdar://problem/11754934>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162114 91177308-0d34-0410-b5e6-96231b3b80d8