This patch will optimize the following
movq %rdi, %rax
subq %rsi, %rax
cmovsq %rsi, %rdi
movq %rdi, %rax
to
cmpq %rsi, %rdi
cmovsq %rsi, %rdi
movq %rdi, %rax
Perform this optimization if the actual result of SUB is not used.
rdar: 11540023
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157755 91177308-0d34-0410-b5e6-96231b3b80d8
Reg-units are named after their root registers, and most units have a
single root, so they simply print as 'AL', 'XMM0', etc. The rare dual
root reg-units print as FPSCR~FPSCR_NZCV, FP0~ST7, ...
The printing piggybacks on the existing register name tables, so no
extra const data space is required.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157754 91177308-0d34-0410-b5e6-96231b3b80d8
Also add subclasses MCSubRegIterator, MCSuperRegIterator, and
MCRegAliasIterator.
These iterators provide an abstract interface to the MCRegisterInfo
register lists so the internal representation can be changed without
changing all clients.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157695 91177308-0d34-0410-b5e6-96231b3b80d8
The register unit lists are typically much shorter than the register
overlap lists, and the backing table for register units has better cache
locality because it is smaller.
This makes llc about 0.5% faster. The regsOverlap() function isn't that hot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157651 91177308-0d34-0410-b5e6-96231b3b80d8
to pass around a struct instead of a large set of individual values. This
cleans up the interface and allows more information to be added to the struct
for future targets without requiring changes to each and every target.
NV_CONTRIB
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157479 91177308-0d34-0410-b5e6-96231b3b80d8
The Hazard checker implements in-order contraints, or interlocked
resources. Ready instructions with hazards do not enter the available
queue and are not visible to other heuristics.
The major code change is the addition of SchedBoundary to encapsulate
the state at the top or bottom of the schedule, including both a
pending and available queue.
The scheduler now counts cycles in sync with the hazard checker. These
are minimum cycle counts based on known hazards.
Targets with no itinerary (x86_64) currently remain at cycle 0. To fix
this, we need to provide some maximum issue width for all targets. We
also need to add the concept of expected latency vs. minimum latency.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157427 91177308-0d34-0410-b5e6-96231b3b80d8
Many targets always use the same bitwise encoding value for physical
registers in all (or most) instructions. Add this mapping to the
.td files and TableGen'erate the information and expose an accessor
in MCRegisterInfo.
patch by Tom Stellard.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156829 91177308-0d34-0410-b5e6-96231b3b80d8
The getPointerRegClass() hook can return register classes that depend on
the calling convention of the current function (ptr_rc_tailcall).
So far, we have been able to infer the calling convention from the
subtarget alone, but as we add support for multiple calling conventions
per target, that no longer works.
Patch by Yiannis Tsiouris!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156328 91177308-0d34-0410-b5e6-96231b3b80d8
This function is a generalization of getMatchingSuperRegClass() to the
symmetric case where both sides are using a sub-register index. It will
find a super-register class and sub-register indexes that make this
diagram commute:
PreA
SuperRC ----------> RCA
| |
| |
PreB | | SubA
| |
| |
V V
RCB ----------> SubRC
SubB
This can be used to coalesce copies like:
%vreg1:sub16 = COPY %vreg2:sub16; GR64:%vreg1, GR32: %vreg2
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156317 91177308-0d34-0410-b5e6-96231b3b80d8
This will be used to determine whether it's profitable to turn a select into a
branch when the branch is likely to be predicted.
Currently enabled for everything but Atom on X86 and Cortex-A9 devices on ARM.
I'm not entirely happy with the name of this flag, suggestions welcome ;)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156233 91177308-0d34-0410-b5e6-96231b3b80d8
This manually enumerated list of super-register classes has been
superceeded by the automatically computed super-register class masks
available through SuperRegClassIterator.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156151 91177308-0d34-0410-b5e6-96231b3b80d8
The masks returned by SuperRegClassIterator are computed automatically
by TableGen. This is better than depending on the manually specified
SuperRegClasses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156147 91177308-0d34-0410-b5e6-96231b3b80d8
This iterator class provides a more abstract interface to the (Idx,
Mask) lists of super-registers for a register class. The layout of the
tables shouldn't be exposed to clients.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156144 91177308-0d34-0410-b5e6-96231b3b80d8
This is a pointer into one of the tables used by
getMatchingSuperRegClass(). It makes it possible to use a shared
implementation of that function.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156121 91177308-0d34-0410-b5e6-96231b3b80d8
Some targets have no sub-registers at all. Use the TargetRegisterInfo
versions of composeSubRegIndices(), getSubClassWithSubReg(), and
getMatchingSuperRegClass() for those targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156075 91177308-0d34-0410-b5e6-96231b3b80d8
The ensures that virtual registers always belong to an allocatable class.
If your target attempts to create a vreg for an operand that has no
allocatable register subclass, you will crash quickly.
This ensures that targets define register classes as intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156046 91177308-0d34-0410-b5e6-96231b3b80d8
When an instruction match is found, but the subtarget features it
requires are not available (missing floating point unit, or thumb vs arm
mode, for example), issue a diagnostic that identifies what the feature
mismatch is.
rdar://11257547
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155499 91177308-0d34-0410-b5e6-96231b3b80d8
on X86 Atom. Some of our tests failed because the tail merging part of
the BranchFolding pass was creating new basic blocks which did not
contain live-in information. When the anti-dependency code in the Post-RA
scheduler ran, it would sometimes rename the register containing
the function return value because the fact that the return value was
live-in to the subsequent block had been lost. To fix this, it is necessary
to run the RegisterScavenging code in the BranchFolding pass.
This patch makes sure that the register scavenging code is invoked
in the X86 subtarget only when post-RA scheduling is being done.
Post RA scheduling in the X86 subtarget is only done for Atom.
This patch adds a new function to the TargetRegisterClass to control
whether or not live-ins should be preserved during branch folding.
This is necessary in order for the anti-dependency optimizations done
during the PostRASchedulerList pass to work properly when doing
Post-RA scheduling for the X86 in general and for the Intel Atom in particular.
The patch adds and invokes the new function trackLivenessAfterRegAlloc()
instead of using the existing requiresRegisterScavenging().
It changes BranchFolding.cpp to call trackLivenessAfterRegAlloc() instead of
requiresRegisterScavenging(). It changes the all the targets that
implemented requiresRegisterScavenging() to also implement
trackLivenessAfterRegAlloc().
It adds an assertion in the Post RA scheduler to make sure that post RA
liveness information is available when it is needed.
It changes the X86 break-anti-dependencies test to use –mcpu=atom, in order
to avoid running into the added assertion.
Finally, this patch restores the use of anti-dependency checking
(which was turned off temporarily for the 3.1 release) for
Intel Atom in the Post RA scheduler.
Patch by Andy Zhang!
Thanks to Jakob and Anton for their reviews.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155395 91177308-0d34-0410-b5e6-96231b3b80d8
Assembly matchers for instructions with a two-operand form. ARM is full
of these, for example:
add {Rd}, Rn, Rm // Rd is optional and is the same as Rn if omitted.
The property TwoOperandAliasConstraint on the instruction definition controls
when, and if, an alias will be formed. No explicit InstAlias definitions
are required.
rdar://11255754
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155172 91177308-0d34-0410-b5e6-96231b3b80d8
also fix SimplifyLibCalls to use TLI rather than compile-time conditionals to enable optimizations on floor, ceil, round, rint, and nearbyint
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154960 91177308-0d34-0410-b5e6-96231b3b80d8
legalizer always use the DAG entry node. This is wrong when the libcall is
emitted as a tail call since it effectively folds the return node. If
the return node's input chain is not the entry (i.e. call, load, or store)
use that as the tail call input chain.
PR12419
rdar://9770785
rdar://11195178
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154370 91177308-0d34-0410-b5e6-96231b3b80d8
optimizations which are valid for position independent code being linked
into a single executable, but not for such code being linked into
a shared library.
I discussed the design of this with Eric Christopher, and the decision
was to support an optional bit rather than a completely separate
relocation model. Fundamentally, this is still PIC relocation, its just
that certain optimizations are only valid under a PIC relocation model
when the resulting code won't be in a shared library. The simplest path
to here is to expose a single bit option in the TargetOptions. If folks
have different/better designs, I'm all ears. =]
I've included the first optimization based upon this: changing TLS
models to the *Exec models when PIE is enabled. This is the LLVM
component of PR12380 and is all of the hard work.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154294 91177308-0d34-0410-b5e6-96231b3b80d8
in TargetLowering. There was already a FIXME about this location being
odd. The interface is simplified as a consequence. This will also make
it easier to change TLS models when compiling with PIE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154292 91177308-0d34-0410-b5e6-96231b3b80d8
This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154011 91177308-0d34-0410-b5e6-96231b3b80d8
Allows us to de-virtualize the function and provides access to it in
the instruction printer, which is useful for handling composite
physical registers (e.g., ARM register lists).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@151815 91177308-0d34-0410-b5e6-96231b3b80d8