pass into the SelectionDAG itself rather than snooping on the
implementation of that pass as exposed by the TargetMachine. This
removes the last direct client of the ScalarTargetTransformInfo class
outside of the TTI pass implementation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171625 91177308-0d34-0410-b5e6-96231b3b80d8
DAGCombiner::reduceBuildVecConvertToConvertBuildVec() was making two
mistakes:
1. It was checking the legality of scalar INT_TO_FP nodes and then generating
vector nodes.
2. It was passing the result value type to
TargetLoweringInfo::getOperationAction() when it should have been
passing the value type of the first operand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171420 91177308-0d34-0410-b5e6-96231b3b80d8
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.
There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.
The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.
I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).
I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171366 91177308-0d34-0410-b5e6-96231b3b80d8
utils/sort_includes.py script.
Most of these are updating the new R600 target and fixing up a few
regressions that have creeped in since the last time I sorted the
includes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171362 91177308-0d34-0410-b5e6-96231b3b80d8
directly.
This is in preparation for removing the use of the 'Attribute' class as a
collection of attributes. That will shift to the AttributeSet class instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171253 91177308-0d34-0410-b5e6-96231b3b80d8
This is supposed to be a mechanical change with no functional effects.
InstrEmitter can generate all types of MachineOperands which revealed
that MachineInstrBuilder was missing a few methods, added by this patch.
Besides providing a context pointer to MI::addOperand(),
MachineInstrBuilder seems like a better fit for this code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170712 91177308-0d34-0410-b5e6-96231b3b80d8
bitwidth op back to the original size. If we reduce ANDs then this can cause
an endless loop. This patch changes the ZEXT to ANY_EXTEND if the demanded bits
are equal or smaller than the size of the reduced operation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170505 91177308-0d34-0410-b5e6-96231b3b80d8
A register can be associated with several distinct register classes.
For example, on PPC, the floating point registers are each associated with
both F4RC (which holds f32) and F8RC (which holds f64). As a result, this code
would fail when provided with a floating point register and an f64 operand
because it would happen to find the register in the F4RC class first and
return that. From the F4RC class, SDAG would extract f32 as the register
type and then assert because of the invalid implied conversion between
the f64 value and the f32 register.
Instead, search all register classes. If a register class containing the
the requested register has the requested type, then return that register
class. Otherwise, as before, return the first register class found that
contains the requested register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170436 91177308-0d34-0410-b5e6-96231b3b80d8
TargetLowering::getRegClassFor).
Some isSimple() guards were missing, or getSimpleVT() were hoisted too
far, resulting in asserts on valid LLVM assembly input.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170336 91177308-0d34-0410-b5e6-96231b3b80d8
Accordingly, add helper funtions getSimpleValueType (in parallel to
getValueType) in SDValue, SDNode, and TargetLowering.
This is the first, in a series of patches.
This is the second attempt. In the first attempt (r169837), a few
getSimpleVT() were hoisted too far, detected by bootstrap failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170104 91177308-0d34-0410-b5e6-96231b3b80d8
load / store pair. It's not legal to use a wider load than the size of
the remaining bytes if it's the first pair of load / store.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170018 91177308-0d34-0410-b5e6-96231b3b80d8
mention the inline memcpy / memset expansion code is a mess?
This patch split the ZeroOrLdSrc argument into two: IsMemset and ZeroMemset.
The first indicates whether it is expanding a memset or a memcpy / memmove.
The later is whether the memset is a memset of zero. It's totally possible
(likely even) that targets may want to do different things for memcpy and
memset of zero.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169959 91177308-0d34-0410-b5e6-96231b3b80d8
Also added more comments to explain why it is generally ok to return true.
- Rename getOptimalMemOpType argument IsZeroVal to ZeroOrLdSrc. It's meant to
be true for loaded source (memcpy) or zero constants (memset). The poor name
choice is probably some kind of legacy issue.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169954 91177308-0d34-0410-b5e6-96231b3b80d8
ScalarTargetTransformInfo::getIntImmCost() instead. "Legal" is a poorly defined
term for something like integer immediate materialization. It is always possible
to materialize an integer immediate. Whether to use it for memcpy expansion is
more a "cost" conceern.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169929 91177308-0d34-0410-b5e6-96231b3b80d8
Accordingly, add helper funtions getSimpleValueType (in parallel to
getValueType) in SDValue, SDNode, and TargetLowering.
This is the first, in a series of patches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169837 91177308-0d34-0410-b5e6-96231b3b80d8
try to reduce the width of this load, and would end up transforming:
(truncate (lshr (sextload i48 <ptr> as i64), 32) to i32)
to
(truncate (zextload i32 <ptr+4> as i64) to i32)
We lost the sext attached to the load while building the narrower i32
load, and replaced it with a zext because lshr always zext's the
results. Instead, bail out of this combine when there is a conflict
between a sextload and a zext narrowing. The rest of the DAG combiner
still optimize the code down to the proper single instruction:
movswl 6(...),%eax
Which is exactly what we wanted. Previously we read past the end *and*
missed the sign extension:
movl 6(...), %eax
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169802 91177308-0d34-0410-b5e6-96231b3b80d8
This shouldn't affect codegen for -O0 compiles as tail call markers are not
emitted in unoptimized compiles. Testing with the external/internal nightly
test suite reveals no change in compile time performance. Testing with -O1,
-O2 and -O3 with fast-isel enabled did not cause any compile-time or
execution-time failures. All tests were performed on my x86 machine.
I'll monitor our arm testers to ensure no regressions occur there.
In an upcoming clang patch I will be marking the objc_autoreleaseReturnValue
and objc_retainAutoreleaseReturnValue as tail calls unconditionally. While
it's theoretically true that this is just an optimization, it's an
optimization that we very much want to happen even at -O0, or else ARC
applications become substantially harder to debug.
Part of rdar://12553082
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169796 91177308-0d34-0410-b5e6-96231b3b80d8
1. Teach it to use overlapping unaligned load / store to copy / set the trailing
bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies.
2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g.
x86 and ARM.
3. When memcpy from a constant string, do *not* replace the load with a constant
if it's not possible to materialize an integer immediate with a single
instruction (required a new target hook: TLI.isIntImmLegal()).
4. Use unaligned load / stores more aggressively if target hooks indicates they
are "fast".
5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8.
Also increase the threshold to something reasonable (8 for memset, 4 pairs
for memcpy).
This significantly improves Dhrystone, up to 50% on ARM iOS devices.
rdar://12760078
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169791 91177308-0d34-0410-b5e6-96231b3b80d8
understand target implementation of any_extend / extload, just generate
zero_extend in place of any_extend for liveouts when the target knows the
zero_extend will be implicit (e.g. ARM ldrb / ldrh) or folded (e.g. x86 movz).
rdar://12771555
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169536 91177308-0d34-0410-b5e6-96231b3b80d8
check if loads that happen in between stores alias with the first store in the
chain, only with the second store onwards.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169516 91177308-0d34-0410-b5e6-96231b3b80d8
missed in the first pass because the script didn't yet handle include
guards.
Note that the script is now able to handle all of these headers without
manual edits. =]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169224 91177308-0d34-0410-b5e6-96231b3b80d8
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.
Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169131 91177308-0d34-0410-b5e6-96231b3b80d8
If we need to split the operand of a VSELECT, it must be the mask operand. We
split the entire VSELECT operand with EXTRACT_SUBVECTOR.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168883 91177308-0d34-0410-b5e6-96231b3b80d8
For some targets, it is desirable to prefer scalarizing <N x i1> instead of promoting to a larger legal type, such as <N x i32>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168882 91177308-0d34-0410-b5e6-96231b3b80d8
This allows me to begin enabling (or backing out) misched by default
for one subtarget at a time. To run misched we typically want to:
- Disable SelectionDAG scheduling (use the source order scheduler)
- Enable more aggressive coalescing (until we decide to always run the coalescer this way)
- Enable MachineScheduler pass itself.
Disabling PostRA sched may follow for some subtargets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167826 91177308-0d34-0410-b5e6-96231b3b80d8
This adds support for weak DAG edges to the general scheduling
infrastructure in preparation for MachineScheduler support for
heuristics based on weak edges.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167738 91177308-0d34-0410-b5e6-96231b3b80d8
InputArg in r165616.
This will enable us to get the actual type for both InputArg and OutputArg.
rdar://9932559
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167265 91177308-0d34-0410-b5e6-96231b3b80d8
r165941: Resubmit the changes to llvm core to update the functions to
support different pointer sizes on a per address space basis.
Despite this commit log, this change primarily changed stuff outside of
VMCore, and those changes do not carry any tests for correctness (or
even plausibility), and we have consistently found questionable or flat
out incorrect cases in these changes. Most of them are probably correct,
but we need to devise a system that makes it more clear when we have
handled the address space concerns correctly, and ideally each pass that
gets updated would receive an accompanying test case that exercises that
pass specificaly w.r.t. alternate address spaces.
However, from this commit, I have retained the new C API entry points.
Those were an orthogonal change that probably should have been split
apart, but they seem entirely good.
In several places the changes were very obvious cleanups with no actual
multiple address space code added; these I have not reverted when
I spotted them.
In a few other places there were merge conflicts due to a cleaner
solution being implemented later, often not using address spaces at all.
In those cases, I've preserved the new code which isn't address space
dependent.
This is part of my ongoing effort to clean out the partial address space
code which carries high risk and low test coverage, and not likely to be
finished before the 3.2 release looms closer. Duncan and I would both
like to see the above issues addressed before we return to these
changes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167222 91177308-0d34-0410-b5e6-96231b3b80d8
getIntPtrType support for multiple address spaces via a pointer type,
and also introduced a crasher bug in the constant folder reported in
PR14233.
These commits also contained several problems that should really be
addressed before they are re-committed. I have avoided reverting various
cleanups to the DataLayout APIs that are reasonable to have moving
forward in order to reduce the amount of churn, and minimize the number
of commits that were reverted. I've also manually updated merge
conflicts and manually arranged for the getIntPtrType function to stay
in DataLayout and to be defined in a plausible way after this revert.
Thanks to Duncan for working through this exact strategy with me, and
Nick Lewycky for tracking down the really annoying crasher this
triggered. (Test case to follow in its own commit.)
After discussing with Duncan extensively, and based on a note from
Micah, I'm going to continue to back out some more of the more
problematic patches in this series in order to ensure we go into the
LLVM 3.2 branch with a reasonable story here. I'll send a note to
llvmdev explaining what's going on and why.
Summary of reverted revisions:
r166634: Fix a compiler warning with an unused variable.
r166607: Add some cleanup to the DataLayout changes requested by
Chandler.
r166596: Revert "Back out r166591, not sure why this made it through
since I cancelled the command. Bleh, sorry about this!
r166591: Delete a directory that wasn't supposed to be checked in yet.
r166578: Add in support for getIntPtrType to get the pointer type based
on the address space.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167221 91177308-0d34-0410-b5e6-96231b3b80d8
the MachineInstr MayLoad/MayLoad flags are based on the tablegen implementation.
For inline assembly, however, we need to compute these based on the constraints.
Revert r166929 as this is no longer needed, but leave the test case in place.
rdar://12033048 and PR13504
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167040 91177308-0d34-0410-b5e6-96231b3b80d8
checks to avoid performing compile-time arithmetic on PPCDoubleDouble.
Now that APFloat supports arithmetic on PPCDoubleDouble, those checks
are no longer needed, and we can treat the type like any other.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166958 91177308-0d34-0410-b5e6-96231b3b80d8
- If more than 1 elemennts are defined and target supports the vectorized
conversion, use the vectorized one instead to reduce the strength on
conversion operation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166546 91177308-0d34-0410-b5e6-96231b3b80d8
(The change at Clang side was committed in r166345)
2. Cosmetic change in order to conform to coding standards.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166350 91177308-0d34-0410-b5e6-96231b3b80d8
which is supposed to consistently raise SIGTRAP across all systems. In contrast,
__builtin_trap() behave differently on different systems. e.g. it raises SIGTRAP on ARM, and
SIGILL on X86. The purpose of __builtin_debugtrap() is to consistently provide "trap"
functionality, in the mean time preserve the compatibility with on gcc on __builtin_trap().
The X86 backend is already able to handle debugtrap(). This patch is to:
1) make front-end recognize "__builtin_debugtrap()" (emboddied in the one-line change to Clang).
2) In DAG legalization phase, by default, "debugtrap" will be replaced with "trap", which
make the __builtin_debugtrap() "available" to all existing ports without the hassle of
changing their code.
3) If trap-function is specified (via -trap-func=xyz to llc), both __builtin_debugtrap() and
__builtin_trap() will be expanded into the function call of the specified trap function.
This behavior may need change in the future.
The provided testing-case is to make sure 2) and 3) are working for ARM port, and we
already have a testing case for x86.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166300 91177308-0d34-0410-b5e6-96231b3b80d8
- Folding (trunc (concat ... X )) to (concat ... (trunc X) ...) is valid
when '...' are all 'undef's.
- r166125 relies on this transformation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166155 91177308-0d34-0410-b5e6-96231b3b80d8
- If the extracted vector has the same type of all vectored being concatenated
together, it should be simplified directly into v_i, where i is the index of
the element being extracted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166125 91177308-0d34-0410-b5e6-96231b3b80d8
any scheduling heuristics nor does it build up any scheduling data structure
that other heuristics use. It essentially linearize by doing a DFA walk but
it does handle glues correctly.
IMPORTANT: it probably can't handle all the physical register dependencies so
it's not suitable for x86. It also doesn't deal with dbg_value nodes right now
so it's definitely is still WIP.
rdar://12474515
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166122 91177308-0d34-0410-b5e6-96231b3b80d8
Also provide an MRI::getReservedRegs() function to access the frozen
register set, and isReserved() and isAllocatable() methods to test
individual registers.
The various implementations of TRI::getReservedRegs() are quite
complicated, and many passes need to look at the reserved register set.
This patch makes it possible for these passes to use the cached copy in
MRI, avoiding a lot of malloc traffic and repeated calculations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165982 91177308-0d34-0410-b5e6-96231b3b80d8
On PowerPC, a bitcast of <16 x i8> to i128 may run through a code
path in ExpandRes_BITCAST that attempts to do an intermediate
bitcast to a <4 x i32> vector, and then construct the Hi and Lo parts
of the resulting i128 by pairing up two of those i32 vector elements
each. The code already recognizes that on a big-endian system, the
first two vector elements form the Hi part, and the final two vector
elements form the Lo part (vice-versa from the little-endian situation).
However, we also need to take endianness into account when forming each
of those separate pairs: on a big-endian system, vector element 0 is
the *high* part of the pair making up the Hi part of the result, and
vector element 1 is the low part of the pair. The code currently always
uses vector element 0 as the low part and vector element 1 as the high
part, as is appropriate for little-endian platforms only.
This patch fixes this by swapping the vector elements as they are
paired up as appropriate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165802 91177308-0d34-0410-b5e6-96231b3b80d8
not legal. However, it should use a div instruction + mul + sub if divide is
legal. The rem legalization code was missing a check and incorrectly uses a
divrem libcall even when div is legal.
rdar://12481395
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165778 91177308-0d34-0410-b5e6-96231b3b80d8
The minimum set of required instructions is ISD::AND, ISD::OR, ISD::SETO(or ISD::SETOEQ) and ISD::SETUO(or ISD::SETUNE). Everything is expanded into one of two patterns:
Pattern 1: (LHS CC1 RHS) Opc (LHS CC2 RHS)
Pattern 2: (LHS CC1 LHS) Opc (RHS CC2 RHS)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165655 91177308-0d34-0410-b5e6-96231b3b80d8
- Due to the current matching vector elements constraints in ISD::FP_EXTEND,
rounding from v2f32 to v2f64 is scalarized. Add a customized v2f32 widening
to convert it into a target-specific X86ISD::VFPEXT to work around this
constraints. This patch also reverts a previous attempt to fix this issue by
recovering the scalarized ISD::FP_EXTEND pattern and thus significantly
reduces the overhead of supporting non-power-2 vector FP extend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165625 91177308-0d34-0410-b5e6-96231b3b80d8
SchedulerDAGInstrs::buildSchedGraph ignores dependencies between FixedStack
objects and byval parameters. So loading byval parameters from stack may be
inserted *before* it will be stored, since these operations are treated as
independent.
Fix:
Currently ARMTargetLowering::LowerFormalArguments saves byval registers with
FixedStack MachinePointerInfo. To fix the problem we need to store byval
registers with MachinePointerInfo referenced to first the "byval" parameter.
Also commit adds two new fields to the InputArg structure: Function's argument
index and InputArg's part offset in bytes relative to the start position of
Function's argument. E.g.: If function's argument is 128 bit width and it was
splitted onto 32 bit regs, then we got 4 InputArg structs with same arg index,
but different offset values.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165616 91177308-0d34-0410-b5e6-96231b3b80d8
The next step is to update the optimizers to allow them to optimize the different address spaces with this information.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165505 91177308-0d34-0410-b5e6-96231b3b80d8
We use the enums to query whether an Attributes object has that attribute. The
opaque layer is responsible for knowing where that specific attribute is stored.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165488 91177308-0d34-0410-b5e6-96231b3b80d8
This class is used by LSR and a number of places in the codegen.
This is the first step in de-coupling LSR from TLI, and creating
a new interface in between them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165455 91177308-0d34-0410-b5e6-96231b3b80d8
multiple stores with a single load. We create the wide loads and stores (and their chains)
before we remove the scalar loads and stores and fix the DAG chain. We attempted to merge
loads with a different chain. When that happened, the assumption that it is safe to RAUW
broke and a cycle was introduced.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165148 91177308-0d34-0410-b5e6-96231b3b80d8
is not profitable in many cases because modern processors perform multiple stores
in parallel and merging stores prior to merging requires extra work. We handle two main cases:
1. Store of multiple consecutive constants:
q->a = 3;
q->4 = 5;
In this case we store a single legal wide integer.
2. Store of multiple consecutive loads:
int a = p->a;
int b = p->b;
q->a = a;
q->b = b;
In this case we load/store either ilegal vector registers or legal wide integer registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165125 91177308-0d34-0410-b5e6-96231b3b80d8
the add/sub case since in the case of multiplication you also have to check that
the operation in the larger type did not overflow.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165017 91177308-0d34-0410-b5e6-96231b3b80d8