If a function isn't actually in a CU's subprogram list in the debug info
metadata, ignore all the DebugLocs and don't try to build scopes, track
variables, etc.
While this is possibly a minor optimization, it's also a correctness fix
for an incoming patch that will add assertions to LexicalScopes and the
debug info verifier to ensure that all scope chains lead to debug info
for the current function.
Fix up a few test cases that had broken/incomplete debug info that could
violate this constraint.
Add a test case where this occurs by design (inlining a
debug-info-having function in an attribute nodebug function - we want
this to work because /if/ the nodebug function is then inlined into a
debug-info-having function, it should be fine (and will work fine - we
just stitch the scopes up as usual), but should the inlining not happen
we need to not assert fail either).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212203 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commits r212189 and r212190.
While this pass was accidentally disabled (until r212073), r205437
slipped in a use of `auto` that should have been `auto&`.
This fixes PR20188.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212201 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r212109, which reverted r212088.
However, disable the assert as it's not necessary for correctness. There are
several corner cases that the assert needed to handle better for in-order
scheduling, but none of them are incorrect scheduler behavior. The assert is
mainly there to collect good unit tests like this and ensure that the
target-independent scheduler is working as expected with the various machine
models.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212187 91177308-0d34-0410-b5e6-96231b3b80d8
CombineTo doesn't allow replacing a node with itself so this would crash if the
combined shuffle is the same as the input shuffle.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212181 91177308-0d34-0410-b5e6-96231b3b80d8
After Alexey Volkov, I'm adding the same property for KNL, that prefers ADD/SUB instead of INC/DEC.
Added a test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212178 91177308-0d34-0410-b5e6-96231b3b80d8
On targets without cmpxchg16b or cmpxchg8b, the borderline atomic
operations were slipping through the gaps.
X86AtomicExpand.cpp was delegating to ISelLowering. Generic
ISelLowering was delegating to X86ISelLowering and X86ISelLowering was
asserting. The correct behaviour is to expand to a libcall, preferably
in generic ISelLowering.
This can be achieved by X86ISelLowering deciding it doesn't want the
faff after all.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212134 91177308-0d34-0410-b5e6-96231b3b80d8
The logic for expanding atomics that aren't natively supported in
terms of cmpxchg loops is much simpler to express at the IR level. It
also allows the normal optimisations and CodeGen improvements to help
out with atomics, instead of using a limited set of possible
instructions..
rdar://problem/13496295
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212119 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r212088, which is causing a number of spec
failures. Will provide reduced test cases shortly.
PR20057
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212109 91177308-0d34-0410-b5e6-96231b3b80d8
AArch64AddressTypePromotion was doing nothing because it was using the
old semantics of `Use` and `uses()`, when it really wanted to get at the
`users()`.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212073 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds support for a new builtin instruction called
__builtin_ia32_rdpmc.
Builtin '__builtin_ia32_rdpmc' is defined as a 'GCC builtin'; on X86, it can
be used to read performance monitoring counters. It takes as input the index
of the performance counter to read, and returns the value of the specified
performance counter as a 64-bit number.
Calls to this new builtin will map to instruction RDPMC.
The index in input to the builtin call is moved to register %ECX. The result
of the builtin call is the value of the specified performance counter (RDPMC
would return that quantity in registers RDX:RAX).
This patch:
- Adds builtin int_x86_rdpmc as a GCCBuiltin;
- Adds a new x86 DAG node called 'RDPMC_DAG';
- Teaches how to lower this new builtin;
- Adds an ISel pattern to select instruction RDPMC;
- Fixes the definition of instruction RDPMC adding %RAX and %RDX as
implicit definitions, and adding %ECX as implicit use;
- Adds a LLVM test to verify that the new builtin is correctly selected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212049 91177308-0d34-0410-b5e6-96231b3b80d8
The combine for mul x, pow2 +/- 1 is unchanged. Test cases for
both combines as well as mul x, pow2 have been added as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212044 91177308-0d34-0410-b5e6-96231b3b80d8
lowering for v16i8.
ASan and some bots caught this bug with existing test cases. Fixing it
even fixed a miscompile with one of the test cases. I'm still a bit
suspicious of this test case as I've not taken a proper amount of time
to think about it, but the fix here is strict goodness.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211976 91177308-0d34-0410-b5e6-96231b3b80d8
These show up really frequently, not the least with actual splats. =] We
lowered these quite badly before. The new code path tries to widen i8
shuffles to i16 shuffles in a splat-like way. There are still some
inefficiencies in our i16 splat logic though, so we aren't really done
here.
Also, for certain patterns (bit of a gather-and-splat) we still
generate pretty silly code, and I've left a fixme for addressing it.
However, I'm not actually worried about this code pattern as much. The
old shuffle lowering generates a 29 instruction monstrosity for it that
should execute much more slowly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211974 91177308-0d34-0410-b5e6-96231b3b80d8
This was generated while trying to debug a test, it shouldn't have been
checked in.
Thanks to Alexander Kornienko for spotting this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211973 91177308-0d34-0410-b5e6-96231b3b80d8
lowering.
For maximum irony, I had already discovered this bug, diagnosed it, and
left FIXMEs about it in the test cases. =[ I just failed to go back over
those until after i had reduced a bootstrap miscompile down to a single
TU, stared at the assembly for an hour, and figured out the bug. Again.
Oh well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211955 91177308-0d34-0410-b5e6-96231b3b80d8
The address space of the pointer must be global (1) for these intrinsics. There must also be alignment metadata attached to the intrinsic calls, e.g.
%val = tail call i32 @llvm.nvvm.ldu.i.global.i32.p1i32(i32 addrspace(1)* %ptr), !align !0!0 = metadata !{i32 4}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211939 91177308-0d34-0410-b5e6-96231b3b80d8
This also introduces DAGCombiner patterns for mul.wide to multiply two smaller integers and produce a larger integer
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211935 91177308-0d34-0410-b5e6-96231b3b80d8
a bootstrap.
I managed to mis-remember how PACKUS worked on x86, and was using undef
for the high bytes instead of zero. The fix is fairly obvious.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211922 91177308-0d34-0410-b5e6-96231b3b80d8
This new IR facility allows us to represent the object-file semantic of
a COMDAT group.
COMDATs allow us to tie together sections and make the inclusion of one
dependent on another. This is required to implement features like MS
ABI VFTables and optimizing away certain kinds of initialization in C++.
This functionality is only representable in COFF and ELF, Mach-O has no
similar mechanism.
Differential Revision: http://reviews.llvm.org/D4178
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211920 91177308-0d34-0410-b5e6-96231b3b80d8
I've run into a bug where current LLVM at -O0 (with fast-isel)
generated invalid code like:
ld 0, 20936(1) # 8-byte Folded Reload
stw 12, 10348(0)
stw 12, 10344(0)
The underlying vreg had been introduced as base register by the
Local Stack Slot Allocation pass. That register was constrained
to G8RC by PPCRegisterInfo::materializeFrameBaseRegister to match
the ADDI instruction used to set it, but it was *not* constrained
to G8RC_NOX0 to fit the *use* of the register in an address.
That should have happened in PPCRegisterInfo::resolveFrameIndex.
This patch adds an appropriate constrainRegClass call.
Reviewed by Hal Finkel.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211897 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This allows it to fold pshufd instructions across intervening
half-shuffles and other noise. This pattern actually shows up in the
generic lowering tests, but I've also added direct tests using
intrinsics to make sure that the specific desired functionality is
working even if the lowering stuff changes in the future.
Differential Revision: http://reviews.llvm.org/D4292
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211892 91177308-0d34-0410-b5e6-96231b3b80d8
half-shuffles, even looking through intervening instructions in a chain.
Summary:
This doesn't happen to show up with any test cases I've found for the current
shuffle lowering, but previous attempts would benefit from this and it seems
generally useful. I've tested it directly using intrinsics, which also shows
that it will work with hand vectorized code as well.
Note that even though pshufd isn't directly used in these tests, it gets
exercised because we combine some of the half shuffles into a pshufd
first, and then merge them.
Differential Revision: http://reviews.llvm.org/D4291
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211890 91177308-0d34-0410-b5e6-96231b3b80d8
trivially redundant.
This fixes several cases in the new vector shuffle lowering algorithm
which would generate redundant shuffle instructions for the sake of
simplicity.
I'm also deleting a testcase which was somewhat ridiculous. It was
checking for a bug in 2007 about incorrectly transforming shuffles by
looking for the string "-86" in the output of a pretty substantial
function. This test case doesn't seem to have any value at this point.
Differential Revision: http://reviews.llvm.org/D4240
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211889 91177308-0d34-0410-b5e6-96231b3b80d8
x86 backend.
This sketches out a new code path for vector lowering, hidden behind an
off-by-default flag while it is under development. The fundamental idea
behind the new code path is to aggressively break down the problem space
in ways that ease selecting the odd set of instructions available on
x86, and carefully avoid scalarizing code even when forced to use older
ISAs. Notably, this starts off restricting itself to SSE2 and implements
the complete vector shuffle and blend space for 128-bit vectors in SSE2
without scalarizing. The plan is to layer on top of this ISA extensions
where we can bail out of the complex SSE2 lowering and opt for
a cheaper, specialized instruction (or set of instructions). It also
needs to be generalized to AVX and AVX512 vector widths.
Currently, this does a decent but not perfect job for SSE2. There are
some specific shortcomings that I plan to address:
- We need a peephole combine to fold together shuffles where possible.
There are cases where a previous shuffle could be modified slightly to
arrange for elements to be in the correct position and a later shuffle
eliminated. Doing this eagerly added quite a bit of complexity, and
so my plan is to combine away these redundancies afterward.
- There are a lot more clever ways to use unpck and pack that need to be
added. This is essential for real world shuffles as it turns out...
Once SSE2 is polished a bit I should be able to get interesting numbers
on performance improvements on benchmarks conducive to vectorization.
All of this will be off by default until it is functionally equivalent
of course.
Differential Revision: http://reviews.llvm.org/D4225
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211888 91177308-0d34-0410-b5e6-96231b3b80d8
Fixe for Bug 20057 - Assertion failied in llvm::SUnit* llvm::SchedBoundary::pickOnlyChoice(): Assertion `i <= (HazardRec->getMaxLookAhead() + MaxObservedStall) && "permanent hazard"'
Thanks to Chad for the test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211865 91177308-0d34-0410-b5e6-96231b3b80d8
There is no need to calculate the liveness information for stackmaps. The
liveness information is still available for the patchpoint intrinsic and
that is also the intended usage model.
Related to <rdar://problem/17473725>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211816 91177308-0d34-0410-b5e6-96231b3b80d8