Since we visit the whole list of subprograms for each CU at module
start, this is clearly true - don't test for the case, just assert it.
A few old test cases seemed to have incomplete subprogram lists, but any
attempt to reproduce them shows full subprogram lists that even include
entities that have been completely inlined and the out of line
definition removed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209178 91177308-0d34-0410-b5e6-96231b3b80d8
When I refactored this in r208636 I accidentally caused this to be added
multiple times to each abstract subprogram (not accounting for the
deduplicating effect of the InlinedSubprogramDIEs set).
This got better in r208798 when the abstract definitions got the
attribute added to them at construction time, but still had the
redundant copies introduced in r208636.
This commit removes those excess DW_AT_inlines and relies solely on the
insertion in r208798.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209166 91177308-0d34-0410-b5e6-96231b3b80d8
The check in DwarfDebug::constructScopeDIE was meant to consider inlined
subroutines as any non-top-level scope that was a subprogram. Instead of
checking "not top level scope" it was checking if the /subprogram's/
scope was non-top-level.
Fix this and beef up a test case to demonstrate some of the missing
inlined_subroutines are no longer missing.
In the course of fixing this I also found that r208748 (with this fix)
found one /extra/ inlined_subroutine in concrete_out_of_line.ll due to
two inlined_subroutines having the same inlinedAt location. The previous
implementation was collapsing these into a single inlined subroutine.
I'm not sure what the original code was that created this .ll file so
I'm not sure if this actually happens in practice today. Since we
deliberately include column information to disambiguate two calls on the
same line, that may've addressed this bug in the frontend, but it's good
to know that workaround isn't necessary for this particular case
anymore.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209165 91177308-0d34-0410-b5e6-96231b3b80d8
Currently the X86 backend doesn't support types larger than i128 very well. For
example an i192 multiply will assert in codegen when the 2nd argument is a constant and the constant got hoisted.
This fix changes the cost model to never hoist constants for types larger than
i128. Once the codegen issues have been resolved, the cost model can be updated
to allow also larger types.
This is related to <rdar://problem/16954938>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209162 91177308-0d34-0410-b5e6-96231b3b80d8
Instructions TZCNT (requires BMI1) and LZCNT (requires LZCNT), always
provide the operand size as output if the input operand is zero.
We can take advantage of this knowledge during instruction selection
stage in order to simplify a few corner case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209159 91177308-0d34-0410-b5e6-96231b3b80d8
so that llvm-size will total up all the sections in the Berkeley format. This
allows for rough categorizations for Mach-O sections. And allows the total of
llvm-size’s Berkeley and System V formats to be the same.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209158 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When inserting an element that's coming from a vector load or a broadcast
of a vector (or scalar) load, combine the load into the insertps
instruction.
Added PerformINSERTPSCombine for the case where we need to fix the load
(load of a vector + insertps with a non-zero CountS).
Added patterns for the broadcasts.
Also added tests for SSE4.1, AVX, and AVX2.
Reviewers: delena, nadav, craig.topper
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3581
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209156 91177308-0d34-0410-b5e6-96231b3b80d8
It turns out that not all the world is x86-64. Who knew?
I'll get to work on a more appropriate test case for this patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209155 91177308-0d34-0410-b5e6-96231b3b80d8
For GOT relocations the addend should modify the offset to the
GOT entry, not the value of the entry itself. Teach RuntimeDyldMachO
to do The Right Thing here.
Fixes <rdar://problem/16961886>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209154 91177308-0d34-0410-b5e6-96231b3b80d8
The cost model conservatively assumes that it will always get scalarized and
that's about as good as we can get with the generic TTI; reasoning whether a
shuffle with an efficient lowering is available is hard. We can override that
conservative estimate for some targets in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209125 91177308-0d34-0410-b5e6-96231b3b80d8
- On ARM/ARM64 we get a vrev because the shuffle matching code is really smart. We still unroll anything that's not v4i32 though.
- On X86 we get a pshufb with SSSE3. Required more cleverness in isShuffleMaskLegal.
- On PPC we get a vperm for v8i16 and v4i32. v2i64 is unrolled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209123 91177308-0d34-0410-b5e6-96231b3b80d8
Rather than create a series of function calls to setup the library calls, create
a table with the information and just use the table to drive the configuration
of the library calls. This makes it easier to both inspect the list as well as
to modify it. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209089 91177308-0d34-0410-b5e6-96231b3b80d8
While there make getOption return a const reference so we don't have to put it
on the stack when calling methods on it. No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209088 91177308-0d34-0410-b5e6-96231b3b80d8
Windows on ARM uses R11 for the frame pointer even though the environment is a
pure Thumb-2, thumb-only environment. Replicate this behaviour to improve
Windows ABI compatibility. This register is used for fast stack walking, and
thus is part of the Windows ABI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209085 91177308-0d34-0410-b5e6-96231b3b80d8
Use the ARMBaseRegisterInfo to query the frame register. The base register info
is aware of the frame register that is used for the frame pointer. Use that to
determine the frame register rather than duplicating the knowledge. Although,
the code path is slightly different in that it may return SP, that can only
occur if the frame pointer has been omitted in the machine function, which is
supposed to contain the desired value in that case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209084 91177308-0d34-0410-b5e6-96231b3b80d8
This is mostly a mechanical change changing all the call sites to the newer
chained-function construction pattern. This removes the horrible 15-parameter
constructor for the CallLoweringInfo in favour of setting properties of the call
via chained functions. No functional change beyond the removal of the old
constructors are intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209082 91177308-0d34-0410-b5e6-96231b3b80d8
Rather than introducing an auxiliary CallLoweringInfoBuilder, add the methods to
do chained function construction directly to CallLoweringInfo. This reduces the
monstrous 15-parameter constructor into a series of simpler (for some definition
of simpler) functions that control particular aspects of the call. The old
interfaces can be completely removed once callers are moved to the new chained
constructor pattern.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209081 91177308-0d34-0410-b5e6-96231b3b80d8
This is a preliminary step to help ease the construction of CallLoweringInfo.
Changing the construction to a chained function pattern requires that the
parameter be nullable. However, rather than copying the vector, save a pointer
rather than the reference to permit a late binding of the arguments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209080 91177308-0d34-0410-b5e6-96231b3b80d8