contract (and be much more useful). It now provides exactly the
post-order traversal a caller might need to perform on newly formed
SCCs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207410 91177308-0d34-0410-b5e6-96231b3b80d8
mapping from a basic block to an incoming value, either for removal or
just lookup, is linear in the number of predecessors, and we were doing
this for every entry in the 'Preds' list which is in many cases almost
all of them!
Unfortunately, the fixes are quite ugly. PHI nodes just don't make this
operation easy. The efficient way to fix this is to have a clever
'remove_if' operation on PHI nodes that lets us do a single pass over
all the incoming values of the original PHI node, extracting the ones we
care about. Then we could quickly construct the new phi node from this
list. This would remove the remaining underlying quadratic movement of
unrelated incoming values and the need for silly backwards looping to
"minimize" how often we hit the quadratic case.
This is the last obvious fix for PR19499. It shaves another 20% off the
compile time for me, and while UpdatePHINodes remains in the profile,
most of the time is now stemming from the well known inefficiencies of
LVI and jump threading.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207409 91177308-0d34-0410-b5e6-96231b3b80d8
by avoiding inlining massive switches merely because they have no
instructions in them. These switches still show up where we fail to form
lookup tables, and in those cases they are actually going to cause
a very significant code size hit anyways, so inlining them is not the
right call. The right way to fix any performance regressions stemming
from this is to enhance the switch-to-lookup-table logic to fire in more
places.
This makes PR19499 about 5x less bad. It uncovers a second compile time
problem in that test case that is unrelated (surprisingly!).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207403 91177308-0d34-0410-b5e6-96231b3b80d8
each line. This is particularly nice for tracking which run of
a particular pass over a particular function was slow.
This also required making the TimeValue string much more useful. First,
there is a standard format for writing out a date and time. Let's use
that rather than strings that would have to be parsed. Second, actually
output the nanosecond resolution that timevalue claims to have.
This is proving useful working on PR19499, so I figured it would be
generally useful to commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207385 91177308-0d34-0410-b5e6-96231b3b80d8
Only the object streamers need to track if a symbol should be marked thumb or
not. This ports the ELF case. The COFF case is not ported since it is currently
not working for some other reason (I will report a bug).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207366 91177308-0d34-0410-b5e6-96231b3b80d8
This restores the previous behaviour of just assuming that if you dont specify a
valid triple that you really meant the default triple with an ELF object file.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207349 91177308-0d34-0410-b5e6-96231b3b80d8
Introduce support for WoA PE/COFF object file emission from LLVM. Add the new
target specific PE/COFF Streamer (ARMWinCOFFStreamer) that handles the ARM
specific behaviour of PE/COFF object emission. ARM exception information is not
yet emitted and is a TODO item.
The ARM specific object writer (ARMWinCOFFObjectWriter) handles the ARM specific
relocation handling in conjunction with the WinCOFFObjectWriter in the MC layer.
The MC layer needs to be updated to deal with the relocation adjustments.
Branch relocations are adjusted by 4 bytes (unlikely their ELF counterparts).
Minor tweaks to switch multiple conditional checks into equivalent switch
statements. The ObjectFileInfo is updated to relax the object file setup for
Windows COFF. Move the architecture checks into an assertion. Windows COFF is
currently only supported on x86, x86_64, and ARM (thumb). Rather than
defaulting to ELF, we will refuse to generate an object file. This is better
though as you do not get an (arbitrary) object file which is different from the
request.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207345 91177308-0d34-0410-b5e6-96231b3b80d8
This introduces a target specific streamer, X86WinCOFFStreamer, which handles
the target specific behaviour (e.g. WinEH). This is mostly to ensure that
differences between ARM and X86 remain disjoint and do not accidentally cross
boundaries. This is the final staging change for enabling object emission for
Windows on ARM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207344 91177308-0d34-0410-b5e6-96231b3b80d8
This is in preparation for promoting WinCOFFStreamer to a base class which will
be shared by the X86 and ARM specific target COFF streamers. Also add a new
getOrCreateSymbolData interface (like MCELFStreamer) for the ARM COFF Streamer.
This makes the COFFStreamer more similar to the ELFStreamer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207343 91177308-0d34-0410-b5e6-96231b3b80d8
Stylistic changes to prepare for splitting up the COFFStreamer into target
specific streamers. Tweak some assertion messages. No functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207342 91177308-0d34-0410-b5e6-96231b3b80d8
Currently, the integrated assembler is the only choice for assembling Windows on
ARM binaries. IAS supports the .file <filename> directive which emits the file
symbol into the resulting object binary. Mark the GNU COFF information to
indicate support for this feature.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207341 91177308-0d34-0410-b5e6-96231b3b80d8
API requirements much more obvious.
The key here is that there are two totally different use cases for
mutating the graph. Prior to doing any SCC formation, it is very easy to
mutate the graph. There may be users that want to do small tweaks here,
and then use the already-built graph for their SCC-based operations.
This method remains on the graph itself and is documented carefully as
being cheap but unavailable once SCCs are formed.
Once SCCs are formed, and there is some in-flight DFS building them, we
have to be much more careful in how we mutate the graph. These mutation
operations are sunk onto the SCCs themselves, which both simplifies
things (the code was already there!) and helps make it obvious that
these interfaces are only applicable within that context. The other
primary constraint is that the edge being mutated is actually related to
the SCC on which we call the method. This helps make it obvious that you
cannot arbitrarily mutate some other SCC.
I've tried to write much more complete documentation for the interesting
mutation API -- intra-SCC edge removal. Currently one aspect of this
documentation is a lie (the result list of SCCs) but we also don't even
have tests for that API. =[ I'm going to add tests and fix it to match
the documentation next.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207339 91177308-0d34-0410-b5e6-96231b3b80d8
Since there's no way to ensure the type unit in the .dwo and the type
unit skeleton in the .o are correlated, this cannot work.
This implementation is a bit inefficient for a few reasons, called out
in comments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207323 91177308-0d34-0410-b5e6-96231b3b80d8
Sinking addition of the declaration attribute down to where the
signature is added. So that if the signature is not added neither is the
declaration attribute (this will come in handy when aborting type unit
construction to instead emit the type into the CU directly in some
cases)
Pull out type unit identifier hashing just to simplify the function a
little, it'll be getting longer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207321 91177308-0d34-0410-b5e6-96231b3b80d8
This gets us pretty code for divs of i16 vectors. Turn the existing
intrinsics into the corresponding nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207317 91177308-0d34-0410-b5e6-96231b3b80d8
Otherwise the legalizer would just scalarize everything. Support for
mulhi in the targets isn't that great yet so on most targets we get
exactly the same scalarized output. Add a test for x86 vector udiv.
I had to disable the mulhi nodes on ARM because there aren't any patterns
for it. As far as I know ARM has instructions for getting the high part of
a multiply so this should be fixed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207315 91177308-0d34-0410-b5e6-96231b3b80d8
them, just skip over any DFS-numbered nodes when finding the next root
of a DFS. This allows the entry set to just be a vector as we populate
it from a uniqued source. It also removes the possibility for a linear
scan of the entry set to actually do the removal which can make things
go quadratic if we get unlucky.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207312 91177308-0d34-0410-b5e6-96231b3b80d8
the DFS stack for leaves in the call graph. As mentioned in my previous
commit, this is particularly interesting for graphs which have high fan
out but low connectivity resulting in many leaves. For such graphs, this
can remove a large % of the DFS stack traffic even though it doesn't
make the stack much smaller.
It's a bit easier to formulate this for the full algorithm because that
one stops completely for each SCC. For example, I was able to directly
eliminate the "Recurse" boolean used to continue an outer loop from the
inner loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207311 91177308-0d34-0410-b5e6-96231b3b80d8
makes working through the worklist much cleaner, and makes it possible
to avoid the 'bool-to-continue-the-outer-loop' hack. Not a huge
difference, but I think this is approaching as polished as I can make
it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207310 91177308-0d34-0410-b5e6-96231b3b80d8
more than 1 instruction. The caller need to be aware of this
and adjust instruction iterators accordingly.
rdar://16679376
Repaired r207302.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207309 91177308-0d34-0410-b5e6-96231b3b80d8
processed in the DFS out of the stack completely. Keep it exclusively in
a variable. Re-shuffle some code structure to make this easier. This can
have a very dramatic effect in some cases because call graphs tend to
look like a high fan-out spanning tree. As a consequence, there are
a large number of leaf nodes in the graph, and this technique causes
leaf nodes to never even go into the stack. While this only reduces the
max depth by 1, it may cause the total number of round trips through the
stack to drop by a lot.
Now, most of this isn't really relevant for the incremental version. =]
But I wanted to prototype it first here as this variant is in ways more
complex. As long as I can get the code factored well here, I'll next
make the primary walk look the same. There are several refactorings this
exposes I think.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207306 91177308-0d34-0410-b5e6-96231b3b80d8
graph in any way because we don't track edges in the SCC graph, just
nodes. This also lets us add a nice assert about the invariant that
we're working on at least a certain number of nodes within the SCC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207305 91177308-0d34-0410-b5e6-96231b3b80d8
The included test case would return the incorrect results, because the expansion
of an shift with a constant shift amount of 0 would generate undefined behavior.
This is because ExpandShiftByConstant assumes that all shifts by constants with
a value of 0 have already been optimized away. This doesn't happen for opaque
constants and usually this isn't a problem, because opaque constants won't take
this code path - they are not supposed to. In the case that the opaque constant
has to be expanded by the legalizer, the legalizer would drop the opaque flag.
In this case we hit the limitations of ExpandShiftByConstant and create incorrect
code.
This commit fixes the legalizer by not dropping the opaque flag when expanding
opaque constants and adding an assertion to ExpandShiftByConstant to catch this
not supported case in the future.
This fixes <rdar://problem/16718472>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207304 91177308-0d34-0410-b5e6-96231b3b80d8
more than 1 instruction. The caller need to be aware of this
and adjust instruction iterators accordingly.
rdar://16679376
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207302 91177308-0d34-0410-b5e6-96231b3b80d8
Scaling factors are not free on X86 because every "complex" addressing mode
breaks the related instruction into 2 allocations instead of 1.
<rdar://problem/16730541>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207301 91177308-0d34-0410-b5e6-96231b3b80d8
a helper function. Also factor the other two places where we did the
same thing into the helper function. =] Much cleaner this way. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207300 91177308-0d34-0410-b5e6-96231b3b80d8
right intrinsics.
A packed logical shift right with a shift count bigger than or equal to the
element size always produces a zero vector. In all other cases, it can be
safely replaced by a 'lshr' instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207299 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
If we're doing a v4f32/v4i32 shuffle on x86 with SSE4.1, we can lower
certain shufflevectors to an insertps instruction:
When most of the shufflevector result's elements come from one vector (and
keep their index), and one element comes from another vector or a memory
operand.
Added tests for insertps optimizations on shufflevector.
Added support and tests for v4i32 vector optimization.
Reviewers: nadav
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3475
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207291 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, irreducible backedges were ignored. With this commit,
irreducible SCCs are discovered on the fly, and modelled as loops with
multiple headers.
This approximation specifies the headers of irreducible sub-SCCs as its
entry blocks and all nodes that are targets of a backedge within it
(excluding backedges within true sub-loops). Block frequency
calculations act as if we insert a new block that intercepts all the
edges to the headers. All backedges and entries to the irreducible SCC
point to this imaginary block. This imaginary block has an edge (with
even probability) to each header block.
The result is now reasonable enough that I've added a number of
testcases for irreducible control flow. I've outlined in
`BlockFrequencyInfoImpl.h` ways to improve the approximation.
<rdar://problem/14292693>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207286 91177308-0d34-0410-b5e6-96231b3b80d8
This also avoids the need for subtly side-effecting calls to manifest
strings in the string table at the point where items are added to the
accelerator tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207281 91177308-0d34-0410-b5e6-96231b3b80d8
This adds support for an -mattr option to the gold plugin and to llvm-lto. This
allows the caller to specify details of the subtarget architecture, like +aes,
or +ssse3 on x86. Note that this requires a change to the include/llvm-c/lto.h
interface: it adds a function lto_codegen_set_attr and it increments the
version of the interface.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207279 91177308-0d34-0410-b5e6-96231b3b80d8
Pulls out some more code from some of the rather monolithic DWARF
classes. Unlike the address table, the string table won't move up into
DwarfDebug - each DWARF file has its own string table (but there can be
only one address table).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207277 91177308-0d34-0410-b5e6-96231b3b80d8
Consider this use from the new testcase:
LSR Use: Kind=ICmpZero, Offsets={0}, widest fixup type: i32
reg({1000,+,-1}<nw><%for.body>)
-3003 + reg({3,+,3}<nw><%for.body>)
-1001 + reg({1,+,1}<nuw><nsw><%for.body>)
-1000 + reg({0,+,1}<nw><%for.body>)
-3000 + reg({0,+,3}<nuw><%for.body>)
reg({-1000,+,1}<nw><%for.body>)
reg({-3000,+,3}<nsw><%for.body>)
This is the last use we consider for a solution in SolveRecurse, so CurRegs is
a large set. (CurRegs is the set of registers that are needed by the
previously visited uses in the in-progress solution.)
ReqRegs is {
{3,+,3}<nw><%for.body>,
{1,+,1}<nuw><nsw><%for.body>
}
This is the intersection of the regs used by any of the formulas for the
current use and CurRegs.
Now, the code requires a formula to contain *all* these regs (the comment is
simply wrong), otherwise the formula is immediately disqualified. Obviously,
no formula for this use contains two regs so they will all get disqualified.
The fix modifies the check to allow the formula in this case. The idea is
that neither of these formulae is introducing any new registers which is the
point of this early pruning as far as I understand.
In terms of set arithmetic, we now allow formulas whose used regs are a subset
of the required regs not just the other way around.
There are few more loops in the test-suite that are now successfully LSRed. I
have benchmarked those and found very minimal change.
Fixes <rdar://problem/13965777>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207271 91177308-0d34-0410-b5e6-96231b3b80d8
buildbot - do not insert debug intrinsics before phi nodes.
Debug info for optimized code: Support variables that are on the stack and
described by DBG_VALUEs during their lifetime.
Previously, when a variable was at a FrameIndex for any part of its
lifetime, this would shadow all other DBG_VALUEs and only a single
fbreg location would be emitted, which in fact is only valid for a small
range and not the entire lexical scope of the variable. The included
dbg-value-const-byref testcase demonstrates this.
This patch fixes this by
Local
- emitting dbg.value intrinsics for allocas that are passed by reference
- dropping all dbg.declares (they are now fully lowered to dbg.values)
SelectionDAG
- renamed constructors for SDDbgValue for better readability.
- fix UserValue::match() to handle indirect values correctly
- not inserting an MMI table entries for dbg.values that describe allocas.
- lowering dbg.values that describe allocas into *indirect* DBG_VALUEs.
CodeGenPrepare
- leaving dbg.values for an alloca were they are (see comment)
Other
- regenerated/updated instcombine.ll testcase and included source
rdar://problem/16679879
http://reviews.llvm.org/D3374
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207269 91177308-0d34-0410-b5e6-96231b3b80d8
This should reduce the chance of memory leaks like those fixed in
r207240.
There's still some unclear ownership of DIEs happening in DwarfDebug.
Pushing unique_ptr and references through more APIs should help expose
the cases where ownership is a bit fuzzy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207263 91177308-0d34-0410-b5e6-96231b3b80d8
Makes some more cases (the unit tests, specifically), lexically
compatible with a change to unique_ptr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207261 91177308-0d34-0410-b5e6-96231b3b80d8
Since this doesn't return ownership (the DIE has been added to the
specified parent already) nor return null, just return by reference.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207259 91177308-0d34-0410-b5e6-96231b3b80d8
Move a lot of the loop-related logic that was sprinkled around the code
into `LoopData`.
<rdar://problem/14292693>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207258 91177308-0d34-0410-b5e6-96231b3b80d8
This'll make changing to unique_ptr ownership of DIEs easier since the
usages will now have '*' on them making them textually compatible
between unique_ptr and raw pointer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207253 91177308-0d34-0410-b5e6-96231b3b80d8
It's fishy to be changing the `std::vector<>` owned by the iterator, and
no one actual does it, so I'm going to remove the ability in a
subsequent commit. First, update the users.
<rdar://problem/14292693>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207252 91177308-0d34-0410-b5e6-96231b3b80d8
This intrinsic is no longer needed with the new @llvm.arm.hint(i32) intrinsic
which provides a generic, extensible manner for adding hint instructions. This
functionality can now be represented as @llvm.arm.hint(i32 5).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207246 91177308-0d34-0410-b5e6-96231b3b80d8
override the default cold threshold.
When we use command line argument to set the inline threshold, the default
cold threshold will not be used. This is in line with how we use
OptSizeThreshold. When we want a higher threshold for all functions, we
do not have to set both inline threshold and cold threshold.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207245 91177308-0d34-0410-b5e6-96231b3b80d8
Introduce the llvm.arm.hint(i32) intrinsic that can be used to inject hints into
the instruction stream. This is particularly useful for generating IR from a
compiler where the user may inject an intrinsic (e.g. __yield). These are then
pattern substituted into the correct instruction which already existed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207242 91177308-0d34-0410-b5e6-96231b3b80d8
Debug info: Let dbg.values inserted by LowerDbgDeclare inherit the location
of the dbg.value. This gets rid of tons of redundant variable DIEs in
subscopes.
rdar://problem/14874886, rdar://problem/16679936
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207236 91177308-0d34-0410-b5e6-96231b3b80d8
AllocaInst that was missing in one location.
Debug info for optimized code: Support variables that are on the stack and
described by DBG_VALUEs during their lifetime.
Previously, when a variable was at a FrameIndex for any part of its
lifetime, this would shadow all other DBG_VALUEs and only a single
fbreg location would be emitted, which in fact is only valid for a small
range and not the entire lexical scope of the variable. The included
dbg-value-const-byref testcase demonstrates this.
This patch fixes this by
Local
- emitting dbg.value intrinsics for allocas that are passed by reference
- dropping all dbg.declares (they are now fully lowered to dbg.values)
SelectionDAG
- renamed constructors for SDDbgValue for better readability.
- fix UserValue::match() to handle indirect values correctly
- not inserting an MMI table entries for dbg.values that describe allocas.
- lowering dbg.values that describe allocas into *indirect* DBG_VALUEs.
CodeGenPrepare
- leaving dbg.values for an alloca were they are (see comment)
Other
- regenerated/updated instcombine.ll testcase and included source
rdar://problem/16679879
http://reviews.llvm.org/D3374
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207235 91177308-0d34-0410-b5e6-96231b3b80d8
There's no need for local symbols to go through the GOT, in fact it seems GNU ld is not even emitting GOT entries for local symbols and will error out when trying to resolve a GOT relocation for a local symbol.
This bug triggers when bootstrapping clang on AArch64 Linux with -fPIC and the ARM64 backend. The AArch64 backend is not affected.
With this commit it's now possible to bootstrap clang on AArch64 Linux with the ARM64 backend (-fPIC, -O3).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207226 91177308-0d34-0410-b5e6-96231b3b80d8
SCCMap to test for nodes that have been re-added to the root SCC rather
than a set vector. We already have done the SCCMap lookup, we juts need
to test it in two different ways. In turn, do most of the processing of
these nodes as they go into the root SCC rather than lazily. This
simplifies the final loop to just stitch the root SCC into its
children's parent sets. No functionlatiy changed.
However, this makes a few things painfully obvious, which was my intent.
=] There is tons of repeated code introduced here and elsewhere. I'm
splitting the refactoring of that code into helpers from this change so
its clear that this is the change which switches the datastructures used
around, and the other is a pure factoring & deduplication of code
change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207217 91177308-0d34-0410-b5e6-96231b3b80d8
This patch is a supplement of implementing predicate of FP, enabling aarch64 backend
no-fp tests on arm64 target for verification. During this, one bug is exposed and
fixed by this patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207215 91177308-0d34-0410-b5e6-96231b3b80d8
remove the nodes in the SCC from the SCC map entirely prior to the DFS
walk. This allows the SCC map to represent both the state of
not-yet-re-added-to-an-SCC and added-back-to-this-SCC independently. The
first is being missing from the SCC map, the second is mapping back to
'this'. In a subsequent commit, I'm going to use this property to
simplify the new node list for this SCC.
In theory, I think this also makes the contract for orphaning a node
from the graph slightly less confusing. Now it is also orphaned from the
SCC graph. Still, this isn't quite right either, and so I'm not adding
test cases here. I'll add test cases for the behavior of orphaning nodes
when the code *actually* supports it. The change here is mostly
incidental, my goal is simplifying the algorithm.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207213 91177308-0d34-0410-b5e6-96231b3b80d8
child from the worklist, wait until we actually need to pop another
element off of the worklist and skip over any that were already visited
by the DFS. This also enables swapping the nodes of the SCC into the
worklist. No functionality changed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207212 91177308-0d34-0410-b5e6-96231b3b80d8
thing, just mucking up the code. I feel bad that I even wrote this loop.
Very sorry. The diff is huge because of the indent change, but I promise
all this is doing is realizing that the outer two loops were actually
the exact same loops, and we didn't need two of them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207202 91177308-0d34-0410-b5e6-96231b3b80d8
factored into a more reasonable form, replace the tail call with
a simple outer-loop continuation. It's sad that C++ makes this so
awkward to write, but it seems more direct and clear than the tail call
at this point.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207201 91177308-0d34-0410-b5e6-96231b3b80d8
Change the object streamer selection to a switch from a series of if conditions.
Rather than defaulting to ELF, require that an ELF format is requested. The
Windows/!ELF is maintained as MachO would have been selected first and will
still provide a MachO format. Add an assertion that if COFF is requested that
the target platform is Windows as only WinCOFF object emission is currently
supported.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207200 91177308-0d34-0410-b5e6-96231b3b80d8
Remove the concepts of "forward" and "general" mass distributions, which
was wrong. The split might have made sense in an early version of the
algorithm, but it's definitely wrong now.
<rdar://problem/14292693>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207195 91177308-0d34-0410-b5e6-96231b3b80d8
Rather than scaling loop headers and then scaling all the loop members
by the header frequency, scale `LoopData::Scale` itself, and scale the
loop members by it. It's much more obvious what's going on this way,
and doesn't cost any extra multiplies.
<rdar://problem/14292693>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207189 91177308-0d34-0410-b5e6-96231b3b80d8
Make `getPackagedNode()` a member function of
`BlockFrequencyInfoImplBase` so that it's available for templated code.
<rdar://problem/14292693>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207183 91177308-0d34-0410-b5e6-96231b3b80d8
As pointed out by David Blaikie in code review, a `std::list<T>` is
simpler than a `std::vector<std::unique_ptr<T>>`. Another option is a
`std::deque<T>` (which allocates in chunks), but I'd like to leave open
the option of inserting in the middle of the sequence for handling
irreducible control flow on the fly.
<rdar://problem/14292693>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207177 91177308-0d34-0410-b5e6-96231b3b80d8
When fixing the symbols in each compressed section we were iterating
over all symbols for each compressed section. In extreme cases this
could snowball severely (5min uncompressed -> 35min compressed) due to
iterating over all symbols for each compressed section (large numbers of
compressed sections can be generated by DWARF type units).
To address this, build a map of the symbols in each section ahead of
time, and access that map if a section is being compressed. This brings
compile time for the aforementioned example down to ~6 minutes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207167 91177308-0d34-0410-b5e6-96231b3b80d8
AllocaInst that was missing in one location.
Debug info for optimized code: Support variables that are on the stack and
described by DBG_VALUEs during their lifetime.
Previously, when a variable was at a FrameIndex for any part of its
lifetime, this would shadow all other DBG_VALUEs and only a single
fbreg location would be emitted, which in fact is only valid for a small
range and not the entire lexical scope of the variable. The included
dbg-value-const-byref testcase demonstrates this.
This patch fixes this by
Local
- emitting dbg.value intrinsics for allocas that are passed by reference
- dropping all dbg.declares (they are now fully lowered to dbg.values)
SelectionDAG
- renamed constructors for SDDbgValue for better readability.
- fix UserValue::match() to handle indirect values correctly
- not inserting an MMI table entries for dbg.values that describe allocas.
- lowering dbg.values that describe allocas into *indirect* DBG_VALUEs.
CodeGenPrepare
- leaving dbg.values for an alloca were they are (see comment)
Other
- regenerated/updated instcombine.ll testcase and included source
rdar://problem/16679879
http://reviews.llvm.org/D3374
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207165 91177308-0d34-0410-b5e6-96231b3b80d8
condition into an obviously infinite loop with an assert about the
degenerate condition. No functionality changed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207147 91177308-0d34-0410-b5e6-96231b3b80d8
This is similar to the 'tail' marker, except that it guarantees that
tail call optimization will occur. It also comes with convervative IR
verification rules that ensure that tail call optimization is possible.
Reviewers: nicholas
Differential Revision: http://llvm-reviews.chandlerc.com/D3240
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207143 91177308-0d34-0410-b5e6-96231b3b80d8
of the dbg.value. This gets rid of tons of redundant variable DIEs in
subscopes.
rdar://problem/14874886, rdar://problem/16679936
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207135 91177308-0d34-0410-b5e6-96231b3b80d8
rather than by adding an overload and hoping that it's declared before the code
that calls it. (In a modules build, it isn't.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207133 91177308-0d34-0410-b5e6-96231b3b80d8
described by DBG_VALUEs during their lifetime.
Previously, when a variable was at a FrameIndex for any part of its
lifetime, this would shadow all other DBG_VALUEs and only a single
fbreg location would be emitted, which in fact is only valid for a small
range and not the entire lexical scope of the variable. The included
dbg-value-const-byref testcase demonstrates this.
This patch fixes this by
Local
- emitting dbg.value intrinsics for allocas that are passed by reference
- dropping all dbg.declares (they are now fully lowered to dbg.values)
SelectionDAG
- renamed constructors for SDDbgValue for better readability.
- fix UserValue::match() to handle indirect values correctly
- not inserting an MMI table entries for dbg.values that describe allocas.
- lowering dbg.values that describe allocas into *indirect* DBG_VALUEs.
CodeGenPrepare
- leaving dbg.values for an alloca were they are (see comment)
Other
- regenerated/updated instcombine-intrinsics testcase and included source
rdar://problem/16679879
http://reviews.llvm.org/D3374
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207130 91177308-0d34-0410-b5e6-96231b3b80d8
This patch:
- Adds two new X86 builtin intrinsics ('int_x86_rdtsc' and
'int_x86_rdtscp') as GCCBuiltin intrinsics;
- Teaches the backend how to lower the two new builtins;
- Introduces a common function to lower READCYCLECOUNTER dag nodes
and the two new rdtsc/rdtscp intrinsics;
- Improves (and extends) the existing x86 test 'rdtsc.ll'; now test 'rdtsc.ll'
correctly verifies that both READCYCLECOUNTER and the two new intrinsics
work fine for both 64bit and 32bit Subtargets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207127 91177308-0d34-0410-b5e6-96231b3b80d8
I discovered this const-hole while attempting to coalesnce the Symbol
and SymbolMap data structures. There's some pending issues with that,
but I figured this change was easy to flush early.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207124 91177308-0d34-0410-b5e6-96231b3b80d8
Leak identified by LSan and reported by Kostya Serebryany.
Let's get a bit experimental here... in theory our minimum compiler
versions support unordered_map.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207118 91177308-0d34-0410-b5e6-96231b3b80d8
This matches ARM64 behaviour, which I think is clearer. It also puts all the
churn from that difference into one easily ignored commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207116 91177308-0d34-0410-b5e6-96231b3b80d8
These can have different relocations in ELF. In particular both:
b.eq global
ldr x0, global
are valid, giving different relocations. The only possible way to distinguish
them is via a different fixup, so the operands had to be separated throughout
the backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207105 91177308-0d34-0410-b5e6-96231b3b80d8
ARM64 was not producing pure BFI instructions for bitfield insertion
operations, unlike AArch64. The approach had to be a little different (in
ISelDAGToDAG rather than ISelLowering), and the outcomes aren't identical but
hopefully this gives it similar power.
This should address PR19424.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207102 91177308-0d34-0410-b5e6-96231b3b80d8
algorithm here: http://dl.acm.org/citation.cfm?id=177301.
The idea of isolating the roots has even more relevance when using the
stack not just to implement the DFS but also to implement the recursive
step. Because we use it for the recursive step, to isolate the roots we
need to maintain two stacks: one for our recursive DFS walk, and another
of the nodes that have been walked. The nice thing is that the latter
will be half the size. It also fixes a complete hack where we scanned
backwards over the stack to find the next potential-root to continue
processing. Now that is always the top of the DFS stack.
While this is a really nice improvement already (IMO) it further opens
the door for two important simplifications:
1) De-duplicating some of the code across the two different walks. I've
actually made the duplication a bit worse in some senses with this
patch because the two are starting to converge.
2) Dramatically simplifying the loop structures of both walks.
I wanted to do those separately as they'll be essentially *just* CFG
restructuring. This patch on the other hand actually uses different
datastructures to implement the algorithm itself.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207098 91177308-0d34-0410-b5e6-96231b3b80d8
applied prior to pushing a node onto the DFSStack. This is the first
step toward avoiding the stack entirely for leaf nodes. It also
simplifies things a bit and I think is pointing the way toward factoring
some more of the shared logic out of the two implementations.
It is also making it more obvious how to restructure the loops
themselves to be a bit easier to read (although no different in terms of
functionality).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207095 91177308-0d34-0410-b5e6-96231b3b80d8
a SmallPtrSet. Currently, there is no need for stable iteration in this
dimension, and I now thing there won't need to be going forward.
If this is ever re-introduced in any form, it needs to not be
a SetVector based solution because removal cannot be linear. There will
be many SCCs with large numbers of parents. When encountering these, the
incremental SCC update for intra-SCC edge removal was quadratic due to
linear removal (kind of).
I'm really hoping we can avoid having an ordering property here at all
though...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207091 91177308-0d34-0410-b5e6-96231b3b80d8
This allows us to compile
return (mask & 0x8 ? a : b);
into
testb $8, %dil
cmovnel %edx, %esi
instead of
andl $8, %edi
shrl $3, %edi
cmovnel %edx, %esi
which we formed previously because dag combiner canonicalizes setcc of and into shift.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207088 91177308-0d34-0410-b5e6-96231b3b80d8
Added support for bytes replication feature, so it could be GAS compatible.
E.g. instructions below:
"vmov.i32 d0, 0xffffffff"
"vmvn.i32 d0, 0xabababab"
"vmov.i32 d0, 0xabababab"
"vmov.i16 d0, 0xabab"
are incorrect, but we could deal with such cases.
For first one we should emit:
"vmov.i8 d0, 0xff"
For second one ("vmvn"):
"vmov.i8 d0, 0x54"
For last two instructions it should emit:
"vmov.i8 d0, 0xab"
P.S.: In ARMAsmParser.cpp I have also fixed few nearby style issues in old code.
Just for keeping method bodies in harmony with themselves.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207080 91177308-0d34-0410-b5e6-96231b3b80d8
This excludes avx512 as I don't have hardware to verify. It excludes _dq
variants because they are represented in the IR as <{2,4} x i64> when it's
actually a byte shift of the entire i{128,265}.
This also excludes _dq_bs as they aren't at all supported by the backend.
There are also no corresponding instructions in the ISA. I have no idea why
they exist...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207058 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Since the upper 64 bits of the destination register are undefined when
performing this operation, we can substitute it and let the optimizer
figure out that only a copy is needed.
Also added range merging, if an instruction copies a range that can be
merged with a previous copied range.
Added test cases for both optimizations.
Reviewers: grosbach, nadav
CC: llvm-commits
Differential Revision: http://reviews.llvm.org/D3357
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207055 91177308-0d34-0410-b5e6-96231b3b80d8
than functions. So far, this access pattern is *much* more common. It
seems likely that any user of this interface is going to have nodes at
the point that they are querying the SCCs.
No functionality changed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207045 91177308-0d34-0410-b5e6-96231b3b80d8
values rather than an expensive dense map query to test whether children
have already been popped into an SCC. This matches the incremental SCC
building code. I've also included the assert that I put there but
updated both of their text.
No functionality changed here.
I still don't have any great ideas for sharing the code between the two
implementations, but I may try a brute-force approach to factoring it at
some point.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207042 91177308-0d34-0410-b5e6-96231b3b80d8
GCOV provides an option to prepend output file names with the source
file name, to disambiguate between covered data that's included from
multiple sources. Add a flag to llvm-cov that does the same.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207035 91177308-0d34-0410-b5e6-96231b3b80d8
Emit the flag to indicate to the assembler that a section contains data if there
is pre-populated data present.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207028 91177308-0d34-0410-b5e6-96231b3b80d8
There's only ever one address pool, not one per DWARF output file, so
let's just have one.
(similar refactoring of the string pool to come soon)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207026 91177308-0d34-0410-b5e6-96231b3b80d8
ANDS does not use the same encoding scheme as other xxxS instructions (e.g.,
ADDS). Take that into account to avoid wrong peephole optimization.
<rdar://problem/16693089>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207020 91177308-0d34-0410-b5e6-96231b3b80d8
Some of these types (DwarfDebug in particular) are quite large to begin
with (and I keep forgetting whether DwarfFile is in DwarfDebug or
DwarfUnit... ) so having a few smaller files seems like goodness.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207010 91177308-0d34-0410-b5e6-96231b3b80d8
We're currently copying CounterData from InstrProfWriter into the
OnDiskHashTable, even though we don't need to, and then carelessly
leaking those copies. A const pointer is much better here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207009 91177308-0d34-0410-b5e6-96231b3b80d8
Don't replace shifts greater than the type with the maximum shift.
This isn't hit anywhere in the tests, and somewhere else is replacing
these with undef.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207000 91177308-0d34-0410-b5e6-96231b3b80d8
Pass::doInitialization is supposed to return False when it did not
change the program, not when a fatal error occurs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206975 91177308-0d34-0410-b5e6-96231b3b80d8
For now it contains a single flag, SanitizeAddress, which enables
AddressSanitizer instrumentation of inline assembly.
Patch by Yuri Gorshenin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206971 91177308-0d34-0410-b5e6-96231b3b80d8
This implements the core functionality necessary to remove an edge from
the call graph and correctly update both the basic graph and the SCC
structure. As part of that it has to run a tiny (in number of nodes)
Tarjan-style DFS walk of an SCC being mutated to compute newly formed
SCCs, etc.
This is *very rough* and a WIP. I have a bunch of FIXMEs for code
cleanup that will reduce the boilerplate in this change substantially.
I also have a bunch of simplifications to various parts of both
algorithms that I want to make, but first I'd like to have a more
holistic picture. Ideally, I'd also like more testing. I'll probably add
quite a few more unit tests as I go here to cover the various different
aspects and corner cases of removing edges from the graph.
Still, this is, so far, successfully updating the SCC graph in-place
without disrupting the identity established for the existing SCCs even
when we do challenging things like delete the critical edge that made an
SCC cycle at all and have to reform things as a tree of smaller SCCs.
Getting this to work is really critical for the new pass manager as it
is going to associate significant state with the SCC instance and needs
it to be stable. That is also the motivation behind the return of the
newly formed SCCs. Eventually, I'll wire this all the way up to the
public API so that the pass manager can use it to correctly re-enqueue
newly formed SCCs into a fresh postorder traversal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206968 91177308-0d34-0410-b5e6-96231b3b80d8
up the stack finishing the exploration of each entries children before
we're finished in addition to accounting for their low-links. Added
a unittest that really hammers home the need for this with interlocking
cycles that would each appear distinct otherwise and crash or compute
the wrong result. As part of this, nuke a stale fixme and bring the rest
of the implementation still more closely in line with the original
algorithm.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206966 91177308-0d34-0410-b5e6-96231b3b80d8
resisted this for too long. Just with the basic testing here I was able
to exercise the analysis in more detail and sift out both type signature
bugs in the API and a bug in the DFS numbering. All of these are fixed
here as well.
The unittests will be much more important for the mutation support where
it is necessary to craft minimal mutations and then inspect the state of
the graph. There is just no way to do that with a standard FileCheck
test. However, unittesting these kinds of analyses is really quite easy,
especially as they're designed with the new pass manager where there is
essentially no infrastructure required to rig up the core logic and
exercise it at an API level.
As a minor aside about the DFS numbering bug, the DFS numbering used in
LCG is a bit unusual. Rather than numbering from 0, we number from 1,
and use 0 as the sentinel "unvisited" state. Other implementations often
use '-1' for this, but I find it easier to deal with 0 and it shouldn't
make any real difference provided someone doesn't write silly bugs like
forgetting to actually initialize the DFS numbering. Oops. ;]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206954 91177308-0d34-0410-b5e6-96231b3b80d8
AArch64 has feature predicates for NEON, FP and CRYPTO instructions.
This allows the compiler to generate code without using FP, NEON
or CRYPTO instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206949 91177308-0d34-0410-b5e6-96231b3b80d8
into a helper function. I plan to re-use it for doing incremental
DFS-based updates to the SCCs when we mutate the call graph.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206948 91177308-0d34-0410-b5e6-96231b3b80d8
the Callee list. This is going to be quite important to prevent removal
from going quadratic. No functionality changed at this point, this is
one of the refactoring patches I've broken out of my initial work toward
mutation updates of the call graph.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206938 91177308-0d34-0410-b5e6-96231b3b80d8
This prompted me to push references through most of DwarfDebug. Sorry
for the churn.
Honestly it's a bit silly that we're passing around units all over the
place like that anyway and I think it's mostly due to the DIE attribute
adding utility functions being utilities in DwarfUnit. I should have
another go at moving them out of DwarfUnit...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206925 91177308-0d34-0410-b5e6-96231b3b80d8
from places like MCCodeEmitter() in the MC backend when the
MCContext is const.
I was going to use this in my change for r206669 but Jim convinced
me to use an assert there. But this still is a good tweak.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206923 91177308-0d34-0410-b5e6-96231b3b80d8
So Chandler - how about those range algorithms? (would really love a
dereferencing range adapter for this sort of stuff)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206921 91177308-0d34-0410-b5e6-96231b3b80d8
In the case where the constant comes from a cloned cast instruction, the
materialization code has to go before the cloned cast instruction.
This commit fixes the method that finds the materialization insertion point
by making it aware of this case.
This fixes <rdar://problem/15532441>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206913 91177308-0d34-0410-b5e6-96231b3b80d8
diagnostic that includes location information.
Currently if one has this assembly:
.quad (0x1234 + (4 * SOME_VALUE))
where SOME_VALUE is undefined ones gets the less than
useful error message with no location information:
% clang -c x.s
clang -cc1as: fatal error: error in backend: expected relocatable expression
With this fix one now gets a more useful error message
with location information:
% clang -c x.s
x.s:5:8: error: expected relocatable expression
.quad (0x1234 + (4 * SOME_VALUE))
^
To do this I plumbed the SMLoc through the MCObjectStreamer
EmitValue() and EmitValueImpl() interfaces so it could be used
when creating the MCFixup.
rdar://12391022
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206906 91177308-0d34-0410-b5e6-96231b3b80d8
Evidently tablegen doesn't infer this from the HasBMI2 predicate on the BZHI
instructions. This should fix the recent bot failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206885 91177308-0d34-0410-b5e6-96231b3b80d8
The point of these calls is to allow Thumb-1 code to make use of the VFP unit
to perform its operations. This is not desirable with -msoft-float, since most
of the reasons you'd want that apply equally to the runtime library.
rdar://problem/13766161
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206874 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r206780.
This commit was regressing gdb.opt/inline-locals.exp in the GDB 7.5 test
suite. Reverting until I can fix the issue.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206867 91177308-0d34-0410-b5e6-96231b3b80d8
Don't introduce new operations on an illegal sub 32-bit type.
Do the operations on a 32-bit value, and then use a truncating store.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206864 91177308-0d34-0410-b5e6-96231b3b80d8
The branch that skips irreducible backedges was only active when
propagating mass at the top-level. In particular, when propagating mass
through a loop recognized by `LoopInfo` with irreducible control flow
inside, irreducible backedges would not be skipped.
Not sure where that idea came from, but the result was that mass was
lost until after loop exit. Added a testcase that covers this case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206860 91177308-0d34-0410-b5e6-96231b3b80d8
Store pointers directly to loops inside the nodes. This could have been
done without changing the type stored in `std::vector<>`. However,
rather than computing the number of loops before constructing them
(which `LoopInfo` doesn't provide directly), I've switched to a
`vector<unique_ptr<LoopData>>`.
This adds some heap overhead, but the number of loops is typically
small.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206857 91177308-0d34-0410-b5e6-96231b3b80d8
This was implicitly with copy assignment before, which fails to actually
clear `std::vector<>`'s heap storage. Move assignment would work, but
since MSVC can't imply those anyway, explicitly `clear()`-ing members
makes more sense.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206856 91177308-0d34-0410-b5e6-96231b3b80d8
definition below all the header #include lines. This updates most of the
miscellaneous other lib/... directories. A few left though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206845 91177308-0d34-0410-b5e6-96231b3b80d8
definition below all of the header #include lines, lib/Transforms/...
edition.
This one is tricky for two reasons. We again have a couple of passes
that define something else before the includes as well. I've sunk their
name macros with the DEBUG_TYPE.
Also, InstCombine contains headers that need DEBUG_TYPE, so now those
headers #define and #undef DEBUG_TYPE around their code, leaving them
well formed modular headers. Fixing these headers was a large motivation
for all of these changes, as "leaky" macros of this form are hard on the
modules implementation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206844 91177308-0d34-0410-b5e6-96231b3b80d8
definition below all the header #include lines, lib/Analysis/...
edition.
This one has a bit extra as there were *other* #define's before #include
lines in addition to DEBUG_TYPE. I've sunk all of them as a block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206843 91177308-0d34-0410-b5e6-96231b3b80d8
of a '.inc' file before including actual headers. In this case we had
both duplicated a header's include and were including a standard header.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206840 91177308-0d34-0410-b5e6-96231b3b80d8
system headers above the includes of generated '.inc' files that
actually contain code. In a few targets this was already done pretty
consistently, but it wasn't done *really* consistently anywhere. It is
strictly cleaner IMO and necessary in a bunch of places where the
DEBUG_TYPE is referenced from the generated code. Consistency with the
necessary places trumps. Hopefully the build bots are OK with the
movement of intrin.h...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206838 91177308-0d34-0410-b5e6-96231b3b80d8
define below all header includes in the lib/CodeGen/... tree. While the
current modules implementation doesn't check for this kind of ODR
violation yet, it is likely to grow support for it in the future. It
also removes one layer of macro pollution across all the included
headers.
Other sub-trees will follow.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206837 91177308-0d34-0410-b5e6-96231b3b80d8
while checking candidate for bit field extract.
Otherwise the value may not fit in uint64_t and this will trigger an
assertion.
This fixes PR19503.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206834 91177308-0d34-0410-b5e6-96231b3b80d8
behavior based on other files defining DEBUG_TYPE, which means it cannot
define DEBUG_TYPE at all. This is actually better IMO as it forces folks
to define relevant DEBUG_TYPEs for their files. However, it requires all
files that currently use DEBUG(...) to define a DEBUG_TYPE if they don't
already. I've updated all such files in LLVM and will do the same for
other upstream projects.
This still leaves one important change in how LLVM uses the DEBUG_TYPE
macro going forward: we need to only define the macro *after* header
files have been #include-ed. Previously, this wasn't possible because
Debug.h required the macro to be pre-defined. This commit removes that.
By defining DEBUG_TYPE after the includes two things are fixed:
- Header files that need to provide a DEBUG_TYPE for some inline code
can do so by defining the macro before their inline code and undef-ing
it afterward so the macro does not escape.
- We no longer have rampant ODR violations due to including headers with
different DEBUG_TYPE definitions. This may be mostly an academic
violation today, but with modules these types of violations are easy
to check for and potentially very relevant.
Where necessary to suppor headers with DEBUG_TYPE, I have moved the
definitions below the includes in this commit. I plan to move the rest
of the DEBUG_TYPE macros in LLVM in subsequent commits; this one is big
enough.
The comments in Debug.h, which were hilariously out of date already,
have been updated to reflect the recommended practice going forward.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206822 91177308-0d34-0410-b5e6-96231b3b80d8
The comment claimed that the register class information wasn't available
in the assembly parser, but that's not really true. It's just annoying to
get to. Replace the helper functions with references to the auto-generated
information.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206802 91177308-0d34-0410-b5e6-96231b3b80d8
With a constant mask a vpermil* is just a shufflevector. This patch implements
that simplification. This allows us to produce denser code. It should also
allow more folding down the line.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206801 91177308-0d34-0410-b5e6-96231b3b80d8
Make sure only general purpose registers are valid for offset regs and
that 32-bit regs are only valid for sxtw and uxtw extends.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206799 91177308-0d34-0410-b5e6-96231b3b80d8
The canonical form for the extended addressing mode (e.g.,
"[x1, w2, uxtw #3]" is for the MCInst to have the second register be the
full 64-bit GPR64 register class. The instruction printer cleans up
the output for display to show the 32-bit register instead, per the
specification.
This simplifies 205893 now that the aliasing is handled in the printer
in 206495 so that the codegen path and the disassembler path give the
same MCInst form.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206797 91177308-0d34-0410-b5e6-96231b3b80d8
The rationale for this artificial dependency seems to have been lost to the
ravages of time, it is covered by no regression tests, and has no impact on
test-suite performance numbers on either x86 or PPC.
For the test suite, on both x86 and PPC, I ran the test suite 10 times (both as
a baseline and with this change), and found no statistically-significant
changes. For PPC, I used a P7 box. For x86, I used an Intel Xeon E5430. Both
with -O3 -mcpu=native.
This was discussed on-list back in January, but I've not had a chance to run
the performance tests until today.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206795 91177308-0d34-0410-b5e6-96231b3b80d8
This avoids copying the container by simply deleting until empty.
While I'd rather move to a stricter ownership semantic (unique_ptr),
SmallPtrSet can't cope with unique_ptr and the ownership semantics here
are a bit incestuous (Module sort of owns itself, but sort of doesn't
(if the LLVMContext is destroyed before the Module, then it deregisters
itself from the context... )).
Ideally Modules would be given to the context, or possibly an
emplace-like function to construct them there. Modules then shouldn't be
destroyed by LLVM API clients, but by interacting with the owner
(LLVMContext) directly (but even then, passing a Module* to LLVMContext
doesn't provide an easy way to destroy the Module, since the set would
be over unique_ptrs and you'd need a heterogenous lookup function which
SmallPtrSet doesn't have either).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206794 91177308-0d34-0410-b5e6-96231b3b80d8
With this MC is able to handle _GLOBAL_OFFSET_TABLE_ in 64 bit mode, which is
needed for medium and large code models.
This fixes pr19470.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206793 91177308-0d34-0410-b5e6-96231b3b80d8
The -tailcallelim pass should be checking if byval or inalloca args can
be captured before marking calls as tail calls. This was the real root
cause of PR7272.
With a better fix in place, revert the inliner change from r105255. The
test case it introduced still passes and has been moved to
test/Transforms/Inline/byval-tail-call.ll.
Reviewers: chandlerc
Differential Revision: http://reviews.llvm.org/D3403
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206789 91177308-0d34-0410-b5e6-96231b3b80d8
Requires switching some vectors to lists to maintain pointer validity.
These could be changed to forward_lists (singly linked) with a bit more
work - I've left comments to that effect.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206780 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The INSERTPS pattern fragment was called insrtps (mising 'e'), which
would make it harder to grep for the patterns related to this instruction.
Renaming it to use the proper instruction name.
Reviewers: nadav
CC: llvm-commits
Differential Revision: http://reviews.llvm.org/D3443
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206779 91177308-0d34-0410-b5e6-96231b3b80d8
header files and into the cpp files.
These files will require more touches as the header files actually use
DEBUG(). Eventually, I'll have to introduce a matched #define and #undef
of DEBUG_TYPE for the header files, but that comes as step N of many to
clean all of this up.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206777 91177308-0d34-0410-b5e6-96231b3b80d8
various .cpp files. This macro is inherently non-modular, and it wasn't
even needed in this header file.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206775 91177308-0d34-0410-b5e6-96231b3b80d8
Change `PositiveFloat` to `UnsignedFloat`, and fix some of the comments
to indicate that it's disappearing eventually.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206771 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r206707, reapplying r206704. The preceding commit
to CalcSpillWeights should have sorted out the failing buildbots.
<rdar://problem/14292693>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206766 91177308-0d34-0410-b5e6-96231b3b80d8
It can be reverted a few days later, after X86Disassembler.d is updated not to contain "X86Disassembler.c".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206758 91177308-0d34-0410-b5e6-96231b3b80d8
We normally don't drop functions from the C API's, but in this case I think we
can:
* The old implementation of getFileOffset was fairly broken
* The introduction of LLVMGetSymbolFileOffset was itself a C api breaking
change as it removed LLVMGetSymbolOffset.
* It is an incredibly specialized use case. The only reason MCJIT needs it is
because of its odd position of being a dynamic linker of .o files.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206750 91177308-0d34-0410-b5e6-96231b3b80d8