O(N*log(N)). The idea is to introduce total ordering among functions set.
That allows to build binary tree and perform function look-up procedure in O(log(N)) time.
This patch description:
Introduced total ordering among Type instances. Actually it is improvement for existing
isEquivalentType.
0. Coerce pointer of 0 address space to integer.
1. If left and right types are equal (the same Type* value), return 0 (means equal).
2. If types are of different kind (different type IDs). Return result of type IDs
comparison, treating them as numbers.
3. If types are vectors or integers, return result of its
pointers comparison (casted to numbers).
4. Check whether type ID belongs to the next group:
* Void
* Float
* Double
* X86_FP80
* FP128
* PPC_FP128
* Label
* Metadata
If so, return 0.
5. If left and right are pointers, return result of address space
comparison (numbers comparison).
6. If types are complex.
Then both LEFT and RIGHT will be expanded and their element types will be checked with
the same way. If we get Res != 0 on some stage, return it. Otherwise return 0.
7. For all other cases put llvm_unreachable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203788 91177308-0d34-0410-b5e6-96231b3b80d8
convenient it is to imagine a world where this works, that is not C++ as
was pointed out in review. The standard even goes to some lengths to
preclude any attempt at this, for better or worse. Maybe better. =]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203775 91177308-0d34-0410-b5e6-96231b3b80d8
Only one instruction pair needed changing: SMULH & UMULH. The previous
code worked, but MC was doing extra work treating Ra as a valid
operand (which then got completely overwritten in MCCodeEmitter).
No behaviour change, so no tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203772 91177308-0d34-0410-b5e6-96231b3b80d8
VSX is an ISA extension supported on the POWER7 and later cores that enhances
floating-point vector and scalar capabilities. Among other things, this adds
<2 x double> support and generally helps to reduce register pressure.
The interesting part of this ISA feature is the register configuration: there
are 64 new 128-bit vector registers, the 32 of which are super-registers of the
existing 32 scalar floating-point registers, and the second 32 of which overlap
with the 32 Altivec vector registers. This makes things like vector insertion
and extraction tricky: this can be free but only if we force a restriction to
the right register subclass when needed. A new "minipass" PPCVSXCopy takes care
of this (although it could do a more-optimal job of it; see the comment about
unnecessary copies below).
Please note that, currently, VSX is not enabled by default when targeting
anything because it is not yet ready for that. The assembler and disassembler
are fully implemented and tested. However:
- CodeGen support causes miscompiles; test-suite runtime failures:
MultiSource/Benchmarks/FreeBench/distray/distray
MultiSource/Benchmarks/McCat/08-main/main
MultiSource/Benchmarks/Olden/voronoi/voronoi
MultiSource/Benchmarks/mafft/pairlocalalign
MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4
SingleSource/Benchmarks/CoyoteBench/almabench
SingleSource/Benchmarks/Misc/matmul_f64_4x4
- The lowering currently falls back to using Altivec instructions far more
than it should. Worse, there are some things that are scalarized through the
stack that shouldn't be.
- A lot of unnecessary copies make it past the optimizers, and this needs to
be fixed.
- Many more regression tests are needed.
Normally, I'd fix these things prior to committing, but there are some
students and other contributors who would like to work this, and so it makes
sense to move this development process upstream where it can be subject to the
regular code-review procedures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203768 91177308-0d34-0410-b5e6-96231b3b80d8
There are currently two schemes for mapping instruction operands to
instruction-format variables for generating the instruction encoders and
decoders for the assembler and disassembler respectively: a) to map by name and
b) to map by position.
In the long run, we'd like to remove the position-based scheme and use only
name-based mapping. Unfortunately, the name-based scheme currently cannot deal
with complex operands (those with suboperands), and so we currently must use
the position-based scheme for those. On the other hand, the position-based
scheme cannot deal with (register) variables that are split into multiple
ranges. An upcoming commit to the PowerPC backend (adding VSX support) will
require this capability. While we could teach the position-based scheme to
handle that, since we'd like to move away from the position-based mapping
generally, it seems silly to teach it new tricks now. What makes more sense is
to allow for partial transitioning: use the name-based mapping when possible,
and only use the position-based scheme when necessary.
Now the problem is that mixing the two sensibly was not possible: the
position-based mapping would map based on position, but would not skip those
variables that were mapped by name. Instead, the two sets of assignments would
overlap. However, I cannot currently change the current behavior, because there
are some backends that rely on it [I think mistakenly, but I'll send a message
to llvmdev about that]. So I've added a new TableGen bit variable:
noNamedPositionallyEncodedOperands, that can be used to cause the
position-based mapping to skip variables mapped by name.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203767 91177308-0d34-0410-b5e6-96231b3b80d8
Support to the IAS was added to actually parse and handle the complex SO
expressions. However, the object file lowering was not updated to compensate
for the fact that the shift operand may be an absolute expression.
When trying to assemble to an object file, the lowering would fail while
succeeding when emitting purely assembly. Add an appropriate test.
The test case is inspired by the test case provided by Jiangning Liu who also
brought the issue to light.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203762 91177308-0d34-0410-b5e6-96231b3b80d8
Add the Windows COFF ARM object file magic. This enables the LLVM tools to
interact with COFF object files for Windows on ARM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203761 91177308-0d34-0410-b5e6-96231b3b80d8
for use with C++11 range-based for-loops.
The gist of phase 1 is to remove the skipInstruction() and skipBundle()
methods from these iterators, instead splitting each iterator into a version
that walks operands, a version that walks instructions, and a version that
walks bundles. This has the result of making some "clever" loops in lib/CodeGen
more verbose, but also makes their iterator invalidation characteristics much
more obvious to the casual reader. (Making them concise again in the future is a
good motivating case for a pre-incrementing range adapter!)
Phase 2 of this undertaking with consist of removing the getOperand() method,
and changing operator*() of the operand-walker to return a MachineOperand&. At
that point, it should be possible to add range views for them that work as one
might expect.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203757 91177308-0d34-0410-b5e6-96231b3b80d8
This makes the mapping consistent with other CU->X mappings in the
MCContext, helping pave the way to refactor all these values into a
single data structure per CU and thus a single map.
I haven't renamed the data structure as that would make the patch churn
even higher (the MCLineSection name no longer makes sense, as this
structure now contains lines for multiple sections covered by a single
CU, rather than lines for a single section in multiple CUs) and further
refactorings will follow that may remove this type entirely.
For convenience, I also gave the MCLineSection value semantics so we
didn't have to do the lazy construction, manual delete, etc.
(& for those playing at home, refactoring the line printing into a
single data structure will eventually alow that data structure to be
reused to own the debug_line.dwo line table used for type unit file name
resolution)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203726 91177308-0d34-0410-b5e6-96231b3b80d8
Chandler voiced some concern with checking this in without some
discussion first. Reverting for now.
This reverts r203703, r203704, r203708, and 203709.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203723 91177308-0d34-0410-b5e6-96231b3b80d8
Extend what's currently done for shift because the HW performs this masking
implicitly:
(rotl:i32 x, (and y, 31)) -> (rotl:i32 x, y)
I use the newly factored out multiclass that was only supporting shifts so
far.
For testing I extended my testcase for the new rotation idiom.
<rdar://problem/15295856>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203718 91177308-0d34-0410-b5e6-96231b3b80d8
On ELF and COFF an alias is just another name for a position in the file.
There is no way to refer to a position in another file, so an alias to
undefined is meaningless.
MachO currently doesn't support aliases. The spec has a N_INDR, which when
implemented will have a different set of restrictions. Adding support for
it shouldn't be harder than any other IR extension.
For now, having the IR represent what is actually possible with current
tools makes it easier to fix the design of GlobalAlias.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203705 91177308-0d34-0410-b5e6-96231b3b80d8
This provides a library to work with the instrumentation based
profiling format that is used by clang's -fprofile-instr-* options and
by the llvm-profdata tool. This is a binary format, rather than the
textual one that's currently in use.
The tests are in the subsequent commits that use this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203703 91177308-0d34-0410-b5e6-96231b3b80d8
This allows us to generate table lookups for code such as:
unsigned test(unsigned x) {
switch (x) {
case 100: return 0;
case 101: return 1;
case 103: return 2;
case 105: return 3;
case 107: return 4;
case 109: return 5;
case 110: return 6;
default: return f(x);
}
}
Since cases 102, 104, etc. are not constants, the lookup table has holes
in those positions. We therefore guard the table lookup with a bitmask check.
Patch by Jasper Neumann!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203694 91177308-0d34-0410-b5e6-96231b3b80d8
The peephole (shift x, (and y, 31)) -> (shift x, y) is repeated for each
integer type and each shift variant.
To improve this a new multiclass is added that covers all integer types. The
shift patterns are now instantiated from this. I am planning to add new
instances for rotates as well.
No functional change intended:
* test/CodeGen/X86/shift-and.ll provides coverage
* Compared the expanded tablegen output and matched up the defs for these
Pat<>s before and after
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203685 91177308-0d34-0410-b5e6-96231b3b80d8
When printing assembly we don't have a Layout object, but we can still
try to fold some constants.
Testcase by Ulrich Weigand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203677 91177308-0d34-0410-b5e6-96231b3b80d8
There's a bit of duplicated "magic" code in opt.cpp and Clang's CodeGen that
computes the inliner threshold from opt level and size opt level.
This patch moves the code to a function that lives alongside the inliner itself,
providing a convenient overload to the inliner creation.
A separate patch can be committed to Clang to use this once it's committed to
LLVM. Standalone tools that use the inlining pass can also avoid duplicating
this code and fearing it will go out of sync.
Note: this patch also restructures the conditinal logic of the computation to
be cleaner.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203669 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This is a white lie to workaround a widespread bug in the -mfp64
implementation.
The problem is that none of the 32-bit fpu ops mention the fact that they
clobber the upper 32-bits of the 64-bit FPR. This allows MTHC1 to be
scheduled on the wrong side of most 32-bit FPU ops, particularly MTC1.
Fixing that requires a major overhaul of the FPU implementation which can't
be done right now due to time constraints.
The testcase is SingleSource/Benchmarks/Misc/oourafft.c when given
TARGET_CFLAGS='-mips32r2 mfp64 -mmsa'.
Also correct the comment added in r203464 to indicate that two
instructions were affected.
Reviewers: matheusalmeida, jacksprat
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D3029
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203659 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Correct the match patterns and the lowerings that made the CodeGen tests pass despite the mistakes.
The original testcase that discovered the problem was SingleSource/UnitTests/SignlessType/factor.c in test-suite.
During review, we also found that some of the existing CodeGen tests were incorrect and fixed them:
* bitwise.ll: In bsel_v16i8 the IfSet/IfClear were reversed because bsel and bmnz have different operand orders and the test didn't correctly account for this. bmnz goes 'IfClear, IfSet, CondMask', while bsel goes 'CondMask, IfClear, IfSet'.
* vec.ll: In the cases where a bsel is emitted as a bmnz (they are the same operation with a different input tied to the result) the operands were in the wrong order.
* compare.ll and compare_float.ll: The bsel operand order was correct for a greater-than comparison, but a greater-than comparison instruction doesn't exist. Lowering this operation inverts the condition so the IfSet/IfClear need to be swapped to match.
The differences between BSEL, BMNZ, and BMZ and how they map to/from vselect are rather confusing. I've therefore added a note to MSA.txt to explain this in a single place in addition to the comments that explain each case.
Reviewers: matheusalmeida, jacksprat
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D3028
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203657 91177308-0d34-0410-b5e6-96231b3b80d8
When the list of VFP registers to be saved was non-contiguous (so multiple
vpush/vpop instructions were needed) these were being ordered oddly, as in:
vpush {d8, d9}
vpush {d11}
This led to the layout in memory being [d11, d8, d9] which is ugly and doesn't
match the CFI_INSTRUCTIONs we're generating either (so Dwarf info would be
broken).
This switches the order of vpush/vpop (in both prologue and epilogue,
obviously) so that the Dwarf locations are correct again.
rdar://problem/16264856
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203655 91177308-0d34-0410-b5e6-96231b3b80d8
I could fold the callers into their one call site, but the indirection
(given how verbose choosing the section is) seemed helpful.
The use of a member function pointer's a bit "tricky", but seems limited
enough, the call sites are simple/clean/clear, and there's only one use.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203619 91177308-0d34-0410-b5e6-96231b3b80d8
Add a utility function to convert the Windows path separator to Unix style path
separators. This is used by a subsequent change in clang to enable the use of
Windows SDK headers on Linux.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203611 91177308-0d34-0410-b5e6-96231b3b80d8
The function hasReliableSymbolDifference had exactly one use in the MachO
writer. It is also only true for X86_64. In fact, the comments refers to
"Darwin x86_64" and everything else, so this makes the code match the
comment.
If this is to be abstracted again, it should be a property of
TargetObjectWriter, like useAggressiveSymbolFolding.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203605 91177308-0d34-0410-b5e6-96231b3b80d8
Before this patch the unix code for creating hardlinks was unused. The code
for creating symbolic links was implemented in lib/Support/LockFileManager.cpp
and the code for creating hard links in lib/Support/*/Path.inc.
The only use we have for these is in LockFileManager.cpp and it can use both
soft and hard links. Just have a create_link function that creates one or the
other depending on the platform.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203596 91177308-0d34-0410-b5e6-96231b3b80d8
When resolving a function call to an external routine, the dynamic
loader must patch the "nop" after the branch instruction to a load
that restores the TOC register.
Current code does that, but only with the *first* instance of a call
to any particular external routine, i.e. at the point where it also
allocates the call stub. With subsequent calls to the same routine,
current code neglects to patch in the TOC restore code. This is a
bug, and leads to corrupt TOC pointers in those cases.
Fixed by patching in restore code every time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203580 91177308-0d34-0410-b5e6-96231b3b80d8
Use the options in the ARMISelLowering to control whether tail calls are
optimised or not. Previously, this option was entirely ignored on the ARM
target and only honoured on x86.
This option is mostly useful in profiling scenarios. The default remains that
tail call optimisations will be applied.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203577 91177308-0d34-0410-b5e6-96231b3b80d8
This option is from 2010, designed to work around a linker issue on Darwin for
ARM. According to grosbach this is no longer an issue and this option can
safely be removed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203576 91177308-0d34-0410-b5e6-96231b3b80d8
Tail call optimisation was previously disabled on all targets other than
iOS5.0+. This enables the tail call optimisation on all Thumb 2 capable
platforms.
The test adjustments are to remove the IR hint "tail" to function invocation.
The tests were designed assuming that tail call optimisations would not kick in
which no longer holds true.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203575 91177308-0d34-0410-b5e6-96231b3b80d8
After r203553 overflow intrinsics and their non-intrinsic (normal)
instruction get hashed to the same value. This patch prevents PRE from
moving an instruction into a predecessor block, and trying to add a phi
node that gets two different types (the intrinsic result and the
non-intrinsic result), resulting in a failing assert.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203574 91177308-0d34-0410-b5e6-96231b3b80d8
ATOMIC_STORE operations always get here as a lowered ATOMIC_SWAP, so there's no
need for any code to handle them specially.
There should be no functionality change so no tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203567 91177308-0d34-0410-b5e6-96231b3b80d8
The syntax for "cmpxchg" should now look something like:
cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic
where the second ordering argument gives the required semantics in the case
that no exchange takes place. It should be no stronger than the first ordering
constraint and cannot be either "release" or "acq_rel" (since no store will
have taken place).
rdar://problem/15996804
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203559 91177308-0d34-0410-b5e6-96231b3b80d8
When an overflow intrinsic is followed by a non-overflow instruction,
replace the latter with an extract. For example:
%sadd = tail call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
%sadd3 = add i32 %a, %b
Here the add statement will be replaced by an extract.
When an overflow intrinsic follows a non-overflow instruction, a clone
of the intrinsic is inserted before the normal instruction, which makes
it the same as the previous case. Subsequent runs of GVN can then clean
up the duplicate instructions and insert the extract.
This fixes PR8817.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203553 91177308-0d34-0410-b5e6-96231b3b80d8
The official specifications state the name to be ARMNT (as per the Microsoft
Portable Executable and Common Object Format Specification v8.3).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203530 91177308-0d34-0410-b5e6-96231b3b80d8
When the MOVBE instructions are available, use them for 16-bit endian
swapping as well as for 32 and 64 bit.
The patterns were already present on the instructions, but weren't being
matched because the operation was unconditionally marked to 'Expand.'
Change that to be conditional on whether the MOVBE instructions are
available. Use 'rolw' to implement the in-register version (32 and 64
bit have the dedicated 'bswap' instruction for that).
Patch by Louis Gerbarg <lgg@apple.com>.
rdar://15479984
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203524 91177308-0d34-0410-b5e6-96231b3b80d8
During LTO, user-supplied definitions of C library functions often
exist. -instcombine uses Module::getOrInsertFunction() to get a handle
on library functions (e.g., @puts, when optimizing @printf).
Previously, Module::getOrInsertFunction() would rename any matching
functions with local linkage, and create a new declaration. In LTO,
this is the opposite of desired behaviour, as it skips by the
user-supplied version of the library function and creates a new
undefined reference which the linker often cannot resolve.
After some discussing with Rafael on the list, it looks like it's
undesired behaviour. If a consumer actually *needs* this behaviour, we
should add new API with a more explicit name.
I added two testcases: one specifically for the -instcombine behaviour
and one for the LTO flow.
<rdar://problem/16165191>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203513 91177308-0d34-0410-b5e6-96231b3b80d8
the legalization cost must be included to get an accurate
estimation of the total cost of the scalarized vector.
The inaccurate cost triggered unprofitable SLP vectorization on
32-bit X86.
Summary:
Include legalization overhead when computing scalarization cost
Reviewers: hfinkel, nadav
CC: chandlerc, rnk, llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2992
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203509 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When the sample profiles include discriminator information,
use the discriminator values to distinguish instruction weights
in different basic blocks.
This modifies the BodySamples mapping to map <line, discriminator> pairs
to weights. Instructions on the same line but different blocks, will
use different discriminator values. This, in turn, means that the blocks
may have different weights.
Other changes in this patch:
- Add tests for positive values of line offset, discriminator and samples.
- Change data types from uint32_t to unsigned and int and do additional
validation.
Reviewers: chandlerc
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2857
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203508 91177308-0d34-0410-b5e6-96231b3b80d8
Extend the error message generated by the Verifier when an intrinsic
name does not match the expected mangling to include the expected
name. Simplifies debugging.
Patch by Philip Reames!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203490 91177308-0d34-0410-b5e6-96231b3b80d8
optimize a call to a llvm intrinsic to something that invovles a call to a C
library call, make sure it sets the right calling convention on the call.
e.g.
extern double pow(double, double);
double t(double x) {
return pow(10, x);
}
Compiles to something like this for AAPCS-VFP:
define arm_aapcs_vfpcc double @t(double %x) #0 {
entry:
%0 = call double @llvm.pow.f64(double 1.000000e+01, double %x)
ret double %0
}
declare double @llvm.pow.f64(double, double) #1
Simplify libcall (part of instcombine) will turn the above into:
define arm_aapcs_vfpcc double @t(double %x) #0 {
entry:
%__exp10 = call double @__exp10(double %x) #1
ret double %__exp10
}
declare double @__exp10(double)
The pre-instcombine code works because calls to LLVM builtins are special.
Instruction selection will chose the right calling convention for the call.
However, the code after instcombine is wrong. The call to __exp10 will use
the C calling convention.
I can think of 3 options to fix this.
1. Make "C" calling convention just work since the target should know what CC
is being used.
This doesn't work because each function can use different CC with the "pcs"
attribute.
2. Have Clang add the right CC keyword on the calls to LLVM builtin.
This will work but it doesn't match the LLVM IR specification which states
these are "Standard C Library Intrinsics".
3. Fix simplify libcall so the resulting calls to the C routines will have the
proper CC keyword. e.g.
%__exp10 = call arm_aapcs_vfpcc double @__exp10(double %x) #1
This works and is the solution I implemented here.
Both solutions #2 and #3 would work. After carefully considering the pros and
cons, I decided to implement #3 for the following reasons.
1. It doesn't change the "spec" of the intrinsics.
2. It's a self-contained fix.
There are a couple of potential downsides.
1. There could be other places in the optimizer that is broken in the same way
that's not addressed by this.
2. There could be other calling conventions that need to be propagated by
simplify-libcall that's not handled.
But for now, this is the fix that I'm most comfortable with.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203488 91177308-0d34-0410-b5e6-96231b3b80d8
NVPTX, like the other backends, relies on generic symbol name sanitizing done by
MCSymbol. However, the ptxas assembler is more stringent and disallows some
additional characters in symbol names.
See PR19099 for more details.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203483 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This is a white lie to workaround a widespread bug in the -mfp64
implementation.
The problem is that none of the 32-bit fpu ops mention the fact that they
clobber the upper 32-bits of the 64-bit FPR. This allows MFHC1 to be
scheduled on the wrong side of most 32-bit FPU ops. Fixing that requires a
major overhaul of the FPU implementation which can't be done right now due to
time constraints.
MFHC1 is one of two affected instructions. These instructions are the only
FPU instructions that don't read or write the lower 32-bits. We therefore
pretend that it reads the bottom 32-bits to artificially create a dependency and
prevent the scheduler changing the behaviour of the code.
The other instruction is MTHC1 which will be fixed once I've have found a failing
test case for it.
The testcase is test-suite/SingleSource/UnitTests/Vector/simple.c when
given TARGET_CFLAGS="-mips32r2 -mfp64 -mmsa".
Reviewers: jacksprat, matheusalmeida
Reviewed By: jacksprat
Differential Revision: http://llvm-reviews.chandlerc.com/D2966
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203464 91177308-0d34-0410-b5e6-96231b3b80d8
The function was making too many assumptions about its input:
1. The NEON_VDUP optimisation was far too aggressive, assuming (I
think) that the input would always be BUILD_VECTOR.
2. We were treating most unknown concats as legal (by returning Op
rather than SDValue()). I think only concats of pairs of vectors are
actually legal.
http://llvm.org/PR19094
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203450 91177308-0d34-0410-b5e6-96231b3b80d8
the stack of the analysis group because they are all immutable passes.
This is made clear by Craig's recent work to use override
systematically -- we weren't overriding anything for 'finalizePass'
because there is no such thing.
This is kind of a lame restriction on the API -- we can no longer push
and pop things, we just set up the stack and run. However, I'm not
invested in building some better solution on top of the existing
(terrifying) immutable pass and legacy pass manager.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203437 91177308-0d34-0410-b5e6-96231b3b80d8
Split by comma once instead of multiple times. Moving this upfront
makes the rest of the code considerably simpler.
No functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203429 91177308-0d34-0410-b5e6-96231b3b80d8
it is available. Also make the move semantics sufficiently correct to
tolerate move-only passes, as the PassManagers *are* move-only passes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203391 91177308-0d34-0410-b5e6-96231b3b80d8
The grammar for LLVM IR is not well specified in any document but seems
to obey the following rules:
- Attributes which have parenthesized arguments are never preceded by
commas. This form of attribute is the only one which ever has
optional arguments. However, not all of these attributes support
optional arguments: 'thread_local' supports an optional argument but
'addrspace' does not. Interestingly, 'addrspace' is documented as
being a "qualifier". What constitutes a qualifier? I cannot find a
definition.
- Some attributes use a space between the keyword and the value.
Examples of this form are 'align' and 'section'. These are always
preceded by a comma.
- Otherwise, the attribute has no argument. These attributes do not
have a preceding comma.
Sometimes an attribute goes before the instruction, between the
instruction and it's type, or after it's type. 'atomicrmw' has
'volatile' between the instruction and the type while 'call' has 'tail'
preceding the instruction.
With all this in mind, it seems most consistent for 'inalloca' on an
'inalloca' instruction to occur before between the instruction and the
type. Unlike the current formulation, there would be no preceding
comma. The combination 'alloca inalloca' doesn't look particularly
appetizing, perhaps a better spelling of 'inalloca' is down the road.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203376 91177308-0d34-0410-b5e6-96231b3b80d8
MSVC (2012, 2013, 2013 Nov CTP) fail on the following code:
int main() {
int arr[] = {1, 2};
for (int i : arr)
do {} while (0);
}
The fix is to put {} around the for loop. I've reported this to the MSVC
team.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203371 91177308-0d34-0410-b5e6-96231b3b80d8
This requires a number of steps.
1) Move value_use_iterator into the Value class as an implementation
detail
2) Change it to actually be a *Use* iterator rather than a *User*
iterator.
3) Add an adaptor which is a User iterator that always looks through the
Use to the User.
4) Wrap these in Value::use_iterator and Value::user_iterator typedefs.
5) Add the range adaptors as Value::uses() and Value::users().
6) Update *all* of the callers to correctly distinguish between whether
they wanted a use_iterator (and to explicitly dig out the User when
needed), or a user_iterator which makes the Use itself totally
opaque.
Because #6 requires churning essentially everything that walked the
Use-Def chains, I went ahead and added all of the range adaptors and
switched them to range-based loops where appropriate. Also because the
renaming requires at least churning every line of code, it didn't make
any sense to split these up into multiple commits -- all of which would
touch all of the same lies of code.
The result is still not quite optimal. The Value::use_iterator is a nice
regular iterator, but Value::user_iterator is an iterator over User*s
rather than over the User objects themselves. As a consequence, it fits
a bit awkwardly into the range-based world and it has the weird
extra-dereferencing 'operator->' that so many of our iterators have.
I think this could be fixed by providing something which transforms
a range of T&s into a range of T*s, but that *can* be separated into
another patch, and it isn't yet 100% clear whether this is the right
move.
However, this change gets us most of the benefit and cleans up
a substantial amount of code around Use and User. =]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203364 91177308-0d34-0410-b5e6-96231b3b80d8
relevant subclasses of RuntimeDyldImpl. This allows construction of
RuntimeDyldImpl instances to be deferred until after the target architecture is
known.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203352 91177308-0d34-0410-b5e6-96231b3b80d8
This is already done for shifts. Allow it for rotations as well. E.g.:
(rotl:i32 x, (trunc (and y, 31))) -> (rotl:i32 x, (and (trunc y), 31))
Use the newly factored-out distributeTruncateThroughAnd.
With this patch and some X86.td tweaks we should be able to remove redundant
masking of the rotation amount like in the example above. HW implicitly
performs this masking.
The testcase will be added as part of the X86 patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203316 91177308-0d34-0410-b5e6-96231b3b80d8
This is the new idiom:
x<<(y&31) | x>>((0-y)&31)
which is recognized as:
x ROTL (y&31)
The change refines matchRotateSub. In
Neg & (OpSize - 1) == (OpSize - Pos) & (OpSize - 1), if Pos is
Pos' & (OpSize - 1) we can just use Pos' instead of Pos.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203315 91177308-0d34-0410-b5e6-96231b3b80d8
Slightly change the wording in the function comment. Originally, it can be
misunderstood as we turned the input into two subsequent rotates.
Better connect the comment which talks about Mask and the code which used
LoBits. Renamed variable to MaskLoBits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203314 91177308-0d34-0410-b5e6-96231b3b80d8
be split and the result type widened.
When the condition of a vselect has to be split it makes no sense widening the
vselect and thereby widening the condition. We end up in an endless loop of
widening (vselect result type) and splitting (condition mask type) doing this.
Instead, split both the condition and the vselect and widen the result.
I ran this over the test suite with i686 and mattr=+sse and saw no regressions.
Fixes PR18036.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203311 91177308-0d34-0410-b5e6-96231b3b80d8
First: refactor out the emission of entries into the .debug_loc section
into its own routine.
Second: add a new class ByteStreamer that can be used to either emit
using an AsmPrinter or hash using DIEHash the series of bytes that
would be emitted. Use this in all of the location emission routines
for the .debug_loc section.
No functional change intended outside of a few additional comments
in verbose assembly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203304 91177308-0d34-0410-b5e6-96231b3b80d8
These are sometimes created by the shrink to boolean optimization in the
globalopt pass.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203280 91177308-0d34-0410-b5e6-96231b3b80d8
The integrated assembler now works for ppc. Since this was the last use of the
bg/p predicate and Hal says that it is now dead, drop the predicate too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203269 91177308-0d34-0410-b5e6-96231b3b80d8
This is a straightfoward replacement, it makes debugging a little
easier.
This has no functional impact.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203264 91177308-0d34-0410-b5e6-96231b3b80d8
For incoming improvements to inlined functions and lexical blocks
suggested by Adrian Prantl in review of r203187.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203263 91177308-0d34-0410-b5e6-96231b3b80d8
The target was marking SIGN_EXTEND as Custom because it wanted to optimize
certain sign-extended shifts. In all other respects the extension is Legal,
so it'd be better to do the optimization in PerformDAGCombine instead.
No functional change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203234 91177308-0d34-0410-b5e6-96231b3b80d8
This helps the instruction selector to lower an i64 * i64 -> i128
multiplication into a single instruction on targets which support it.
Patch by Manuel Jacob.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203230 91177308-0d34-0410-b5e6-96231b3b80d8
Sequences of insertelement/extractelements are sometimes used to build
vectorsr; this code tries to put them back together into shuffles, but
could only produce a completely uniform shuffle types (<N x T> from two
<N x T> sources).
This should allow shuffles with different numbers of elements on the
input and output sides as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203229 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
llvm/MC/MCSectionMachO.h and llvm/Support/MachO.h both had the same
definitions for the section flags. Instead, grab the definitions out of
support.
No functionality change.
Reviewers: grosbach, Bigcheese, rafael
Reviewed By: rafael
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2998
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203211 91177308-0d34-0410-b5e6-96231b3b80d8
The old system was fairly convoluted:
* A temporary label was created.
* A single PROLOG_LABEL was created with it.
* A few MCCFIInstructions were created with the same label.
The semantics were that the cfi instructions were mapped to the PROLOG_LABEL
via the temporary label. The output position was that of the PROLOG_LABEL.
The temporary label itself was used only for doing the mapping.
The new CFI_INSTRUCTION has a 1:1 mapping to MCCFIInstructions and points to
one by holding an index into the CFI instructions of this function.
I did consider removing MMI.getFrameInstructions completelly and having
CFI_INSTRUCTION own a MCCFIInstruction, but MCCFIInstructions have non
trivial constructors and destructors and are somewhat big, so the this setup
is probably better.
The net result is that we don't create temporary labels that are never used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203204 91177308-0d34-0410-b5e6-96231b3b80d8
That was overly aggressive in assuming that we could always assume COFF. Some
of the tests assume that they will get ELF rather than COFF even on Windows
where the default is COFF.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203176 91177308-0d34-0410-b5e6-96231b3b80d8
This is a preliminary setup change to support a renaming of Windows target
triples. Split the object file format information out of the environment into a
separate entity. Unfortunately, file format was previously treated as an
environment with an unknown OS. This is most obvious in the ARM subtarget where
the handling for macho on an arbitrary platform switches to AAPCS rather than
APCS (as per Apple's needs).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203160 91177308-0d34-0410-b5e6-96231b3b80d8
Windows always uses COFF unless Windows ELF is in use. Rather than checking if
Windows, MinGW, or Cygwin is being targeted, just check if the target OS is
windows and that it is not an ELF environment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203159 91177308-0d34-0410-b5e6-96231b3b80d8
This patch teaches the DAGCombiner how to fold a binary OR between two
shufflevector into a single shuffle vector when possible.
The rules are:
1. fold (or (shuf A, V_0, MA), (shuf B, V_0, MB)) -> (shuf A, B, Mask1)
2. fold (or (shuf A, V_0, MA), (shuf B, V_0, MB)) -> (shuf B, A, Mask2)
The DAGCombiner can take advantage of the fact that OR is commutative and
compute two possible shuffle masks (Mask1 and Mask2) for the resulting
shuffle node.
Before folding a dag according to either rule 1 or 2, DAGCombiner verifies
that the resulting shuffle mask is legal for the target.
DAGCombiner would firstly try to fold according to 1.; If not possible
then it will try to fold according to 2.
If both Mask1 and Mask2 are illegal then we conservatively don't fold
the OR instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203156 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commits r203136, r203137, and r203138.
This code doesn't build on Windows. Even on Vista+, Windows requires
elevated privileges to create a symlink. Therefore we can't use
symlinks in the compiler. We'll have to find another approach.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203143 91177308-0d34-0410-b5e6-96231b3b80d8
There was a race where:
- The LockFileManager tries to own the lock file and fails.
- The other owner then releases and removes the lock file.
- The LockFileManager tries to read the owner info from the lock file but fails now.
In such a case have LockFileManager try to get ownership again, instead of error'ing out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203138 91177308-0d34-0410-b5e6-96231b3b80d8
Hard links do not work on SMB network directories, and it causes us to fail
to build clang module files if the module cache is in such a directory.
rdar://15944959
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203137 91177308-0d34-0410-b5e6-96231b3b80d8
for the Cortex-A53 subtarget in the AArch64 backend.
This patch lays the ground work to annotate each AArch64 instruction
(no NEON yet) with a list of SchedReadWrite types. The patch also
provides the Cortex-A53 processor resources, maps those the the default
SchedReadWrites, and provides basic latency. NEON support will be added
in a subsequent patch with proper forwarding logic.
Verification was done by setting the pre-RA scheduler to linearize to
better gauge the effect of the MIScheduler. Even without modeling the
forward logic, the results show a modest improvement for Cortex-A53.
Reviewers: apazos, mcrosier, atrick
Patch by Dave Estes <cestes@codeaurora.org>!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203125 91177308-0d34-0410-b5e6-96231b3b80d8
Just the simple cases for now. There were a few knock-on changes of
MachineBasicBlock *s to MachineBasicBlock &s. No functional change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203105 91177308-0d34-0410-b5e6-96231b3b80d8
This compiles with no changes to clang/lld/lldb with MSVC and includes
overloads to various functions which are used by those projects and llvm
which have OwningPtr's as parameters. This should allow out of tree
projects some time to move. There are also no changes to libs/Target,
which should help out of tree targets have time to move, if necessary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203083 91177308-0d34-0410-b5e6-96231b3b80d8
This is consistent with GDB ToT and reduces the number of relocations in
(type and compile) units, substantially reducing relocations and debug
size in fission + type units builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203082 91177308-0d34-0410-b5e6-96231b3b80d8
implementation already lived.
After this commit, the only IR-library headers in include/llvm/* are
ones related to the legacy pass infrastructure that I'm planning to
leave there until the new one is farther along.
The only other headers at the top level are linking and initialization
aids that aren't really libraries but just headers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203069 91177308-0d34-0410-b5e6-96231b3b80d8
The global base register cannot be r0 because it might end up as the first
argument to addi or addis. Fixes PR18316.
I don't have a small stable test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203054 91177308-0d34-0410-b5e6-96231b3b80d8
pointed to by the attribute, rather than the form as a first
step to determining how to hash the values. No functional change
intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203044 91177308-0d34-0410-b5e6-96231b3b80d8
When copying an i1 value into a GPR for a vaarg call, we need to explicitly
zero-extend the i1 value (otherwise an invalid CRBIT -> GPR copy will be
generated).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203041 91177308-0d34-0410-b5e6-96231b3b80d8
This works by moving the existing code into the DIEValue hierarchy
and using the DwarfDebug pointer off of the AsmPrinter to access
any global information we need.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203033 91177308-0d34-0410-b5e6-96231b3b80d8
This enables us to figure out where in the debug_loc section our
locations are so that we can eventually hash them. It also helps
remove some special case code in emission. No functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203018 91177308-0d34-0410-b5e6-96231b3b80d8
On cores without fpcvt support, we cannot promote int_to_fp i1 operations,
because there is nothing to promote them to. The most straightforward
implementation of this uses a select to choose between the two possible
resulting floating-point values (and that's what is done here).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203015 91177308-0d34-0410-b5e6-96231b3b80d8
Before llvm-mc would print it, but llc was assuming that it would produce
another section changing directive before one was needed. That assumption is
false with inline asm.
Fixes PR19049.
Another option would be to always create the section, but in the asm printer
avoid printing sections changes during initialization. That would work, but
* We do use the fact that llvm-mc prints it in testing. The tests can be changed
if needed.
* A quick poll on IRC suggest that most developers prefer the implicit .text to
be printed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203001 91177308-0d34-0410-b5e6-96231b3b80d8
When using a //net/ path, we were transforming the trailing / into a '.'
when the path was just the root path and we were iterating backwards.
Forwards iteration and other kinds of root path (C:\, /) were already
correct.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202999 91177308-0d34-0410-b5e6-96231b3b80d8
already lived there and it is where it belongs -- this is the in-memory
debug location representation.
This is just cleanup -- Modules can actually cope with this, but that
doesn't make it right. After chatting with folks that have out-of-tree
stuff, going ahead and moving the rest of the headers seems preferable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202960 91177308-0d34-0410-b5e6-96231b3b80d8
This will allow external callers of these functions to switch over time
rather than forcing a breaking change all a once. These particular
functions were determined by building clang/lld/lldb.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202959 91177308-0d34-0410-b5e6-96231b3b80d8
to ensure we don't mess up any of the overrides. Necessary for cleaning
up the Value use iterators and enabling range-based traversing of use
lists.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202958 91177308-0d34-0410-b5e6-96231b3b80d8
Patchpoints already did this. Doing it for stackmaps is a convenience
for the runtime in the event that it needs to scratch register to
patch or perform a runtime call thunk.
Unlike patchpoints, we just assume the AnyRegCC calling
convention. This is the only language and target independent calling
convention specific to stackmaps so makes sense. Although the calling
convention is not currently used to select the scratch registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202943 91177308-0d34-0410-b5e6-96231b3b80d8
selection dag (PR19012)
In X86SelectionDagInfo::EmitTargetCodeForMemcpy we check with MachineFrameInfo
to make sure that ESI isn't used as a base pointer register before we choose to
emit rep movs (which clobbers esi).
The problem is that MachineFrameInfo wouldn't know about dynamic allocas or
inline asm that clobbers the stack pointer until SelectionDAGBuilder has
encountered them.
This patch fixes the problem by checking for such things when building the
FunctionLoweringInfo.
Differential Revision: http://llvm-reviews.chandlerc.com/D2954
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202930 91177308-0d34-0410-b5e6-96231b3b80d8