Fixes most of the test suite on Windows with clang-cl.
I'm not sure why the test suite was passing with MSVC 2013. Maybe they
changed their behavior and we are emulating their old sign extension
behavior. I think this deserves more investigation, but I want to green
the bot first.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239357 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This was a longstanding FIXME and is a necessary precursor to cases
where foldOperandImpl may have to create more than one instruction
(e.g. to constrain a register class). This is the split out NFC changes from
D6262.
Reviewers: pete, ributzka, uweigand, mcrosier
Reviewed By: mcrosier
Subscribers: mcrosier, ted, llvm-commits
Differential Revision: http://reviews.llvm.org/D10174
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239336 91177308-0d34-0410-b5e6-96231b3b80d8
on a per-function basis.
Previously some of the passes were conditionally added to ARM's pass pipeline
based on the target machine's subtarget. This patch makes changes to add those
passes unconditionally and execute them conditonally based on the predicate
functor passed to the pass constructors. This enables running different sets of
passes for different functions in the module.
rdar://problem/20542263
Differential Revision: http://reviews.llvm.org/D8717
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239325 91177308-0d34-0410-b5e6-96231b3b80d8
The Fragment and Section, and a bool for HasFragment were all used to create
a PointerUnion. Just use a pointer union instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239324 91177308-0d34-0410-b5e6-96231b3b80d8
All of ELF, COFF and MachO now manipulate the flags in helpers so we don't need
anyone to read the flags directly, but instead via those helpers.
Reviewed by Rafael Espíndola.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239317 91177308-0d34-0410-b5e6-96231b3b80d8
Also delete the now unused MCMachOSymbolFlags.h header as the only enum in there was moved to MCSymbolMachO.
Similarly to ELF and COFF, manipulating the flags is now done via helpers instead of spread
throughout the codebase.
Reviewed by Rafael Espíndola.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239316 91177308-0d34-0410-b5e6-96231b3b80d8
All flags setting/getting is now done in the class with helper methods instead
of users having to get the bits in the correct order.
Reviewed by Rafael Espíndola.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239314 91177308-0d34-0410-b5e6-96231b3b80d8
The flags field in MCSymbol only needs to be 16-bits on ELF and MachO.
This moves the 16-bit Type out of there so that it can be reduced in size in a future commit.
Reviewed by Rafael Espíndola.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239313 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We need to add a runtime memcheck for pair of accesses (x,y) where at least one of x and y
are writes.
Assuming we have w writes and r reads, currently this number is estimated as being
w* (w+r-1). This estimation will count (write,write) pairs twice and will overestimate
the number of checks required.
This change adds a getNumberOfChecks method to RuntimePointerCheck, which
will count the number of runtime checks needed (similar in implementation to
needsAnyChecking) and uses it to produce the correct number of runtime checks.
Test Plan:
llvm test suite
spec2k
spec2k6
Performance results: no changes observed (not surprising since the formula for 1 writer is basically the same, which would covers most cases - at least with the current check limit).
Reviewers: anemet
Reviewed By: anemet
Subscribers: mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D10217
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239295 91177308-0d34-0410-b5e6-96231b3b80d8
Interleaved memory accesses are grouped and vectorized into vector load/store and shufflevector.
E.g. for (i = 0; i < N; i+=2) {
a = A[i]; // load of even element
b = A[i+1]; // load of odd element
... // operations on a, b, c, d
A[i] = c; // store of even element
A[i+1] = d; // store of odd element
}
The loads of even and odd elements are identified as an interleave load group, which will be transfered into vectorized IRs like:
%wide.vec = load <8 x i32>, <8 x i32>* %ptr
%vec.even = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
%vec.odd = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 1, i32 3, i32 5, i32 7>
The stores of even and odd elements are identified as an interleave store group, which will be transfered into vectorized IRs like:
%interleaved.vec = shufflevector <4 x i32> %vec.even, %vec.odd, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7>
store <8 x i32> %interleaved.vec, <8 x i32>* %ptr
This optimization is currently disabled by defaut. To try it by adding '-enable-interleaved-mem-accesses=true'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239291 91177308-0d34-0410-b5e6-96231b3b80d8
These were, originally, in a different form in lld.
They can be reused for other tools, e.g. llvm-readobj.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239231 91177308-0d34-0410-b5e6-96231b3b80d8
The new naming is (to me) much easier to understand. Here is a summary
of the new state of the world:
- '*Threshold' is the threshold for full unrolling. It is measured
against the estimated unrolled cost as computed by getUserCost in TTI
(or CodeMetrics, etc). We will exceed this threshold when unrolling
loops where unrolling exposes a significant degree of simplification
of the logic within the loop.
- '*PercentDynamicCostSavedThreshold' is the percentage of the loop's
estimated dynamic execution cost which needs to be saved by unrolling
to apply a discount to the estimated unrolled cost.
- '*DynamicCostSavingsDiscount' is the discount applied to the estimated
unrolling cost when the dynamic savings are expected to be high.
When actually analyzing the loop, we now produce both an estimated
unrolled cost, and an estimated rolled cost. The rolled cost is notably
a dynamic estimate based on our analysis of the expected execution of
each iteration.
While we're still working to build up the infrastructure for making
these estimates, to me it is much more clear *how* to make them better
when they have reasonably descriptive names. For example, we may want to
apply estimated (from heuristics or profiles) dynamic execution weights
to the *dynamic* cost estimates. If we start doing that, we would also
need to track the static unrolled cost and the dynamic unrolled cost, as
only the latter could reasonably be weighted by profile information.
This patch is sadly not without functionality change for the new unroll
analysis logic. Buried in the heuristic management were several things
that surprised me. For example, we never subtracted the optimized
instruction count off when comparing against the unroll heursistics!
I don't know if this just got lost somewhere along the way or what, but
with the new accounting of things, this is much easier to keep track of
and we use the post-simplification cost estimate to compare to the
thresholds, and use the dynamic cost reduction ratio to select whether
we can exceed the baseline threshold.
The old values of these flags also don't necessarily make sense. My
impression is that none of these thresholds or discounts have been tuned
yet, and so they're just arbitrary placehold numbers. As such, I've not
bothered to adjust for the fact that this is now a discount and not
a tow-tier threshold model. We need to tune all these values once the
logic is ready to be enabled.
Differential Revision: http://reviews.llvm.org/D9966
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239164 91177308-0d34-0410-b5e6-96231b3b80d8
These are added mainly for the benefit of clang, but this also means that they
are now allowed in .fpu directives and we emit the correct .fpu directive when
single-precision-only is used.
Differential Revision: http://reviews.llvm.org/D10238
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239151 91177308-0d34-0410-b5e6-96231b3b80d8
Add getFPUFeatures to TargetParser, which gets the list of subtarget features
that are enabled/disabled for each FPU, and use it when handling the .fpu
directive.
No functional change in this commit, though clang will start behaving
differently once it starts using this.
Differential Revision: http://reviews.llvm.org/D10237
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239150 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Only restoring AvailableFeatures is not enough and will lead to buggy behaviour.
For example, if we have a feature enabled and we ".set pop", the next time we try
to ".set" that feature nothing will happen because the "!(STI.getFeatureBits()[Feature])"
check will be false, because we didn't restore STI.FeatureBits.
In order to fix this, we need to make MipsAssemblerOptions remember the STI.FeatureBits
instead of the AvailableFeatures and then regenerate AvailableFeatures each time we ".set pop".
This is because, AFAIK, there is no way to convert from AvailableFeatures back to STI.FeatureBits,
but the reverse is possible by using ComputeAvailableFeatures(STI.FeatureBits).
I also moved the updating of AssemblerOptions inside the "if" statement in
setFeatureBits() and clearFeatureBits(), as there is no reason to update if
nothing changes.
Reviewers: dsanders, mkuper
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9156
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239144 91177308-0d34-0410-b5e6-96231b3b80d8
When we generate coverage data, we explicitly set each coverage map's
alignment to 8 (See InstrProfiling::lowerCoverageData), but when we
read the coverage data, we assume consecutive maps are exactly
adjacent. When we're dealing with 32 bit, maps can end on a 4 byte
boundary, causing us to think the padding is part of the next record.
Fix this by adjusting the buffer to an appropriately aligned address
between records.
This is pretty awkward to test, as it requires a binary with multiple
coverage maps to hit, so we'd need to check in multiple source files
and a binary blob as inputs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239129 91177308-0d34-0410-b5e6-96231b3b80d8
Apparently this functionality isn't used in-tree (or I would go & make
the explicit unique_ptr constructions into implicit constructions to
make them more self documenting now that clone doesn't return a raw
owning pointer anymore) but only by the Julia frontend. This isn't
ideal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239091 91177308-0d34-0410-b5e6-96231b3b80d8