Previously, BranchProbabilityInfo::calcLoopBranchHeuristics would determine the weights of basic blocks inside loops even when it didn't have enough information to estimate the branch probabilities correctly. This patch fixes the function to exit early if it doesn't see any exit edges or back edges and let the later heuristics determine the weights.
This fixes PR18705 and <rdar://problem/15991090>.
Differential Revision: http://reviews.llvm.org/D3363
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206194 91177308-0d34-0410-b5e6-96231b3b80d8
This allows correct relocations to be generated for a symbolic
address that is being adjusted by a negative constant. Since r204294,
such expressions have triggered undefined behavior when LLVM was built
without assertions.
Credit goes to Rafael for this patch; I'm submitting it on his behalf
as he is on vacation this week.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206192 91177308-0d34-0410-b5e6-96231b3b80d8
Once the auxiliary fields relating to the filename have been inspected, any
following auxiliary fields need not be visited as they have been consumed (the
following fields comprise the filepath as a single unit).
Adjust the test to catch this even if ASAN is not enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206190 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Two exceptions to this:
test/CodeGen/Mips/octeon.ll
test/CodeGen/Mips/octeon_popcnt.ll
these test extensions to MIPS64
One test is altered for MIPS-IV:
test/CodeGen/Mips/mips64countleading.ll
Tests dclo/dclz which were added in MIPS64. The MIPS-IV version tests
that dclo/dclz are not emitted.
Four tests fail and are not in this patch:
test/CodeGen/Mips/abicalls.ll
test/CodeGen/Mips/fcopysign-f32-f64.ll
test/CodeGen/Mips/fcopysign.ll
test/CodeGen/Mips/stack-alignment.ll
Depends on D3343
Reviewers: matheusalmeida, vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D3344
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206185 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
- Conditional moves acting on 64-bit GPR's should require MIPS-IV rather than MIPS64
- ISD::MUL, and ISD::MULH[US] should be lowered on all 64-bit ISA's
Patch by David Chisnall
His work was sponsored by: DARPA, AFRL
I've added additional testcases to cover as much of the codegen changes
affecting MIPS-IV as I can. Where I've been unable to find an existing
MIPS64 testcase that can be re-used for MIPS-IV (mainly tests covering
ISD::GlobalAddress and similar), I at least agree that MIPS-IV should
behave like MIPS64. Further testcases that are fixed by this patch will follow
in my next commit. The testcases from that commit that fail for MIPS-IV without
this patch are:
LLVM :: CodeGen/Mips/2010-07-20-Switch.ll
LLVM :: CodeGen/Mips/cmov.ll
LLVM :: CodeGen/Mips/eh-dwarf-cfa.ll
LLVM :: CodeGen/Mips/largeimmprinting.ll
LLVM :: CodeGen/Mips/longbranch.ll
LLVM :: CodeGen/Mips/mips64-f128.ll
LLVM :: CodeGen/Mips/mips64directive.ll
LLVM :: CodeGen/Mips/mips64ext.ll
LLVM :: CodeGen/Mips/mips64fpldst.ll
LLVM :: CodeGen/Mips/mips64intldst.ll
LLVM :: CodeGen/Mips/mips64load-store-left-right.ll
LLVM :: CodeGen/Mips/sint-fp-store_pattern.ll
Reviewers: dsanders
Reviewed By: dsanders
CC: matheusalmeida
Differential Revision: http://reviews.llvm.org/D3343
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206183 91177308-0d34-0410-b5e6-96231b3b80d8
Code change is because optimizeCompareInstr didn't know how to pull the
condition code out of FCSEL instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206171 91177308-0d34-0410-b5e6-96231b3b80d8
There was one definite issue in ARM64 (the off-by-1 check for whether
a shift could be folded in) and one difference that is probably
correct: ARM64 didn't fold nodes with multiple uses into the
arithmetic operations unless optimising for code size.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206168 91177308-0d34-0410-b5e6-96231b3b80d8
This transformation is only valid when being used for an EQ or NE
comparison since the flags change otherwise.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206167 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Previously loadImmediate() would produce MKMSK instructions with invalid
immediate values such as mkmsk r0, 9. Fix this by checking the mask size
is valid.
Reviewers: robertlytton
Reviewed By: robertlytton
CC: llvm-commits
Differential Revision: http://reviews.llvm.org/D3289
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206163 91177308-0d34-0410-b5e6-96231b3b80d8
BasicTTI::getMemoryOpCost must explicitly check for non-simple types; setting
AllowUnknown=true with TLI->getSimpleValueType is not sufficient because, for
example, non-power-of-two vector types return non-simple EVTs (not MVT::Other).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206150 91177308-0d34-0410-b5e6-96231b3b80d8
If a filename is a multiple of 18 characters, there will be no null-terminator.
This will result in an invalid access by the constructed StringRef. Add a test
case to exercise this and fix that handling. Address this same vulnerability in
llvm-readobj as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206145 91177308-0d34-0410-b5e6-96231b3b80d8
Implements the various TTI functions to enable constant hoisting on PPC. The
only significant test-suite change is this:
MultiSource/Benchmarks/VersaBench/bmm/bmm - 20% speedup
(which essentially reverses the slowdown from r206120).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206141 91177308-0d34-0410-b5e6-96231b3b80d8
If multiplication involves zero-extended arguments and the result is
compared as in the patterns:
%mul32 = trunc i64 %mul64 to i32
%zext = zext i32 %mul32 to i64
%overflow = icmp ne i64 %mul64, %zext
or
%overflow = icmp ugt i64 %mul64 , 0xffffffff
then the multiplication may be replaced by call to umul.with.overflow.
This change fixes PR4917 and PR4918.
Differential Revision: http://llvm-reviews.chandlerc.com/D2814
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206137 91177308-0d34-0410-b5e6-96231b3b80d8
We had been using the known-zero values of the operand of the or to construct
the mask for an rlwimi; this is not quite correct, but fine when the mask is
constant. When the mask is constant, then the known zeros of the operand must
be a superset of the zeros in the mask. However, when the mask is not a
constant, then there might be bits in the operand that are not known to be zero
that, at runtime, might be zero in the mask. Therefore, we check that any bits
not known to be zero *are* known to be one in the mask. Otherwise, we can't
fold the mask with the or and shift.
This was revealed as a miscompile of
MultiSource/Benchmarks/BitBench/drop3/drop3 when I started experimenting with
constant hoisting.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206136 91177308-0d34-0410-b5e6-96231b3b80d8
I found this from a particular GDB test suite case of inlining
(something similar is provided as a test case) but came across a few
other related cases (other callers of the same functions, and one other
instance of the same coding mistake in a separate function).
I'm not sure what the best way to test this is (let alone to cover the
other cases I discovered), so hopefully this sufficies - open to ideas.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206130 91177308-0d34-0410-b5e6-96231b3b80d8
Add support for file auxiliary symbol entries in COFF symbol tables. A COFF
symbol table with a FILE entry is followed by sizeof(__FILE__) / 18 auxiliary
symbol records which contain the filename. Read them and form the original
filename that the record contains. Then display the name in the output.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206126 91177308-0d34-0410-b5e6-96231b3b80d8
There is no need to check if we want to hoist the immediate value of an
shift instruction. Simply return TCC_Free right away.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206101 91177308-0d34-0410-b5e6-96231b3b80d8
Originally the cost model would give up for large constants and just return the
maximum cost. This is not what we want for constant hoisting, because some of
these constants are large in bitwidth, but are still cheap to materialize.
This commit fixes the cost model to either return TCC_Free if the cost cannot be
determined, or accurately calculate the cost even for large constants
(bitwidth > 128).
This fixes <rdar://problem/16591573>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206100 91177308-0d34-0410-b5e6-96231b3b80d8
therefore, their declaration cannot have one DW_AT_linkage_name.
The specific instances however can and should have that attribute.
This patch reorders the code in DwarfUnit::getOrCreateSubprogramDIE()
to emit linkage names for C/Dtors.
rdar://problem/16362674.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206096 91177308-0d34-0410-b5e6-96231b3b80d8
We had disabled use of TBAA during CodeGen (even when otherwise using AA)
because the ptrtoint/inttoptr used by CGP for address sinking caused BasicAA to
miss basic type punning that it should catch (and, thus, we'd fail to override
TBAA when we should).
However, when AA is in use during CodeGen, CGP now uses normal GEPs and
bitcasts, instead of ptrtoint/inttoptr, when doing address sinking. As a
result, BasicAA should be able to make us do the right thing in the face of
type-punning, and it seems safe to enable use of TBAA again. self-hosting seems
fine on PPC64/Linux on the P7, with TBAA enabled and -misched=shuffle.
Note: We still don't update TBAA when merging stack slots, although because
BasicAA should now catch all such cases, this is no longer a blocking issue.
Nevertheless, I plan to commit code to deal with this properly in the near
future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206093 91177308-0d34-0410-b5e6-96231b3b80d8
The current memory-instruction optimization logic in CGP, which sinks parts of
the address computation that can be adsorbed by the addressing mode, does this
by explicitly converting the relevant part of the address computation into
IR-level integer operations (making use of ptrtoint and inttoptr). For most
targets this is currently not a problem, but for targets wishing to make use of
IR-level aliasing analysis during CodeGen, the use of ptrtoint/inttoptr is a
problem for two reasons:
1. BasicAA becomes less powerful in the face of the ptrtoint/inttoptr
2. In cases where type-punning was used, and BasicAA was used
to override TBAA, BasicAA may no longer do so. (this had forced us to disable
all use of TBAA in CodeGen; something which we can now enable again)
This (use of GEPs instead of ptrtoint/inttoptr) is not currently enabled by
default (except for those targets that use AA during CodeGen), and so aside
from some PowerPC subtargets and SystemZ, there should be no change in
behavior. We may be able to switch completely away from the ptrtoint/inttoptr
sinking on all targets, but further testing is required.
I've doubled-up on a number of existing tests that are sensitive to the
address sinking behavior (including some store-merging tests that are
sensitive to the order of the resulting ADD operations at the SDAG level).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206092 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds patterns to generate the cls instruction ARM64. Includes tests
for 64 bit and 32 bit operands.
rdar://15611957
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206079 91177308-0d34-0410-b5e6-96231b3b80d8
fexhaustive-register-search => exhaustive-register-search
'f' is a Clang thing!
This is related to PR18747.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206075 91177308-0d34-0410-b5e6-96231b3b80d8
-fexhaustive-register-search option to allow an exhaustive search during last
chance recoloring.
This is related to PR18747
Patch by MAYUR PANDEY <mayur.p@samsung.com>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206072 91177308-0d34-0410-b5e6-96231b3b80d8
The TargetLowering::expandMUL() helper contains lowering code extracted
from the DAGTypeLegalizer and allows the SelectionDAGLegalizer to expand more
ISD::MUL patterns without having to use a library call.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206037 91177308-0d34-0410-b5e6-96231b3b80d8
This was most likely caused by an uninitialized value and the relevant code was re-written in r205292. Reverting to see if it still fails on any of the buildbots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206033 91177308-0d34-0410-b5e6-96231b3b80d8
The patch implements support for both relocation record formats: Elf_Rel
and Elf_Rela. It is possible to define relocation against symbol only.
Relocations against sections will be implemented later. Now yaml2obj
recognizes X86_64, MIPS and Hexagon relocation types.
Example of relocation section specification:
Sections:
- Name: .text
Type: SHT_PROGBITS
Content: "0000000000000000"
AddressAlign: 16
Flags: [SHF_ALLOC]
- Name: .rel.text
Type: SHT_REL
Info: .text
AddressAlign: 4
Relocations:
- Offset: 0x1
Symbol: glob1
Type: R_MIPS_32
- Offset: 0x2
Symbol: glob2
Type: R_MIPS_CALL16
The patch reviewed by Michael Spencer, Sean Silva, Shankar Easwaran.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206017 91177308-0d34-0410-b5e6-96231b3b80d8
This removes the -segmented-stacks command line flag in favor of a
per-function "split-stack" attribute.
Patch by Luqman Aden and Alex Crichton!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205997 91177308-0d34-0410-b5e6-96231b3b80d8
Refactored stack-protector.ll to use new-style function attributes everywhere
and eliminated unnecessary attributes.
This cleanup is in preparation for an upcoming test change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205996 91177308-0d34-0410-b5e6-96231b3b80d8
To support compressing the debug_line section that contains multiple
fragments (due, I believe, to variation in choices of line table
encoding depending on the size of instruction ranges in the actual
program code) we needed to support compressing multiple MCFragments in a
single pass.
This patch implements that behavior by mutating the post-relaxed and
relocated section to be the compressed form of its former self,
including renaming the section.
This is a more flexible (and less invasive, to a degree) implementation
that will allow for other features such as "use compression only if it's
smaller than the uncompressed data".
Compressing debug_frame would be a possible further extension to this
work, but I've left it for now. The hurdle there is alignment sections -
which might require going as far as to refactor
MCAssembler.cpp:writeFragment to handle writing to a byte buffer or an
MCObjectWriter (there's already a virtual call there, so it shouldn't
add substantial compile-time cost) which could in turn involve
refactoring MCAsmBackend::writeNopData to use that same abstraction...
which involves touching all the backends. This would remove the limited
handling of fragment writing seen in
ELFObjectWriter.cpp:getUncompressedData which would be nice - but it's
more invasive.
I did discover that I (perhaps obviously) don't need to handle
relocations when I rewrite the fragments - since the relocations have
already been applied and computed (and stored into
ELFObjectWriter::Relocations) by this stage (necessarily, because we
need to have written any immediate values or assembly-time relocations
into the data already before we compress it, which we have). The test
case doesn't necessarily cover that in detail - I can add more test
coverage if that's preferred.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205990 91177308-0d34-0410-b5e6-96231b3b80d8
To support compression for debug_line and debug_frame a different
approach is required. To simplify review, revert the old implementation
and XFAIL the test case. New implementation to follow shortly.
Reverts r205059 and r204958.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205989 91177308-0d34-0410-b5e6-96231b3b80d8