Summary:
`ComputeNumSignBits` returns incorrect results for `srem` instructions.
This change fixes the issue and adds a test case.
Reviewers: nadav, nicholas, atrick
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8600
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233225 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds supports for the vector constant folding of TRUNCATE and FP_EXTEND instructions and tidies up the SINT_TO_FP and UINT_TO_FP instructions to match.
It also moves the vector constant folding for the FNEG and FABS instructions to use the DAG.getNode() functionality like the other unary instructions.
Differential Revision: http://reviews.llvm.org/D8593
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233224 91177308-0d34-0410-b5e6-96231b3b80d8
As dblaikie pointed out, if I stop setting `emissionKind: 2` then the
backend won't do magical things on Linux vs. Darwin. I had wrongly
assumed that there were stricter requirements on the input if we weren't
in line-tables-only mode, but apparently not.
With that knowledge, clean up this testcase a little more.
- Set `emissionKind: 1`.
- Add back checks for the weak version of @foo.
- Check more robustly that we have the right subprograms by checking
the `DW_AT_decl_file` and `DW_AT_decl_line` which now show up.
- Check the line table in isolation (since it's no longer doubling as
an indirect test for the subprogram of the weak version of @foo).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233221 91177308-0d34-0410-b5e6-96231b3b80d8
If liveranges induced by an IMPLICIT_DEF get completely covered by a
proper liverange the IMPLICIT_DEF instructions and its corresponding
definitions have to be removed from the live ranges. This has to happen
in the subregister live ranges as well (I didn't see this case earlier
because in most programs only some subregisters are covered and the
IMPLCIT_DEF won't get removed).
No testcase, I spent hours trying to create one for one of the public
targets, but ultimately failed because I couldn't manage to properly
control the placement of COPY and IMPLICIT_DEF instructions from an .ll
file.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233217 91177308-0d34-0410-b5e6-96231b3b80d8
According to at least one bot [1], function prologues aren't always
empty for these functions. Skip that part of the follow-up check.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233214 91177308-0d34-0410-b5e6-96231b3b80d8
We don't have any logic to emit those tables yet, so the sdag lowering
of this intrinsic is just a stub. We can see the intrinsic in the
prepared IR, though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233209 91177308-0d34-0410-b5e6-96231b3b80d8
Rewrite the checks from r233164 that I temporarily disabled in r233165.
It turns out that the line-tables only debug info we emit from `llc` is
(intentionally) different on Linux than on Darwin. r218129 started
skipping emission of subprograms with no inlined subroutines, and
r218702 was a spiritual revert of that behaviour for Darwin.
I think we can still test this in a platform-neutral way.
- Stop checking for the possibly missing `DW_TAG_subprogram` defining
the debug info for the real version of `@foo`.
- Start checking the line tables, ensuring that the right debug info
was used to generate them (grabbing `DW_AT_low_pc` from the compile
unit).
- I changed up the line numbers used in the "weak" version so it's
easier to follow.
This should hopefully finish off PR22792.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233207 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds Hardware Transaction Memory (HTM) support supported by ISA 2.07
(POWER8). The intrinsic support is based on GCC one [1], but currently only the
'PowerPC HTM Low Level Built-in Function' are implemented.
The HTM instructions follows the RC ones and the transaction initiation result
is set on RC0 (with exception of tcheck). Currently approach is to create a
register copy from CR0 to GPR and comapring. Although this is suboptimal, since
the branch could be taken directly by comparing the CR0 value, it generates code
correctly on both test and branch and just return value. A possible future
optimization could be elimitate the MFCR instruction to branch directly.
The HTM usage requires a recently newer kernel with PPC HTM enabled. Tested on
powerpc64 and powerpc64le.
This is send along a clang patch to enabled the builtins and option switch.
[1] https://gcc.gnu.org/onlinedocs/gcc/PowerPC-Hardware-Transactional-Memory-Built-in-Functions.html
Phabricator Review: http://reviews.llvm.org/D8247
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233204 91177308-0d34-0410-b5e6-96231b3b80d8
Some languages, such as Go, have pre-defined structure types (e.g. "string"
is essentially a pointer/length pair) or pre-defined "typedef" types
(e.g. "error" is essentially a typedef for a specific interface type).
Such types do not have associated source location, so a Go frontend would
be correct not to associate a file name with such types.
This change relaxes the DIType verifier to permit unlocated types with
these tags.
Differential Revision: http://reviews.llvm.org/D8588
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233200 91177308-0d34-0410-b5e6-96231b3b80d8
This patch allows AVX blend instructions to handle insertion into the low
element of a 256-bit vector for the appropriate data types.
For f32, instead of:
vblendps $1, %xmm1, %xmm0, %xmm1 ## xmm1 = xmm1[0],xmm0[1,2,3]
vblendps $15, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0,1,2,3],ymm0[4,5,6,7]
we get:
vblendps $1, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0],ymm0[1,2,3,4,5,6,7]
For f64, instead of:
vmovsd %xmm1, %xmm0, %xmm1 ## xmm1 = xmm1[0],xmm0[1]
vblendpd $3, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0,1],ymm0[2,3]
we get:
vblendpd $1, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0],ymm0[1,2,3]
For the hardware-neglected integer data types, I left a TODO comment in the
code and added regression tests for a follow-on patch.
Differential Revision: http://reviews.llvm.org/D8609
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233199 91177308-0d34-0410-b5e6-96231b3b80d8
1. There were no CHECK-LABELs, so we could match instructions from the wrong function.
2. The use of zero operands meant multiple xor instructions could match some CHECKs.
3. The test was over-specified to need a Sandybridge CPU and Darwin triple.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233198 91177308-0d34-0410-b5e6-96231b3b80d8
This code is only compiled when LLVM_USE_INTEL_JITEVENTS, but at least we have
one buildbot where that's the case :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233197 91177308-0d34-0410-b5e6-96231b3b80d8
To complement getSplat. This is more general than the binary
decomposition method as it also handles non-pow2 splat sizes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233195 91177308-0d34-0410-b5e6-96231b3b80d8
The previous logic was to first try without relocations at all
and failing that stop on the first defined symbol.
That was inefficient and incorrect in the case part of the
expression could be simplified and another part could not
(see included test).
We now stop the evaluation when we get to a variable whose value
can change (i.e. is weak).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233187 91177308-0d34-0410-b5e6-96231b3b80d8
Added test Float2Int/float2int-optnone.ll to verify that pass Float2Int
is not run on optnone functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233183 91177308-0d34-0410-b5e6-96231b3b80d8
This ensures that we're building and testing the CompileOnDemand layer, at least
in a basic way.
Currently x86-64 only, and with limited to no library calls enabled (depending
on host platform). Patches welcome. ;)
To enable access to the lazy JIT, this patch replaces the '-use-orcmcjit' lli
option with a new option:
'-jit-kind={ mcjit | orc-mcjit | orc-lazy }'.
All regression tests are updated to use the new option, and one trivial test of
the new lazy JIT is added.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233182 91177308-0d34-0410-b5e6-96231b3b80d8
In r233009 we gained specific check-llvm-* build targets for invoking
specific parts of the test suite, but they were copying the
dependencies for check-all, rather than just listing the dependencies
for check-llvm.
This moves the creation of these targets next to the check-llvm
target, and uses that target's configuration rather than the check-all
config.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233174 91177308-0d34-0410-b5e6-96231b3b80d8
target-independent callback management.
This is a prerequisite for adding orc-based lazy-jitting to lli.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233166 91177308-0d34-0410-b5e6-96231b3b80d8
Instead of dropping subprograms that have been overridden, just set
their function pointers to `nullptr`. This is a minor adjustment to the
stop-gap fix for PR21910 committed in r224487, and fixes the crasher
from PR22792.
The problem that r224487 put a band-aid on: how do we find the canonical
subprogram for a `Function`? Since the backend currently relies on
`DebugInfoFinder` (which does a naive in-order traversal of compile
units and picks the first subprogram) for this, r224487 tried dropping
non-canonical subprograms.
Dropping subprograms fails because the backend *also* builds up a map
from subprogram to compile unit (`DwarfDebug::SPMap`) based on the
subprogram lists. A missing subprogram causes segfaults later when an
inlined reference (such as in this testcase) is created.
Instead, just drop the `Function` pointer to `nullptr`, which nicely
mirrors what happens when an already-inlined `Function` is optimized
out. We can't really be sure that it's the same definition anyway, as
the testcase demonstrates.
This still isn't completely satisfactory. Two flaws at least that I can
think of:
- I still haven't found a straightforward way to make this symmetric
in the IR. (Interestingly, the DWARF output is already symmetric,
and I've tested for that to be sure we don't regress.)
- Using `DebugInfoFinder` to find the canonical subprogram for a
function is kind of crazy. We should just attach metadata to the
function, like this:
define weak i32 @foo(i32, i32) !dbg !MDSubprogram(...) {
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233164 91177308-0d34-0410-b5e6-96231b3b80d8
Reverts the code change from r221168 and the relevant test.
It was a mistake to disable the combiner, and based on the ultimate
definition of 'optnone' we shouldn't have considered the test case
as failing in the first place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233153 91177308-0d34-0410-b5e6-96231b3b80d8
A load from an invariant location is assumed to not alias any otherwise potentially aliasing stores. Our implementation only applied this rule to store instructions themselves whereas they it should apply for any memory accessing instruction. This results in both FRE and PRE becoming more effective at eliminating invariant loads.
Note that as a follow on change I will likely move this into AliasAnalysis itself. That's where the TBAA constant flag is handled and the semantics are essentially the same. I'd like to separate the semantic change from the refactoring and thus have extended the hack that's already in MemoryDependenceAnalysis for this change.
Differential Revision: http://reviews.llvm.org/D8591
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233140 91177308-0d34-0410-b5e6-96231b3b80d8
In a subtraction of the form A - B, if B is weak, there is no way to represent
that on ELF since all relocations add the value of a symbol.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233139 91177308-0d34-0410-b5e6-96231b3b80d8
We can't use TargetFrameLowering::getFrameIndexOffset directly, because
Win64 really wants the offset from the stack pointer at the end of the
prologue. Instead, use X86FrameLowering::getFrameIndexOffsetFromSP(),
which is a pretty close approximiation of that. It fails to handle cases
with interestingly large stack alignments, which is pretty uncommon on
Win64 and is TODO.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233137 91177308-0d34-0410-b5e6-96231b3b80d8
A while ago llvm-cov gained support for clang's instrumentation based
profiling in addition to its gcov support, and subcommands were added
to choose which behaviour to use. When no subcommand was specified, we
fell back to gcov compatibility with a warning that a subcommand would
be required in the future. Now, we require the subcommand.
Note that if the basename of llvm-cov is gcov (via symlink or
hardlink, for example), we still use the gcov compatible behaviour
with no subcommand required.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233132 91177308-0d34-0410-b5e6-96231b3b80d8
The changes to InstCombine (& SCEV) do seem a bit silly - it doesn't make
anything obviously better to have the caller access the pointers element
type (the thing I'm trying to remove) than the GEP itself, but it's a
helpful migration step. This will allow me to more obviously lock down
GEP (& Load, etc) API usage, then fix all the code that accesses pointer
element types except the places that need to be removed (most of the
InstCombines) anyway - at which point I'll need to just remove all that
code because it won't be meaningful anymore (there will be no pointer
types, so no bitcasts to combine)
SCEV looks like it'll need some restructuring - we'll have to do a bit
more work for GEP canonicalization, since it'll depend on how it's used
if we can even manage to canonicalize it to a non-ugly GEP. I guess we
can do some fun stuff like voting (do 2 out of 3 load from the GEP with
a certain type that gives a pretty GEP? Does every typed use of the GEP
use either a specific type or a generic type (i8*, etc)?)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233131 91177308-0d34-0410-b5e6-96231b3b80d8
It seems one windows bot fails since I added ilne table linking to
llvm-dsymutil (see r232333 commit thread).
Disable the affected tests until I can figure out what's happening.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233130 91177308-0d34-0410-b5e6-96231b3b80d8
The changes to InstCombine do seem a bit silly - it doesn't make
anything obviously better to have the caller access the pointers element
type (the thing I'm trying to remove) than the GEP itself, but it's a
helpful migration step. This will allow me to more obviously lock down
GEP (& Load, etc) API usage, then fix all the code that accesses pointer
element types except the places that need to be removed (most of the
InstCombines) anyway - at which point I'll need to just remove all that
code because it won't be meaningful anymore (there will be no pointer
types, so no bitcasts to combine)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233126 91177308-0d34-0410-b5e6-96231b3b80d8
This patch tries to merge duplicate landing pads when they branch to a common shared target.
Given IR that looks like this:
lpad1:
%exn = landingpad {i8*, i32} personality i32 (...)* @__gxx_personality_v0
cleanup
br label %shared_resume
lpad2:
%exn2 = landingpad {i8*, i32} personality i32 (...)* @__gxx_personality_v0
cleanup
br label %shared_resume
shared_resume:
call void @fn()
ret void
}
We can rewrite the users of both landing pad blocks to use one of them. This will generally allow the shared_resume block to be merged with the common landing pad as well.
Without this change, tail duplication would likely kick in - creating N (2 in this case) copies of the shared_resume basic block.
Differential Revision: http://reviews.llvm.org/D8297
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233125 91177308-0d34-0410-b5e6-96231b3b80d8