Patch by: Aaron Watry
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>
NOTE: This is a candidate for the 3.3 branch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181579 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes piglit test for OpenCL builtin mul24, and allows mad24 to run.
Patch by: Aaron Watry
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>
NOTE: This is a candidate for the 3.3 branch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181578 91177308-0d34-0410-b5e6-96231b3b80d8
v2: Add v4i32 test
Patch by: Aaron Watry
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>
NOTE: This is a candidate for the 3.3 branch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181577 91177308-0d34-0410-b5e6-96231b3b80d8
v2: Add vselect v4i32 test
Patch by: Aaron Watry
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>
NOTE: This is a candidate for the 3.3 branch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181576 91177308-0d34-0410-b5e6-96231b3b80d8
We generate a `push' of a random register (%rax) if the stack needs to be
aligned by the size of that register. However, this could mess up compact unwind
generation. In particular, we want to still generate compact unwind in the
presence of this monstrosity.
Check if the push of of the %rax/%eax register. If it is and it's marked with
the `FrameSetup' flag, then we can generate a compact unwind encoding for the
function only if the push is the last FrameSetup instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181540 91177308-0d34-0410-b5e6-96231b3b80d8
When we replace an internal alias with its target, be careful not to
replace the entry in llvm.used (and llvm.compiler_used).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181524 91177308-0d34-0410-b5e6-96231b3b80d8
Previously we only checked if the LR required saving if the frame size was
non zero. However because the caller reserves 1 word for the callee to use
that doesn't count towards our frame size it is possible for the LR to need
saving and for the frame size to be 0.
We didn't hit when the LR needed saving because of a function calls because
the 1 word of stack we must allocate for our callee means the frame size
is always non zero in this case. However we can hit this case if the LR is
clobbered in inline asm.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181520 91177308-0d34-0410-b5e6-96231b3b80d8
That's obviously wrong. Conservatively restrict it to the sign bit, which
matches the original intention of this analysis. Fixes PR15940.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181518 91177308-0d34-0410-b5e6-96231b3b80d8
It was only implemented for ELF where it collected the Addend, so this
patch also renames it to getRelocationAddend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181502 91177308-0d34-0410-b5e6-96231b3b80d8
A computable loop exit count does not imply the presence of an induction
variable. Scalar evolution can return a value for an infinite loop.
Fixes PR15926.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181495 91177308-0d34-0410-b5e6-96231b3b80d8
- the temporaries "-debug.ll" files generated by DebugIR pass are considered tests, even though they are not
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181476 91177308-0d34-0410-b5e6-96231b3b80d8
for constructors and destructors since the original declaration given
by the AT_specification both won't and can't.
Patch by Yacine Belkadi, I've cleaned up the testcases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181471 91177308-0d34-0410-b5e6-96231b3b80d8
- simple one-function case
- function-calling case
- external function calling case
- exception throwing case
- vector case
Note: these tests are somewhat coupled to the current format of debug metadata.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181469 91177308-0d34-0410-b5e6-96231b3b80d8
This patch extends test/MC/PowerPC/ppc64-fixups.s to not only check for
the correct fixup type in the --show-encoding output, but also runs the
generated object file through llvm-readobj -r and verifies that the
correct ELF relocation records were generated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181453 91177308-0d34-0410-b5e6-96231b3b80d8
The floating-point record forms on PPC don't set the condition register bits
based on a comparison with zero (like the integer record forms do), but rather
based on the exception status bits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181423 91177308-0d34-0410-b5e6-96231b3b80d8
The reference encoding is correct, but written in the wrong byte order (these are Thumb tests, while the reference is in ARM byte order).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181420 91177308-0d34-0410-b5e6-96231b3b80d8
Fold (xor (and x, y), y) -> (and (not x), y)
This removes an opportunity for a constant to appear twice.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181395 91177308-0d34-0410-b5e6-96231b3b80d8
This provides basic functionality for imported declarations. For
subprograms and types some amount of lazy construction is supported (so
the definition of a function can proceed the using declaration), but it
still doesn't handle declared-but-not-defined functions (since we don't
generally emit function declarations).
Variable support is really rudimentary at the moment - simply looking up
the existing definition with no support for out of order (declaration,
imported_module, then definition).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181392 91177308-0d34-0410-b5e6-96231b3b80d8
The two nested loops were confusing and also conservative in identifying
reduction variables. This patch replaces them by a worklist based approach.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181369 91177308-0d34-0410-b5e6-96231b3b80d8
Apparently we didn't keep an association of Compile Unit metadata nodes
to DIEs so looking up that parental context failed & thus caused no
DW_TAG_imported_modules to be emitted at the CU scope. Fix this by
adding the mapping & sure up the test case to verify this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181339 91177308-0d34-0410-b5e6-96231b3b80d8
We were passing an i32 to ConstantInt::get where an i64 was needed and we must
also pass the sign if we pass negatives numbers. The start index passed to
getConsecutiveVector must also be signed.
Should fix PR15882.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181286 91177308-0d34-0410-b5e6-96231b3b80d8
The alignment is just a byte in the middle of Characteristics, not an
independent flag. Making it an independent field in the yaml
representation makes it more yamlio friendly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181243 91177308-0d34-0410-b5e6-96231b3b80d8
Test case by Michele Scandale!
Fixes PR10293: Load not hoisted out of loop with multiple exits.
There are few regressions with this patch, now tracked by
rdar:13817079, and a roughly equal number of improvements. The
regressions are almost certainly back luck because LoopRotate has very
little idea of whether rotation is profitable. Doing better requires a
more comprehensive solution.
This checkin is a quick fix that lacks generality (PR10293 has
a counter-example). But it trivially fixes the case in PR10293 without
interfering with other cases, and it does satify the criteria that
LoopRotate is a loop canonicalization pass that should avoid
heuristics and special cases.
I can think of two approaches that would probably be better in
the long run. Ultimately they may both make sense.
(1) LoopRotate should check that the current header would make a good
loop guard, and that the loop does not already has a sufficient
guard. The artifical SimplifiedLoopLatch check would be unnecessary,
and the design would be more general and canonical. Two difficulties:
- We need a strong guarantee that we won't endlessly rotate, so the
analysis would need to be precise in order to avoid the
SimplifiedLoopLatch precondition.
- Analysis like this are usually based on SCEV, which we don't want to
rely on.
(2) Rotate on-demand in late loop passes. This could even be done by
shoving the loop back on the queue after the optimization that needs
it. This could work well when we find LICM opportunities in
multi-branch loops. This requires some work, and it doesn't really
solve the problem of SCEV wanting a loop guard before the analysis.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181230 91177308-0d34-0410-b5e6-96231b3b80d8
As pointed out by Rafael Espindola, we should match the DWARF encodings
produced by GCC in both pic and non-pic modes. This was not the case
for the non-pic case.
This patch changes all DWARF encodings to DW_EH_PE_absptr for the
non-pic case, just like GCC does. The test case is updated to check
for both variants.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181222 91177308-0d34-0410-b5e6-96231b3b80d8
A * (1 - (uitofp i1 C)) -> select C, 0, A
B * (uitofp i1 C) -> select C, B, 0
select C, 0, A + select C, B, 0 -> select C, B, A
These come up in code that has been hand-optimized from a select to a linear blend,
on platforms where that may have mattered. We want to undo such changes
with the following transform:
A*(1 - uitofp i1 C) + B*(uitofp i1 C) -> select C, A, B
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181216 91177308-0d34-0410-b5e6-96231b3b80d8