Summary:
This adds support in 'opt' to filter pass remarks emitted by
optimization passes. A new flag -pass-remarks specifies which
passes should emit a diagnostic when LLVMContext::emitOptimizationRemark
is invoked.
This will allow the front end to simply pass along the regular
expression from its own -Rpass flag when launching the backend.
Depends on D3227.
Reviewers: qcolombet
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D3291
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205775 91177308-0d34-0410-b5e6-96231b3b80d8
Confusingly, the NEON fmla instructions put the accumulator first but the
scalar versions put it at the end (like the fma lib function & LLVM's
intrinsic).
This should fix PR19345, assuming there's only one issue.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205758 91177308-0d34-0410-b5e6-96231b3b80d8
The IO normalizer would essentially lump I386 and AMD64 relocations
together. Relocation types with the same numeric value would then get
mapped in appropriately.
For example:
IMAGE_REL_AMD64_ADDR64 and IMAGE_REL_I386_DIR16 both have a numeric
value of one. We would see IMAGE_REL_I386_DIR16 in obj2yaml conversions
of object files with a machine type of IMAGE_FILE_MACHINE_AMD64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205746 91177308-0d34-0410-b5e6-96231b3b80d8
Moving these patterns from TableGen files to PerformDAGCombine()
should allow us to generate better code by eliminating unnecessary
shifts and extensions earlier.
This also fixes a bug where the MAD pattern was calling
SimplifyDemandedBits with a 24-bit mask on the first operand
even when the full pattern wasn't being matched. This occasionally
resulted in some instructions being incorrectly deleted from the
program.
v2:
- Fix bug with 64-bit mul
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205731 91177308-0d34-0410-b5e6-96231b3b80d8
into a constant size alloca by inlining.
Ran a run over the testsuite, no results out of the noise, fixes
the testcase in the PR.
PR19115.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205710 91177308-0d34-0410-b5e6-96231b3b80d8
cygwin has llvm-dwarfdump problems and isn't paying attention to the
specific xfail there.
s390x isn't matching for an unknown reason.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205708 91177308-0d34-0410-b5e6-96231b3b80d8
It affected callee's stack pop in x86. It is one of devergences between cygwin and mingw since mingw-gcc-4.6.
Added testcases to llvm/test/CodeGen/X86/win32_sret.ll for cygwin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205688 91177308-0d34-0410-b5e6-96231b3b80d8
I really should read the spec more often (and test GCC more often too).
I just assumed that namespace aliases would be the same as using
directives, except with a name. But apparently that's not how the DWARF
standards suggests they be implemented. DWARF4 provides an example and
other non-normative text suggesting that namespace aliases be
implemented by named imported declarations intsead of named imported
modules.
So be it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205685 91177308-0d34-0410-b5e6-96231b3b80d8
This adds a warning when linker_private or linker_private_weak is provided and
we handle it in a compatible manner.
Suggested by Chris Lattner!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205681 91177308-0d34-0410-b5e6-96231b3b80d8
This consolidates the duplicated MachO checks in the directive parsing for
various directives that are unsupported for Mach-O. The error message change is
unimportant as this restores the behaviour to that prior to the addition of the
new directive handling. Furthermore, use a more direct check for MachO
targeting rather than an indirect feature check of the assembler.
Also simplify the test execution command to avoid temporary files. Further more,
perform the check in both object and assembly emission.
Whether all non-applicable directives are handled is another question. .fnstart
is marked as being unsupported, however, the complementary .fnend is not. The
additional unwinding directives are also still honoured. This change does not
change that, though, it would be good to validate and mark them as being
unsupported if they are unsupported for the MachO emission.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205678 91177308-0d34-0410-b5e6-96231b3b80d8
This restores the linker_private and linker_private_weak lexemes to permit
translation of the deprecated lexmes. The behaviour is identical to the bitcode
handling: linker_private and linker_private_weak are handled as if private had
been specified. This enables compatibility with IR generated by LLVM 3.4.
Reported on IRC by ki9a!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205675 91177308-0d34-0410-b5e6-96231b3b80d8
This provides more realistic costs for the insert/extractelement instructions
(which are load/store pairs), accounts for the cheap unaligned Altivec load
sequence, and for unaligned VSX load/stores.
Bad news:
MultiSource/Applications/sgefa/sgefa - 35% slowdown (this will require more investigation)
SingleSource/Benchmarks/McGill/queens - 20% slowdown (we no longer vectorize this, but it was a constant store that was scalarized)
MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2 - 2% slowdown
Good news:
SingleSource/Benchmarks/Shootout/ary3 - 54% speedup
SingleSource/Benchmarks/Shootout-C++/ary - 40% speedup
MultiSource/Benchmarks/Ptrdist/ks/ks - 35% speedup
MultiSource/Benchmarks/FreeBench/neural/neural - 30% speedup
MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt - 20% speedup
Unfortunately, estimating the costs of the stack-based scalarization sequences
is hard, and adjusting these costs is like a game of whac-a-mole :( I'll
revisit this again after we have better codegen for vector extloads and
truncstores and unaligned load/stores.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205658 91177308-0d34-0410-b5e6-96231b3b80d8
gcc inline asm supports specifying "cc" as a clobber of all condition
registers. Add just enough modeling of the full register to make this work.
Fixed PR19326.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205630 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
They behave in accordance with the Has2008 and ABS2008 configuration bits of the
processor which are used to select between the 1985 and 2008 versions of IEEE
754. In 1985 mode, these instructions are arithmetic (i.e. they raise invalid
operation exceptions when given NaN), in 2008 mode they are non-arithmetic
(i.e. they are copies).
nmadd.[ds], and nmsub.[ds] are still subject to -enable-no-nans-fp-math because
the ISA spec does not explicitly state that they obey Has2008 and ABS2008.
Reviewers: matheusalmeida
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D3274
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205628 91177308-0d34-0410-b5e6-96231b3b80d8
When LLVM sees something like (v1iN (vselect v1i1, v1iN, v1iN)) it can
decide that the result is OK (v1i64 is legal on AArch64, for example)
but it still need scalarising because of that v1i1. There was no code
to do this though.
AArch64 and ARM64 have DAG combines to produce efficient code and
prevent that occuring in *most* such situations, but there are edge
cases that they miss. This adds a legalization to cope with that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205626 91177308-0d34-0410-b5e6-96231b3b80d8
There were several overlapping problems here, and this solution is
closely inspired by the one adopted in AArch64 in r201381.
Firstly, scalarisation of v1i1 setcc operations simply fails if the
input types are legal. This is fixed in LegalizeVectorTypes.cpp this
time, and allows AArch64 code to be simplified slightly.
Second, vselect with such a setcc feeding into it ends up in
ScalarizeVectorOperand, where it's not handled. I experimented with an
implementation, but found that whatever DAG came out was rather
horrific. I think Hao's DAG combine approach is a good one for
quality, though there are edge cases it won't catch (to be fixed
separately).
Should fix PR19335.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205625 91177308-0d34-0410-b5e6-96231b3b80d8
Removed "GNU Assembler extension (compatibility)" definitions from ARMInstrInfo.td
Fixed ARMAsmParser::ParseInstruction GNU compatability branch, so it also works for thumb mode from now.
Added new tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205622 91177308-0d34-0410-b5e6-96231b3b80d8
Sorry for the breakage.
For now, it will fail in two ways:
1. To fail for targeting x86_64-mingw32.
<stdin>:131:8: note: possible intended match here
0x30830a0100000002 3 0 1 0 0 is_stmt
2. To fail not to find the target x86.
llc: : error: unable to get target for 'x86_64-unknown-unknown',
see --version and --triple.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205621 91177308-0d34-0410-b5e6-96231b3b80d8
The previous patterns directly inserted FMOV or INS instructions into
the DAG for scalar_to_vector & bitconvert patterns. This is horribly
inefficient and can generated lots more GPR <-> FPR register traffic
than necessary.
It's much better to emit instructions the register allocator
understands so it can coalesce the copies when appropriate.
It led to at least one ISelLowering hack to avoid the problems, which
was incorrect for v1i64 (FPR64 has no dsub). It can now be removed
entirely.
This should also fix PR19331.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205616 91177308-0d34-0410-b5e6-96231b3b80d8
Without this change, the llvm_unreachable kicked in. The code pattern
being spotted is rather non-canonical for 128-bit MLAs, but it can
happen and there's no point in generating sub-optimal code for it just
because it looks odd.
Should fix PR19332.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205615 91177308-0d34-0410-b5e6-96231b3b80d8
recoloring cut-offs are encountered and register allocation failed.
This is related to PR18747
Patch by MAYUR PANDEY <mayur.p@samsung.com>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205601 91177308-0d34-0410-b5e6-96231b3b80d8
Removes unnecessary casts from non-generic address spaces to the generic address
space for certain code patterns.
Patch by Jingyue Wu.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205571 91177308-0d34-0410-b5e6-96231b3b80d8
When rematerializing through truncates, the coalescer may produce instructions
with dead defs, but live implicit-defs of subregs:
E.g.
%X1<def,dead> = MOVi64imm 2, %W1<imp-def>; %X1:GPR64, %W1:GPR32
These instructions are live, and their definitions should not be rewritten.
Fixes <rdar://problem/16492408>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205565 91177308-0d34-0410-b5e6-96231b3b80d8
Acording to AMD documentation, the correct opcode for
BFE_INT is 0x5, not 0x4
Fixes Arithm/Absdiff.Mat/3 OpenCV test
Patch by: Bruno Jiménez
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205562 91177308-0d34-0410-b5e6-96231b3b80d8
these is very much off and is more than just the branch
from this bug incorrect:
Address Line Column File ISA Discriminator Flags
------------------ ------ ------ ------ --- ------------- -------------
0x30830a0100000002 3 0 1 0 0 is_stmt
0x30830a0100000008 3 0 1 0 0 is_stmt end_sequence
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205551 91177308-0d34-0410-b5e6-96231b3b80d8
More updating of tests to be explicit about the target triple rather than
relying on the default target triple supporting ARM mode.
Indicate to lit that object emission is not yet available for Windows on ARM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205545 91177308-0d34-0410-b5e6-96231b3b80d8
This changes the tests that were targeting ARM EABI to explicitly specify the
environment rather than relying on the default. This breaks with the new
Windows on ARM support when running the tests on Windows where the default
environment is no longer EABI.
Take the opportunity to avoid a pointless redirect (helps when trying to debug
with providing a command line invocation which can be copy and pasted) and
removing a few greps in favour of FileCheck.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205541 91177308-0d34-0410-b5e6-96231b3b80d8
Implementing this via ComputeMaskedBits has two advantages:
+ It actually works. DAGISel doesn't deal with the chains properly
in the previous pattern-based solution, so they never trigger.
+ The information can be used in other DAG combines, as well as the
trivial "get rid of truncs". For example if the trunc is in a
different basic block.
rdar://problem/16227836
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205540 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
test/MC/Mips/<isa1>/invalid-<isa2>.s
Test that <isa1> does not support <isa2>'s instructions.
test/MC/Mips/<isa1>/invalid-<isa2>-xfail.s
Things that should be invalid but currently aren't. Will XPASS if any
become invalid.
Reviewers: matheusalmeida
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D3262
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205538 91177308-0d34-0410-b5e6-96231b3b80d8