This is a minor drive-by fix with no robust way to unit test.
As an example see neon-div.ll:
SU(16): %Q8<def> = VMOVLsv4i32 %D17, pred:14, pred:%noreg, %Q8<imp-use,kill>
val SU(1): Latency=2 Reg=%Q8
...should be latency=1
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158960 91177308-0d34-0410-b5e6-96231b3b80d8
Minor drive by fix to cleanup latency computation. Calling
getOperandLatency with a deliberately incorrect operand index does not
give you the latency you want.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158959 91177308-0d34-0410-b5e6-96231b3b80d8
boolean flag to an enum: { Fast, Standard, Strict } (default = Standard).
This option controls the creation by optimizations of fused FP ops that store
intermediate results in higher precision than IEEE allows (E.g. FMAs). The
behavior of this option is intended to match the behaviour specified by a
soon-to-be-introduced frontend flag: '-ffuse-fp-ops'.
Fast mode - allows formation of fused FP ops whenever they're profitable.
Standard mode - allow fusion only for 'blessed' FP ops. At present the only
blessed op is the fmuladd intrinsic. In the future more blessed ops may be
added.
Strict mode - allow fusion only if/when it can be proven that the excess
precision won't effect the result.
Note: This option only controls formation of fused ops by the optimizers. Fused
operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic)
will always be honored, regardless of the value of this option.
Internally TargetOptions::AllowExcessFPPrecision has been replaced by
TargetOptions::AllowFPOpFusion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158956 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or
FSUB + FMUL. The combines are performed when:
(a) Either
AllowExcessFPPrecision option (-enable-excess-fp-precision for llc)
OR
UnsafeFPMath option (-enable-unsafe-fp-math)
are set, and
(b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of
the FADD/FSUB, and
(c) The FMUL only has one user (the FADD/FSUB).
If your target has fast FMA instructions you can make use of these combines by
overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for
types supported by your FMA instruction, and adding patterns to match ISD::FMA
to your FMA instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158757 91177308-0d34-0410-b5e6-96231b3b80d8
The condition code didn't actually matter for arm "b" instructions,
unlike "bl". It should just use the R_ARM_JUMP24 reloc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158722 91177308-0d34-0410-b5e6-96231b3b80d8
TargetLoweringObjectFileELF. Use this to support it on X86. Unlike ARM,
on X86 it is not easy to find out if .init_array should be used or not, so
the decision is made via TargetOptions and defaults to off.
Add a command line option to llc that enables it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158692 91177308-0d34-0410-b5e6-96231b3b80d8
The NOP, WFE, WFI, SEV and YIELD instructions are all hints w/
a different immediate value in bits [7,0]. Define a generic HINT
instruction and refactor NOP, WFI, WFI, SEV and YIELD to be
assembly aliases of that.
rdar://11600518
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158674 91177308-0d34-0410-b5e6-96231b3b80d8
when a compile time constant is known. This occurs when implicitly zero
extending function arguments from 16 bits to 32 bits. The 8 bit case doesn't
need to be handled, as the 8 bit constants are encoded directly, thereby
not needing a separate load instruction to form the constant into a register.
<rdar://problem/11481151>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158659 91177308-0d34-0410-b5e6-96231b3b80d8
This patch will optimize abs(x-y)
FROM
sub, movs, rsbmi
TO
subs, rsbmi
For abs, we will use cmp instead of movs. This is necessary because we already
have an existing peephole pass which optimizes away cmp following sub.
rdar: 11633193
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158551 91177308-0d34-0410-b5e6-96231b3b80d8
We turned off the CMN instruction because it had semantics which we weren't
getting correct. If we are comparing with an immediate, then it's okay to use
the CMN instruction.
<rdar://problem/7569620>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158302 91177308-0d34-0410-b5e6-96231b3b80d8
There are some that I didn't remove this round because they looked like
obvious stubs. There are dead variables in gtest too, they should be
fixed upstream.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158090 91177308-0d34-0410-b5e6-96231b3b80d8
This allows a subtarget to explicitly specify the issue width and
other properties without providing pipeline stage details for every
instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157979 91177308-0d34-0410-b5e6-96231b3b80d8
when a compile time constant is known. This occurs when implicitly zero
extending function arguments from 16 bits to 32 bits.
<rdar://problem/11481151>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157966 91177308-0d34-0410-b5e6-96231b3b80d8
No functional change intended.
Sorry for the churn. The iterator classes are supposed to help avoid
giant commits like this one in the future. The TableGen-produced
register lists are getting quite large, and it may be necessary to
change the table representation.
This makes it possible to do so without changing all clients (again).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157854 91177308-0d34-0410-b5e6-96231b3b80d8
We handle struct byval by inserting a pseudo op, which will be expanded to a
loop at ExpandISelPseudos.
A separate patch for clang will be submitted to enable struct byval.
rdar://9877866
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157793 91177308-0d34-0410-b5e6-96231b3b80d8
to pass around a struct instead of a large set of individual values. This
cleans up the interface and allows more information to be added to the struct
for future targets without requiring changes to each and every target.
NV_CONTRIB
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157479 91177308-0d34-0410-b5e6-96231b3b80d8
32-bit offset jump tables just use real branch instructions and so aren't
marked as data regions. We were still emitting the .end_data_region
marker though, which assert()ed.
rdar://11499158
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157221 91177308-0d34-0410-b5e6-96231b3b80d8
Use a dedicated MachO load command to annotate data-in-code regions.
This is the same format the linker produces for final executable images,
allowing consistency of representation and use of introspection tools
for both object and executable files.
Data-in-code regions are annotated via ".data_region"/".end_data_region"
directive pairs, with an optional region type.
data_region_directive := ".data_region" { region_type }
region_type := "jt8" | "jt16" | "jt32" | "jta32"
end_data_region_directive := ".end_data_region"
The previous handling of ARM-style "$d.*" labels was broken and has
been removed. Specifically, it didn't handle ARM vs. Thumb mode when
marking the end of the section.
rdar://11459456
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157062 91177308-0d34-0410-b5e6-96231b3b80d8
the 0b10 mask encoding bits. Make MSR APSR writes without a _<bits> qualifier
an alias for MSR APSR_nzcvq even though ARM as deprecated it use. Also add
support for suffixes (_nzcvq, _g, _nzcvqg) for APSR versions. Some FIXMEs in
the code for better error checking when versions shouldn't be used.
rdar://11457025
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157019 91177308-0d34-0410-b5e6-96231b3b80d8
Add the MCRegisterInfo to the factories and constructors.
Patch by Tom Stellard <Tom.Stellard@amd.com>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156828 91177308-0d34-0410-b5e6-96231b3b80d8