- CodeGenPrepare pass for identifying div/rem ops
- Backend specifies the type mapping using addBypassSlowDivType
- Enabled only for Intel Atom with O2 32-bit -> 8-bit
- Replace IDIV with instructions which test its value and use DIVB if the value
is positive and less than 256.
- In the case when the quotient and remainder of a divide are used a DIV
and a REM instruction will be present in the IR. In the non-Atom case
they are both lowered to IDIVs and CSE removes the redundant IDIV instruction,
using the quotient and remainder from the first IDIV. However,
due to this optimization CSE is not able to eliminate redundant
IDIV instructions because they are located in different basic blocks.
This is overcome by calculating both the quotient (DIV) and remainder (REM)
in each basic block that is inserted by the optimization and reusing the result
values when a subsequent DIV or REM instruction uses the same operands.
- Test cases check for the presents of the optimization when calculating
either the quotient, remainder, or both.
Patch by Tyler Nowicki!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163150 91177308-0d34-0410-b5e6-96231b3b80d8
Change current Hexagon MI scheduler to use new converging
scheduler. Integrates DFA resource model into it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163137 91177308-0d34-0410-b5e6-96231b3b80d8
This patch corrects the definition of umlal/smlal instructions and adds support
for matching them to the ARM dag combiner.
Bug 12213
Patch by Yin Ma!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163136 91177308-0d34-0410-b5e6-96231b3b80d8
the NumMCOperands argument to the GetMCInstOperandNum() function that is set
to the number of MCOperands this asm operand mapped to.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163124 91177308-0d34-0410-b5e6-96231b3b80d8
MatchInstructionImpl() function.
These values are used by the ConvertToMCInst() function to index into the
ConversionTable. The values are also needed to call the GetMCInstOperandNum()
function.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163101 91177308-0d34-0410-b5e6-96231b3b80d8
For example, the ARM target does not have efficient ISel handling for vector
selects with scalar conditions. This patch adds a TLI hook which allows the
different targets to report which selects are supported well and which selects
should be converted to CF duting codegen prepare.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163093 91177308-0d34-0410-b5e6-96231b3b80d8
NEON domain conversion was too heavy-handed with its widened
registers, which could have stripped existing instructions of their
dependency, leaving them vulnerable to scheduling errors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163070 91177308-0d34-0410-b5e6-96231b3b80d8
output chain is correctly setup.
As an example, if the original load must happen before later stores, we need
to make sure the constructed VZEXT_LOAD is constrained to be before the stores.
rdar://11457792
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163036 91177308-0d34-0410-b5e6-96231b3b80d8
- In addition to undefined, if V2 is zero vector, skip 2nd PSHUFB and POR as
well as PSHUFB will zero elements with negative indices.
Patch by Sriram Murali <sriram.murali@intel.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163018 91177308-0d34-0410-b5e6-96231b3b80d8
on the size of the extraction and its position in the 64 bit word.
This patch allows support of the dext transformations with mips64 direct
object output.
0 <= msb < 32 0 <= lsb < 32 0 <= pos < 32 1 <= size <= 32
DINS
The field is entirely contained in the right-most word of the doubleword
32 <= msb < 64 0 <= lsb < 32 0 <= pos < 32 2 <= size <= 64
DINSM
The field straddles the words of the doubleword
32 <= msb < 64 32 <= lsb < 64 32 <= pos < 64 1 <= size <= 32
DINSU
The field is entirely contained in the left-most word of the doubleword
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163010 91177308-0d34-0410-b5e6-96231b3b80d8
Thumb2 instructions are mostly constrained to rGPR, not tGPR which is
for Thumb1.
rdar://problem/12203728
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162968 91177308-0d34-0410-b5e6-96231b3b80d8
The assembly string for the VMOVPQIto64rr instruction incorrectly lacked the 'v'
prefix, resulting in mis-assembly of the vanilla movd instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162963 91177308-0d34-0410-b5e6-96231b3b80d8
- Add 'UseSSEx' to force SSE legacy insn not being selected when AVX is
enabled.
As the penalty of inter-mixing SSE and AVX instructions, we need
prevent SSE legacy insn from being generated except explicitly
specified through some intrinsics. For patterns supported by both
SSE and AVX, so far, we force AVX insn will be tried first relying on
AddedComplexity or position in td file. It's error-prone and
introduces bugs accidentally.
'UseSSEx' is disabled when AVX is turned on. For SSE insns inherited
by AVX, we need this predicate to force VEX encoding or SSE legacy
encoding only.
For insns not inherited by AVX, we still use the previous predicates,
i.e. 'HasSSEx'. So far, these insns fall into the following
categories:
* SSE insns with MMX operands
* SSE insns with GPR/MEM operands only (xFENCE, PREFETCH, CLFLUSH,
CRC, and etc.)
* SSE4A insns.
* MMX insns.
* x87 insns added by SSE.
2 test cases are modified:
- test/CodeGen/X86/fast-isel-x86-64.ll
AVX code generation is different from SSE one. 'vcvtsi2sdq' cannot be
selected by fast-isel due to complicated pattern and fast-isel
fallback to materialize it from constant pool.
- test/CodeGen/X86/widen_load-1.ll
AVX code generation is different from SSE one after fixing SSE/AVX
inter-mixing. Exec-domain fixing prefers 'vmovapd' instead of
'vmovaps'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162919 91177308-0d34-0410-b5e6-96231b3b80d8
[Tobias von Koch] What's happening here is that the CR6SET/CR6UNSET is breaking the chain of register copies glued to the function call (BL_SVR4 node). The scheduler then moves other instructions in between those and the function call, which isn't good!
Right. That's the case where there is no chain of register copies before the call, so InFlag == 0... Attached is a new revision of the patch which should fix this for good.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162916 91177308-0d34-0410-b5e6-96231b3b80d8
- The root cause is that target constant materialization in X86 fast-isel
creates a PC-rel addressing which may overflow 32-bit range in non-Small code
model if .rodata section is allocated too far away from code segment in
MCJIT, which uses Large code model so far.
- Follow the similar logic to fix non-Small code model in fast-isel by skipping
non-Small code model.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162881 91177308-0d34-0410-b5e6-96231b3b80d8
Ordered memory operations are more constrained than volatile loads and
stores because they must be ordered with respect to all other memory
operations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162861 91177308-0d34-0410-b5e6-96231b3b80d8
We need to reserve space for the mandatory traceback fields,
though leaving them as zero is appropriate for now.
Although the ABI calls for these fields to be filled in fully, no
compiler on Linux currently does this, and GDB does not read these
fields. GDB uses the first word of zeroes during exception handling to
find the end of the function and the size field, allowing it to compute
the beginning of the function. DWARF information is used for everything
else. We need the extra 8 bytes of pad so the size field is found in
the right place.
As a comparison, GCC fills in a few of the fields -- language, number
of saved registers -- but ignores the rest. IBM's proprietary OSes do
make use of the full traceback table facility.
Patch by Bill Schmidt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162854 91177308-0d34-0410-b5e6-96231b3b80d8
This disables malloc-specific optimization when -fno-builtin (or -ffreestanding)
is specified. This has been a problem for a long time but became more severe
with the recent memory builtin improvements.
Since the memory builtin functions are used everywhere, this required passing
TLI in many places. This means that functions that now have an optional TLI
argument, like RecursivelyDeleteTriviallyDeadFunctions, won't remove dead
mallocs anymore if the TLI argument is missing. I've updated most passes to do
the right thing.
Fixes PR13694 and probably others.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162841 91177308-0d34-0410-b5e6-96231b3b80d8
I have tested the fix, but have not been successfull in generating
a robust unit test. This can only be exposed through particular
register assignments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162821 91177308-0d34-0410-b5e6-96231b3b80d8
on the size of the extraction and its position in the 64 bit word.
This patch allows support of the dext transformations with mips64 direct
object output.
0 <= msb < 32 0 <= lsb < 32 0 <= pos < 32 1 <= size <= 32
DINS
The field is entirely contained in the right-most word of the doubleword
32 <= msb < 64 0 <= lsb < 32 0 <= pos < 32 2 <= size <= 64
DINSM
The field straddles the words of the doubleword
32 <= msb < 64 32 <= lsb < 64 32 <= pos < 64 1 <= size <= 32
DINSU
The field is entirely contained in the left-most word of the doubleword
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162782 91177308-0d34-0410-b5e6-96231b3b80d8
transformed to the final instruction variant. An
example would be dsrll which is transformed into
dsll32 if the shift value is greater than 32.
For direct object output we need to do this transformation
in the codegen. If the instruction was inside branch
delay slot, it was being missed. This patch corrects this
oversight.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162779 91177308-0d34-0410-b5e6-96231b3b80d8
traceback table on PowerPC64. This helps gdb handle exceptions. The other
mandatory fields are ignored by gdb and harder to implement so just add
there a FIXME.
Patch by Bill Schmidt. PR13641.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162778 91177308-0d34-0410-b5e6-96231b3b80d8
Fix a couple of bugs in mips' long branch pass.
This patch was supposed to be committed along with r162731, so I don't have a
new test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162777 91177308-0d34-0410-b5e6-96231b3b80d8
Add subtargets for Freescale e500mc (32-bit) and e5500 (64-bit) to
the PowerPC backend.
Patch by Tobias von Koch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162764 91177308-0d34-0410-b5e6-96231b3b80d8
it here, then a 'register-memory' version would wrongly get the commutative
flag.
<rdar://problem/12180135>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162741 91177308-0d34-0410-b5e6-96231b3b80d8
- Add a target-specific DAG optimization to recognize a pattern PTEST-able.
Such a pattern is a OR'd tree with X86ISD::OR as the root node. When
X86ISD::OR node has only its flag result being used as a boolean value and
all its leaves are extracted from the same vector, it could be folded into an
X86ISD::PTEST node.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162735 91177308-0d34-0410-b5e6-96231b3b80d8
This wasn't the right way to enforce ordering of atomics.
We are already setting the isVolatile bit on memory operands of atomic
operations which is good enough to enforce the correct ordering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162732 91177308-0d34-0410-b5e6-96231b3b80d8
Instructions emitted to compute branch offsets now use immediate operands
instead of symbolic labels. This change was needed because there were problems
when R_MIPS_HI16/LO16 relocations were used to make shared objects.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162731 91177308-0d34-0410-b5e6-96231b3b80d8
Slight reorganisation of PPC instruction classes for scheduling. No
functionality change for existing subtargets.
- Clearly separate load/store-with-update instructions from regular loads and stores.
- Split IntRotateD -> IntRotateD and IntRotateDI
- Split out fsub and fadd from FPGeneral -> FPAddSub
- Update existing itineraries
Patch by Tobias von Koch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162729 91177308-0d34-0410-b5e6-96231b3b80d8
Allow load-immediates to be rematerialised in the register coalescer for
PPC. This makes test/CodeGen/PowerPC/big-endian-formal-args.ll fail,
because it relies on a register move getting emitted. The immediate load is
equivalent, so change this test case.
Patch by Tobias von Koch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162727 91177308-0d34-0410-b5e6-96231b3b80d8
The 32-bit ABI requires CR bit 6 to be set if the call has fp arguments and
unset if it doesn't. The solution up to now was to insert a MachineNode to
set/unset the CR bit, which produces a CR vreg. This vreg was then copied
into CR bit 6. When the register allocator saw a bunch of these in the same
function, it allocated the set/unset CR bit in some random CR register (1
extra instruction) and then emitted CR moves before every vararg function
call, rather than just setting and unsetting CR bit 6 directly before every
vararg function call. This patch instead inserts a PPCcrset/PPCcrunset
instruction which are then matched by a dedicated instruction pattern.
Patch by Tobias von Koch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162725 91177308-0d34-0410-b5e6-96231b3b80d8
The zeroextend IR instruction is lowered to an 'and' node with an immediate
mask operand, which in turn gets legalised to a sequence of ori's & ands.
This can be done more efficiently using the rldicl instruction.
Patch by Tobias von Koch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162724 91177308-0d34-0410-b5e6-96231b3b80d8
It is not safe to use normal LDR instructions because they may be
reordered by the scheduler. The ATOMIC_LDR pseudos have a mayStore flag
that prevents reordering.
Atomic loads are also prevented from participating in rematerialization
and load folding.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162713 91177308-0d34-0410-b5e6-96231b3b80d8
ARMConstantIslandPass expects this instruction to stay in the same basic
block as the jump table branch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162615 91177308-0d34-0410-b5e6-96231b3b80d8
It's not clear that they should be marked as such, but tbb formation
fails if t2LEApcrelJT is hoisted of of a loop.
This doesn't change the flags on these instructions,
UnmodeledSideEffects was already inferred from the missing pattern.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162603 91177308-0d34-0410-b5e6-96231b3b80d8
The ARM BL and BLX instructions don't have predicate operands, but the
thumb variants tBL and tBLX do.
The argument registers should be added as implicit uses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162593 91177308-0d34-0410-b5e6-96231b3b80d8
There is special magic happening when returning floating point values on
the x87 stack. The RET instructions get extra f80 operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162592 91177308-0d34-0410-b5e6-96231b3b80d8
the temporary register that was used to load the immediate. Currently, it always
returns register $at, but this will change if, in the future, we decide to use
another register.
No changes in functionality.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162417 91177308-0d34-0410-b5e6-96231b3b80d8
Assertion failed: (Start.isValid() == End.isValid() && "Start and end should
either both be valid or both be invalid!")
when parsing inline asm. SMLoc assumes that the first char * in the source is
invalid. However, when parsing an inline asm the mnemonic is at this location.
I don't want to change SMLoc, so use a trivial workaround.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162381 91177308-0d34-0410-b5e6-96231b3b80d8
to prevent it from being clobbered. mips uses $gp to access small data section.
This bug was originally reported by Carl Norum.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162340 91177308-0d34-0410-b5e6-96231b3b80d8
within the codegen EK_GPRel64BlockAddress. This was not
supported for direct object output and resulted in an assertion.
This change adds support for EK_GPRel64BlockAddress for
direct object.
One fallout from this is to turn on rela relocations
for mips64 to match gas.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162334 91177308-0d34-0410-b5e6-96231b3b80d8
this is the index of the operand that failed to match.
Note: This may cause a buildbot failure due to an API mismatch in clang. Should
recover with my next commit to clang.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162295 91177308-0d34-0410-b5e6-96231b3b80d8
this allows for better code generation.
Added a new DAGCombine transformation to convert FMAX and FMIN to FMANC and
FMINC, which are commutative.
For example:
movaps %xmm0, %xmm1
movsd LC(%rip), %xmm0
minsd %xmm1, %xmm0
becomes:
minsd LC(%rip), %xmm0
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162187 91177308-0d34-0410-b5e6-96231b3b80d8
Add these transformations to the existing add/sub ones:
(and (select cc, -1, c), x) -> (select cc, x, (and, x, c))
(or (select cc, 0, c), x) -> (select cc, x, (or, x, c))
(xor (select cc, 0, c), x) -> (select cc, x, (xor, x, c))
The selects can then be transformed to a single predicated instruction
by peephole.
This transformation will make it possible to eliminate the ISD::CAND,
COR, and CXOR custom DAG nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162176 91177308-0d34-0410-b5e6-96231b3b80d8
arithmetic instructions. However, when small data types are used, a truncate
node appears between the SETCC node and the arithmetic operation. This patch
adds support for this pattern.
Before:
xorl %esi, %edi
testb %dil, %dil
setne %al
ret
After:
xorb %dil, %sil
setne %al
ret
rdar://12081007
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162160 91177308-0d34-0410-b5e6-96231b3b80d8
No new tests are added.
All tests in ExecutionEngine/MCJIT that have been failing pass after this patch
is applied (when "make check" is done on a mips board).
Patch by Petar Jovanovic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162135 91177308-0d34-0410-b5e6-96231b3b80d8
PEI can't handle the pseudo-instructions. This can be removed when the
pseudo-instructions are replaced by normal predicated instructions.
Fixes PR13628.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162130 91177308-0d34-0410-b5e6-96231b3b80d8
make it more consistent with its intended semantics.
The `linker_private_weak_def_auto' linkage type was meant to automatically hide
globals which never had their addresses taken. It has nothing to do with the
`linker_private' linkage type, which outputs the symbols with a `l' (ell) prefix
among other things.
The intended semantic is more like the `linkonce_odr' linkage type.
Change the name of the linkage type to `linkonce_odr_auto_hide'. And therefore
changing the semantics so that it produces the correct output for the linker.
Note: The old linkage name `linker_private_weak_def_auto' will still parse but
is not a synonym for `linkonce_odr_auto_hide'. This should be removed in 4.0.
<rdar://problem/11754934>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162114 91177308-0d34-0410-b5e6-96231b3b80d8
It is not my plan to duplicate the entire ARM instruction set with
predicated versions. We need a way of representing predicated
instructions in SSA form without requiring a separate opcode.
Then the pseudo-instructions can go away.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162061 91177308-0d34-0410-b5e6-96231b3b80d8
Without fastcc support, the caller just falls through to CallingConv::C
for fastcc, but callee still uses fastcc, this inconsistency of calling
convention is a problem, and fastcc support can fix it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162013 91177308-0d34-0410-b5e6-96231b3b80d8
The ARM select instructions are just predicated moves. If the select is
the only use of an operand, the instruction defining the operand can be
predicated instead, saving one instruction and decreasing register
pressure.
This implementation can turn AND/ORR/EOR instructions into their
corresponding ANDCC/ORRCC/EORCC variants. Ideally, we should be able to
predicate any instruction, but we don't yet support predicated
instructions in SSA form.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161994 91177308-0d34-0410-b5e6-96231b3b80d8
When predicating this instruction:
Rd = ADD Rn, Rm
We need an extra operand to represent the value given to Rd when the
predicate is false:
Rd = ADDCC Rfalse, Rn, Rm, pred
The Rd and Rfalse operands are different registers while in SSA form.
Rfalse is tied to Rd to make sure they get the same register during
register allocation.
Previously, Rd and Rn were tied, but that is not required.
Compare to MOVCC:
Rd = MOVCC Rfalse, Rtrue, pred
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161955 91177308-0d34-0410-b5e6-96231b3b80d8
reversed. This leads to wrong codegen for float-to-half conversion
intrinsics which are used to support storage-only fp16 type.
NEON variants of same instructions are fine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161907 91177308-0d34-0410-b5e6-96231b3b80d8
- FP_EXTEND only support extending from vectors with matching elements.
This results in the scalarization of extending to v2f64 from v2f32,
which will be legalized to v4f32 not matching with v2f64.
- add X86-specific VFPEXT supproting extending from v4f32 to v2f64.
- add BUILD_VECTOR lowering helper to recover back the original
extending from v4f32 to v2f64.
- test case is enhanced to include different vector width.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161894 91177308-0d34-0410-b5e6-96231b3b80d8
Refactor the TableGen'erated fixed length disassemblmer to use a
table-driven state machine rather than a massive set of nested
switch() statements.
As a result, the ARM Disassembler (ARMDisassembler.cpp) builds much more
quickly and generates a smaller end result. For a Release+Asserts build on
a 16GB 3.4GHz i7 iMac w/ SSD:
Time to compile at -O2 (averaged w/ hot caches):
Previous: 35.5s
New: 8.9s
TEXT size:
Previous: 447,251
New: 297,661
Builds in 25% of the time previously required and generates code 66% of
the size.
Execution time of the disassembler is only slightly slower (7% disassembling
10 million ARM instructions, 19.6s vs 21.0s). The new implementation has
not yet been tuned, however, so the performance should almost certainly
be recoverable should it become a concern.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161888 91177308-0d34-0410-b5e6-96231b3b80d8
It never does anything when running 'make check', and it get's in the
way of updating live intervals in 2-addr.
The hook was originally added to help form IT blocks in Thumb2 code
before register allocation, but the pass ordering has changed since
then, and we run if-conversion after register allocation now.
When the MI scheduler is enabled, there will be no less than two
schedulers between 2-addr and Thumb2ITBlockPass, so this hook is
unlikely to help anything.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161794 91177308-0d34-0410-b5e6-96231b3b80d8
This was causing unnecessary spills/restores of callee saved registers.
Fixes PR13572.
Patch by Pranav Bhandarkar!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161778 91177308-0d34-0410-b5e6-96231b3b80d8
OpTbl1 to OpTbl2 since they have 3 operands and the last operand can be changed
to a memory operand.
PR13576
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161769 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, we used VLD1.32 in all cases, however there are both 16 and 64-bit
accesses being selected, so we need to use an appropriate width load in those
cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161748 91177308-0d34-0410-b5e6-96231b3b80d8
architecture
It broke MultiSource/Applications/JM/ldecod/ldecod on armv7 thumb O0 g and armv7
thumb O3.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161736 91177308-0d34-0410-b5e6-96231b3b80d8
- FCMOV only supports a subset of X86 conditions. Skip boolean
simplification if X86 condition is not valid for FCMOV.
- add a minimal test case for PR13577.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161732 91177308-0d34-0410-b5e6-96231b3b80d8
FeatureFastUAMem for Nehalem, Westmere and Sandy Bridge.
FeatureFastUAMem is already on if we pass in nehalem or westmere as a command
argument.
rdar: 7252306
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161717 91177308-0d34-0410-b5e6-96231b3b80d8
- if a boolean test (X86ISD::CMP or X86ISD:SUB) checks a boolean value
generated from X86ISD::SETCC, try to simplify the boolean value
generation and checking by reusing the original EFLAGS with proper
condition code
- add hooks to X86 specific SETCC/BRCOND/CMOV, the major 3 places
consuming EFLAGS
part of patches fixing PR12312
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161687 91177308-0d34-0410-b5e6-96231b3b80d8
the register info for getEncodingValue. This builds on the
small patch of yesterday to set HWEncoding in the register
file.
One (deprecated) use was turned into a hard number to avoid
needing register info in the old JIT.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161628 91177308-0d34-0410-b5e6-96231b3b80d8
This new API will be used by clang to parse ms-style inline asms.
One goal of this project is to use this style of inline asm for targets other
then x86. Therefore, this API needs to be implemented for non-x86 targets at
some point in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161624 91177308-0d34-0410-b5e6-96231b3b80d8
This patch corrects the definition of umlal/smlal instructions and adds support
for matching them to the ARM dag combiner.
Bug 12213
Patch by Yin Ma!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161581 91177308-0d34-0410-b5e6-96231b3b80d8
This way of using getNextOperandForReg() was unlikely to work as
intended. We don't give any guarantees about the order of operands in
the use-def chains, so looking only at operands following a given
operand in the chain doesn't make sense.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161542 91177308-0d34-0410-b5e6-96231b3b80d8
This replaces an existing subtarget hook on ARM and allows standard
CodeGen passes to potentially use the property.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161471 91177308-0d34-0410-b5e6-96231b3b80d8
We perform the following:
1> Use SUB instead of CMP for i8,i16,i32 and i64 in ISel lowering.
2> Modify MachineCSE to correctly handle implicit defs.
3> Convert SUB back to CMP if possible at peephole.
Removed pattern matching of (a>b) ? (a-b):0 and like, since they are handled
by peephole now.
rdar://11873276
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161462 91177308-0d34-0410-b5e6-96231b3b80d8
We can't rematerialize a PIC base after register allocation anyway, and
scanning physreg use-def chains is very expensive in a function with
many calls.
<rdar://problem/12047515>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161461 91177308-0d34-0410-b5e6-96231b3b80d8
initialize fields of the class that it used.
The result was nonsense code.
Before:
0000000000000000 <foo>:
0: 00441100 0x441100
4: 03e00008 jr ra
8: 00000000 nop
After:
0000000000000000 <foo>:
0: 00041000 sll v0,a0,0x0
4: 03e00008 jr ra
8: 00000000 nop
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161377 91177308-0d34-0410-b5e6-96231b3b80d8
This allows codegen passes to query properties like
InstrItins->SchedModel->IssueWidth. It also ensure's that
computeOperandLatency returns the X86 defaults for loads and "high
latency ops". This should have no significant impact on existing
schedulers because X86 defaults happen to be the same as global
defaults.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161370 91177308-0d34-0410-b5e6-96231b3b80d8
I hit this in a very large program (spirit.cpp), but
have not figured out how to make a small make check
test for it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161366 91177308-0d34-0410-b5e6-96231b3b80d8
were using a class defined for 32 bit instructions and
thus the instruction was for addiu instead of daddiu.
This was corrected by adding the instruction opcode as a
field in the base class to be filled in by the defs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161359 91177308-0d34-0410-b5e6-96231b3b80d8
These 2 relocations gain access to the
highest and the second highest 16 bits
of a 64 bit object.
R_MIPS_HIGHER %higher(A+S)
The %higher(x) function is [ (((long long) x + 0x80008000LL) >> 32) & 0xffff ].
R_MIPS_HIGHEST %highest(A+S)
The %highest(x) function is [ (((long long) x + 0x800080008000LL) >> 48) & 0xffff ].
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161348 91177308-0d34-0410-b5e6-96231b3b80d8
The MFTB instruction itself is being phased out, and its functionality
is provided by MFSPR. According to the ISA docs, using MFSPR works on all known
chips except for the 601 (which did not have a timebase register anyway)
and the POWER3.
Thanks to Adhemerval Zanella for pointing this out!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161346 91177308-0d34-0410-b5e6-96231b3b80d8
On PPC64, this can be done with a simple TableGen pattern.
To enable this, I've added the (otherwise missing) readcyclecounter
SDNode definition to TargetSelectionDAG.td.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161302 91177308-0d34-0410-b5e6-96231b3b80d8
(this corresponds by spilling/reloading regs in DTriple / DQuad reg classes).
No testcase, found by inspection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161300 91177308-0d34-0410-b5e6-96231b3b80d8
this makes this hack a bit more bearable
for poor souls who need to pass custom
preprocessor flags to the build process
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161240 91177308-0d34-0410-b5e6-96231b3b80d8
Fast isel doesn't currently have support for translating builtin function
calls to target instructions. For embedded environments where the library
functions are not available, this is a matter of correctness and not
just optimization. Most of this patch is just arranging to make the
TargetLibraryInfo available in fast isel. <rdar://problem/12008746>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161232 91177308-0d34-0410-b5e6-96231b3b80d8
This just provides a way to look up a LibFunc::Func enum value for a
function name. Alphabetize the enums and function names so we can use a
binary search.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161231 91177308-0d34-0410-b5e6-96231b3b80d8
Now that TableGen supports references to NAME w/o it being explicitly
referenced in the definition's own name, use that to simplify
assembly InstAlias definitions in multiclasses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161218 91177308-0d34-0410-b5e6-96231b3b80d8
Add more comments and use early returns to reduce nesting in isLoadFoldable.
Also disable folding for V_SET0 to avoid introducing a const pool entry and
a const pool load.
rdar://10554090 and rdar://11873276
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161207 91177308-0d34-0410-b5e6-96231b3b80d8
Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.
This patch is a rework of r160919 and was tested on clang self-host on my local
machine.
rdar://10554090 and rdar://11873276
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161152 91177308-0d34-0410-b5e6-96231b3b80d8
No new test case is added.
This patch makes test JITTest.FunctionIsRecompiledAndRelinked pass on mips
platform.
Patch by Petar Jovanovic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161098 91177308-0d34-0410-b5e6-96231b3b80d8