getCastInstrCost had an assert prohibiting scalar to vector casts. Such casts,
however, are allowed. This should make the vectorizer buildbot happier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166998 91177308-0d34-0410-b5e6-96231b3b80d8
When the operand is a plain immediate rather than a label, print it
as [pc, #imm] like we do for the Thumb2 wide encoding variant.
rdar://12154503
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166991 91177308-0d34-0410-b5e6-96231b3b80d8
We will make them delay slot forms if there is something that can be
placed in the delay slot during a separate pass. Mips16 extended instructions
cannot be placed in delay slots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166990 91177308-0d34-0410-b5e6-96231b3b80d8
is 24 bits not 20 and the decoding needed to correctly handle converting the
J1 and J2 bits to their I1 and I2 values to reconstruct the displacement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166982 91177308-0d34-0410-b5e6-96231b3b80d8
ELF ABI.
A varargs parameter consisting of a single-precision floating-point value,
or of a single-element aggregate containing a single-precision floating-point
value, must be passed in the low-order (rightmost) four bytes of the
doubleword stack slot reserved for that parameter. If there are GPR protocol
registers remaining, the parameter must also be mirrored in the low-order
four bytes of the reserved GPR.
Prior to this patch, such parameters were being passed in the high-order
four bytes of the stack slot and the mirrored GPR.
The patch adds a new test case to verify the correct code generation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166968 91177308-0d34-0410-b5e6-96231b3b80d8
ELF subtarget.
The existing logic is used as a fallback to avoid any changes to the Darwin
ABI. PPC64 ELF now has two possible data layout strings: one for FreeBSD,
which requires 8-byte alignment, and a default string that requires
16-byte alignment.
I've added a test for PPC64 Linux to verify the 16-byte alignment. If
somebody wants to add a separate test for FreeBSD, that would be great.
Note that there is a companion patch to update the alignment information
in Clang, which I am committing now as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166928 91177308-0d34-0410-b5e6-96231b3b80d8
Previously mips16 was sharing the pattern addr which is used for mips32
and mips64. This had a number of problems:
1) Storing and loading byte and halfword quantities for mips16 has particular
problems due to the primarily non mips16 nature of SP. When we must
load/store byte/halfword stack objects in a function, we must create a mips16
alias register for SP. This functionality is tested in stchar.ll.
2) We need to have an FP register under certain conditions (such as
dynamically sized alloca). We use mips16 register S0 for this purpose.
In this case, we also use this register when accessing frame objects so this
issue also affects the complex pattern addr16. This functionality is
tested in alloca16.ll.
The Mips16InstrInfo.td has been updated to use addr16 instead of addr.
The complex pattern C++ function for addr has been copied to addr16 and
updated to reflect the above issues.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166897 91177308-0d34-0410-b5e6-96231b3b80d8
This method emits nodes for passing byval arguments in registers and stack.
This has the same functionality as existing functions PassByValArg64 and
WriteByValArg which will be deleted later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166843 91177308-0d34-0410-b5e6-96231b3b80d8
This method copies byval arguments passed in registers onto the stack and has
the same functionality as existing functions CopyMips64ByValRegs and
ReadByValArg which will be deleted later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166841 91177308-0d34-0410-b5e6-96231b3b80d8
Add getCostXXX calls for different families of opcodes, such as casts, arithmetic, cmp, etc.
Port the LoopVectorizer to the new API.
The LoopVectorizer now finds instructions which will remain uniform after vectorization. It uses this information when calculating the cost of these instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166836 91177308-0d34-0410-b5e6-96231b3b80d8
Keep the integer_insertelement test case, the new coalescer can handle
this kind of lane insertion without help from pseudo-instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166835 91177308-0d34-0410-b5e6-96231b3b80d8
Some instructions in ARM require 2 even-odd paired GPRs. This
patch adds support for such register class.
Patch by Weiming Zhao!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166816 91177308-0d34-0410-b5e6-96231b3b80d8
This patch fixes the rldcl/rldicl/rldicr instruction emission. The issue is
the MDForm_1 instruction defines the PowerISA MB field from 'rldicl'
with the name MBE, but RLDCL/RLDICL/RLDICR definition uses as 'MB'.
It end up by generatint the 'rldicl' enconding at
'lib/Target/PowerPC/PPCGenMCCodeEmitter.inc' to use the fourth argument as the
third. The patch changes it by adjusting to use the fourth argument as
intended.
Fixes PR14180.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166770 91177308-0d34-0410-b5e6-96231b3b80d8
As discussed on IRC, add VectorTargetTransform::getNumberOfParts
to provide a stable interface to the vector legalization splitting factor.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166751 91177308-0d34-0410-b5e6-96231b3b80d8
and also fixes the R_PPC64_TOC16 and R_PPC64_TOC16_DS relocation offset.
The 'nop' is needed so a restore TOC instruction (ld r2,40(r1)) can be placed
by the linker to correct restore the TOC of previous function.
Current code has two issues: it defines in PPCInstr64Bit.td file a LDinto_toc
and LDtoc_restore as a DSForm_1 with DS_RA=0 where it should be
DS=2 (the 8 bytes displacement of the TOC saving). It also wrongly emits a
MC intruction using an uint32_t value while the PPC::BL8_NOP_ELF
and PPC::BLA8_NOP_ELF are both uint64_t (because of the following 'nop').
This patch corrects the remaining ExecutionEngine using MCJIT:
ExecutionEngine/2002-12-16-ArgTest.ll
ExecutionEngine/2003-05-07-ArgumentTest.ll
ExecutionEngine/2005-12-02-TailCallBug.ll
ExecutionEngine/hello.ll
ExecutionEngine/hello2.ll
ExecutionEngine/test-call.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166682 91177308-0d34-0410-b5e6-96231b3b80d8
structs having size 3, 5, 6, or 7. Such a struct must be passed and received
as right-justified within its register or memory slot. The problem is only
present for structs that are passed in registers.
Previously, as part of a patch handling all structs of size less than 8, I
added logic to rotate the incoming register so that the struct was left-
justified prior to storing the whole register. This was incorrect because
the address of the parameter had already been adjusted earlier to point to
the right-adjusted value in the storage slot. Essentially I had accidentally
accounted for the right-adjustment twice.
In this patch, I removed the incorrect logic and reorganized the code to make
the flow clearer.
The removal of the rotates changes the expected code generation, so test case
structsinregs.ll has been modified to reflect this. I also added a new test
case, jaggedstructs.ll, to demonstrate that structs of these sizes can now
be properly received and passed.
I've built and tested the code on powerpc64-unknown-linux-gnu with no new
regressions. I also ran the GCC compatibility test suite and verified that
earlier problems with these structs are now resolved, with no new regressions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166680 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds initial PPC64 TOC MC object creation using the small mcmodel
(a single 64K TOC) adding the some TOC relocations (R_PPC64_TOC,
R_PPC64_TOC16, and R_PPC64_TOC16DS).
The addition of 'undefinedExplicitRelSym' hook on 'MCELFObjectTargetWriter'
is meant to avoid the creation of an unreferenced ".TOC." symbol (used in
the .odp creation) as well to set the R_PPC64_TOC relocation target as the
temporary ".TOC." symbol. On PPC64 ABI, the R_PPC64_TOC relocation should
not point to any symbol.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166677 91177308-0d34-0410-b5e6-96231b3b80d8
[register].field
The operator returns the value at the location pointed to by register plus the
offset of field within its structure or union. This patch only handles
immediate fields (i.e., [eax].4). The original displacement has to be a
MCConstantExpr as well.
Part of rdar://12470415 and rdar://12470514
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166632 91177308-0d34-0410-b5e6-96231b3b80d8
into a sbc with a positive number, the immediate should be complemented, not
negated. Also added a missing pattern for ARM codegen.
rdar://12559385
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166613 91177308-0d34-0410-b5e6-96231b3b80d8
see the offsetof operator. Previously, we were matching something like MOVrm
in the front-end and later matching MOVrr in the back-end. This change makes
things more consistent. It also fixes cases where we can't match against a
memory operand as the source (test cases coming).
Part of rdar://12470317
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166592 91177308-0d34-0410-b5e6-96231b3b80d8
- As there's no 64-bit GPRs in 32-bit mode, a custom conversion from v2u32 to
v2f32 is added to improve the efficiency of the code generated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166545 91177308-0d34-0410-b5e6-96231b3b80d8
the difference from "int x" (which should go in registers and
"struct y {int x;}" (which should not).
Clang will be updated in the next patches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166536 91177308-0d34-0410-b5e6-96231b3b80d8
non-zero value as we don't know the actual value at this point. This is
necessary to get the matching correct in some cases. However, the actual value
set as the base register doesn't matter, since we're just matching not emitting.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166523 91177308-0d34-0410-b5e6-96231b3b80d8
and easier to read by adding a couple helper functions. Suggestion by
Chandler Carruth and seconded by Meador Inge!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166515 91177308-0d34-0410-b5e6-96231b3b80d8
- Check index being extracted to be constant 0 before simplfiying.
Otherwise, retain the original sequence.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166504 91177308-0d34-0410-b5e6-96231b3b80d8
- Replace v4i8/v8i8 -> v8f32 DAG combine with custom lowering to reduce
DAG combine overhead.
- Extend the support to v4i16/v8i16 as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166487 91177308-0d34-0410-b5e6-96231b3b80d8
for the PowerPC target, and factoring the results. This will ease future
maintenance of both subtargets.
PPCTargetLowering::LowerCall_Darwin_Or_64SVR4() has grown a lot of special-case
code for the different ABIs, making maintenance difficult. This is getting
worse as we repair errors in the 64-bit ELF ABI implementation, while avoiding
changes to the Darwin ABI logic. This patch splits the routine into
LowerCall_Darwin() and LowerCall_64SVR4(), allowing both versions to be
significantly simplified. I've factored out chunks of similar code where it
made sense to do so. I also performed similar factoring on
LowerFormalArguments_Darwin() and LowerFormalArguments_64SVR4().
There are no functional changes in this patch, and therefore no new test
cases have been developed.
Built and tested on powerpc64-unknown-linux-gnu with no new regressions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166480 91177308-0d34-0410-b5e6-96231b3b80d8
The CFG of the machine function needs to know that the targets of the indirect
branch are successors to the indirect branch.
<rdar://problem/12529625>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166448 91177308-0d34-0410-b5e6-96231b3b80d8
Per the October 12, 2012 Proposal for annotated disassembly output sent out by
Jim Grosbach this set of changes implements this for X86 and arm. The llvm-mc
tool now has a -mdis option to produced the marked up disassembly and a couple
of small example test cases have been added.
rdar://11764962
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166445 91177308-0d34-0410-b5e6-96231b3b80d8
Currently, it is enabled only if option "enable-mips-tail-calls" is given and
all of the callee's arguments are passed in registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166342 91177308-0d34-0410-b5e6-96231b3b80d8
a memory operand. Retain this information and then add the sizing directives
to the IR. This allows the backend to do proper instruction selection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166316 91177308-0d34-0410-b5e6-96231b3b80d8
which is supposed to consistently raise SIGTRAP across all systems. In contrast,
__builtin_trap() behave differently on different systems. e.g. it raises SIGTRAP on ARM, and
SIGILL on X86. The purpose of __builtin_debugtrap() is to consistently provide "trap"
functionality, in the mean time preserve the compatibility with on gcc on __builtin_trap().
The X86 backend is already able to handle debugtrap(). This patch is to:
1) make front-end recognize "__builtin_debugtrap()" (emboddied in the one-line change to Clang).
2) In DAG legalization phase, by default, "debugtrap" will be replaced with "trap", which
make the __builtin_debugtrap() "available" to all existing ports without the hassle of
changing their code.
3) If trap-function is specified (via -trap-func=xyz to llc), both __builtin_debugtrap() and
__builtin_trap() will be expanded into the function call of the specified trap function.
This behavior may need change in the future.
The provided testing-case is to make sure 2) and 3) are working for ARM port, and we
already have a testing case for x86.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166300 91177308-0d34-0410-b5e6-96231b3b80d8
- If INSERT_VECTOR_ELT is supported (above SSE2, either by custom
sequence of legal insn), transform BUILD_VECTOR into SHUFFLE +
INSERT_VECTOR_ELT if most of elements could be built from SHUFFLE with few
(so far 1) elements being inserted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166288 91177308-0d34-0410-b5e6-96231b3b80d8
Removed extra stack frame object for fixed byval arguments,
VarArgsStyleRegisters invocation was reworked due to some improper usage in
past. PR14099 also demonstrates it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166273 91177308-0d34-0410-b5e6-96231b3b80d8
test case on PowerPC caused by rounding errors when converting from a 64-bit
integer to a single-precision floating point. The reason for this are
double-rounding effects, since on PowerPC we have to convert to an
intermediate double-precision value first, which gets rounded to the
final single-precision result.
The patch fixes the problem by preparing the 64-bit integer so that the
first conversion step to double-precision will always be exact, and the
final rounding step will result in the correctly-rounded single-precision
result. The generated code sequence is equivalent to what GCC would generate.
When -enable-unsafe-fp-math is in effect, that extra effort is omitted
and we accept possible rounding errors (just like GCC does as well).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166178 91177308-0d34-0410-b5e6-96231b3b80d8
The TargetTransform changes are breaking LTO bootstraps of clang. I am
working with Nadav to figure out the problem, but I am reverting it for now
to get our buildbots working.
This reverts svn commits: 165665 165669 165670 165786 165787 165997
and I have also reverted clang svn 165741
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166168 91177308-0d34-0410-b5e6-96231b3b80d8
All callers of these functions really want the isPhysRegOrOverlapUsed()
functionality which also checks aliases. For historical reasons, targets
without register aliases were calling isPhysRegUsed() instead.
Change isPhysRegUsed() to also check aliases, and switch all
isPhysRegOrOverlapUsed() callers to isPhysRegUsed().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166117 91177308-0d34-0410-b5e6-96231b3b80d8
The previous MRI.isPhysRegUsed(YMM0) would also return true when the
function contains a call to a function that may clobber YMM0. That's
most of them.
Checking the use-def chains allows us to skip functions that don't
explicitly mention YMM registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166110 91177308-0d34-0410-b5e6-96231b3b80d8
- MBB address is only valid as an immediate value in Small & Static
code/relocation models. On other models, LEA is needed to load IP address of
the restore MBB.
- A minor fix of MBB in MC lowering is added as well to enable target
relocation flag being propagated into MC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166084 91177308-0d34-0410-b5e6-96231b3b80d8
- Add custom FP_TO_SINT on v8i16 (and v8i8 which is legalized as v8i16 due to
vector element-wise widening) to reduce DAG combiner and its overhead added
in X86 backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166036 91177308-0d34-0410-b5e6-96231b3b80d8
For the PowerPC 64-bit ELF Linux ABI, aggregates of size less than 8
bytes are to be passed in the low-order bits ("right-adjusted") of the
doubleword register or memory slot assigned to them. A previous patch
addressed this for aggregates passed in registers. However, small
aggregates passed in the overflow portion of the parameter save area are
still being passed left-adjusted.
The fix is made in PPCTargetLowering::LowerCall_Darwin_Or_64SVR4 on the
caller side, and in PPCTargetLowering::LowerFormalArguments_64SVR4 on
the callee side. The main fix on the callee side simply extends
existing logic for 1- and 2-byte objects to 1- through 7-byte objects,
and correcting a constant left over from 32-bit code. There is also a
fix to a bogus calculation of the offset to the following argument in
the parameter save area.
On the caller side, again a constant left over from 32-bit code is
fixed. Additionally, some code for 1, 2, and 4-byte objects is
duplicated to handle the 3, 5, 6, and 7-byte objects for SVR4 only. The
LowerCall_Darwin_Or_64SVR4 logic is getting fairly convoluted trying to
handle both ABIs, and I propose to separate this into two functions in a
future patch, at which time the duplication can be removed.
The patch adds a new test (structsinmem.ll) to demonstrate correct
passing of structures of all seven sizes. Eight dummy parameters are
used to force these structures to be in the overflow portion of the
parameter save area.
As a side effect, this corrects the case when aggregates passed in
registers are saved into the first eight doublewords of the parameter
save area: Previously they were stored left-justified, and now are
properly stored right-justified. This requires changing the expected
output of existing test case structsinregs.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166022 91177308-0d34-0410-b5e6-96231b3b80d8
Stack is formed improperly for long structures passed as byval arguments for
EABI mode.
If we took AAPCS reference, we can found the next statements:
A: "If the argument requires double-word alignment (8-byte), the NCRN (Next
Core Register Number) is rounded up to the next even register number." (5.5
Parameter Passing, Stage C, C.3).
B: "The alignment of an aggregate shall be the alignment of its most-aligned
component." (4.3 Composite Types, 4.3.1 Aggregates).
So if we have structure with doubles (9 double fields) and 3 Core unused
registers (r1, r2, r3): caller should use r2 and r3 registers only.
Currently r1,r2,r3 set is used, but it is invalid.
Callee VA routine should also use r2 and r3 regs only. All is ok here. This
behaviour is guessed by rounding up SP address with ADD+BFC operations.
Fix:
Main fix is in ARMTargetLowering::HandleByVal. If we detected AAPCS mode and
8 byte alignment, we waste odd registers then.
P.S.:
I also improved LDRB_POST_IMM regression test. Since ldrb instruction will
not generated by current regression test after this patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166018 91177308-0d34-0410-b5e6-96231b3b80d8
Original message:
The attached is the fix to radar://11663049. The optimization can be outlined by following rules:
(select (x != c), e, c) -> select (x != c), e, x),
(select (x == c), c, e) -> select (x == c), x, e)
where the <c> is an integer constant.
The reason for this change is that : on x86, conditional-move-from-constant needs two instructions;
however, conditional-move-from-register need only one instruction.
While the LowerSELECT() sounds to be the most convenient place for this optimization, it turns out to be a bad place. The reason is that by replacing the constant <c> with a symbolic value, it obscure some instruction-combining opportunities which would otherwise be very easy to spot. For that reason, I have to postpone the change to last instruction-combining phase.
The change passes the test of "make check-all -C <build-root/test" and "make -C project/test-suite/SingleSource".
Original message since r165661:
My previous change has a bug: I negated the condition code of a CMOV, and go ahead creating a new CMOV using the *ORIGINAL* condition code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166017 91177308-0d34-0410-b5e6-96231b3b80d8
- Besides used in SjLj exception handling, __builtin_setjmp/__longjmp is also
used as a light-weight replacement of setjmp/longjmp which are used to
implementation continuation, user-level threading, and etc. The support added
in this patch ONLY addresses this usage and is NOT intended to support SjLj
exception handling as zero-cost DWARF exception handling is used by default
in X86.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165989 91177308-0d34-0410-b5e6-96231b3b80d8
This patch replaces the EmitRawText by a EmitTCEntry class (specialized for
each Streamer) in PowerPC64 TOC entry creation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165940 91177308-0d34-0410-b5e6-96231b3b80d8
Convert the internal representation of the Attributes class into a pointer to an
opaque object that's uniqued by and stored in the LLVMContext object. The
Attributes class then becomes a thin wrapper around this opaque
object. Eventually, the internal representation will be expanded to include
attributes that represent code generation options, etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165917 91177308-0d34-0410-b5e6-96231b3b80d8
X86 doesn't have i8 cmovs so isel would emit a branch. Emitting branches at this
level is often not a good idea because it's too late for many optimizations to
kick in. This solution doesn't add any extensions (truncs are free) and tries
to avoid introducing partial register stalls by filtering direct copyfromregs.
I'm seeing a ~10% speedup on reading a random .png file with libpng15 via
graphicsmagick on x86_64/westmere, but YMMV depending on the microarchitecture.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165868 91177308-0d34-0410-b5e6-96231b3b80d8
the interface between the front-end and the MC layer when parsing inline
assembly. Unfortunately, this is too deep into the parsing stack. Specifically,
we're unable to handle target-independent assembly (i.e., assembly directives,
labels, etc.). Note the MatchAndEmitInstruction() isn't the correct
abstraction either. I'll be exposing target-independent hooks shortly, so this
is really just a cleanup.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165858 91177308-0d34-0410-b5e6-96231b3b80d8
local frame causes problem.
For example:
void f(StructToPass s) {
g(&s, sizeof(s));
}
will cause problem with tail-call since part of s is passed via registers and
saved in f's local frame. When g tries to access s, part of s may be corrupted
since f's local frame is popped out before the tail-call.
The current fix is to disable tail-call if getVarArgsRegSaveSize is not 0 for
the caller. This is a conservative approach, if we can prove the address of
s or part of s is not taken and passed to g, it should be okay to perform
tail-call.
rdar://12442472
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165853 91177308-0d34-0410-b5e6-96231b3b80d8
isa<> et al. automatically infer when the cast is an upcast (including a
self-cast), so these are no longer necessary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165767 91177308-0d34-0410-b5e6-96231b3b80d8
For function calls on the 64-bit PowerPC SVR4 target, each parameter
is mapped to as many doublewords in the parameter save area as
necessary to hold the parameter. The first 13 non-varargs
floating-point values are passed in registers; any additional
floating-point parameters are passed in the parameter save area. A
single-precision floating-point parameter (32 bits) must be mapped to
the second (rightmost, low-order) word of its assigned doubleword
slot.
Currently LLVM violates this ABI requirement by mapping such a
parameter to the first (leftmost, high-order) word of its assigned
doubleword slot. This is internally self-consistent but will not
interoperate correctly with libraries compiled with an ABI-compliant
compiler.
This patch corrects the problem by adjusting the parameter addressing
on both sides of the calling convention.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165714 91177308-0d34-0410-b5e6-96231b3b80d8
Note: [D]M{T,F}CP2 is just a recommended encoding. Vendors often provide a
custom CP2 that interprets instructions differently and may wish to add their
own instructions that use this opcode. We should ensure that this is easy to
do. I will probably add a 'has custom CP{0-3}' subtarget flag to make this
easy: We want to avoid the GCC situation where every MIPS vendor makes a custom
fork that breaks every other MIPS CPU and so can't be merged upstream.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165711 91177308-0d34-0410-b5e6-96231b3b80d8
Original message:
The attached is the fix to radar://11663049. The optimization can be outlined by following rules:
(select (x != c), e, c) -> select (x != c), e, x),
(select (x == c), c, e) -> select (x == c), x, e)
where the <c> is an integer constant.
The reason for this change is that : on x86, conditional-move-from-constant needs two instructions;
however, conditional-move-from-register need only one instruction.
While the LowerSELECT() sounds to be the most convenient place for this optimization, it turns out to be a bad place. The reason is that by replacing the constant <c> with a symbolic value, it obscure some instruction-combining opportunities which would otherwise be very easy to spot. For that reason, I have to postpone the change to last instruction-combining phase.
The change passes the test of "make check-all -C <build-root/test" and "make -C project/test-suite/SingleSource".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165661 91177308-0d34-0410-b5e6-96231b3b80d8
the compiler makes use of GPR0. However, there are two flavors of
GPR0 defined by the target: the 32-bit GPR0 (R0) and the 64-bit GPR0
(X0). The spill/reload code makes use of R0 regardless of whether we
are generating 32- or 64-bit code.
This patch corrects the problem in the obvious manner, using X0 and
ADDI8 for 64-bit and R0 and ADDI for 32-bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165658 91177308-0d34-0410-b5e6-96231b3b80d8
the Altivec extensions were introduced. Its use is optional, and
allows the compiler to communicate to the operating system which
vector registers should be saved and restored during a context switch.
In practice, this information is ignored by the various operating
systems using the SVR4 ABI; the kernel saves and restores the entire
register state. Setting the VRSAVE register is no longer performed by
the AIX XL compilers, the IBM i compilers, or by GCC on Power Linux
systems. It seems best to avoid this logic within LLVM as well.
This patch avoids generating code to update and restore VRSAVE for the
PowerPC SVR4 ABIs (32- and 64-bit). The code remains in place for the
Darwin ABI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165656 91177308-0d34-0410-b5e6-96231b3b80d8
- Due to the current matching vector elements constraints in
ISD::FP_ROUND, rounding from v2f64 to v4f32 (after legalization from
v2f32) is scalarized. Add a customized v2f32 widening to convert it
into a target-specific X86ISD::VFPROUND to work around this
constraints.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165631 91177308-0d34-0410-b5e6-96231b3b80d8
- Due to the current matching vector elements constraints in ISD::FP_EXTEND,
rounding from v2f32 to v2f64 is scalarized. Add a customized v2f32 widening
to convert it into a target-specific X86ISD::VFPEXT to work around this
constraints. This patch also reverts a previous attempt to fix this issue by
recovering the scalarized ISD::FP_EXTEND pattern and thus significantly
reduces the overhead of supporting non-power-2 vector FP extend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165625 91177308-0d34-0410-b5e6-96231b3b80d8
SDNode for LDRB_POST_IMM is invalid: number of registers added to SDNode fewer
that described in .td.
7 ops is needed, but SDNode with only 6 is created.
In more details:
In ARMInstrInfo.td, in multiclass AI2_ldridx, in definition _POST_IMM, offset
operand is defined as am2offset_imm. am2offset_imm is complex parameter type,
and actually it consists from dummy register and imm itself. As I understood
trick with dummy reg was made for AsmParser. In ARMISelLowering.cpp, this dummy
register was not added to SDNode, and it cause crash in Peephole Optimizer pass.
The problem fixed by setting up additional dummy reg when emitting
LDRB_POST_IMM instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165617 91177308-0d34-0410-b5e6-96231b3b80d8
SchedulerDAGInstrs::buildSchedGraph ignores dependencies between FixedStack
objects and byval parameters. So loading byval parameters from stack may be
inserted *before* it will be stored, since these operations are treated as
independent.
Fix:
Currently ARMTargetLowering::LowerFormalArguments saves byval registers with
FixedStack MachinePointerInfo. To fix the problem we need to store byval
registers with MachinePointerInfo referenced to first the "byval" parameter.
Also commit adds two new fields to the InputArg structure: Function's argument
index and InputArg's part offset in bytes relative to the start position of
Function's argument. E.g.: If function's argument is 128 bit width and it was
splitted onto 32 bit regs, then we got 4 InputArg structs with same arg index,
but different offset values.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165616 91177308-0d34-0410-b5e6-96231b3b80d8
Allows the new machine model to be used for NumMicroOps and OutputLatency.
Allows the HazardRecognizer to be disabled along with itineraries.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165603 91177308-0d34-0410-b5e6-96231b3b80d8
This patch provides initial implementation of load address
macro instruction for Mips. We have implemented two kinds
of expansions with their variations depending on the size
of immediate operand:
1) load address with immediate value directly:
* la d,j => addiu d,$zero,j (for -32768 <= j <= 65535)
* la d,j => lui d,hi16(j)
ori d,d,lo16(j) (for any other 32 bit value of j)
2) load load address with register offset value
* la d,j(s) => addiu d,s,j (for -32768 <= j <= 65535)
* la d,j(s) => lui d,hi16(j) (for any other 32 bit value of j)
ori d,d,lo16(j)
addu d,d,s
This patch does not cover the case when the address is loaded
from the value of the label or function.
Contributer: Vladimir Medic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165561 91177308-0d34-0410-b5e6-96231b3b80d8
- Teach it about dadd[i] instructions and move pseudo-instruction
- Make it parse the register names correctly (for N32 / N64)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165506 91177308-0d34-0410-b5e6-96231b3b80d8
We use the enums to query whether an Attributes object has that attribute. The
opaque layer is responsible for knowing where that specific attribute is stored.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165488 91177308-0d34-0410-b5e6-96231b3b80d8
Vector compare using altivec 'vcmpxxx' instructions have as third argument
a vector register instead of CR one, different from integer and float-point
compares. This leads to a failure in code generation, where 'SelectSETCC'
expects a DAG with a CR register and gets vector register instead.
This patch changes the behavior by just returning a DAG with the
vector compare instruction based on the type. The patch also adds a testcase
for all vector types llvm defines.
It also included a fix on signed 5-bits predicates printing, where
signed values were not handled correctly as signed (char are unsigned by
default for PowerPC). This generates 'vspltisw' (vector splat)
instruction with SIM out of range.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165419 91177308-0d34-0410-b5e6-96231b3b80d8
into separate versions for the Darwin and 64-bit SVR4 ABIs. This will
facilitate doing more major surgery on the 64-bit SVR4 ABI in the near future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165336 91177308-0d34-0410-b5e6-96231b3b80d8
Make sure functions located in user specified text sections (via the
section attribute) are located together with the default text sections.
Otherwise, for large object files, the relocations for call instructions
are more likely to be out of range. This becomes even more likely in the
presence of LTO.
rdar://12402636
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165254 91177308-0d34-0410-b5e6-96231b3b80d8
a) frame setup instructions define the prologue
b) we shouldn't change our location mid-stream
Add a test to make sure that the stack adjustment stays within
the prologue.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165250 91177308-0d34-0410-b5e6-96231b3b80d8
- Add 'HwEncoding' for X86 registers and call getEncodingValue() to
retrieve their encoding values.
- This's the first step to adopt new scheme. Furthur revising is onging.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165241 91177308-0d34-0410-b5e6-96231b3b80d8
"Instruction 'foo' has no tokens" errors during llvm-tblgen
-gen-asm-matcher attempts. At this time, the added
tokens are "#comment" style rather than the actual mnemonic. This will
be revisited once the rest of the base asmparser bits get straightened
out for ppc64-elf-linux.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165237 91177308-0d34-0410-b5e6-96231b3b80d8
macro instruction (li) in the assembler.
We have identified three possible expansions depending on
the size of immediate operand:
1) for 0 ≤ j ≤ 65535.
li d,j =>
ori d,$zero,j
2) for −32768 ≤ j < 0.
li d,j =>
addiu d,$zero,j
3) for any other value of j that is representable as a 32-bit integer.
li d,j =>
lui d,hi16(j)
ori d,d,lo16(j)
All of the above have been implemented in ths patch.
Contributer: Vladimir Medic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165199 91177308-0d34-0410-b5e6-96231b3b80d8
.set option
The patch implements following options
at - lets the assembler use the $at register for macros,
but generates warnings if the source program uses $at
noat - let source programs use $at without issuingwarnings.
noreorder - prevents the assembler from reordering machine
language instructions.
nomacro - causes the assembler to print a warning whenever
an assembler operation generates more than one
machine language instruction.
macro - lets the assembler generate multiple machine instructions
from a single assembler instruction
reorder - lets the assembler reorder machine language
instructions to improve performance
The above variants are parsed and their boolean values set or unset.
The code to actually use them will come later.
Following options are not implemented yet:
nomips16
nomicromips
move
nomove
Contributer: Vladimir Medic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165194 91177308-0d34-0410-b5e6-96231b3b80d8