Revert "Add classof implementations to the raw_ostream classes."
Revert "Use the cast machinery to remove dummy uses of formatted_raw_ostream."
The underlying issue can be fixed without classof.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234495 91177308-0d34-0410-b5e6-96231b3b80d8
Currently, llvm (backend) doesn't know cortex-r4, even though it is the
default target for armv7r. Using "--target=armv7r-arm-none-eabi" provokes
'cortex-r4' is not a recognized processor for this target' by llvm.
This patch adds support for cortex-r4 and, very closely related, r4f.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234486 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Make the code more readable by fusing the for-loops together and explicitly checking for each register class.
Also, this version is more straightforward because it doesn't assume that FPU registers always come before CPU registers in the CalleeSavedInfo vector.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8033
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234475 91177308-0d34-0410-b5e6-96231b3b80d8
restrictions when choosing a type for small-memcpy inlining in
SelectionDAGBuilder.
This ensures that the loads and stores output for the memcpy won't be further
expanded during legalization, which would cause the total number of instructions
for the memcpy to exceed (often significantly) the inlining thresholds.
<rdar://problem/17829180>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234462 91177308-0d34-0410-b5e6-96231b3b80d8
If we know we are producing an object, we don't need to wrap the stream
in a formatted_raw_ostream anymore.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234461 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Even though there is no 2nd register operand in the "lw/sw $8, symbol" case, we still try to find one,
and we end up with $0, which makes us generate an unnecessary "addu $8, $8, $0" (a.k.a. "move $8, $8").
We can avoid this by checking if the 2nd register operand is different from $0, before generating the addu.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8055
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234406 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
They are of the form "bnezl/beqzl $rs, offset" and expand to "bnel/beql $rs, $zero, offset".
These instructions are used in Linux inline assembly.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8540
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234401 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: Looks like new code from [[ http://reviews.llvm.org/rL222057 | rL222057 ]] doesn't account for early `return` in `ARMFrameLowering::emitPrologue`, which leads to loosing `.cfi_def_cfa_offset` directive for functions without stack frame.
Reviewers: echristo, rengolin, asl, t.p.northover
Reviewed By: t.p.northover
Subscribers: llvm-commits, rengolin, aemerson
Differential Revision: http://reviews.llvm.org/D8606
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234399 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
These AssemblerPredicate's are unnecessary and actually make some instructions unusable when assembling pre-MIPS32 ISAs.
For example, this was causing the IAS to reject the 'j' instruction for MIPS I-V.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8300
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234398 91177308-0d34-0410-b5e6-96231b3b80d8
This is currently considered experimental, but most of the more
commonly used instructions should work.
So far only SI has been extensively tested, CI and VI probably work too,
but may be buggy. The current set of tests cases do not give complete
coverage, but I think it is sufficient for an experimental assembler.
See the documentation in R600Usage for more information.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234381 91177308-0d34-0410-b5e6-96231b3b80d8
We weren't checking the sign of the floating point immediate before translating
it to "fmov sD, wzr". Similarly for D-regs.
Technically "movi vD.2s, #0x80, lsl #24" would work most of the time, but it's
not a blessed alias (and I don't think it should be since people expect writing
sD to zero out the high lanes, and there's no dD equivalent). So an error it is.
rdar://20455398
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234372 91177308-0d34-0410-b5e6-96231b3b80d8
This shouldn't affect anything in-tree, as the OperandType users are
mostly smart disassemblers and such; more information is helpful there.
However, on the flip side, that + the fact that this is just hinting at
the meaning of operands makes this not really test-worthy or testable.
Differential Revision: http://reviews.llvm.org/D8620
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234350 91177308-0d34-0410-b5e6-96231b3b80d8
Instead of lowering SELECT to SELECT_CC which is further lowered later
immediately call the SELECT_CC lowering code. This is preferable
because:
- Avoids an unnecessary roundtrip through the legalization queues with
an intermediate node.
- More importantly: Lowered operations get visited last leading to SELECT_CC
getting visited with legalized operands and unlegalized ones for preexisting
SELECT_CC nodes. This does not hurt the current code (hence no testcase) but
is required for another patch I am working on.
Differential Revision: http://reviews.llvm.org/D8187
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234334 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This is not possible when using the IAS for MIPS, but it is possible when using the IAS for other architectures and when using GAS for MIPS.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8578
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234316 91177308-0d34-0410-b5e6-96231b3b80d8
This also moves it earlier so that it they are produced before we print
an end symbol for the data section.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234315 91177308-0d34-0410-b5e6-96231b3b80d8
After recognising that a certain narrow instruction might need a relocation to
be represented, we used to unconditionally relax it to a Thumb2 instruction to
permit this. Unfortunately, some CPUs (e.g. v6m) don't even have most Thumb2
instructions, so we end up emitting a completely invalid instruction.
Theoretically, ELF does have relocations for these situations; but they are
fairly unusable with such short ranges and the ABI document even says they're
documented "for completeness". So an error is probably better there too.
rdar://20391953
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234195 91177308-0d34-0410-b5e6-96231b3b80d8
This patch allows SSE4.1 targets to use (V)PINSRB to create 16i8 vectors by inserting i8 scalars directly into a XMM register instead of merging pairs of i8 scalars into a i16 and using the SSE2 PINSRW instruction.
This allows folding of byte loads and reduces scalar register usage as well.
Differential Revision: http://reviews.llvm.org/D8839
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234193 91177308-0d34-0410-b5e6-96231b3b80d8
As pr19627 points out, every use of AliasedSymbol is likely a bug.
The main use was to avoid the oddity of a variable showing up as undefined. That
was fixed in r233995, which made these calls nops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234169 91177308-0d34-0410-b5e6-96231b3b80d8
This allows the compiler/assembly programmer to switch back to a
section. This in turn fixes the bootstrap failure on powerpc (tested
on gcc110) without changing the ppc codegen at all.
I will try to cleanup the various getELFSection overloads in a followup patch.
Just using a default argument now would lead to ambiguities.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234099 91177308-0d34-0410-b5e6-96231b3b80d8
Previously the patterns didn't have high enough priority and we would only use the GR32 form if the only the upper 32 or 56 bits were zero.
Fixes PR23100.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234075 91177308-0d34-0410-b5e6-96231b3b80d8
We don't need to represent UnwindHelp in IR. Instead, we can use the
knowledge that we are emitting the parent function to decide if we
should create the UnwindHelp stack object.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234061 91177308-0d34-0410-b5e6-96231b3b80d8
The plan here is to push the API changes out from the common components
(like Constant::getGetElementPtr and IRBuilder::CreateGEP related
functions) and just update callers to either pass the type if it's
obvious, or pass null.
Do this with LoadInst as well and anything else that comes up, then to
start porting specific uses to not pass null anymore - this may require
some refactoring in each case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234042 91177308-0d34-0410-b5e6-96231b3b80d8
As a follow-up to r234021, assert that a debug info intrinsic variable's
`MDLocalVariable::getInlinedAt()` always matches the
`MDLocation::getInlinedAt()` of its `!dbg` attachment.
The goal here is to get rid of `MDLocalVariable::getInlinedAt()`
entirely (PR22778), but I'll let these assertions bake for a while
first.
If you have an out-of-tree backend that just broke, you're probably
attaching the wrong `DebugLoc` to a `DBG_VALUE` instruction. The one
you want is the location that was attached to the corresponding
`@llvm.dbg.declare` or `@llvm.dbg.value` call that you started with.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234038 91177308-0d34-0410-b5e6-96231b3b80d8
When enabling PPC64LE, I disabled some optimizations of BUILD_VECTOR
nodes for little endian because wrong results were produced. I've
subsequently investigated and found this is due to a call to
BuildVectorSDNode::isConstantSplat that was always specifying
big-endian. With this changed to correctly identify the target
endianness, the optimizations work as expected.
I found another case of a call to the same method with big-endian
hardcoded, in PPC::isAllNegativeZeroVector(). I discovered this was
an orphaned method with no callers, so I've just removed it.
The existing test/CodeGen/PowerPC/vec_constants.ll checks these
optimizations, so for testing I've just added a variant for little
endian.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234011 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes PR19582.
Previously, when an asm assignment (.set or =) was created, we would look up
the section immediately in MCSymbol::setVariableValue. This caused symbols
to receive the wrong section if the RHS of the assignment had not been seen
yet. This had a knock-on effect in the object file emitters, causing them
to emit extra symbols, or to give symbols the wrong visibility or the wrong
section. For example, in the following asm:
.data
.Llocal:
.text
leaq .Llocal1(%rip), %rdi
.Llocal1 = .Llocal2
.Llocal2 = .Llocal
the first assignment would give .Llocal1 a null section, which would never get
fixed up by the second assignment. This would cause the ELF object file emitter
to consider .Llocal1 to be an undefined symbol and give it external linkage,
even though .Llocal1 should not have been emitted at all in the object file.
Or in the following asm:
alias_to_local = Ltmp0
Ltmp0:
the Mach-O object file emitter would give the alias_to_local symbol a n_type
of N_SECT and a n_sect of 0. This is invalid under the Mach-O specification,
which requires N_SECT symbols to receive a non-zero section number if the
symbol is defined in a section in the object file.
https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/MachORuntime/#//apple_ref/c/tag/nlist
After this change we do not look up the section when the assignment is created,
but instead look it up on demand and store it in Section, which is treated
as a cache if the symbol is a variable symbol.
This change also fixes a bug in MCExpr::FindAssociatedSection. Previously,
if we saw a subtraction, we would return the first referenced section, even in
cases where we should have been returning the absolute pseudo-section. Now we
always return the absolute pseudo-section for expressions that subtract two
section-derived expressions. This isn't always correct (e.g. if one of the
sections ends up being laid out at an absolute address), but it's probably
the best we can do without more context.
This allows us to remove code in two places where we appear to have been
working around this bug, in MachObjectWriter::markAbsoluteVariableSymbols
and in X86AsmPrinter::EmitStartOfAsmFile.
Re-applies r233595 (aka D8586), which was reverted in r233898.
Differential Revision: http://reviews.llvm.org/D8798
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233995 91177308-0d34-0410-b5e6-96231b3b80d8
Register coalescing can change the target of a RegPair hint to a
physreg, we should not crash on this. This also slightly improved the
way ARMBaseRegisterInfo::updateRegAllocHint() works.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233987 91177308-0d34-0410-b5e6-96231b3b80d8
Without this patch, we split the 256-bit vector into halves and produced something like:
movzwl (%rdi), %eax
vmovd %eax, %xmm0
vxorps %xmm1, %xmm1, %xmm1
vblendps $15, %ymm0, %ymm1, %ymm0 ## ymm0 = ymm0[0,1,2,3],ymm1[4,5,6,7]
Now, we eliminate the xor and blend because those zeros are free with the vmovd:
movzwl (%rdi), %eax
vmovd %eax, %xmm0
This should be the final fix needed to resolve PR22685:
https://llvm.org/bugs/show_bug.cgi?id=22685
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233941 91177308-0d34-0410-b5e6-96231b3b80d8
Require the pointee type to be passed explicitly and assert that it is
correct. For now it's possible to pass nullptr here (and I've done so in
a few places in this patch) but eventually that will be disallowed once
all clients have been updated or removed. It'll be a long road to get
all the way there... but if you have the cahnce to update your callers
to pass the type explicitly without depending on a pointer's element
type, that would be a good thing to do soon and a necessary thing to do
eventually.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233938 91177308-0d34-0410-b5e6-96231b3b80d8
For code like this:
define <8 x i32> @load_v8i32() {
ret <8 x i32> <i32 7, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>
}
We produce this AVX code:
_load_v8i32: ## @load_v8i32
movl $7, %eax
vmovd %eax, %xmm0
vxorps %ymm1, %ymm1, %ymm1
vblendps $1, %ymm0, %ymm1, %ymm0 ## ymm0 = ymm0[0],ymm1[1,2,3,4,5,6,7]
retq
There are at least 2 bugs in play here:
We're generating a blend when a move scalar does the same job using 2 less instruction bytes (see FIXMEs).
We're not matching an existing pattern that would eliminate the xor and blend entirely. The zero bytes are free with vmovd.
The 2nd fix involves an adjustment of "AddedComplexity" [1] and mostly masks the 1st problem.
[1] AddedComplexity has close to no documentation in the source.
The best we have is this comment: "roughly corresponds to the number of nodes that are covered".
It appears that x86 has bastardized this definition by inflating its values for some other
undocumented reason. For example, we have a pattern with "AddedComplexity = 400" (!).
I searched my way to this page:
https://groups.google.com/forum/#!topic/llvm-dev/5UX-Og9M0xQ
Differential Revision: http://reviews.llvm.org/D8794
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233931 91177308-0d34-0410-b5e6-96231b3b80d8