I was originally going to use MVC for memmove too, but that's less of
a clear win. Remove some accidental left-overs in the previous commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185804 91177308-0d34-0410-b5e6-96231b3b80d8
Use MVC for memcpy in cases where a single MVC is enough. Using MVC is
a win for longer copies too, but I'll leave that for later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185802 91177308-0d34-0410-b5e6-96231b3b80d8
The stack coloring pass has code to delete stores and loads that become
trivially dead after coloring. Extend it to cope with single instructions
that copy from one frame index to another.
The testcase happens to show an example of this kicking in at the moment.
It did occur in Real Code too though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185705 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes foldMemoryOperandImpl() so that it doesn't create duplicated
frame MMOs. I hadn't realized when writing r185434 that it was the caller's
responsibility to add these.
No behavioural change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185704 91177308-0d34-0410-b5e6-96231b3b80d8
...now that the problem that prompted the restriction has been fixed.
The original spill-02.py was a compromise because at the time I couldn't
find an example that actually failed without the two scavenging slots.
The version included here did.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185701 91177308-0d34-0410-b5e6-96231b3b80d8
This is another prerequisite for frame-to-frame MVC copies.
I'll commit the patch that makes use of the slot separately.
The downside of trying to test many corner cases with each of the
available addressing modes is that a fair few tests need to account
for the new frame layout. I do still think it's useful to have all
these tests though, since it's something that wouldn't get much coverage
otherwise.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185698 91177308-0d34-0410-b5e6-96231b3b80d8
SystemZ wants normal register scavenging slots, as close to the stack or
frame pointer as possible. The only reason it was using custom code was
because PrologEpilogInserter assumed an x86-like layout, where the frame
pointer is at the opposite end of the frame from the stack pointer.
This meant that when frame pointer elimination was disabled,
the slots ended up being as close as possible to the incoming
stack pointer, which is the opposite of what we want on SystemZ.
This patch adds a new knob to say which layout is used and converts
SystemZ to use target-independent scavenging slots. It's one of the pieces
needed to support frame-to-frame MVCs, where two slots might be required.
The ABI requires us to allocate 160 bytes for calls, so one approach
would be to use that area as temporary spill space instead. It would need
some surgery to make sure that the slot isn't live across a call though.
I stuck to the "isFPCloseToIncomingSP - ..." style comment on the
"do what the surrounding code does" principle. The FP case is already
covered by several Systemz/frame-* tests, which fail without the
PrologueEpilogueInserter change, so no new ones are needed.
No behavioural change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185696 91177308-0d34-0410-b5e6-96231b3b80d8
Add a mapping from register-based <INSN>R instructions to the corresponding
memory-based <INSN>. Use it to cut down on the number of spill loads.
Some instructions extend their operands from smaller fields, so this
required a new TSFlags field to say how big the unextended operand is.
This optimisation doesn't trigger for C(G)R and CL(G)R because in practice
we always combine those instructions with a branch. Adding a test for every
other case probably seems excessive, but it did catch a missed optimisation
for DSGF (fixed in r185435).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185529 91177308-0d34-0410-b5e6-96231b3b80d8
Rename Function->DispKey and PairType->DispSize. I'd originally used
"Function" because I thought it might be useful for other InstMappings.
However, it turns out that having two very similar instructions with the
same Function makes it pretty useless for anything other than the displacement
size key. Other InstMappings will want the key to be defined for only one
instruction in the pair.
No behavioural change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185526 91177308-0d34-0410-b5e6-96231b3b80d8
Get rid of some old code (and associated FIXME) for handling the
caller-allocated register save area. No behavioural change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185525 91177308-0d34-0410-b5e6-96231b3b80d8
This is dead code since PIC16 was removed in 2010. The result was an odd mix,
where some parts would carefully pass it along and others would assert it was
zero (most of the object streamer for example).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185436 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes some cases where we were using full 64-bit division for (sdiv i32, i32)
and (sdiv i64, i32).
The "32" in "SDIVREM32" just refers to the second operand. The first operand
of all *DIVREM*s is a GR128.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185435 91177308-0d34-0410-b5e6-96231b3b80d8
Try to use MVC when spilling the destination of a simple load or the source
of a simple store. As explained in the comment, this doesn't yet handle
the case where the load or store location is also a frame index, since
that could lead to two simultaneous scavenger spills, something the
backend can't handle yet. spill-02.py tests that this restriction kicks in,
but unfortunately I've not yet found a case that would fail without it.
The volatile trick I used for other scavenger tests doesn't work here
because we can't use MVC for volatile accesses anyway.
I'm planning on relaxing the restriction later, hopefully with a test
that does trigger the problem...
Tests @f8 and @f9 also showed that L(G)RL and ST(G)RL were wrongly
classified as SimpleBDX{Load,Store}. It wouldn't be easy to test for
that bug separately, which is why I didn't split out the fix as a
separate patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185434 91177308-0d34-0410-b5e6-96231b3b80d8
This is the first use of D(L,B) addressing, which required a fair bit
of surgery. For that reason, the patch just adds the instruction
definition and the associated assembler and disassembler support.
A later patch will actually make use of it for codegen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185433 91177308-0d34-0410-b5e6-96231b3b80d8
Add pseudo conditional store instructions, so that we use:
branch foo:
store
foo:
instead of:
load
branch foo:
move
foo:
store
z196 has real 32-bit and 64-bit conditional stores, but we don't use
any z196 instructions yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185065 91177308-0d34-0410-b5e6-96231b3b80d8
Someone may want to do something crazy, like replace these objects if they
change or something.
No functionality change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184175 91177308-0d34-0410-b5e6-96231b3b80d8
NOTE: If this broke your out-of-tree backend, in *RegisterInfo.td, change
the instances of SubRegIndex that have a comps template arg to use the
ComposedSubRegIndex class instead.
In TableGen land, this adds Size and Offset attributes to SubRegIndex,
and the ComposedSubRegIndex class, for which the Size and Offset are
computed by TableGen. This also adds an accessor in MCRegisterInfo, and
Size/Offsets for the X86 and ARM subreg indices.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183020 91177308-0d34-0410-b5e6-96231b3b80d8
Unlike most -- hopefully "all other", but I'm still checking -- memory
instructions we support, LOAD REVERSED and STORE REVERSED may access
the memory location several times. This means that they are not suitable
for volatile loads and stores.
This patch is a prerequisite for better atomic load and store support.
The same principle applies there: almost all memory instructions we
support are inherently atomic ("block concurrent"), but LOAD REVERSED
and STORE REVERSED are exceptions.
Other instructions continue to allow volatile operands. I will add
positive "allows volatile" tests at the same time as the "allows atomic
load or store" tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183002 91177308-0d34-0410-b5e6-96231b3b80d8
The code to distinguish between unaligned and aligned addresses was
already there, so this is mostly just a switch-on-and-test process.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182920 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes PR16146: gdb.base__call-ar-st.exp fails after
pre-RA-sched=source fixes.
Patch by Xiaoyi Guo!
This also fixes an unsupported dbg.value test case. Codegen was
previously incorrect but the test was passing by luck.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182885 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds support for the CRJ and CGRJ instructions. Support for
the immediate forms will be a separate patch.
The architecture has a large number of comparison instructions. I think
it's generally better to concentrate on using the "best" comparison
instruction first and foremost, then only use something like CRJ if
CR really was the natual choice of comparison instruction. The patch
therefore opportunistically converts separate CR and BRC instructions
into a single CRJ while emitting instructions in ISelLowering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182764 91177308-0d34-0410-b5e6-96231b3b80d8
Change SelectionDAG::getXXXNode() interfaces as well as call sites of
these functions to pass in SDLoc instead of DebugLoc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182703 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, an invalid instruction like:
foo %r1, %r0
would generate the rather odd error message:
....: error: unknown token in expression
foo %r1, %r0
^
We now get the more informative:
....: error: invalid instruction
foo %r1, %r0
^
The same would happen if an address were used where a register was expected.
We now get "invalid operand for instruction" instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182644 91177308-0d34-0410-b5e6-96231b3b80d8
The idea is to make sure that:
(1) "register expected" is restricted to cases where ParseRegister()
is called and the token obviously isn't a register.
(2) "invalid register" is restricted to cases where a register-like "%..."
sequence is found, but the "..." makes no sense.
(3) the generic "invalid operand for instruction" is used in cases where
the wrong register type is used (GPR instead of FPR, etc.).
(4) the new "invalid register pair" is used if the register has the right type,
but is not a valid register pair.
Testing of (1)-(3) is now restricted to regs-bad.s. It uses a representative
instruction for each register class to make sure that only registers from
that class are accepted.
(4) is tested by both regs-bad.s (which checks all invalid register pairs)
and insn-bad.s (which tests one invalid pair for each instruction that
requires a pair).
While there, I changed "Number" to "Num" for consistency with the
operand class.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182643 91177308-0d34-0410-b5e6-96231b3b80d8
There was exactly one caller using this API right, the others were relying on
specific behavior of the default implementation. Since it's too hard to use it
right just remove it and standardize on the default behavior.
Defines away PR16132.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182636 91177308-0d34-0410-b5e6-96231b3b80d8
Addresses a review comment from Ulrich Weigand. No functional change intended.
I'm not sure whether the old TODO that this patch touches still holds,
but that's something we'd get to when adding a targetted scheduling
description.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182474 91177308-0d34-0410-b5e6-96231b3b80d8
The original version of the pass could underestimate the length of a backward
branch in cases like:
alignment to N bytes or more
...
relaxable branch A
...
foo: (aligned to M<N bytes)
...
bar: (aligned to N bytes)
...
relaxable branch B to foo
We don't add any misalignment gap for "bar" because N bytes of alignment
had already been reached earlier in the function. In this case, assuming
that A is relaxed can push "foo" closer to "bar", and make B appear to be
in range. Similar problems can occur for forward branches.
I don't think it's possible to create blocks with mixed alignments as
things stand, not least because we haven't yet defined getPrefLoopAlignment()
for SystemZ (that would need benchmarking). So I don't think we can test
this yet.
Thanks to Rafael Espíndola for spotting the bug.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182460 91177308-0d34-0410-b5e6-96231b3b80d8
Before this change, the SystemZ backend would use BRCL for all branches
and only consider shortening them to BRC when generating an object file.
E.g. a branch on equal would use the JGE alias of BRCL in assembly output,
but might be shortened to the JE alias of BRC in ELF output. This was
a useful first step, but it had two problems:
(1) The z assembler isn't traditionally supposed to perform branch shortening
or branch relaxation. We followed this rule by not relaxing branches
in assembler input, but that meant that generating assembly code and
then assembling it would not produce the same result as going directly
to object code; the former would give long branches everywhere, whereas
the latter would use short branches where possible.
(2) Other useful branches, like COMPARE AND BRANCH, do not have long forms.
We would need to do something else before supporting them.
(Although COMPARE AND BRANCH does not change the condition codes,
the plan is to model COMPARE AND BRANCH as a CC-clobbering instruction
during codegen, so that we can safely lower it to a separate compare
and long branch where necessary. This is not a valid transformation
for the assembler proper to make.)
This patch therefore moves branch relaxation to a pre-emit pass.
For now, calls are still shortened from BRASL to BRAS by the assembler,
although this too is not really the traditional behaviour.
The first test takes about 1.5s to run, and there are likely to be
more tests in this vein once further branch types are added. The feeling
on IRC was that 1.5s is a bit much for a single test, so I've restricted
it to SystemZ hosts for now.
The patch exposes (and fixes) some typos in the main CodeGen/SystemZ tests.
A later patch will remove the {{g}}s from that directory.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182274 91177308-0d34-0410-b5e6-96231b3b80d8
The GNU assembler treats things like:
brasl %r14, 100
in the same way as:
brasl %r14, .+100
rather than as a branch to absolute address 100. We implemented this in
LLVM by creating an immediate operand rather than the usual expr operand,
and by handling immediate operands specially in the code emitter.
This was undesirable for (at least) three reasons:
- the specialness of immediate operands was exposed to the backend MC code,
rather than being limited to the assembler parser.
- in disassembly, an immediate operand really is an absolute address.
(Note that this means reassembling printed disassembly can't recreate
the original code.)
- it would interfere with any assembly manipulation that we might
try in future. E.g. operations like branch shortening can change
the relative position of instructions, but any code that updates
sym+offset addresses wouldn't update an immediate "100" operand
in the same way as an explicit ".+100" operand.
This patch changes the implementation so that the assembler creates
a "." label for immediate PC-relative operands, so that the operand
to the MCInst is always the absolute address. The patch also adds
some error checking of the offset.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181773 91177308-0d34-0410-b5e6-96231b3b80d8