We would wrongly transform the testcase into the equivalent of an AND with 1.
The problem was that, when testing whether the shifted-in bits of the right
shift were significant, we used the width of the final zero-extended result
rather than the width of the shifted value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195731 91177308-0d34-0410-b5e6-96231b3b80d8
I've no idea why I decided to handle TMxx differently from all the other
high/low logic operations, but it was a stupid thing to do. The high
registers aren't available as separate 32-bit registers on z10,
so subreg_h32 can't be used on a GR64 there.
I've normally been testing with z196 and with -O3 and so hadn't noticed
this until now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195473 91177308-0d34-0410-b5e6-96231b3b80d8
This patch removes most of the trivial cases of weak vtables by pinning them to
a single object file. The memory leaks in this version have been fixed. Thanks
Alexey for pointing them out.
Differential Revision: http://llvm-reviews.chandlerc.com/D2068
Reviewed by Andy
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195064 91177308-0d34-0410-b5e6-96231b3b80d8
This change is incorrect. If you delete virtual destructor of both a base class
and a subclass, then the following code:
Base *foo = new Child();
delete foo;
will not cause the destructor for members of Child class. As a result, I observe
plently of memory leaks. Notable examples I investigated are:
ObjectBuffer and ObjectBufferStream, AttributeImpl and StringSAttributeImpl.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194997 91177308-0d34-0410-b5e6-96231b3b80d8
As on other hosts, the CPU identification instruction is priveleged,
so we need to look through /proc/cpuinfo. I copied the PowerPC way of
handling "generic".
Several tests were implicitly assuming z10 and so failed on z196.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193742 91177308-0d34-0410-b5e6-96231b3b80d8
useAA significantly improves the handling of vector code that has TBAA
information attached. It also helps other cases, as shown by the testsuite
changes here. The only real downside I've seen is that it interferes with
MergeConsecutiveStores. The problem is that that optimization works top
down, starting at the first store in the chain, and looks for cases where
the chain result is only used by a single related store. These related
stores don't alias, so useAA will have rewritten all the later stores to
use a different chain input (typically the same one as the first store).
I think the advantages outweigh the disadvantages though, so for now I've
just disabled alias analysis for the unaligned-01.ll test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193521 91177308-0d34-0410-b5e6-96231b3b80d8
The input to an RxSBG operation can be narrower as long as the upper bits
are don't care. This fixes a FIXME added in r192783.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192790 91177308-0d34-0410-b5e6-96231b3b80d8
We previously used the default expansion to SELECT_CC, which in turn would
expand to "LHI; BRC; LHI". In most cases it's better to use an IPM-based
sequence instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192784 91177308-0d34-0410-b5e6-96231b3b80d8
This patch fixes an old FIXME by creating a MCTargetStreamer interface
and moving the target specific functions for ARM, Mips and PPC to it.
The ARM streamer is still declared in a common place because it is
used from lib/CodeGen/ARMException.cpp, but the Mips and PPC are
completely hidden in the corresponding Target directories.
I will send an email to llvmdev with instructions on how to use this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192181 91177308-0d34-0410-b5e6-96231b3b80d8
There are no corresponding patterns for small immediates because they would
prevent the use of fused compare-and-branch instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191775 91177308-0d34-0410-b5e6-96231b3b80d8
Floats are stored in the high 32 bits of an FPR, and the only GPR<->FPR
transfers are full-register transfers. This patch optimizes GPR<->FPR
float transfers when the high word of a GPR is directly accessible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191764 91177308-0d34-0410-b5e6-96231b3b80d8
Similar to low words, we can use the shorter LLIHL and LLIHH if it turns
out that the other half of the GR64 isn't live.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191750 91177308-0d34-0410-b5e6-96231b3b80d8
This just adds the basics necessary for allocating the upper words to
virtual registers (move, load and store). The move support is parameterised
in a way that makes it easy to handle zero extensions, but the associated
zero-extend patterns are added by a later patch.
The easiest way of testing this seemed to be add a new "h" register
constraint for high words. I don't expect the constraint to be useful
in real inline asms, but it should work, so I didn't try to hide it
behind an option.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191739 91177308-0d34-0410-b5e6-96231b3b80d8
Originally committed as r191661, but reverted because it changed the matching
order of comparisons on some hosts. That should have been fixed by r191735.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191738 91177308-0d34-0410-b5e6-96231b3b80d8
For some reason, adding definitions for these load and store
instructions changed whether some of the build bots matched
comparisons as signed or unsigned.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191663 91177308-0d34-0410-b5e6-96231b3b80d8
The only thing this does on its own is make the definitions of RISB[HL]G
a bit more precise. Those instructions are only used by the MC layer at
the moment, so no behavioral change is intended. The class is needed by
later patches though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191660 91177308-0d34-0410-b5e6-96231b3b80d8
Use subreg_hNN and subreg_lNN for the high and low NN bits of a register.
List the low registers first, so that subreg_l32 also means the low 32
bits of a 128-bit register.
Floats are stored in the upper 32 bits of a 64-bit register, so they
should use subreg_h32 rather than subreg_l32.
No behavioral change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191659 91177308-0d34-0410-b5e6-96231b3b80d8
I'm about to add support for high-word operations, so it seemed better
for the low-word registers to have names like R0L rather than R0W.
No behavioral change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191655 91177308-0d34-0410-b5e6-96231b3b80d8
The backend tries to use block operations like MVC, NC, OC and XC for
simple scalar operations. For correctness reasons, it rejects any case
in which the regions might partially overlap. However, for performance
reasons, it should also reject cases where the regions might be equal,
since the instruction might then not use the fast path.
This fixes a performance regression seen in bzip2. We may want to limit
the optimisation even more in future, or even remove it entirely, but I'll
try with this for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191525 91177308-0d34-0410-b5e6-96231b3b80d8
The backend previously folded offsets into PC-relative addresses
whereever possible. That's the right thing to do when the address
can be used directly in a PC-relative memory reference (using things
like LRL). But if we have a register-based memory reference and need
to load the PC-relative address separately, it's better to use an anchor
point that could be shared with other accesses to the same area of the
variable.
Fixes a FIXME.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191524 91177308-0d34-0410-b5e6-96231b3b80d8
Another patch to avoid duplication of encoding information. Things like
NILF, NILL and NILH are used as both 32-bit and 64-bit instructions.
Here the 64-bit versions are defined as aliases of the 32-bit ones.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191369 91177308-0d34-0410-b5e6-96231b3b80d8
Similar to r191364, but for calls. This patch also removes the shortening
of BRASL to BRAS within a TU. Doing that was a bit controversial internally,
since there's a strong expectation with the z assembler that WYWIWYG.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191366 91177308-0d34-0410-b5e6-96231b3b80d8
Another patch to reduce the duplication of encoding information.
Rather than define separate patterns for truncating 64-bit stores,
use the 32-bit stores with a subreg. No behavioral changed intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191365 91177308-0d34-0410-b5e6-96231b3b80d8
This is the first of a few patches to reduce the dupliation of encoding
information. The return instruction is a normal BR in which one of the
registers is fixed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191364 91177308-0d34-0410-b5e6-96231b3b80d8
When loading immediates into a GR32, the port prefered LHI, followed by
LLILH or LLILL, followed by IILF. LHI and IILF are natural 32-bit
operations, but LLILH and LLILL also clear the upper 32 bits of the register.
This was represented as taking a 32-bit subreg of a 64-bit assignment.
Using subregs for something as simple as a move immediate was probably
a bad idea. Also, I have patches to add support for the high-word facility,
and we don't want something like LLILH and LLILL to stop the high word of
the same GPR from being used.
This patch therefore uses LHI and IILF to begin with and adds a late
machine-specific pass to use LLILH and LLILL if the other half of the
register is not live. The high-word patches extend this behavior to
IIHF, LLIHL and LLIHH.
No behavioral change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191363 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, the DAGISel function WalkChainUsers was spotting that it
had entered already-selected territory by whether a node was a
MachineNode (amongst other things). Since it's fairly common practice
to insert MachineNodes during ISelLowering, this was not the correct
check.
Looking around, it seems that other nodes get their NodeId set to -1
upon selection, so this makes sure the same thing happens to all
MachineNodes and uses that characteristic to determine whether we
should stop looking for a loop during selection.
This should fix PR15840.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191165 91177308-0d34-0410-b5e6-96231b3b80d8
For some reason I never got around to adding these at the same time as
the signed versions. No idea why.
I'm not sure whether this SystemZII::BranchC* stuff is useful, or whether
it should just be replaced with an "is normal" flag. I'll leave that
for later though.
There are some boundary conditions that can be tweaked, such as preferring
unsigned comparisons for equality with [128, 256), and "<= 255" over "< 256",
but again I'll leave those for a separate patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190930 91177308-0d34-0410-b5e6-96231b3b80d8
The port originally had special patterns for extload, mapping them to the
same instructions as sextload. It seemed neater to have patterns that
match "an extension that is allowed to be signed" and "an extension that
is allowed to be unsigned".
This was originally meant to be a clean-up, but it does improve the handling
of promoted integers a little, as shown by args-06.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190777 91177308-0d34-0410-b5e6-96231b3b80d8
The 'Deprecated' class allows you to specify a SubtargetFeature that the
instruction is deprecated on.
The 'ComplexDeprecationPredicate' class allows you to define a custom
predicate that is called to check for deprecation.
For example:
ComplexDeprecationPredicate<"MCR">
would mean you would have to define the following function:
bool getMCRDeprecationInfo(MCInst &MI, MCSubtargetInfo &STI,
std::string &Info)
Which returns 'false' for not deprecated, and 'true' for deprecated
and store the warning message in 'Info'.
The MCTargetAsmParser constructor was chaned to take an extra argument of
the MCInstrInfo class, so out-of-tree targets will need to be changed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190598 91177308-0d34-0410-b5e6-96231b3b80d8
The main complication here is that TM and TMY (the memory forms) set
CC differently from the register forms. When the tested bits contain
some 0s and some 1s, the register forms set CC to 1 or 2 based on the
value the uppermost bit. The memory forms instead set CC to 1
regardless of the uppermost bit.
Until now, I've tried to make it so that a branch never tests for an
impossible CC value. E.g. NR only sets CC to 0 or 1, so branches on the
result will only test for 0 or 1. Originally I'd tried to do the same
thing for TM and TMY by using custom matching code in ISelDAGToDAG.
That ended up being very ugly though, and would have meant duplicating
some of the chain checks that the common isel code does.
I've therefore gone for the simpler alternative of adding an extra
operand to the TM DAG opcode to say whether a memory form would be OK.
This means that the inverse of a "TM;JE" is "TM;JNE" rather than the
more precise "TM;JNLE", just like the inverse of "TMLL;JE" is "TMLL;JNE".
I suppose that's arguably less confusing though...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190400 91177308-0d34-0410-b5e6-96231b3b80d8
We used to generate the compact unwind encoding from the machine
instructions. However, this had the problem that if the user used `-save-temps'
or compiled their hand-written `.s' file (with CFI directives), we wouldn't
generate the compact unwind encoding.
Move the algorithm that generates the compact unwind encoding into the
MCAsmBackend. This way we can generate the encoding whether the code is from a
`.ll' or `.s' file.
<rdar://problem/13623355>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190290 91177308-0d34-0410-b5e6-96231b3b80d8
The architecture has many comparison instructions, including some that
extend one of the operands. The signed comparison instructions use sign
extensions and the unsigned comparison instructions use zero extensions.
In cases where we had a free choice between signed or unsigned comparisons,
we were trying to decide at lowering time which would best fit the available
instructions, taking things like extension type into account. The code
to do that was getting increasingly hairy and was also making some bad
decisions. E.g. when comparing the result of two LLCs, it is better to use
CR rather than CLR, since CR can be fused with a branch while CLR can't.
This patch removes the lowering code and instead adds an operand to
integer comparisons to say whether signed comparison is required,
whether unsigned comparison is required, or whether either is OK.
We can then leave the choice of instruction up to the normal isel code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190138 91177308-0d34-0410-b5e6-96231b3b80d8
For now this just handles simple comparisons of an ANDed value with zero.
The CC value provides enough information to do any comparison for a
2-bit mask, and some nonzero comparisons with more populated masks,
but that's all future work.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189819 91177308-0d34-0410-b5e6-96231b3b80d8
For now just handles simple comparisons of an ANDed value with zero.
The CC value provides enough information to do any comparison for a
2-bit mask, and some nonzero comparisons with more populated masks,
but that's all future work.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189469 91177308-0d34-0410-b5e6-96231b3b80d8
Lengths up to a certain threshold (currently 6 * 256) use a series of MVCs.
Lengths above that threshold use a loop to handle X*256 bytes followed
by a single MVC to handle the excess (if any). This loop will also be
needed in future when support for variable lengths is added.
Because the same tablegen classes are used to define MVC and CLC,
the patch also has the side-effect of defining a pseudo loop instruction
for CLC. That instruction isn't used yet (and wouldn't be handled correctly
if it were). I'm planning to use it soon though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189331 91177308-0d34-0410-b5e6-96231b3b80d8
If we had a store of an integer to memory, and the integer and store size
were suitable for a form of MV..., we used MV... no matter what. We could
then have sequences like:
lay %r2, 0(%r3,%r4)
mvi 0(%r2), 4
In these cases it seems better to force the constant into a register
and use a normal store:
lhi %r2, 4
stc %r2, 0(%r3, %r4)
since %r2 is more likely to be hoisted and is easier to rematerialize.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189098 91177308-0d34-0410-b5e6-96231b3b80d8
...so that it can be used for z too. Most of the code is the same.
The only real change is to use TargetTransformInfo to test when a sqrt
instruction is available.
The pass is opt-in because at the moment it only handles sqrt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189097 91177308-0d34-0410-b5e6-96231b3b80d8
The initial port used MLG(R) for i64 UMUL_LOHI but left the other three
combinations as not-legal-or-custom. Although 32x32->{32,32}
multiplications exist, they're not as quick as doing a normal 64-bit
multiplication, so it didn't seem like i32 SMUL_LOHI and UMUL_LOHI
would be useful. There's also no direct instruction for i64 SMUL_LOHI,
so it needs to be implemented in terms of UMUL_LOHI.
However, not defining these patterns means that we don't convert
division by a constant into multiplication, so this patch fills
in the other cases. The new i64 SMUL_LOHI sequence is simpler
than the one that we used previously for 64x64->128 multiplication,
so int-mul-08.ll now tests the full sequence.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188898 91177308-0d34-0410-b5e6-96231b3b80d8
These are extensions of the existing FI[EDX]BR instructions, but use a spare
bit to suppress inexact conditions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188894 91177308-0d34-0410-b5e6-96231b3b80d8
SystemZTargetLowering::emitStringWrapper() previously loaded the character
into R0 before the loop and made R0 live on entry. I'd forgotten that
allocatable registers weren't allowed to be live across blocks at this stage,
and it confused LiveVariables enough to cause a miscompilation of f3 in
memchr-02.ll.
This patch instead loads R0 in the loop and leaves LICM to hoist it
after RA. This is actually what I'd tried originally, but I went for
the manual optimisation after noticing that R0 often wasn't being hoisted.
This bug forced me to go back and look at why, now fixed as r188774.
We should also try to optimize null checks so that they test the CC result
of the SRST directly. The select between null and the SRST GPR result could
then usually be deleted as dead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188779 91177308-0d34-0410-b5e6-96231b3b80d8
For now this matches the equivalent of (neg (abs ...)), which did hit a few
times in projects/test-suite. We should probably also match cases where
absolute-like selects are used with reversed arguments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188671 91177308-0d34-0410-b5e6-96231b3b80d8
This first cut is pretty conservative. The final argument register (R6)
is call-saved, so we would need to make sure that the R6 argument to a
sibling call is the same as the R6 argument to the calling function,
which seems worth keeping as a separate patch.
Saying that integer truncations are free means that we no longer
use the extending instructions LGF and LLGF for spills in int-conv-09.ll
and int-conv-10.ll. Instead we treat the registers as 64 bits wide and
truncate them to 32-bits where necessary. I think it's unlikely we'd
use LGF and LLGF for spills in other situations for the same reason,
so I'm removing the tests rather than replacing them. The associated
code is generic and applies to many more instructions than just
LGF and LLGF, so there is no corresponding code removal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188669 91177308-0d34-0410-b5e6-96231b3b80d8
Generalize r188163 to cope with return types other than MVT::i32, just
as the existing visitMemCmpCall code did. I've split this out into a
subroutine so that it can be used for other upcoming patches.
I also noticed that I'd used the wrong API to record the out chain.
It's a load that uses DAG.getRoot() rather than getRoot(), so the out
chain should go on PendingLoads. I don't have a testcase for that because
we don't do any interesting scheduling on z yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188540 91177308-0d34-0410-b5e6-96231b3b80d8
r188163 used CLC to implement memcmp. Code that compares the result
directly against zero can test the CC value produced by CLC, but code
that needs an integer result must use IPM. The sequence I'd used was:
ipm <reg>
sll <reg>, 2
sra <reg>, 30
but I'd forgotten that this inverts the order, so that CC==1 ("less")
becomes an integer greater than zero, and CC==2 ("greater") becomes
an integer less than zero. This sequence should only be used if the
CLC arguments are reversed to compensate. The problem then is that
the branch condition must also be reversed when testing the CLC
result directly.
Rather than do that, I went for a different sequence that works with
the natural CLC order:
ipm <reg>
srl <reg>, 28
rll <reg>, <reg>, 31
One advantage of this is that it doesn't clobber CC. A disadvantage
is that any sign extension to 64 bits must be done separately,
rather than being folded into the shifts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188538 91177308-0d34-0410-b5e6-96231b3b80d8
For now this is restricted to fixed-length comparisons with a length
in the range [1, 256], as for memcpy() and MVC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188163 91177308-0d34-0410-b5e6-96231b3b80d8
This follows the same lines as the integer code. In the end it seemed
easier to have a second 4-bit mask in TSFlags to specify the compare-like
CC values. That eats one more TSFlags bit than adding a CCHasUnordered
would have done, but it feels more concise.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187883 91177308-0d34-0410-b5e6-96231b3b80d8
Without explicit dependencies, both per-file action and in-CommonTableGen action could run in parallel.
It races to emit *.inc files simultaneously.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187780 91177308-0d34-0410-b5e6-96231b3b80d8
This patch just uses a peephole test for "add; compare; branch" sequences
within a single block. The IR optimizers already convert loops to
decrement-and-branch-on-nonzero form in some cases, so even this
simplistic test triggers many times during a clang bootstrap and
projects/test-suite run. It looks like there are still cases where we
need to more strongly prefer branches on nonzero though. E.g. I saw a
case where a loop that started out with a check for 0 ended up with a
check for -1. I'll try to look at that sometime.
I ended up adding the Reference class because MachineInstr::readsRegister()
doesn't check for subregisters (by design, as far as I could tell).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187723 91177308-0d34-0410-b5e6-96231b3b80d8
Perhaps predictably, doing comparison elimination on the fly during
SystemZLongBranch turned out to be a bad idea. The next patches make
use of LOAD AND TEST and BRANCH ON COUNT, both of which require
changes to earlier instructions.
No functionality change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187718 91177308-0d34-0410-b5e6-96231b3b80d8
This also fixes a bug in the predication of LR to LOCR: I'd forgotten
that with these in-place instruction builds, the implicit operands need
to be added manually. I think this was latent until now, but is tested
by int-cmp-45.c. It also adds a CC valid mask to STOC, again tested by
int-cmp-45.c.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187573 91177308-0d34-0410-b5e6-96231b3b80d8
Convert >= 1 to > 0, etc. Using comparison with zero isn't a win on its own,
but it exposes more opportunities for CC reuse (the next patch).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187571 91177308-0d34-0410-b5e6-96231b3b80d8
The loop optimizers were assuming that scales > 1 were OK. I think this
is actually a bug in TargetLoweringBase::isLegalAddressingMode(),
since it seems to be trying to reject anything that isn't r+i or r+r,
but it has no default case for scales other than 0, 1 or 2. Implementing
the hook for z means that z can no longer test any change there though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187497 91177308-0d34-0410-b5e6-96231b3b80d8