mirror of
https://github.com/c64scene-ar/llvm-6502.git
synced 2025-01-13 09:33:50 +00:00
fe6bd52bf2
Summary: Correct the match patterns and the lowerings that made the CodeGen tests pass despite the mistakes. The original testcase that discovered the problem was SingleSource/UnitTests/SignlessType/factor.c in test-suite. During review, we also found that some of the existing CodeGen tests were incorrect and fixed them: * bitwise.ll: In bsel_v16i8 the IfSet/IfClear were reversed because bsel and bmnz have different operand orders and the test didn't correctly account for this. bmnz goes 'IfClear, IfSet, CondMask', while bsel goes 'CondMask, IfClear, IfSet'. * vec.ll: In the cases where a bsel is emitted as a bmnz (they are the same operation with a different input tied to the result) the operands were in the wrong order. * compare.ll and compare_float.ll: The bsel operand order was correct for a greater-than comparison, but a greater-than comparison instruction doesn't exist. Lowering this operation inverts the condition so the IfSet/IfClear need to be swapped to match. The differences between BSEL, BMNZ, and BMZ and how they map to/from vselect are rather confusing. I've therefore added a note to MSA.txt to explain this in a single place in addition to the comments that explain each case. Reviewers: matheusalmeida, jacksprat Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3028 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203657 91177308-0d34-0410-b5e6-96231b3b80d8
84 lines
3.6 KiB
Plaintext
84 lines
3.6 KiB
Plaintext
Code Generation Notes for MSA
|
|
=============================
|
|
|
|
Intrinsics are lowered to SelectionDAG nodes where possible in order to enable
|
|
optimisation, reduce the size of the ISel matcher, and reduce repetition in
|
|
the implementation. In a small number of cases, this can cause different
|
|
(semantically equivalent) instructions to be used in place of the requested
|
|
instruction, even when no optimisation has taken place.
|
|
|
|
Instructions
|
|
============
|
|
|
|
This section describes any quirks of instruction selection for MSA. For
|
|
example, two instructions might be equally valid for some given IR and one is
|
|
chosen in preference to the other.
|
|
|
|
bclri.b:
|
|
It is not possible to emit bclri.b since andi.b covers exactly the
|
|
same cases. andi.b should use fractionally less power than bclri.b in
|
|
most hardware implementations so it is used in preference to bclri.b.
|
|
|
|
vshf.w:
|
|
It is not possible to emit vshf.w when the shuffle description is
|
|
constant since shf.w covers exactly the same cases. shf.w is used
|
|
instead. It is also impossible for the shuffle description to be
|
|
unknown at compile-time due to the definition of shufflevector in
|
|
LLVM IR.
|
|
|
|
vshf.[bhwd]
|
|
When the shuffle description describes a splat operation, splat.[bhwd]
|
|
instructions will be selected instead of vshf.[bhwd]. Unlike the ilv*,
|
|
and pck* instructions, this is matched from MipsISD::VSHF instead of
|
|
a special-case MipsISD node.
|
|
|
|
ilvl.d, pckev.d:
|
|
It is not possible to emit ilvl.d, or pckev.d since ilvev.d covers the
|
|
same shuffle. ilvev.d will be emitted instead.
|
|
|
|
ilvr.d, ilvod.d, pckod.d:
|
|
It is not possible to emit ilvr.d, or pckod.d since ilvod.d covers the
|
|
same shuffle. ilvod.d will be emitted instead.
|
|
|
|
splat.[bhwd]
|
|
The intrinsic will work as expected. However, unlike other intrinsics
|
|
it lowers directly to MipsISD::VSHF instead of using common IR.
|
|
|
|
splati.w:
|
|
It is not possible to emit splati.w since shf.w covers the same cases.
|
|
shf.w will be emitted instead.
|
|
|
|
copy_s.w:
|
|
On MIPS32, the copy_u.d intrinsic will emit this instruction instead of
|
|
copy_u.w. This is semantically equivalent since the general-purpose
|
|
register file is 32-bits wide.
|
|
|
|
binsri.[bhwd], binsli.[bhwd]:
|
|
These two operations are equivalent to each other with the operands
|
|
swapped and condition inverted. The compiler may use either one as
|
|
appropriate.
|
|
Furthermore, the compiler may use bsel.[bhwd] for some masks that do
|
|
not survive the legalization process (this is a bug and will be fixed).
|
|
|
|
bmnz.v, bmz.v, bsel.v:
|
|
These three operations differ only in the operand that is tied to the
|
|
result and the order of the operands.
|
|
It is (currently) not possible to emit bmz.v, or bsel.v since bmnz.v is
|
|
the same operation and will be emitted instead.
|
|
In future, the compiler may choose between these three instructions
|
|
according to register allocation.
|
|
These three operations can be very confusing so here is a mapping
|
|
between the instructions and the vselect node in one place:
|
|
bmz.v wd, ws, wt/i8 -> (vselect wt/i8, wd, ws)
|
|
bmnz.v wd, ws, wt/i8 -> (vselect wt/i8, ws, wd)
|
|
bsel.v wd, ws, wt/i8 -> (vselect wd, wt/i8, ws)
|
|
|
|
bmnzi.b, bmzi.b:
|
|
Like their non-immediate counterparts, bmnzi.v and bmzi.v are the same
|
|
operation with the operands swapped. bmnzi.v will (currently) be emitted
|
|
for both cases.
|
|
|
|
bseli.v:
|
|
Unlike the non-immediate versions, bseli.v is distinguishable from
|
|
bmnzi.b and bmzi.b and can be emitted.
|