245 lines
11 KiB
Plaintext
Raw Normal View History

//===-- README.txt - Notes for Blackfin Target ------------------*- org -*-===//
* Condition codes
** DONE Problem with asymmetric SETCC operations
The instruction
CC = R0 < 2
is not symmetric - there is no R0 > 2 instruction. On the other hand, IF CC
JUMP can take both CC and !CC as a condition. We cannot pattern-match (brcond
(not cc), target), the DAG optimizer removes that kind of thing.
This is handled by creating a pseudo-register NCC that aliases CC. Register
classes JustCC and NotCC are used to control the inversion of CC.
** DONE CC as an i32 register
The AnyCC register class pretends to hold i32 values. It can only represent the
values 0 and 1, but we can copy to and from the D class. This hack makes it
possible to represent the setcc instruction without having i1 as a legal type.
In most cases, the CC register is set by a "CC = .." or BITTST instruction, and
then used in a conditional branch or move. The code generator thinks it is
moving 32 bits, but the value stays in CC. In other cases, the result of a
comparison is actually used as am i32 number, and CC will be copied to a D
register.
* Stack frames
** TODO Use Push/Pop instructions
We should use the push/pop instructions when saving callee-saved
registers. The are smaller, and we may even use push multiple instructions.
** TODO requiresRegisterScavenging
We need more intelligence in determining when the scavenger is needed. We
should keep track of:
- Spilling D16 registers
- Spilling AnyCC registers
* Assembler
** TODO Implement PrintGlobalVariable
** TODO Remove LOAD32sym
It's a hack combining two instructions by concatenation.
* Inline Assembly
These are the GCC constraints from bfin/constraints.md:
| Code | Register class | LLVM |
|-------+-------------------------------------------+------|
| a | P | C |
| d | D | C |
| z | Call clobbered P (P0, P1, P2) | X |
| D | EvenD | X |
| W | OddD | X |
| e | Accu | C |
| A | A0 | S |
| B | A1 | S |
| b | I | C |
| v | B | C |
| f | M | C |
| c | Circular I, B, L | X |
| C | JustCC | S |
| t | LoopTop | X |
| u | LoopBottom | X |
| k | LoopCount | X |
| x | GR | C |
| y | RET*, ASTAT, SEQSTAT, USP | X |
| w | ALL | C |
| Z | The FD-PIC GOT pointer (P3) | S |
| Y | The FD-PIC function pointer register (P1) | S |
| q0-q7 | R0-R7 individually | |
| qA | P0 | |
|-------+-------------------------------------------+------|
| Code | Constant | |
|-------+-------------------------------------------+------|
| J | 1<<N, N<32 | |
| Ks3 | imm3 | |
| Ku3 | uimm3 | |
| Ks4 | imm4 | |
| Ku4 | uimm4 | |
| Ks5 | imm5 | |
| Ku5 | uimm5 | |
| Ks7 | imm7 | |
| KN7 | -imm7 | |
| Ksh | imm16 | |
| Kuh | uimm16 | |
| L | ~(1<<N) | |
| M1 | 0xff | |
| M2 | 0xffff | |
| P0-P4 | 0-4 | |
| PA | Macflag, not M | |
| PB | Macflag, only M | |
| Q | Symbol | |
** TODO Support all register classes
* DAG combiner
** Create test case for each Illegal SETCC case
The DAG combiner may someimes produce illegal i16 SETCC instructions.
*** TODO SETCC (ctlz x), 5) == const
*** TODO SETCC (and load, const) == const
*** DONE SETCC (zext x) == const
*** TODO SETCC (sext x) == const
* Instruction selection
** TODO Better imediate constants
Like ARM, build constants as small imm + shift.
** TODO Implement cycle counter
We have CYCLES and CYCLES2 registers, but the readcyclecounter intrinsic wants
to return i64, and the code generator doesn't know how to legalize that.
** TODO Instruction alternatives
Some instructions come in different variants for example:
D = D + D
P = P + P
Cross combinations are not allowed:
P = D + D (bad)
Similarly for the subreg pseudo-instructions:
D16L = EXTRACT_SUBREG D16, bfin_subreg_lo16
P16L = EXTRACT_SUBREG P16, bfin_subreg_lo16
We want to take advantage of the alternative instructions. This could be done by
changing the DAG after instruction selection.
** Multipatterns for load/store
We should try to identify multipatterns for load and store instructions. The
available instruction matrix is a bit irregular.
Loads:
| Addr | D | P | D 16z | D 16s | D16 | D 8z | D 8s |
|------------+---+---+-------+-------+-----+------+------|
| P | * | * | * | * | * | * | * |
| P++ | * | * | * | * | | * | * |
| P-- | * | * | * | * | | * | * |
| P+uimm5m2 | | | * | * | | | |
| P+uimm6m4 | * | * | | | | | |
| P+imm16 | | | | | | * | * |
| P+imm17m2 | | | * | * | | | |
| P+imm18m4 | * | * | | | | | |
| P++P | * | | * | * | * | | |
| FP-uimm7m4 | * | * | | | | | |
| I | * | | | | * | | |
| I++ | * | | | | * | | |
| I-- | * | | | | * | | |
| I++M | * | | | | | | |
Stores:
| Addr | D | P | D16H | D16L | D 8 |
|------------+---+---+------+------+-----|
| P | * | * | * | * | * |
| P++ | * | * | | * | * |
| P-- | * | * | | * | * |
| P+uimm5m2 | | | | * | |
| P+uimm6m4 | * | * | | | |
| P+imm16 | | | | | * |
| P+imm17m2 | | | | * | |
| P+imm18m4 | * | * | | | |
| P++P | * | | * | * | |
| FP-uimm7m4 | * | * | | | |
| I | * | | * | * | |
| I++ | * | | * | * | |
| I-- | * | | * | * | |
| I++M | * | | | | |
* Workarounds and features
Blackfin CPUs have bugs. Each model comes in a number of silicon revisions with
different bugs. We learn about the CPU model from the -mcpu switch.
** Interpretation of -mcpu value
- -mcpu=bf527 refers to the latest known BF527 revision
- -mcpu=bf527-0.2 refers to silicon rev. 0.2
- -mcpu=bf527-any refers to all known revisions
- -mcpu=bf527-none disables all workarounds
The -mcpu setting affects the __SILICON_REVISION__ macro and enabled workarounds:
| -mcpu | __SILICON_REVISION__ | Workarounds |
|------------+----------------------+--------------------|
| bf527 | Def Latest | Specific to latest |
| bf527-1.3 | Def 0x0103 | Specific to 1.3 |
| bf527-any | Def 0xffff | All bf527-x.y |
| bf527-none | Undefined | None |
These are the known cores and revisions:
| Core | Silicon | Processors |
|-------------+--------------------+-------------------------|
| Edinburgh | 0.3, 0.4, 0.5, 0.6 | BF531 BF532 BF533 |
| Braemar | 0.2, 0.3 | BF534 BF536 BF537 |
| Stirling | 0.3, 0.4, 0.5 | BF538 BF539 |
| Moab | 0.0, 0.1, 0.2 | BF542 BF544 BF548 BF549 |
| Teton | 0.3, 0.5 | BF561 |
| Kookaburra | 0.0, 0.1, 0.2 | BF523 BF525 BF527 |
| Mockingbird | 0.0, 0.1 | BF522 BF524 BF526 |
| Brodie | 0.0, 0.1 | BF512 BF514 BF516 BF518 |
** Compiler implemented workarounds
Most workarounds are implemented in header files and source code using the
__ADSPBF527__ macros. A few workarounds require compiler support.
| Anomaly | Macro | GCC Switch |
|----------+--------------------------------+------------------|
| Any | __WORKAROUNDS_ENABLED | |
| 05000074 | WA_05000074 | |
| 05000244 | __WORKAROUND_SPECULATIVE_SYNCS | -mcsync-anomaly |
| 05000245 | __WORKAROUND_SPECULATIVE_LOADS | -mspecld-anomaly |
| 05000257 | WA_05000257 | |
| 05000283 | WA_05000283 | |
| 05000312 | WA_LOAD_LCREGS | |
| 05000315 | WA_05000315 | |
| 05000371 | __WORKAROUND_RETS | |
| 05000426 | __WORKAROUND_INDIRECT_CALLS | Not -micplb |
** GCC feature switches
| Switch | Description |
|---------------------------+----------------------------------------|
| -msim | Use simulator runtime |
| -momit-leaf-frame-pointer | Omit frame pointer for leaf functions |
| -mlow64k | |
| -mcsync-anomaly | |
| -mspecld-anomaly | |
| -mid-shared-library | |
| -mleaf-id-shared-library | |
| -mshared-library-id= | |
| -msep-data | Enable separate data segment |
| -mlong-calls | Use indirect calls |
| -mfast-fp | |
| -mfdpic | |
| -minline-plt | |
| -mstack-check-l1 | Do stack checking in L1 scratch memory |
| -mmulticore | Enable multicore support |
| -mcorea | Build for Core A |
| -mcoreb | Build for Core B |
| -msdram | Build for SDRAM |
| -micplb | Assume ICPLBs are enabled at runtime. |