//===-- README.txt - Notes for Blackfin Target ------------------*- org -*-===// * Condition codes ** DONE Problem with asymmetric SETCC operations The instruction CC = R0 < 2 is not symmetric - there is no R0 > 2 instruction. On the other hand, IF CC JUMP can take both CC and !CC as a condition. We cannot pattern-match (brcond (not cc), target), the DAG optimizer removes that kind of thing. This is handled by creating a pseudo-register NCC that aliases CC. Register classes JustCC and NotCC are used to control the inversion of CC. ** DONE CC as an i32 register The AnyCC register class pretends to hold i32 values. It can only represent the values 0 and 1, but we can copy to and from the D class. This hack makes it possible to represent the setcc instruction without having i1 as a legal type. In most cases, the CC register is set by a "CC = .." or BITTST instruction, and then used in a conditional branch or move. The code generator thinks it is moving 32 bits, but the value stays in CC. In other cases, the result of a comparison is actually used as am i32 number, and CC will be copied to a D register. * Stack frames ** TODO Use Push/Pop instructions We should use the push/pop instructions when saving callee-saved registers. The are smaller, and we may even use push multiple instructions. ** TODO requiresRegisterScavenging We need more intelligence in determining when the scavenger is needed. We should keep track of: - Spilling D16 registers - Spilling AnyCC registers * Assembler ** TODO Implement PrintGlobalVariable ** TODO Remove LOAD32sym It's a hack combining two instructions by concatenation. * Inline Assembly These are the GCC constraints from bfin/constraints.md: | Code | Register class | LLVM | |-------+-------------------------------------------+------| | a | P | C | | d | D | C | | z | Call clobbered P (P0, P1, P2) | X | | D | EvenD | X | | W | OddD | X | | e | Accu | C | | A | A0 | S | | B | A1 | S | | b | I | C | | v | B | C | | f | M | C | | c | Circular I, B, L | X | | C | JustCC | S | | t | LoopTop | X | | u | LoopBottom | X | | k | LoopCount | X | | x | GR | C | | y | RET*, ASTAT, SEQSTAT, USP | X | | w | ALL | C | | Z | The FD-PIC GOT pointer (P3) | S | | Y | The FD-PIC function pointer register (P1) | S | | q0-q7 | R0-R7 individually | | | qA | P0 | | |-------+-------------------------------------------+------| | Code | Constant | | |-------+-------------------------------------------+------| | J | 1<<N, N<32 | | | Ks3 | imm3 | | | Ku3 | uimm3 | | | Ks4 | imm4 | | | Ku4 | uimm4 | | | Ks5 | imm5 | | | Ku5 | uimm5 | | | Ks7 | imm7 | | | KN7 | -imm7 | | | Ksh | imm16 | | | Kuh | uimm16 | | | L | ~(1<<N) | | | M1 | 0xff | | | M2 | 0xffff | | | P0-P4 | 0-4 | | | PA | Macflag, not M | | | PB | Macflag, only M | | | Q | Symbol | | ** TODO Support all register classes * DAG combiner ** Create test case for each Illegal SETCC case The DAG combiner may someimes produce illegal i16 SETCC instructions. *** TODO SETCC (ctlz x), 5) == const *** TODO SETCC (and load, const) == const *** DONE SETCC (zext x) == const *** TODO SETCC (sext x) == const * Instruction selection ** TODO Better imediate constants Like ARM, build constants as small imm + shift. ** TODO Implement cycle counter We have CYCLES and CYCLES2 registers, but the readcyclecounter intrinsic wants to return i64, and the code generator doesn't know how to legalize that. ** TODO Instruction alternatives Some instructions come in different variants for example: D = D + D P = P + P Cross combinations are not allowed: P = D + D (bad) Similarly for the subreg pseudo-instructions: D16L = EXTRACT_SUBREG D16, bfin_subreg_lo16 P16L = EXTRACT_SUBREG P16, bfin_subreg_lo16 We want to take advantage of the alternative instructions. This could be done by changing the DAG after instruction selection. ** Multipatterns for load/store We should try to identify multipatterns for load and store instructions. The available instruction matrix is a bit irregular. Loads: | Addr | D | P | D 16z | D 16s | D16 | D 8z | D 8s | |------------+---+---+-------+-------+-----+------+------| | P | * | * | * | * | * | * | * | | P++ | * | * | * | * | | * | * | | P-- | * | * | * | * | | * | * | | P+uimm5m2 | | | * | * | | | | | P+uimm6m4 | * | * | | | | | | | P+imm16 | | | | | | * | * | | P+imm17m2 | | | * | * | | | | | P+imm18m4 | * | * | | | | | | | P++P | * | | * | * | * | | | | FP-uimm7m4 | * | * | | | | | | | I | * | | | | * | | | | I++ | * | | | | * | | | | I-- | * | | | | * | | | | I++M | * | | | | | | | Stores: | Addr | D | P | D16H | D16L | D 8 | |------------+---+---+------+------+-----| | P | * | * | * | * | * | | P++ | * | * | | * | * | | P-- | * | * | | * | * | | P+uimm5m2 | | | | * | | | P+uimm6m4 | * | * | | | | | P+imm16 | | | | | * | | P+imm17m2 | | | | * | | | P+imm18m4 | * | * | | | | | P++P | * | | * | * | | | FP-uimm7m4 | * | * | | | | | I | * | | * | * | | | I++ | * | | * | * | | | I-- | * | | * | * | | | I++M | * | | | | | * Workarounds and features Blackfin CPUs have bugs. Each model comes in a number of silicon revisions with different bugs. We learn about the CPU model from the -mcpu switch. ** Interpretation of -mcpu value - -mcpu=bf527 refers to the latest known BF527 revision - -mcpu=bf527-0.2 refers to silicon rev. 0.2 - -mcpu=bf527-any refers to all known revisions - -mcpu=bf527-none disables all workarounds The -mcpu setting affects the __SILICON_REVISION__ macro and enabled workarounds: | -mcpu | __SILICON_REVISION__ | Workarounds | |------------+----------------------+--------------------| | bf527 | Def Latest | Specific to latest | | bf527-1.3 | Def 0x0103 | Specific to 1.3 | | bf527-any | Def 0xffff | All bf527-x.y | | bf527-none | Undefined | None | These are the known cores and revisions: | Core | Silicon | Processors | |-------------+--------------------+-------------------------| | Edinburgh | 0.3, 0.4, 0.5, 0.6 | BF531 BF532 BF533 | | Braemar | 0.2, 0.3 | BF534 BF536 BF537 | | Stirling | 0.3, 0.4, 0.5 | BF538 BF539 | | Moab | 0.0, 0.1, 0.2 | BF542 BF544 BF548 BF549 | | Teton | 0.3, 0.5 | BF561 | | Kookaburra | 0.0, 0.1, 0.2 | BF523 BF525 BF527 | | Mockingbird | 0.0, 0.1 | BF522 BF524 BF526 | | Brodie | 0.0, 0.1 | BF512 BF514 BF516 BF518 | ** Compiler implemented workarounds Most workarounds are implemented in header files and source code using the __ADSPBF527__ macros. A few workarounds require compiler support. | Anomaly | Macro | GCC Switch | |----------+--------------------------------+------------------| | Any | __WORKAROUNDS_ENABLED | | | 05000074 | WA_05000074 | | | 05000244 | __WORKAROUND_SPECULATIVE_SYNCS | -mcsync-anomaly | | 05000245 | __WORKAROUND_SPECULATIVE_LOADS | -mspecld-anomaly | | 05000257 | WA_05000257 | | | 05000283 | WA_05000283 | | | 05000312 | WA_LOAD_LCREGS | | | 05000315 | WA_05000315 | | | 05000371 | __WORKAROUND_RETS | | | 05000426 | __WORKAROUND_INDIRECT_CALLS | Not -micplb | ** GCC feature switches | Switch | Description | |---------------------------+----------------------------------------| | -msim | Use simulator runtime | | -momit-leaf-frame-pointer | Omit frame pointer for leaf functions | | -mlow64k | | | -mcsync-anomaly | | | -mspecld-anomaly | | | -mid-shared-library | | | -mleaf-id-shared-library | | | -mshared-library-id= | | | -msep-data | Enable separate data segment | | -mlong-calls | Use indirect calls | | -mfast-fp | | | -mfdpic | | | -minline-plt | | | -mstack-check-l1 | Do stack checking in L1 scratch memory | | -mmulticore | Enable multicore support | | -mcorea | Build for Core A | | -mcoreb | Build for Core B | | -msdram | Build for SDRAM | | -micplb | Assume ICPLBs are enabled at runtime. |