llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2026-04-21 23:17:16 +00:00

Files

T

Evan Cheng 6edb0eac87 Teach machine sink to

1) Do forward copy propagation. This makes it easier to estimate the cost of the
   instruction being sunk.
2) Break critical edges on demand, including cases where the value is used by
   PHI nodes.
Critical edge splitting is not yet enabled by default.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114227 91177308-0d34-0410-b5e6-96231b3b80d8

2010-09-17 22:28:18 +00:00

AsmPrinter

If FE forgot to provide a file name (usually it uses "stdin" as name in such situation) then make one up to ensure that debug info is not malformed.

2010-09-16 20:57:49 +00:00

PBQP

Added initialisers for reduction rule counters.

2010-09-05 13:42:32 +00:00

SelectionDAG

Check bb to ensure that alloca is in separate basic block.

2010-09-15 18:13:55 +00:00

AggressiveAntiDepBreaker.cpp

Anti-dependency breaking needs to be careful not to use reserved regs

2010-09-02 17:12:55 +00:00

AggressiveAntiDepBreaker.h

Use std::vector instead of TargetRegisterInfo::FirstVirtualRegister.

2010-07-15 18:43:09 +00:00

Analysis.cpp

Using llvm.eh.catch.all.value instead of .llvm.eh.catch.all.value.

2010-07-26 22:36:52 +00:00

AntiDepBreaker.h

Make BreakAntiDependencies' SUnits argument const, and make the Begin

2010-04-19 23:11:58 +00:00

BranchFolding.cpp

Reapply r110396, with fixes to appease the Linux buildbot gods.

2010-08-06 18:33:48 +00:00

BranchFolding.h

Tail merging pass shall not break up IT blocks. rdar://8115404

2010-06-22 01:18:16 +00:00

CalcSpillWeights.cpp

Clean up debug output.

2010-08-12 18:50:55 +00:00

CallingConvLower.cpp

Reapply bottom-up fast-isel, with several fixes for x86-32:

2010-07-10 09:00:22 +00:00

CMakeLists.txt

Revert "CMake: Get rid of LLVMLibDeps.cmake and export the libraries normally."

2010-09-13 23:59:48 +00:00

CodePlacementOpt.cpp

Reapply r110396, with fixes to appease the Linux buildbot gods.

2010-08-06 18:33:48 +00:00

CriticalAntiDepBreaker.cpp

Fix a comment typo.

2010-09-10 22:42:21 +00:00

CriticalAntiDepBreaker.h

Use std::vector instead of TargetRegisterInfo::FirstVirtualRegister. This time

2010-07-15 19:58:14 +00:00

DeadMachineInstructionElim.cpp

Track liveness of unallocatable, unreserved registers in machine DCE.

2010-08-31 21:51:05 +00:00

DwarfEHPrepare.cpp

Reapply commit 112702 which was speculatively reverted by echristo.

2010-09-03 08:31:48 +00:00

ELF.h

Get rid of a bunch of duplicated ELF enum values.

2010-07-16 07:53:29 +00:00

ELFCodeEmitter.cpp

Get rid of a bunch of duplicated ELF enum values.

2010-07-16 07:53:29 +00:00

ELFCodeEmitter.h

2010-03-14 01:41:15 +00:00

ELFWriter.cpp

Reapply r110396, with fixes to appease the Linux buildbot gods.

2010-08-06 18:33:48 +00:00

ELFWriter.h

Tidy some #includes and forward-declarations, and move the C binding code

2010-08-07 00:43:20 +00:00

GCMetadata.cpp

zap dead code.

2010-09-04 18:12:00 +00:00

GCMetadataPrinter.cpp

mcize the gc metadata printing stuff.

2010-04-04 07:39:04 +00:00

GCStrategy.cpp

Reapply r110396, with fixes to appease the Linux buildbot gods.

2010-08-06 18:33:48 +00:00

IfConversion.cpp

Teach if-converter to be more careful with predicating instructions that would

2010-09-10 01:29:16 +00:00

InlineSpiller.cpp

Clean up the Spiller.h interface.

2010-08-13 22:56:53 +00:00

IntrinsicLowering.cpp

undo 80 column trespassing I caused

2010-07-22 10:37:47 +00:00

LatencyPriorityQueue.cpp

Use llvm::next' instead of next' to make VC++ 2010 happy.

2010-05-30 13:14:21 +00:00

LiveInterval.cpp

Remove dead code.

2010-09-08 18:50:24 +00:00

LiveIntervalAnalysis.cpp

PHI elimination shouldn't require machineloopinfo since it's used at -O0. Move the requirement to LiveIntervalAnalysis instead. Note this does not change the number of times machineloopinfo is computed.

2010-08-17 21:00:37 +00:00

LiveStackAnalysis.cpp

Fix batch of converting RegisterPass<> to INTIALIZE_PASS().

2010-07-21 22:09:45 +00:00

LiveVariables.cpp

Remove unused functions.

2010-08-16 17:18:20 +00:00

LLVMTargetMachine.cpp

Stop using the dom frontier in DwarfEHPrepare by not promoting alloca's

2010-08-31 09:05:06 +00:00

LocalStackSlotAllocation.cpp

Improve virtual frame base register allocation heuristics.

2010-08-31 17:58:19 +00:00

LowerSubregs.cpp

Remove unused functions.

2010-08-16 17:18:20 +00:00

MachineBasicBlock.cpp

Properly update MachineDominators when splitting critical edge.

2010-08-19 23:32:47 +00:00

MachineCSE.cpp

Machine CSE was forgetting to clear some data structures.

2010-09-17 21:59:42 +00:00

MachineDominators.cpp

Now that PassInfo and Pass::ID have been separated, move the rest of the passes over to the new registration API.

2010-08-23 17:52:01 +00:00

MachineFunction.cpp

It's better to have the arrays, which would trigger the creation of stack

2010-07-27 01:55:19 +00:00

MachineFunctionAnalysis.cpp

Reapply r110396, with fixes to appease the Linux buildbot gods.

2010-08-06 18:33:48 +00:00

MachineFunctionPass.cpp

Ok, third time's the charm. No changes from last time except the CMake

2010-04-02 23:17:14 +00:00

MachineFunctionPrinterPass.cpp

Reapply r110396, with fixes to appease the Linux buildbot gods.

2010-08-06 18:33:48 +00:00

MachineInstr.cpp

Prefix next' iterator operation with llvm::'.

2010-08-02 06:00:15 +00:00

MachineLICM.cpp

Reapply r110396, with fixes to appease the Linux buildbot gods.

2010-08-06 18:33:48 +00:00

MachineLoopInfo.cpp

Now that PassInfo and Pass::ID have been separated, move the rest of the passes over to the new registration API.

2010-08-23 17:52:01 +00:00

MachineModuleInfo.cpp

zap dead code.

2010-09-04 18:12:00 +00:00

MachineModuleInfoImpls.cpp

Add a bit along with the MCSymbols stored in the MachineModuleInfo maps that

2010-03-10 22:34:10 +00:00

MachinePassRegistry.cpp

…

MachineRegisterInfo.cpp

Replace copyRegToReg with COPY everywhere in lib/CodeGen except for FastISel.

2010-07-10 22:42:59 +00:00

MachineSink.cpp

Teach machine sink to

2010-09-17 22:28:18 +00:00

MachineSSAUpdater.cpp

Fix PR7096. When a block containing multiple defs is tail duplicated, the

2010-05-10 17:14:26 +00:00

MachineVerifier.cpp

Now that PassInfo and Pass::ID have been separated, move the rest of the passes over to the new registration API.

2010-08-23 17:52:01 +00:00

Makefile

…

ObjectCodeEmitter.cpp

…

OcamlGC.cpp

…

OptimizePHIs.cpp

Reapply r110396, with fixes to appease the Linux buildbot gods.

2010-08-06 18:33:48 +00:00

Passes.cpp

Use the fast register allocator by default for -O0 builds.

2010-06-03 00:39:06 +00:00

PeepholeOptimizer.cpp

must not peephole away side effects

2010-09-14 20:46:08 +00:00

PHIElimination.cpp

Now that PassInfo and Pass::ID have been separated, move the rest of the passes over to the new registration API.

2010-08-23 17:52:01 +00:00

PHIElimination.h

2010-08-17 21:00:37 +00:00

PostRAHazardRecognizer.cpp

Teach if-converter to be more careful with predicating instructions that would

2010-09-10 01:29:16 +00:00

PostRASchedulerList.cpp

Teach if-converter to be more careful with predicating instructions that would

2010-09-10 01:29:16 +00:00

PreAllocSplitting.cpp

Now that PassInfo and Pass::ID have been separated, move the rest of the passes over to the new registration API.

2010-08-23 17:52:01 +00:00

ProcessImplicitDefs.cpp

Fix batch of converting RegisterPass<> to INTIALIZE_PASS().

2010-07-21 22:09:45 +00:00

PrologEpilogInserter.cpp

Simplify eliminateFrameIndex() interface back down now that PEI doesn't need

2010-08-26 23:32:16 +00:00

PrologEpilogInserter.h

Simplify eliminateFrameIndex() interface back down now that PEI doesn't need

2010-08-26 23:32:16 +00:00

PseudoSourceValue.cpp

Fix memcheck-found leaks: one false positive from using new[], and one true

2010-03-04 22:15:01 +00:00

README.txt

…

RegAllocFast.cpp

Add DEBUG message.

2010-09-10 20:32:09 +00:00

RegAllocLinearScan.cpp

Tweak to ignoring reserved regs. The allocator was occasionally still looking

2010-09-01 22:48:34 +00:00

RegAllocPBQP.cpp

Added support for register allocators to record which intervals are spill intervals, and where the uses and defs of the original intervals were in the original code.

2010-09-02 08:27:00 +00:00

RegisterCoalescer.cpp

Remove many calls to TII::isMoveInstr. Targets should be producing COPY anyway.

2010-07-16 04:45:42 +00:00

RegisterScavenging.cpp

The scavenger should just use getAllocatableSet() rather than reinventing it

2010-09-02 18:29:04 +00:00

RenderMachineFunction.cpp

Added support for register allocators to record which intervals are spill intervals, and where the uses and defs of the original intervals were in the original code.

2010-09-02 08:27:00 +00:00

RenderMachineFunction.h

Added support for register allocators to record which intervals are spill intervals, and where the uses and defs of the original intervals were in the original code.

2010-09-02 08:27:00 +00:00

ScheduleDAG.cpp

Remove trailing whitespace, no functionality changes.

2010-06-30 03:40:54 +00:00

ScheduleDAGEmit.cpp

Emit COPY instructions instead of using copyRegToReg in InstrEmitter,

2010-07-10 19:08:25 +00:00

ScheduleDAGInstrs.cpp

Teach if-converter to be more careful with predicating instructions that would

2010-09-10 01:29:16 +00:00

ScheduleDAGInstrs.h

Teach if-converter to be more careful with predicating instructions that would

2010-09-10 01:29:16 +00:00

ScheduleDAGPrinter.cpp

…

ShadowStackGC.cpp

use ArgOperand API and CallSite to access arguments of CallInst

2010-06-25 08:48:19 +00:00

ShrinkWrapping.cpp

…

SimpleRegisterCoalescing.cpp

Teach RemoveCopyByCommutingDef to check all aliases, not just subregisters.

2010-09-01 22:15:35 +00:00

SimpleRegisterCoalescing.h

Transpose the calculation of spill weights such that we are calculating one

2010-08-10 00:02:26 +00:00

SjLjEHPrepare.cpp

Reapply r110396, with fixes to appease the Linux buildbot gods.

2010-08-06 18:33:48 +00:00

SlotIndexes.cpp

Fix batch of converting RegisterPass<> to INTIALIZE_PASS().

2010-07-21 22:09:45 +00:00

Spiller.cpp

Clean up the Spiller.h interface.

2010-08-13 22:56:53 +00:00

Spiller.h

Clean up the Spiller.h interface.

2010-08-13 22:56:53 +00:00

SplitKit.cpp

Use the value mapping provided by LiveIntervalMap. This simplifies the code a

2010-09-16 00:01:36 +00:00

SplitKit.h

Use the value mapping provided by LiveIntervalMap. This simplifies the code a

2010-09-16 00:01:36 +00:00

Splitter.cpp

Fix batch of converting RegisterPass<> to INTIALIZE_PASS().

2010-07-21 22:09:45 +00:00

Splitter.h

Reapply r110396, with fixes to appease the Linux buildbot gods.

2010-08-06 18:33:48 +00:00

StackProtector.cpp

Reapply r110396, with fixes to appease the Linux buildbot gods.

2010-08-06 18:33:48 +00:00

StackSlotColoring.cpp

remove dead proto

2010-08-28 03:45:03 +00:00

StrongPHIElimination.cpp

Now that PassInfo and Pass::ID have been separated, move the rest of the passes over to the new registration API.

2010-08-23 17:52:01 +00:00

TailDuplication.cpp

Reapply r110396, with fixes to appease the Linux buildbot gods.

2010-08-06 18:33:48 +00:00

TargetInstrInfoImpl.cpp

Teach if-converter to be more careful with predicating instructions that would

2010-09-10 01:29:16 +00:00

TargetLoweringObjectFileImpl.cpp

two changes:

2010-08-30 18:12:35 +00:00

TwoAddressInstructionPass.cpp

Now that PassInfo and Pass::ID have been separated, move the rest of the passes over to the new registration API.

2010-08-23 17:52:01 +00:00

UnreachableBlockElim.cpp

Now that PassInfo and Pass::ID have been separated, move the rest of the passes over to the new registration API.

2010-08-23 17:52:01 +00:00

VirtRegMap.cpp

Fix batch of converting RegisterPass<> to INTIALIZE_PASS().

2010-07-21 22:09:45 +00:00

VirtRegMap.h

Reapply r110396, with fixes to appease the Linux buildbot gods.

2010-08-06 18:33:48 +00:00

VirtRegRewriter.cpp

Don't add <imp-def> operands during register rewriting.

2010-09-07 22:38:45 +00:00

VirtRegRewriter.h

Code clean up. Move includes from VirtRegRewriter.h to VirtRegRewriter.cpp.

2010-04-06 17:19:55 +00:00

README.txt

//===---------------------------------------------------------------------===//

Common register allocation / spilling problem:

        mul lr, r4, lr
        str lr, [sp, #+52]
        ldr lr, [r1, #+32]
        sxth r3, r3
        ldr r4, [sp, #+52]
        mla r4, r3, lr, r4

can be:

        mul lr, r4, lr
        mov r4, lr
        str lr, [sp, #+52]
        ldr lr, [r1, #+32]
        sxth r3, r3
        mla r4, r3, lr, r4

and then "merge" mul and mov:

        mul r4, r4, lr
        str lr, [sp, #+52]
        ldr lr, [r1, #+32]
        sxth r3, r3
        mla r4, r3, lr, r4

It also increase the likelyhood the store may become dead.

//===---------------------------------------------------------------------===//

bb27 ...
        ...
        %reg1037 = ADDri %reg1039, 1
        %reg1038 = ADDrs %reg1032, %reg1039, %NOREG, 10
    Successors according to CFG: 0x8b03bf0 (#5)

bb76 (0x8b03bf0, LLVM BB @0x8b032d0, ID#5):
    Predecessors according to CFG: 0x8b0c5f0 (#3) 0x8b0a7c0 (#4)
        %reg1039 = PHI %reg1070, mbb<bb76.outer,0x8b0c5f0>, %reg1037, mbb<bb27,0x8b0a7c0>

Note ADDri is not a two-address instruction. However, its result %reg1037 is an
operand of the PHI node in bb76 and its operand %reg1039 is the result of the
PHI node. We should treat it as a two-address code and make sure the ADDri is
scheduled after any node that reads %reg1039.

//===---------------------------------------------------------------------===//

Use local info (i.e. register scavenger) to assign it a free register to allow
reuse:
        ldr r3, [sp, #+4]
        add r3, r3, #3
        ldr r2, [sp, #+8]
        add r2, r2, #2
        ldr r1, [sp, #+4]  <==
        add r1, r1, #1
        ldr r0, [sp, #+4]
        add r0, r0, #2

//===---------------------------------------------------------------------===//

LLVM aggressively lift CSE out of loop. Sometimes this can be negative side-
effects:

R1 = X + 4
R2 = X + 7
R3 = X + 15

loop:
load [i + R1]
...
load [i + R2]
...
load [i + R3]

Suppose there is high register pressure, R1, R2, R3, can be spilled. We need
to implement proper re-materialization to handle this:

R1 = X + 4
R2 = X + 7
R3 = X + 15

loop:
R1 = X + 4  @ re-materialized
load [i + R1]
...
R2 = X + 7 @ re-materialized
load [i + R2]
...
R3 = X + 15 @ re-materialized
load [i + R3]

Furthermore, with re-association, we can enable sharing:

R1 = X + 4
R2 = X + 7
R3 = X + 15

loop:
T = i + X
load [T + 4]
...
load [T + 7]
...
load [T + 15]
//===---------------------------------------------------------------------===//

It's not always a good idea to choose rematerialization over spilling. If all
the load / store instructions would be folded then spilling is cheaper because
it won't require new live intervals / registers. See 2003-05-31-LongShifts for
an example.

//===---------------------------------------------------------------------===//

With a copying garbage collector, derived pointers must not be retained across
collector safe points; the collector could move the objects and invalidate the
derived pointer. This is bad enough in the first place, but safe points can
crop up unpredictably. Consider:

        %array = load { i32, [0 x %obj] }** %array_addr
        %nth_el = getelementptr { i32, [0 x %obj] }* %array, i32 0, i32 %n
        %old = load %obj** %nth_el
        %z = div i64 %x, %y
        store %obj* %new, %obj** %nth_el

If the i64 division is lowered to a libcall, then a safe point will (must)
appear for the call site. If a collection occurs, %array and %nth_el no longer
point into the correct object.

The fix for this is to copy address calculations so that dependent pointers
are never live across safe point boundaries. But the loads cannot be copied
like this if there was an intervening store, so may be hard to get right.

Only a concurrent mutator can trigger a collection at the libcall safe point.
So single-threaded programs do not have this requirement, even with a copying
collector. Still, LLVM optimizations would probably undo a front-end's careful
work.

//===---------------------------------------------------------------------===//

The ocaml frametable structure supports liveness information. It would be good
to support it.

//===---------------------------------------------------------------------===//

The FIXME in ComputeCommonTailLength in BranchFolding.cpp needs to be
revisited. The check is there to work around a misuse of directives in inline
assembly.

//===---------------------------------------------------------------------===//

It would be good to detect collector/target compatibility instead of silently
doing the wrong thing.

//===---------------------------------------------------------------------===//

It would be really nice to be able to write patterns in .td files for copies,
which would eliminate a bunch of explicit predicates on them (e.g. no side 
effects).  Once this is in place, it would be even better to have tblgen 
synthesize the various copy insertion/inspection methods in TargetInstrInfo.

//===---------------------------------------------------------------------===//

Stack coloring improvments:

1. Do proper LiveStackAnalysis on all stack objects including those which are
   not spill slots.
2. Reorder objects to fill in gaps between objects.
   e.g. 4, 1, <gap>, 4, 1, 1, 1, <gap>, 4 => 4, 1, 1, 1, 1, 4, 4

//===---------------------------------------------------------------------===//

The scheduler should be able to sort nearby instructions by their address. For
example, in an expanded memset sequence it's not uncommon to see code like this:

  movl $0, 4(%rdi)
  movl $0, 8(%rdi)
  movl $0, 12(%rdi)
  movl $0, 0(%rdi)

Each of the stores is independent, and the scheduler is currently making an
arbitrary decision about the order.

//===---------------------------------------------------------------------===//

Another opportunitiy in this code is that the $0 could be moved to a register:

  movl $0, 4(%rdi)
  movl $0, 8(%rdi)
  movl $0, 12(%rdi)
  movl $0, 0(%rdi)

This would save substantial code size, especially for longer sequences like
this. It would be easy to have a rule telling isel to avoid matching MOV32mi
if the immediate has more than some fixed number of uses. It's more involved
to teach the register allocator how to do late folding to recover from
excessive register pressure.