mirror of
https://github.com/ZornsLemma/lib6502-jit.git
synced 2024-11-23 09:30:54 +00:00
68 lines
3.3 KiB
Plaintext
68 lines
3.3 KiB
Plaintext
It would be interesting to see if this works OK on an ARM machine.
|
|
|
|
|
|
Running e.g. z-self-modify-1 to completion in -mc -mx 1 mode shows the memory
|
|
for the run6502 process grows steadily, but valgrind doesn't show any leaks. A
|
|
quick web search suggests this might be internal leaks in LLVM (which are only
|
|
exposed by things like this which continually JIT). I am inclined to leave this
|
|
and perhaps come back to it once LLVM 3.5 is actuallly released; if there's
|
|
still a problem then it might be worth tracking it down.
|
|
|
|
|
|
Would it be helpful to pass branch weights to CreateCondBr()? For example,
|
|
where we have a computed address which might trigger a read/write callback, we
|
|
could calculate the proportion of addresses in the address range which have
|
|
callbacks on them and use that as the probability of taking the callback-exists
|
|
branch.
|
|
|
|
|
|
We could potentially use Function objects to deduce properties of stretches of
|
|
code and use that information to improve the generated code. For example, if we
|
|
observed that a Function object didn't contain any external calls or any
|
|
stack-modification instructions except RTS then we could inline it in any
|
|
callers (adding its code ranges to their code ranges, of course) and the RTS
|
|
could be a no-op. (For 100% accuracy, the JSR should still push the return
|
|
address on the stack but not modify the stack pointer. Code executed later on
|
|
might peek at the stack and expect those values to be there.) This might in
|
|
turn allow the callers of that Function to be inlined themselves. This is just
|
|
an example. It may be that in practice deciding when to re-translate code would
|
|
cause a sufficient performance impact to just not be worth it in the first
|
|
place.
|
|
|
|
|
|
We could add support for counting the number of cycles executed by the JITted
|
|
code; lib6502 itself has some support for this in the form of the tick* macros,
|
|
but they don't do anything by default.
|
|
|
|
|
|
Would there be any performance improvement to be had by having Function objects
|
|
(tail) call one another where possible?
|
|
|
|
|
|
Hybrid mode currently makes no attempt to avoid re-generating Function objects
|
|
which are continually being invalidated due to self-modifying code. It might be
|
|
nice if some heuristic caused us to avoid this unnecessary work and just let
|
|
the interpreter always handle that code.
|
|
|
|
On a related but distinct note, currently once an element of
|
|
FunctionManager::code_at_address_ is set, it is never cleared. This might cause
|
|
us to avoid optimistic writes which in reality would be OK. We could use some
|
|
heuristic to decide when to destroy Function objects which have not been
|
|
executed in a long time, and start clearing code_at_address_ elements when all
|
|
functions covering an address are removed. (See the note in
|
|
FunctionManager::destroyFunction(); this clearing must be done *outside* the
|
|
loop in FunctionManager::buildFunction(), or the implementation of
|
|
buildFunction() must be tweaked.)
|
|
|
|
However, it may be that it just isn't worth being that clever. Any such code
|
|
would need to be triggered inside the main loop between executions of Function
|
|
objects. We could do it only every nth time, and keeping track of how many
|
|
times we've been round probably wouldn't significantly harm performance, but be
|
|
careful.
|
|
|
|
|
|
Would a different default value for max_instructions be better?
|
|
|
|
|
|
Are there any other LLVM optimisation passes which would be helpful?
|