gbeauche
45df157a5e
Implement lazy icache range invalidation. Disable for now until it shows
...
a real benefit over only 2%
2003-11-21 14:20:01 +00:00
gbeauche
3630404307
fix fp_do_sgn1() for "double"-targets
2003-11-21 13:27:47 +00:00
gbeauche
309c2f0bd5
Add "jitblacklist" prefs item so that opcodes ranges could be excluded for
...
translation. This should help debugging of (badly) translated code.
Usage: jitblacklist xxxx(-yyyy)?(;xxxx(-yyyy)?)*
where xxxx/yyyy are hexadecimal numbers
2003-10-14 10:29:19 +00:00
gbeauche
b66f5972f9
Make sure a 32-bit B2/JIT works reasonnably well on AMD64 too. This implies
...
to force RAMBaseHost < 0x80000000. This is empirically determined to work on
Linux/x86 and Linux/amd64.
2003-10-03 18:18:15 +00:00
gbeauche
87e4d48b3e
flags are live after a call to fflags_into_flags_internal()
2003-10-02 09:51:14 +00:00
gbeauche
c464b19f06
get a chance to see some illegal instruction variants if we ever come to
...
encounter them.
2003-10-02 09:48:10 +00:00
gbeauche
f3ad33ed58
Call correct PUSHF/POPF macro
2003-06-03 09:01:03 +00:00
gbeauche
04167990a6
workaround a compiler bug on SPARC (Milan)
2003-05-28 10:17:43 +00:00
gbeauche
ccdec0782b
really make long double values (Milan)
2003-05-28 10:14:14 +00:00
gbeauche
3863961d26
- Fix "extended register" predicate to exclude X86_NOREG and X86_RIP
...
- Really handle requested 32-bit absolute address in AMD64 target
- Fix REX prefixes in 16-bit ALU instructions
- Fix POPF, remove useless? POPFD and PUSHFD
2003-05-19 17:15:17 +00:00
nigel
21c4e9da5b
Building on GCC 2 causes errors:
...
../uae_cpu/gencpu.c: In function `void gen_opcode(long unsigned int)':
../uae_cpu/gencpu.c:874: conversion from `unsigned int' to `enum wordsizes'
../uae_cpu/gencpu.c:875: conversion from `unsigned int' to `enum amodes'
due to mismatching of types in struct instr and types in function prototypes.
However, this only started happening recently and I don't know why :-(
2003-04-01 05:26:07 +00:00
gbeauche
9ed554b3a9
Remove some dead code. Start implementation of optimized calls to interpretive
...
fallbacks for untranslatable instruction handlers. Disabled for now since
call_m_01() is not correctly imeplemented yet.
2003-03-21 19:12:44 +00:00
gbeauche
b48a5a3253
Detect x86-64
2003-03-20 13:49:49 +00:00
gbeauche
96ae75cd7e
Optimize TEST[BWLQ]ir case where dest register is %rax
...
Add JCCSii and JCCii which directly takes the displacement value to encode
2003-03-19 17:06:22 +00:00
gbeauche
ecab19aa4e
Emulate CMOV in the new code generator for processors that don't support
...
this intruction
2003-03-19 17:05:02 +00:00
gbeauche
06af072a40
Add missing wrappers of the new runtime-assembler primitives
2003-03-19 16:32:51 +00:00
gbeauche
a3b815366a
Add facility to filter out some opcodes from the compfunctbl[] et al.
2003-03-19 16:28:23 +00:00
gbeauche
547bd6ab2c
Fix MOVBrr
2003-03-19 16:25:12 +00:00
gbeauche
c4bf8e0695
Fix 0(%rbp,<reg>,1) operand encoding
2003-03-19 11:34:10 +00:00
gbeauche
da8d81509e
Add new backend, disabled for until it's proofread and fully functional
...
Remove obsolete string-related instructions
2003-03-18 17:26:32 +00:00
gbeauche
5fb74e3592
Add sign/zero-extend instructions
2003-03-18 17:01:44 +00:00
gbeauche
29f636c2eb
Fix _REXBmr(). Add CPUID. Some C++ compiler fixes. Make x86_emit_failure()
...
be void, and let x86_emit_failure0() be an int expression instead.
2003-03-18 16:28:23 +00:00
gbeauche
8271c0503e
Add CMOV and BSF/BSR instructions
2003-03-18 13:12:56 +00:00
gbeauche
e07bfdbc8b
Handle absolute and RIP addressing modes in x86-64
2003-03-18 10:08:16 +00:00
gbeauche
ce3d90ff5e
clobber "cc" for flags, not "flags". Thanks Milan for noticing it.
2003-03-17 22:37:55 +00:00
gbeauche
08e9f936eb
Add some SSE/SSE2 instructions
2003-03-17 17:18:24 +00:00
gbeauche
c2566295af
Implement a generic setzflg_l() for P4, thus permitting to re-enable
...
translation of ADDX/SUBX/BCLR/BTST/BSET/BCHG instructions. i.e. make
it faster. ;-)
2003-03-13 20:34:34 +00:00
gbeauche
0cfa3126b3
Workaround change in flags handling for BSF instruction on Pentium 4.
...
i.e. currently disable translation of ADDX/SUBX/B<CHG,CLR,SET,TST> instructions
in that case. That is to say, better (much?) slower than inaccurate. :-(
2003-03-13 15:57:01 +00:00
gbeauche
a8e76deb69
Fix align_target with a padding of 0 bytes
2003-03-13 09:51:31 +00:00
gbeauche
45289042e6
Add some FPU instructions. Minor clean-ups.
2003-01-31 23:48:10 +00:00
gbeauche
ee7cea923a
Add new run-time assembler derived from GNU lightning. It is suitable for
...
both i386 and x86-64 architectures. Still needs some work (see TODO) and
an actual glue to the JIT backend.
Original work is LGPL, but per section 3 of this license, I opt for GPL v2
for Basilisk II purposes.
2003-01-31 20:39:53 +00:00
gbeauche
144e6f4e87
Use old x87 FPU stack on x86-64 too because we now use long doubles there for
...
better accuracy. Aka. prefer compatibility over speed.
2002-11-16 15:28:25 +00:00
gbeauche
bc5d7f9490
OPTIMIZED_FLAGS for x86-64 with the pushf/pop method since sahf/lahf are
...
invalid in long mode.
2002-11-05 11:59:12 +00:00
gbeauche
0a201217bf
Remove obsolete CFLOW_* constants but keep cpuop_{begin,end} for an
...
inline-threaded core.
2002-11-02 18:13:29 +00:00
gbeauche
2cda26edae
Fix buffer overflow reported by Aranym people
2002-11-02 17:23:20 +00:00
gbeauche
2cb7e02c9e
Some instructions assume offsets are only 1-byte long. I don't think this
...
is 100% correct. Therefore, insert some asserts so that would fail.
2002-10-13 11:14:24 +00:00
gbeauche
aa6b264d21
Add raw_emit_nop_filler() with more efficient no-op fillers stolen from
...
GNU binutils 2.12.90.0.15. Speed bump is marginal (less than 6%). Make it
default though, that's conditionalized by tune_nop_fillers constant.
2002-10-12 16:27:13 +00:00
gbeauche
78ac3e667f
Don't forget to note CPU detection code mostly comes from Linux kernel.
2002-10-03 16:16:57 +00:00
gbeauche
d4ed937de6
JIT add copyright notices just to notify people that's real derivative
...
work from GPL code (UAE-JIT). Additions and improvements are from B2
developers.
2002-10-03 16:13:46 +00:00
gbeauche
8de7ad1091
- Turn on runtime detection of loop and jump alignment as Aranym people
...
reported they got some improvement with it and larger loops. Small
loops are an issue for now until unrolling is implemented for DBcc.
- Const jumps are identified in readcpu. I don't want to duplicate code
uselessly. Rather, it's the JIT job to know whether we are doing block
inlining and un-marking those instructions as end-of-block.
2002-10-03 15:05:01 +00:00
gbeauche
a60c6da7c3
Turn on block inlining so that people could test this feature and report
...
if they do gain something or renders JIT less stable.
2002-10-03 15:01:53 +00:00
gbeauche
724516511a
Do translate BSR.L, we don't have any issue with that even if we are
...
doing block inlining since we have a complete chain of information about
the blocks to checksum.
2002-10-03 14:59:35 +00:00
gbeauche
e11dd3d375
Do translate FMUL instructions, the core needs to be fixed and this is
...
not translation of that instruction. I believe this is related to some
misgeneration of FPU core sequence and allocation of FP registers?
2002-10-03 14:58:02 +00:00
gbeauche
e9584dbcc1
Add PROFILE_UNTRANSLATED_INSNS information. Interestingly, the following
...
are the bottleneck now: DIVS, BSR.L (why isn't it translated yet?),
bit-field instructions (I need to self-motivate enough for that), and
A-Traps.
2002-10-02 16:22:51 +00:00
gbeauche
94a9038826
- Remove dead code in readcpu.cpp concerning CONST_JUMP control flow.
...
- Replace unused fl_compiled with fl_const_jump
- Implement block inlining enabled with USE_INLINING && USE_CHECKSUM_INFO.
However, this is currently disabled as it doesn't give much and exhibits
even more a cache/code generation problem with FPU JIT compiled code.
- Actual checksum values are now integral part of a blockinfo regardless
of USE_CHECKSUM_INFO is set or not. Reduce number of elements in that
structure and speeds up a little calculation of checksum of chained blocks.
- Don't care about show_checksum() for now.
2002-10-02 15:55:10 +00:00
gbeauche
21909f1eed
- Rewrite blockinfo allocator et al. Use a template class so that this
...
can work with other types related to blockinfos.
- Add new method to compute checksums. This should permit code inlining
and follow-ups of const_jumps without breaking the lazy cache invalidator.
aka. chain infos for checksuming. TODO: Incomplete support thus disabled.
2002-10-01 16:22:36 +00:00
gbeauche
75de104c92
- Optimize use of quit_program variable. This is a real boolean for B2.
...
- Remove unused/dead code concerning surroundings of (debugging).
- m68k_compile_execute() is generated and optimized code now.
2002-10-01 09:39:55 +00:00
gbeauche
bdf9d76bb8
- #include "flags_x86.h" here to get NATICE_CC_?? helper macros
...
- Add raw_cmp_b_mi() and raw_call_m_indexed() for generated
m68k_compile_execute() function
2002-10-01 09:37:03 +00:00
gbeauche
8748b48b7a
Disable USE_QUAD_DOUBLE for now and probably for good as (i) the emulator
...
implementation is not correct, (ii) I don't know of any CPU which
handles this kind of format *natively* with conformance to IEEE.
2002-09-20 16:52:48 +00:00
gbeauche
ec92457d68
Fix align_jumps for athlon, that's really "16" and gcc-3.2 sources contained
...
the same error. ;-)
2002-09-20 14:55:50 +00:00