Commit Graph

121 Commits

Author SHA1 Message Date
gbeauche
45df157a5e Implement lazy icache range invalidation. Disable for now until it shows
a real benefit over only 2%
2003-11-21 14:20:01 +00:00
gbeauche
3630404307 fix fp_do_sgn1() for "double"-targets 2003-11-21 13:27:47 +00:00
gbeauche
309c2f0bd5 Add "jitblacklist" prefs item so that opcodes ranges could be excluded for
translation. This should help debugging of (badly) translated code.

Usage: jitblacklist xxxx(-yyyy)?(;xxxx(-yyyy)?)*
where xxxx/yyyy are hexadecimal numbers
2003-10-14 10:29:19 +00:00
gbeauche
b66f5972f9 Make sure a 32-bit B2/JIT works reasonnably well on AMD64 too. This implies
to force RAMBaseHost < 0x80000000. This is empirically determined to work on
Linux/x86 and Linux/amd64.
2003-10-03 18:18:15 +00:00
gbeauche
87e4d48b3e flags are live after a call to fflags_into_flags_internal() 2003-10-02 09:51:14 +00:00
gbeauche
c464b19f06 get a chance to see some illegal instruction variants if we ever come to
encounter them.
2003-10-02 09:48:10 +00:00
gbeauche
f3ad33ed58 Call correct PUSHF/POPF macro 2003-06-03 09:01:03 +00:00
gbeauche
04167990a6 workaround a compiler bug on SPARC (Milan) 2003-05-28 10:17:43 +00:00
gbeauche
ccdec0782b really make long double values (Milan) 2003-05-28 10:14:14 +00:00
gbeauche
3863961d26 - Fix "extended register" predicate to exclude X86_NOREG and X86_RIP
- Really handle requested 32-bit absolute address in AMD64 target
- Fix REX prefixes in 16-bit ALU instructions
- Fix POPF, remove useless? POPFD and PUSHFD
2003-05-19 17:15:17 +00:00
nigel
21c4e9da5b Building on GCC 2 causes errors:
../uae_cpu/gencpu.c: In function `void gen_opcode(long unsigned int)':
../uae_cpu/gencpu.c:874: conversion from `unsigned int' to `enum wordsizes'
../uae_cpu/gencpu.c:875: conversion from `unsigned int' to `enum amodes'
due to mismatching of types in struct instr and types in function prototypes.
However, this only started happening recently and I don't know why :-(
2003-04-01 05:26:07 +00:00
gbeauche
9ed554b3a9 Remove some dead code. Start implementation of optimized calls to interpretive
fallbacks for untranslatable instruction handlers. Disabled for now since
call_m_01() is not correctly imeplemented yet.
2003-03-21 19:12:44 +00:00
gbeauche
b48a5a3253 Detect x86-64 2003-03-20 13:49:49 +00:00
gbeauche
96ae75cd7e Optimize TEST[BWLQ]ir case where dest register is %rax
Add JCCSii and JCCii which directly takes the displacement value to encode
2003-03-19 17:06:22 +00:00
gbeauche
ecab19aa4e Emulate CMOV in the new code generator for processors that don't support
this intruction
2003-03-19 17:05:02 +00:00
gbeauche
06af072a40 Add missing wrappers of the new runtime-assembler primitives 2003-03-19 16:32:51 +00:00
gbeauche
a3b815366a Add facility to filter out some opcodes from the compfunctbl[] et al. 2003-03-19 16:28:23 +00:00
gbeauche
547bd6ab2c Fix MOVBrr 2003-03-19 16:25:12 +00:00
gbeauche
c4bf8e0695 Fix 0(%rbp,<reg>,1) operand encoding 2003-03-19 11:34:10 +00:00
gbeauche
da8d81509e Add new backend, disabled for until it's proofread and fully functional
Remove obsolete string-related instructions
2003-03-18 17:26:32 +00:00
gbeauche
5fb74e3592 Add sign/zero-extend instructions 2003-03-18 17:01:44 +00:00
gbeauche
29f636c2eb Fix _REXBmr(). Add CPUID. Some C++ compiler fixes. Make x86_emit_failure()
be void, and let x86_emit_failure0() be an int expression instead.
2003-03-18 16:28:23 +00:00
gbeauche
8271c0503e Add CMOV and BSF/BSR instructions 2003-03-18 13:12:56 +00:00
gbeauche
e07bfdbc8b Handle absolute and RIP addressing modes in x86-64 2003-03-18 10:08:16 +00:00
gbeauche
ce3d90ff5e clobber "cc" for flags, not "flags". Thanks Milan for noticing it. 2003-03-17 22:37:55 +00:00
gbeauche
08e9f936eb Add some SSE/SSE2 instructions 2003-03-17 17:18:24 +00:00
gbeauche
c2566295af Implement a generic setzflg_l() for P4, thus permitting to re-enable
translation of ADDX/SUBX/BCLR/BTST/BSET/BCHG instructions. i.e. make
it faster. ;-)
2003-03-13 20:34:34 +00:00
gbeauche
0cfa3126b3 Workaround change in flags handling for BSF instruction on Pentium 4.
i.e. currently disable translation of ADDX/SUBX/B<CHG,CLR,SET,TST> instructions
in that case. That is to say, better (much?) slower than inaccurate. :-(
2003-03-13 15:57:01 +00:00
gbeauche
a8e76deb69 Fix align_target with a padding of 0 bytes 2003-03-13 09:51:31 +00:00
gbeauche
45289042e6 Add some FPU instructions. Minor clean-ups. 2003-01-31 23:48:10 +00:00
gbeauche
ee7cea923a Add new run-time assembler derived from GNU lightning. It is suitable for
both i386 and x86-64 architectures. Still needs some work (see TODO) and
an actual glue to the JIT backend.

Original work is LGPL, but per section 3 of this license, I opt for GPL v2
for Basilisk II purposes.
2003-01-31 20:39:53 +00:00
gbeauche
144e6f4e87 Use old x87 FPU stack on x86-64 too because we now use long doubles there for
better accuracy. Aka. prefer compatibility over speed.
2002-11-16 15:28:25 +00:00
gbeauche
bc5d7f9490 OPTIMIZED_FLAGS for x86-64 with the pushf/pop method since sahf/lahf are
invalid in long mode.
2002-11-05 11:59:12 +00:00
gbeauche
0a201217bf Remove obsolete CFLOW_* constants but keep cpuop_{begin,end} for an
inline-threaded core.
2002-11-02 18:13:29 +00:00
gbeauche
2cda26edae Fix buffer overflow reported by Aranym people 2002-11-02 17:23:20 +00:00
gbeauche
2cb7e02c9e Some instructions assume offsets are only 1-byte long. I don't think this
is 100% correct. Therefore, insert some asserts so that would fail.
2002-10-13 11:14:24 +00:00
gbeauche
aa6b264d21 Add raw_emit_nop_filler() with more efficient no-op fillers stolen from
GNU binutils 2.12.90.0.15. Speed bump is marginal (less than 6%). Make it
default though, that's conditionalized by tune_nop_fillers constant.
2002-10-12 16:27:13 +00:00
gbeauche
78ac3e667f Don't forget to note CPU detection code mostly comes from Linux kernel. 2002-10-03 16:16:57 +00:00
gbeauche
d4ed937de6 JIT add copyright notices just to notify people that's real derivative
work from GPL code (UAE-JIT). Additions and improvements are from B2
developers.
2002-10-03 16:13:46 +00:00
gbeauche
8de7ad1091 - Turn on runtime detection of loop and jump alignment as Aranym people
reported they got some improvement with it and larger loops. Small
  loops are an issue for now until unrolling is implemented for DBcc.
- Const jumps are identified in readcpu. I don't want to duplicate code
  uselessly. Rather, it's the JIT job to know whether we are doing block
  inlining and un-marking those instructions as end-of-block.
2002-10-03 15:05:01 +00:00
gbeauche
a60c6da7c3 Turn on block inlining so that people could test this feature and report
if they do gain something or renders JIT less stable.
2002-10-03 15:01:53 +00:00
gbeauche
724516511a Do translate BSR.L, we don't have any issue with that even if we are
doing block inlining since we have a complete chain of information about
the blocks to checksum.
2002-10-03 14:59:35 +00:00
gbeauche
e11dd3d375 Do translate FMUL instructions, the core needs to be fixed and this is
not translation of that instruction. I believe this is related to some
misgeneration of FPU core sequence and allocation of FP registers?
2002-10-03 14:58:02 +00:00
gbeauche
e9584dbcc1 Add PROFILE_UNTRANSLATED_INSNS information. Interestingly, the following
are the bottleneck now: DIVS, BSR.L (why isn't it translated yet?),
bit-field instructions (I need to self-motivate enough for that), and
A-Traps.
2002-10-02 16:22:51 +00:00
gbeauche
94a9038826 - Remove dead code in readcpu.cpp concerning CONST_JUMP control flow.
- Replace unused fl_compiled with fl_const_jump
- Implement block inlining enabled with USE_INLINING && USE_CHECKSUM_INFO.
  However, this is currently disabled as it doesn't give much and exhibits
  even more a cache/code generation problem with FPU JIT compiled code.
- Actual checksum values are now integral part of a blockinfo regardless
  of USE_CHECKSUM_INFO is set or not. Reduce number of elements in that
  structure and speeds up a little calculation of checksum of chained blocks.
- Don't care about show_checksum() for now.
2002-10-02 15:55:10 +00:00
gbeauche
21909f1eed - Rewrite blockinfo allocator et al. Use a template class so that this
can work with other types related to blockinfos.
- Add new method to compute checksums. This should permit code inlining
  and follow-ups of const_jumps without breaking the lazy cache invalidator.
  aka. chain infos for checksuming. TODO: Incomplete support thus disabled.
2002-10-01 16:22:36 +00:00
gbeauche
75de104c92 - Optimize use of quit_program variable. This is a real boolean for B2.
- Remove unused/dead code concerning surroundings of (debugging).
- m68k_compile_execute() is generated and optimized code now.
2002-10-01 09:39:55 +00:00
gbeauche
bdf9d76bb8 - #include "flags_x86.h" here to get NATICE_CC_?? helper macros
- Add raw_cmp_b_mi() and raw_call_m_indexed() for generated
  m68k_compile_execute() function
2002-10-01 09:37:03 +00:00
gbeauche
8748b48b7a Disable USE_QUAD_DOUBLE for now and probably for good as (i) the emulator
implementation is not correct, (ii) I don't know of any CPU which
handles this kind of format *natively* with conformance to IEEE.
2002-09-20 16:52:48 +00:00
gbeauche
ec92457d68 Fix align_jumps for athlon, that's really "16" and gcc-3.2 sources contained
the same error. ;-)
2002-09-20 14:55:50 +00:00