gbeauche
b48a5a3253
Detect x86-64
2003-03-20 13:49:49 +00:00
gbeauche
96ae75cd7e
Optimize TEST[BWLQ]ir case where dest register is %rax
...
Add JCCSii and JCCii which directly takes the displacement value to encode
2003-03-19 17:06:22 +00:00
gbeauche
ecab19aa4e
Emulate CMOV in the new code generator for processors that don't support
...
this intruction
2003-03-19 17:05:02 +00:00
gbeauche
06af072a40
Add missing wrappers of the new runtime-assembler primitives
2003-03-19 16:32:51 +00:00
gbeauche
a3b815366a
Add facility to filter out some opcodes from the compfunctbl[] et al.
2003-03-19 16:28:23 +00:00
gbeauche
547bd6ab2c
Fix MOVBrr
2003-03-19 16:25:12 +00:00
gbeauche
c4bf8e0695
Fix 0(%rbp,<reg>,1) operand encoding
2003-03-19 11:34:10 +00:00
gbeauche
da8d81509e
Add new backend, disabled for until it's proofread and fully functional
...
Remove obsolete string-related instructions
2003-03-18 17:26:32 +00:00
gbeauche
5fb74e3592
Add sign/zero-extend instructions
2003-03-18 17:01:44 +00:00
gbeauche
29f636c2eb
Fix _REXBmr(). Add CPUID. Some C++ compiler fixes. Make x86_emit_failure()
...
be void, and let x86_emit_failure0() be an int expression instead.
2003-03-18 16:28:23 +00:00
gbeauche
8271c0503e
Add CMOV and BSF/BSR instructions
2003-03-18 13:12:56 +00:00
gbeauche
e07bfdbc8b
Handle absolute and RIP addressing modes in x86-64
2003-03-18 10:08:16 +00:00
gbeauche
ce3d90ff5e
clobber "cc" for flags, not "flags". Thanks Milan for noticing it.
2003-03-17 22:37:55 +00:00
gbeauche
08e9f936eb
Add some SSE/SSE2 instructions
2003-03-17 17:18:24 +00:00
gbeauche
c2566295af
Implement a generic setzflg_l() for P4, thus permitting to re-enable
...
translation of ADDX/SUBX/BCLR/BTST/BSET/BCHG instructions. i.e. make
it faster. ;-)
2003-03-13 20:34:34 +00:00
gbeauche
0cfa3126b3
Workaround change in flags handling for BSF instruction on Pentium 4.
...
i.e. currently disable translation of ADDX/SUBX/B<CHG,CLR,SET,TST> instructions
in that case. That is to say, better (much?) slower than inaccurate. :-(
2003-03-13 15:57:01 +00:00
gbeauche
a8e76deb69
Fix align_target with a padding of 0 bytes
2003-03-13 09:51:31 +00:00
gbeauche
45289042e6
Add some FPU instructions. Minor clean-ups.
2003-01-31 23:48:10 +00:00
gbeauche
ee7cea923a
Add new run-time assembler derived from GNU lightning. It is suitable for
...
both i386 and x86-64 architectures. Still needs some work (see TODO) and
an actual glue to the JIT backend.
Original work is LGPL, but per section 3 of this license, I opt for GPL v2
for Basilisk II purposes.
2003-01-31 20:39:53 +00:00
gbeauche
144e6f4e87
Use old x87 FPU stack on x86-64 too because we now use long doubles there for
...
better accuracy. Aka. prefer compatibility over speed.
2002-11-16 15:28:25 +00:00
gbeauche
bc5d7f9490
OPTIMIZED_FLAGS for x86-64 with the pushf/pop method since sahf/lahf are
...
invalid in long mode.
2002-11-05 11:59:12 +00:00
gbeauche
0a201217bf
Remove obsolete CFLOW_* constants but keep cpuop_{begin,end} for an
...
inline-threaded core.
2002-11-02 18:13:29 +00:00
gbeauche
2cda26edae
Fix buffer overflow reported by Aranym people
2002-11-02 17:23:20 +00:00
gbeauche
2cb7e02c9e
Some instructions assume offsets are only 1-byte long. I don't think this
...
is 100% correct. Therefore, insert some asserts so that would fail.
2002-10-13 11:14:24 +00:00
gbeauche
aa6b264d21
Add raw_emit_nop_filler() with more efficient no-op fillers stolen from
...
GNU binutils 2.12.90.0.15. Speed bump is marginal (less than 6%). Make it
default though, that's conditionalized by tune_nop_fillers constant.
2002-10-12 16:27:13 +00:00
gbeauche
78ac3e667f
Don't forget to note CPU detection code mostly comes from Linux kernel.
2002-10-03 16:16:57 +00:00
gbeauche
d4ed937de6
JIT add copyright notices just to notify people that's real derivative
...
work from GPL code (UAE-JIT). Additions and improvements are from B2
developers.
2002-10-03 16:13:46 +00:00
gbeauche
8de7ad1091
- Turn on runtime detection of loop and jump alignment as Aranym people
...
reported they got some improvement with it and larger loops. Small
loops are an issue for now until unrolling is implemented for DBcc.
- Const jumps are identified in readcpu. I don't want to duplicate code
uselessly. Rather, it's the JIT job to know whether we are doing block
inlining and un-marking those instructions as end-of-block.
2002-10-03 15:05:01 +00:00
gbeauche
a60c6da7c3
Turn on block inlining so that people could test this feature and report
...
if they do gain something or renders JIT less stable.
2002-10-03 15:01:53 +00:00
gbeauche
724516511a
Do translate BSR.L, we don't have any issue with that even if we are
...
doing block inlining since we have a complete chain of information about
the blocks to checksum.
2002-10-03 14:59:35 +00:00
gbeauche
e11dd3d375
Do translate FMUL instructions, the core needs to be fixed and this is
...
not translation of that instruction. I believe this is related to some
misgeneration of FPU core sequence and allocation of FP registers?
2002-10-03 14:58:02 +00:00
gbeauche
e9584dbcc1
Add PROFILE_UNTRANSLATED_INSNS information. Interestingly, the following
...
are the bottleneck now: DIVS, BSR.L (why isn't it translated yet?),
bit-field instructions (I need to self-motivate enough for that), and
A-Traps.
2002-10-02 16:22:51 +00:00
gbeauche
94a9038826
- Remove dead code in readcpu.cpp concerning CONST_JUMP control flow.
...
- Replace unused fl_compiled with fl_const_jump
- Implement block inlining enabled with USE_INLINING && USE_CHECKSUM_INFO.
However, this is currently disabled as it doesn't give much and exhibits
even more a cache/code generation problem with FPU JIT compiled code.
- Actual checksum values are now integral part of a blockinfo regardless
of USE_CHECKSUM_INFO is set or not. Reduce number of elements in that
structure and speeds up a little calculation of checksum of chained blocks.
- Don't care about show_checksum() for now.
2002-10-02 15:55:10 +00:00
gbeauche
21909f1eed
- Rewrite blockinfo allocator et al. Use a template class so that this
...
can work with other types related to blockinfos.
- Add new method to compute checksums. This should permit code inlining
and follow-ups of const_jumps without breaking the lazy cache invalidator.
aka. chain infos for checksuming. TODO: Incomplete support thus disabled.
2002-10-01 16:22:36 +00:00
gbeauche
75de104c92
- Optimize use of quit_program variable. This is a real boolean for B2.
...
- Remove unused/dead code concerning surroundings of (debugging).
- m68k_compile_execute() is generated and optimized code now.
2002-10-01 09:39:55 +00:00
gbeauche
bdf9d76bb8
- #include "flags_x86.h" here to get NATICE_CC_?? helper macros
...
- Add raw_cmp_b_mi() and raw_call_m_indexed() for generated
m68k_compile_execute() function
2002-10-01 09:37:03 +00:00
gbeauche
8748b48b7a
Disable USE_QUAD_DOUBLE for now and probably for good as (i) the emulator
...
implementation is not correct, (ii) I don't know of any CPU which
handles this kind of format *natively* with conformance to IEEE.
2002-09-20 16:52:48 +00:00
gbeauche
ec92457d68
Fix align_jumps for athlon, that's really "16" and gcc-3.2 sources contained
...
the same error. ;-)
2002-09-20 14:55:50 +00:00
gbeauche
d7c677d077
- Implement {make,extract}_extended() for USE_QUAD_DOUBLE
...
- Don't forget to fill in mantissa3 member for USE_QUAD_DOUBLE in
make_extended_*() but make sure NaN, inf, zeros are handled beforehand
2002-09-19 20:52:50 +00:00
gbeauche
a5ba7ea5ac
Don't define USE_LONG_DOUBLE when sizeof(long double) == 16. This still
...
is not very clean but it should build now. Probably live with USE_LONG_DOUBLE
for any case where native long double exists and sizeof > 8 ?
2002-09-19 16:02:13 +00:00
gbeauche
b765112cf9
Get rid of any "extern inline" bits. Use static inline instead as MIPS
...
compilers don't really like the former syntax.
2002-09-19 15:42:16 +00:00
gbeauche
ecd3db832e
- Rewrite raw_init_cpu() to match more details, from kernel sources.
...
- Add possibility to tune code alignment to the underlying processor. However,
this is turned off as I don't see much improvement and align_jumps = 64
for Athlon looks suspicious to me.
- Remove two extra align_target() that are already covered.
- Remove unused may_trap() predicate.
2002-09-19 14:59:03 +00:00
gbeauche
feca66d43e
Optimize runtime assembler with shorter equivalents when the accumulator
...
(%eax) is referenced along with immediates.
2002-09-18 15:56:17 +00:00
gbeauche
54ac7a1493
Move -DSAHF_SETO_PROFITABLE down in x86 & gas specific block. Also ensure
...
SAHF_SETO_PROFITABLE is defined when compiling the JIT. Aka I don't want
to support obsolete and probably bogus code nowadays.
2002-09-18 11:41:56 +00:00
gbeauche
c40279294a
Don't forget to use vm_realease() to free up translation cache. Also free
...
the right amount of memory that was previously allocated.
2002-09-18 09:55:37 +00:00
gbeauche
599f7e845f
Use vm_acquire() to allocate translation cache
2002-09-18 07:50:55 +00:00
gbeauche
4fc127c8df
- Changes to support 68040 -> x86 dynamic translator
...
- Globalize FLIGHT_RECORDER, possibly used in compiler/ sources as well
2002-09-17 16:05:39 +00:00
gbeauche
c0526db089
Import JIT compiler
2002-09-17 16:04:06 +00:00
gbeauche
6af88bc787
Only use *l() math functions when they are available
2002-09-16 15:40:23 +00:00
gbeauche
48986febc6
- FP endianness is now testing at configure time
...
- Fix junk introduced in previous rev for extract_extended()
2002-09-16 12:01:38 +00:00