Commit graph

154 commits

Author SHA1 Message Date
Henrik Rydgard
215a269b34 Optimize dl_write_matrix just because. not expecting a big speedup... 2013-12-21 12:39:34 +01:00
Unknown W. Brackets
438361d0bc Clean up code pointer naming for the jit.
Now it properly identifies thunk code which is actually a decent percent
when fastmem is off at least.
2013-12-18 23:57:39 -08:00
Henrik Rydgard
1e300447e1 Fix some replace-related bugs. Add "jal" replace inlining, not activated. 2013-12-18 16:27:23 +01:00
Henrik Rydgard
2eab4aa1bf Play around with function replacement. Turned off by default of course. 2013-12-17 23:40:27 +01:00
Henrik Rydgard
2140892074 Initial preparations for ability to replace game functions with custom implementations.
Also auto-saves hashmap additions and reapplies the hashmap on function
rename so that if you rename a function that exists in several copies
they will all be labelled.

Note that actual function replacement is not activated yet.
2013-12-17 12:27:20 +01:00
Henrik Rydgard
2d8429ac48 Assorted cleanup in the MIPS emulation 2013-12-10 13:15:16 +01:00
Unknown W. Brackets
5d2ff64252 Support for modified jit-enabled VerySleepy.
This allows profiling the jit.  Should have zero perf impact when not
in use, since it's entirely triggered by VerySleepy.
2013-11-30 19:20:21 -08:00
Henrik Rydgard
5bb3824dcf Implement vocp on ARM and x86. 2013-11-19 21:41:47 +01:00
Unknown W. Brackets
a334aaf6ca x86jit: Refactor and skip flushes in branch cont.
Still not faster, but at least the code isn't as messy.
2013-11-12 00:45:28 -08:00
Unknown W. Brackets
7e19933f64 x86jit: Try predicting branch continues.
Still doesn't seem to work.  Something like a 4% gain in Star Ocean was
the best I saw...
2013-11-10 22:50:23 -08:00
Unknown W. Brackets
359110f010 x86/armjit: Add jump following (off by default.)
Inlines function calls up to a certain extent.  Allows us to get
immediates all the way to a syscall, for example, usually.

Not sure if faster.
2013-11-10 21:59:49 -08:00
Unknown W. Brackets
aacb31bc18 armjit: Copy over (disabled) immbranch optim.
This does a little loop unrolling.  Costs a bit more cache space, but
avoids flushing regs for longer.

Not enabled.
2013-11-10 21:59:48 -08:00
Unknown W. Brackets
455a7e090d Compile the cache instruction to nothing.
Was showing up in a few profiles, does nothing currently.
2013-11-10 14:38:10 -08:00
Henrik Rydgard
0a844ce98d Delete functions for vsge and vslt, these have been rolled into VecDo3 2013-11-09 19:29:52 +01:00
Henrik Rydgard
309f904c0c Extract JitState into its own header (arm/x86) 2013-11-08 18:51:52 +01:00
Henrik Rydgard
6eb7f94065 Implement vsgn in x86/x64 and ARM jit 2013-11-07 15:29:13 +01:00
Henrik Rydgard
aa3cf34fc1 Jit: Fix valgrind warnings.
The first time PrefixStart was entered with startDefaultPrefix = true, it would
call EatPrefix, which checks the so far entirely uninitialized prefixXFlags.
2013-10-16 22:33:48 +02:00
Henrik Rydgard
20174d9410 Delete the lookup table version of vh2f 2013-09-28 22:15:29 +02:00
Henrik Rydgard
7ca6d73857 Two approaches to vh2f (half-float to float): lookuptable and fast SSE 2013-09-28 22:08:44 +02:00
Henrik Rydgard
aa753c88b2 ARM: implement vhdp 2013-09-28 12:30:28 +02:00
Unknown W. Brackets
157b682344 Always use fastmem for sw/lw on SP. 2013-09-07 22:44:18 -07:00
Unknown W. Brackets
538a4c064c Add a note so as not to forget. 2013-09-01 01:15:08 -07:00
Unknown W. Brackets
b558189c37 Just invalidate blocks on ClearCacheAt().
This makes it safe to call from a jitted syscall, etc.
2013-09-01 00:32:43 -07:00
Unknown W. Brackets
97aa1a631e Improve typesafety in the x86 regalloc. 2013-08-24 19:41:10 -07:00
Unknown W. Brackets
6c97b66806 Cap imm branch instructions, reset compiling.
Break and other delay slot ops could've set it to false.

It's actually sometimes faster now.
2013-08-24 17:26:24 -07:00
Unknown W. Brackets
109ad17ac6 Use a typesafe struct for opcodes.
Also, correctly read delayslots using Read_Instruction on ARM.
2013-08-24 15:36:24 -07:00
Unknown W. Brackets
df32c99be6 Attempt to follow branches to a max # of ops.
Seems to make it slower also.  Maybe taking the branch would be better...
hmmph.
2013-08-16 01:07:11 -07:00
Unknown W. Brackets
defd2b6383 Attempt at doing branches with imm args. 2013-08-16 01:05:52 -07:00
Unknown W. Brackets
6b0b5145e5 Clean up some inconsistency in jit branches. 2013-08-16 00:44:23 -07:00
Unknown W. Brackets
64c2ea86c0 Add a method to save the gpr/fpr state in jit. 2013-08-16 00:12:49 -07:00
Henrik Rydgard
4e8958f42d A small optimization, a few jit stubs, and cross/quat product on x86. 2013-08-01 00:15:08 +02:00
Henrik Rydgard
0a8f85a919 Some JIT cleanup, implement VI2F on ARM. also disabled untested impl of viim for x86. 2013-07-31 17:27:04 +02:00
Henrik Rydgard
51596b636a Fix numerous ARM JIT bugs. Activate vmtvc and vscl, and vadd/vmul/vdiv/vsub for real this time. 2013-07-31 10:34:58 +02:00
Henrik Rydgard
d8294f025f More VFPU stuff (nothing new activated) 2013-07-30 01:09:11 +02:00
Henrik Rydgard
8feeaf2e7a Jit: Implement vidt in both, plus translate a couple easy ones to ARM. 2013-07-28 16:14:21 +02:00
Unknown W. Brackets
2d15eb2acd Re-enable lwl/lwr/swl/swr on the x86 jit.
Now correctly handling ECX on x64.
2013-07-06 01:21:52 -07:00
Unknown W. Brackets
662ae77214 Save regs before/after 3-arg func calls on x86.
This fixes bugs only on x64 when ABI_CallFunctionACC and etc. were used.
This was breaking things since R8 was not being saved (arg 3.)
2013-07-06 00:54:53 -07:00
Unknown W. Brackets
d823989330 Implement vmone/vmzero/vmidt for the x86 jit. 2013-07-04 18:16:57 -07:00
Unknown W. Brackets
e27ab6fa11 Add swl/swr to the x86 jit. 2013-07-04 17:34:56 -07:00
Unknown W. Brackets
203daf955b Implement lwl/lwr in the x86 jit. 2013-07-04 17:30:36 -07:00
Unknown W. Brackets
2d25d1eb05 Add a way to force alignment in JitSafeMem(). 2013-07-04 15:59:12 -07:00
Unknown W. Brackets
609f8d6340 Allow hitting Go on a breakpoint to continue.
Doesn't work for branches though, because of delay slots.
2013-06-29 11:23:24 -07:00
Henrik Rydgard
ce2c18d2fe Remove redundant vmov instructions (seen in wipeout) 2013-06-15 00:19:48 +02:00
Sacha
a26b48fc0b Stub wsbh/wsbw for x86. 2013-06-05 14:55:01 +10:00
Henrik Rydgard
1a1c161a0d Implement vmin/vmax in x86 jit, slots right into VecDo3 2013-04-27 20:52:42 +02:00
Henrik Rydgard
6f4ad05582 Remove some unused code, add some stubs to vfpu jit, some cleanup 2013-04-27 19:35:42 +02:00
Henrik Rydgard
9eace8a80e Combine the two JitCache implementations (x86, ARM) into one. 2013-04-27 01:32:03 +02:00
Unknown W. Brackets
3bb5651ca7 Initial x86 jit for vtfm/vhtfm. 2013-04-20 01:52:06 -07:00
Unknown W. Brackets
9245490b53 Initial / simple vmscl for x86 jit. 2013-04-20 01:34:16 -07:00
Unknown W. Brackets
29109d25af Non-optimal vmmul for x86 jit.
It's faster than interpreter anyway, but it could be much better.
2013-04-20 01:15:15 -07:00