Henrik Rydgard
215a269b34
Optimize dl_write_matrix just because. not expecting a big speedup...
2013-12-21 12:39:34 +01:00
Unknown W. Brackets
438361d0bc
Clean up code pointer naming for the jit.
...
Now it properly identifies thunk code which is actually a decent percent
when fastmem is off at least.
2013-12-18 23:57:39 -08:00
Henrik Rydgard
1e300447e1
Fix some replace-related bugs. Add "jal" replace inlining, not activated.
2013-12-18 16:27:23 +01:00
Henrik Rydgard
2eab4aa1bf
Play around with function replacement. Turned off by default of course.
2013-12-17 23:40:27 +01:00
Henrik Rydgard
2140892074
Initial preparations for ability to replace game functions with custom implementations.
...
Also auto-saves hashmap additions and reapplies the hashmap on function
rename so that if you rename a function that exists in several copies
they will all be labelled.
Note that actual function replacement is not activated yet.
2013-12-17 12:27:20 +01:00
Henrik Rydgard
2d8429ac48
Assorted cleanup in the MIPS emulation
2013-12-10 13:15:16 +01:00
Unknown W. Brackets
5d2ff64252
Support for modified jit-enabled VerySleepy.
...
This allows profiling the jit. Should have zero perf impact when not
in use, since it's entirely triggered by VerySleepy.
2013-11-30 19:20:21 -08:00
Henrik Rydgard
5bb3824dcf
Implement vocp on ARM and x86.
2013-11-19 21:41:47 +01:00
Unknown W. Brackets
a334aaf6ca
x86jit: Refactor and skip flushes in branch cont.
...
Still not faster, but at least the code isn't as messy.
2013-11-12 00:45:28 -08:00
Unknown W. Brackets
7e19933f64
x86jit: Try predicting branch continues.
...
Still doesn't seem to work. Something like a 4% gain in Star Ocean was
the best I saw...
2013-11-10 22:50:23 -08:00
Unknown W. Brackets
359110f010
x86/armjit: Add jump following (off by default.)
...
Inlines function calls up to a certain extent. Allows us to get
immediates all the way to a syscall, for example, usually.
Not sure if faster.
2013-11-10 21:59:49 -08:00
Unknown W. Brackets
aacb31bc18
armjit: Copy over (disabled) immbranch optim.
...
This does a little loop unrolling. Costs a bit more cache space, but
avoids flushing regs for longer.
Not enabled.
2013-11-10 21:59:48 -08:00
Unknown W. Brackets
455a7e090d
Compile the cache instruction to nothing.
...
Was showing up in a few profiles, does nothing currently.
2013-11-10 14:38:10 -08:00
Henrik Rydgard
0a844ce98d
Delete functions for vsge and vslt, these have been rolled into VecDo3
2013-11-09 19:29:52 +01:00
Henrik Rydgard
309f904c0c
Extract JitState into its own header (arm/x86)
2013-11-08 18:51:52 +01:00
Henrik Rydgard
6eb7f94065
Implement vsgn in x86/x64 and ARM jit
2013-11-07 15:29:13 +01:00
Henrik Rydgard
aa3cf34fc1
Jit: Fix valgrind warnings.
...
The first time PrefixStart was entered with startDefaultPrefix = true, it would
call EatPrefix, which checks the so far entirely uninitialized prefixXFlags.
2013-10-16 22:33:48 +02:00
Henrik Rydgard
20174d9410
Delete the lookup table version of vh2f
2013-09-28 22:15:29 +02:00
Henrik Rydgard
7ca6d73857
Two approaches to vh2f (half-float to float): lookuptable and fast SSE
2013-09-28 22:08:44 +02:00
Henrik Rydgard
aa753c88b2
ARM: implement vhdp
2013-09-28 12:30:28 +02:00
Unknown W. Brackets
157b682344
Always use fastmem for sw/lw on SP.
2013-09-07 22:44:18 -07:00
Unknown W. Brackets
538a4c064c
Add a note so as not to forget.
2013-09-01 01:15:08 -07:00
Unknown W. Brackets
b558189c37
Just invalidate blocks on ClearCacheAt().
...
This makes it safe to call from a jitted syscall, etc.
2013-09-01 00:32:43 -07:00
Unknown W. Brackets
97aa1a631e
Improve typesafety in the x86 regalloc.
2013-08-24 19:41:10 -07:00
Unknown W. Brackets
6c97b66806
Cap imm branch instructions, reset compiling.
...
Break and other delay slot ops could've set it to false.
It's actually sometimes faster now.
2013-08-24 17:26:24 -07:00
Unknown W. Brackets
109ad17ac6
Use a typesafe struct for opcodes.
...
Also, correctly read delayslots using Read_Instruction on ARM.
2013-08-24 15:36:24 -07:00
Unknown W. Brackets
df32c99be6
Attempt to follow branches to a max # of ops.
...
Seems to make it slower also. Maybe taking the branch would be better...
hmmph.
2013-08-16 01:07:11 -07:00
Unknown W. Brackets
defd2b6383
Attempt at doing branches with imm args.
2013-08-16 01:05:52 -07:00
Unknown W. Brackets
6b0b5145e5
Clean up some inconsistency in jit branches.
2013-08-16 00:44:23 -07:00
Unknown W. Brackets
64c2ea86c0
Add a method to save the gpr/fpr state in jit.
2013-08-16 00:12:49 -07:00
Henrik Rydgard
4e8958f42d
A small optimization, a few jit stubs, and cross/quat product on x86.
2013-08-01 00:15:08 +02:00
Henrik Rydgard
0a8f85a919
Some JIT cleanup, implement VI2F on ARM. also disabled untested impl of viim for x86.
2013-07-31 17:27:04 +02:00
Henrik Rydgard
51596b636a
Fix numerous ARM JIT bugs. Activate vmtvc and vscl, and vadd/vmul/vdiv/vsub for real this time.
2013-07-31 10:34:58 +02:00
Henrik Rydgard
d8294f025f
More VFPU stuff (nothing new activated)
2013-07-30 01:09:11 +02:00
Henrik Rydgard
8feeaf2e7a
Jit: Implement vidt in both, plus translate a couple easy ones to ARM.
2013-07-28 16:14:21 +02:00
Unknown W. Brackets
2d15eb2acd
Re-enable lwl/lwr/swl/swr on the x86 jit.
...
Now correctly handling ECX on x64.
2013-07-06 01:21:52 -07:00
Unknown W. Brackets
662ae77214
Save regs before/after 3-arg func calls on x86.
...
This fixes bugs only on x64 when ABI_CallFunctionACC and etc. were used.
This was breaking things since R8 was not being saved (arg 3.)
2013-07-06 00:54:53 -07:00
Unknown W. Brackets
d823989330
Implement vmone/vmzero/vmidt for the x86 jit.
2013-07-04 18:16:57 -07:00
Unknown W. Brackets
e27ab6fa11
Add swl/swr to the x86 jit.
2013-07-04 17:34:56 -07:00
Unknown W. Brackets
203daf955b
Implement lwl/lwr in the x86 jit.
2013-07-04 17:30:36 -07:00
Unknown W. Brackets
2d25d1eb05
Add a way to force alignment in JitSafeMem().
2013-07-04 15:59:12 -07:00
Unknown W. Brackets
609f8d6340
Allow hitting Go on a breakpoint to continue.
...
Doesn't work for branches though, because of delay slots.
2013-06-29 11:23:24 -07:00
Henrik Rydgard
ce2c18d2fe
Remove redundant vmov instructions (seen in wipeout)
2013-06-15 00:19:48 +02:00
Sacha
a26b48fc0b
Stub wsbh/wsbw for x86.
2013-06-05 14:55:01 +10:00
Henrik Rydgard
1a1c161a0d
Implement vmin/vmax in x86 jit, slots right into VecDo3
2013-04-27 20:52:42 +02:00
Henrik Rydgard
6f4ad05582
Remove some unused code, add some stubs to vfpu jit, some cleanup
2013-04-27 19:35:42 +02:00
Henrik Rydgard
9eace8a80e
Combine the two JitCache implementations (x86, ARM) into one.
2013-04-27 01:32:03 +02:00
Unknown W. Brackets
3bb5651ca7
Initial x86 jit for vtfm/vhtfm.
2013-04-20 01:52:06 -07:00
Unknown W. Brackets
9245490b53
Initial / simple vmscl for x86 jit.
2013-04-20 01:34:16 -07:00
Unknown W. Brackets
29109d25af
Non-optimal vmmul for x86 jit.
...
It's faster than interpreter anyway, but it could be much better.
2013-04-20 01:15:15 -07:00