Unknown W. Brackets
bbeb5758b7
x86jit: Simplify VS() / VSX() usage.
2014-11-27 00:07:17 -08:00
Unknown W. Brackets
f63c165f64
x86jit: Fix several cases of missing dirty checks.
2014-11-26 23:28:14 -08:00
Henrik Rydgard
acb711007f
x86 jit: SIMD-ify cross product
2014-11-27 00:18:19 +01:00
Henrik Rydgard
5033babb10
x86 Jit: SIMD-ify vdot
2014-11-26 23:47:18 +01:00
Henrik Rydgard
4b25afb7b4
x86 Jit: SIMD some more instructions
2014-11-26 22:30:06 +01:00
Henrik Rydgard
804de50711
x86 jit: SIMD-ify VFPU register file writebacks where possible
2014-11-26 01:33:05 +01:00
Henrik Rydgard
b3c8a82c49
x86 jit: SIMD-ify some more
2014-11-25 23:56:46 +01:00
Henrik Rydgard
b5ee47a80c
x86 jit: SIMD-ify lv.q and sv.q
2014-11-25 23:28:29 +01:00
Henrik Rydgård
4db6b7f3e2
SIMD-ify a couple instructions a bit
2014-11-25 22:47:26 +01:00
Unknown W. Brackets
5347431c20
x86jit: Initial simd for VecDo3(). Broken.
...
I'm not sure why/where it's broken...
2014-11-16 13:33:15 -08:00
Unknown W. Brackets
2862367927
x86jit: Add force-non-simd to all current ops.
...
Unless they already use MapRegs, because that will automatically handle
it.
2014-11-16 13:33:12 -08:00
Henrik Rydgard
bfcd3690b6
x86 jit: Fix+enable quaternion product, optimize "sw zero, *"
2014-11-16 18:37:38 +01:00
Henrik Rydgard
1c78e29c79
x86 jit: For clarity, use TEMPREG where it doesn't matter that it's EAX.
...
Might have missed a few places.
2014-11-16 17:38:26 +01:00
Henrik Rydgard
8b90f881b8
x86 jit: A tiny optimization and a tiny bugfix
2014-11-16 16:46:35 +01:00
Unknown W. Brackets
096b41cceb
x86jit: Interleave reg usage in vcmp.
2014-11-10 23:22:04 -08:00
Unknown W. Brackets
0e1aa35e84
x86jit: Just do the ES/NS compare once.
2014-11-10 23:04:38 -08:00
Unknown W. Brackets
2758e8fa3c
x86jit: Optimize vcmp for single and simd.
2014-11-10 23:04:37 -08:00
Unknown W. Brackets
27d8108bb2
x86jit: Optimize loads of 0 into fp regs.
2014-11-08 18:41:16 -08:00
Unknown W. Brackets
57caa95273
x86jit: Implement round.w.s and friends.
...
They are not terribly fast, though, updating MXCSR.
2014-11-08 17:59:38 -08:00
Unknown W. Brackets
671dee85c7
x86jit: Micro optimize vi2f a little bit.
...
This didn't help overall perf much but micro benchmarks are better.
2014-11-08 13:07:01 -08:00
Unknown W. Brackets
c29b126357
x86jit: Oops, can't have an imm here.
2014-11-08 12:41:48 -08:00
Unknown W. Brackets
c0be19edb6
x86jit: Simplify vavg a bit.
2014-11-08 12:40:04 -08:00
Unknown W. Brackets
761e269e5f
x86jit: Avoid some regcache pollution.
2014-11-08 12:38:08 -08:00
Unknown W. Brackets
bc7497857a
x86jit: Micro optimize vi2x a bit with ssse3/sse4.
...
Both are small wins.
2014-11-08 12:13:26 -08:00
Unknown W. Brackets
0e646f748a
x86jit: Implement vi2x instructions.
...
Also, my opcodes were wrong in the test (shifted the pair bit the wrong
way, oops.)
AFAICT, there's no reason PSRAD/etc. were not encoding REX...
2014-11-08 12:13:26 -08:00
Unknown W. Brackets
ddc90ee550
x86jit: Implement vfad and vavg.
2014-11-08 12:13:25 -08:00
Unknown W. Brackets
5ae43defd9
Oops, these should be signed.
2014-11-08 09:39:17 -08:00
Unknown W. Brackets
316e923b40
x86jit: Implement other forms of vx2i.
...
Gains 3.2% performance in Grand Knights History.
2014-11-08 00:39:40 -08:00
Unknown W. Brackets
097a483d77
x86jit: Micro optimize vs2i a bit.
2014-11-06 22:45:54 -08:00
Unknown W. Brackets
3061e89250
Fix copy/paste mistake.
2014-11-04 01:41:17 -08:00
Unknown W. Brackets
0d36d4e082
Add a helper to reduce duplicate code.
...
This is not performance critical. I wonder if compilers can inline
closures?
2014-11-03 23:50:23 -08:00
Unknown W. Brackets
16ca2b0155
x86jit: Fix trig vv2ops on 32-bit, arg.
2014-11-03 23:43:18 -08:00
Unknown W. Brackets
3e95763a3f
x86jit: Implement other rounding modes in vf2i.
...
3% improvement in Grand Knights History. I know other games use these
too.
2014-11-03 23:27:05 -08:00
Unknown W. Brackets
717cf25f0d
x86jit: Use our sincos funcs for VV2Op as well.
...
Small (0.7%) speedup in Gods Eater Burst. There's probably SSE
approximations we could use instead, but those will also need at least xmm
reg flushing/thunking.
At least this avoids flushing gprs, etc. The sin and cos ops are fairly
common.
2014-11-03 22:13:38 -08:00
Unknown W. Brackets
0f32103615
x86jit: Consistently use mips_.
2014-10-12 15:16:09 -07:00
Unknown W. Brackets
e3a04aa2d2
x86jit: Preload sp and similar regs used often.
...
This can help us avoid using a temporary.
Very tiny performance improvement.
2014-10-12 14:53:56 -07:00
Unknown W. Brackets
4459b8f483
jit: Actually jit vmtfc/vmfvc.
...
Sicne we have them and they are easy.
2014-09-01 23:13:39 -07:00
Unknown W. Brackets
252100aee5
Remove outdated comment (real cause found/fixed.)
2014-06-28 16:06:10 -07:00
Unknown W. Brackets
bc3d789c8a
x86jit: Cache the vfpu compare flags in a reg.
...
Again, to match armjit.
2014-06-28 00:38:55 -07:00
Henrik Rydgard
0879d76503
VFPU: Ensure that sin(4*x) returns 0.0 (and cos 1) for all x. Fixes #2921
2014-06-15 11:03:00 +02:00
Unknown W. Brackets
5b24e0107f
x86jit: Correct vsat0/vsat1 handling.
2014-05-16 01:04:58 -07:00
Unknown W. Brackets
ccc5458a84
x86jit: Correctly handle NaNs in vfpxd sat clamps.
...
May be a small performance jit, but we've seen games with bugs because of
NaNs recently, so better to be safe.
2014-05-16 01:04:57 -07:00
Unknown W. Brackets
6ccae8f5a7
x86jit: Use a faster safemem fallback.
...
Really helps performance in games that use uncached addresses a lot,
without really impacting performance of most games which don't.
Of course, fastmem is faster.
2014-05-06 08:05:12 -07:00
Unknown W. Brackets
246eaeb209
x86jit: Avoid mem temp for float cmp/loads.
2014-03-22 15:56:28 -07:00
Unknown W. Brackets
ce518a432f
x86jit: Add a missing unknown prefix check.
2014-02-21 09:47:28 -08:00
Unknown W. Brackets
2347498667
x86jit: Use templates to avoid some void * casts.
...
Makes it a bit cleaner and potentially safer.
2014-01-18 09:57:13 -08:00
Henrik Rydgard
2eab4aa1bf
Play around with function replacement. Turned off by default of course.
2013-12-17 23:40:27 +01:00
Henrik Rydgard
2d8429ac48
Assorted cleanup in the MIPS emulation
2013-12-10 13:15:16 +01:00
Henrik Rydgard
ab3037112f
Some scaffolding for a future VFPU-on-NEON implementation
2013-11-19 21:41:48 +01:00
Henrik Rydgard
99af10cb09
Get rid of bool disablePrefixes in ARM build (already gone in x86)
2013-11-19 21:41:48 +01:00