Commit graph

599 commits

Author SHA1 Message Date
Tyler Stachecki
b36574e5bd Commit AIO's VLT optimizations. 2014-12-30 10:32:55 -05:00
Tyler Stachecki
84544d9521 Work in AIO's optimizations for VABS. 2014-12-29 17:36:25 -05:00
Tyler Stachecki
b4c83e8d4b Set initial values for VCC/VCO/VCE.
Thanks, krom!
2014-12-29 17:23:29 -05:00
Tyler Stachecki
2585daa532 Move around and patch bugs in new functions. 2014-12-28 15:18:47 -05:00
Tyler Stachecki
b8fb829e1c Prevent register-caching on MinGW.
Since Microsoft decided to totally bork their x86_64 calling
convention, defer all Windows builds to non-optimized RSP
routines. When MinGW supports __vectorcall, this change can
be reverted.
2014-12-28 13:13:05 -05:00
Tyler Stachecki
dda369c888 Add support PE/COFF executable formats. 2014-12-28 11:18:36 -05:00
Tyler Stachecki
25670f76e7 Update toolchains with GNU AS references. 2014-12-28 10:37:06 -05:00
Tyler Stachecki
7e6293684e Optimize register-caching version of VMRG. 2014-12-28 10:17:23 -05:00
Tyler Stachecki
9077006d24 Only use VEX-encoded SSE where it helps us.
Otherwise, stick to the "legacy" SSE instructions as they're
smaller and we don't use the upper halves of AVX registers
anyways.
2014-12-28 10:11:14 -05:00
Tyler Stachecki
d9d9eabf50 Fix register-caching version of VABS. 2014-12-28 10:05:43 -05:00
Tyler Stachecki
86967b828e Actually enable the register caching...
And fix a lot of bugs introduced with a regex.
2014-12-27 17:32:57 -05:00
Tyler Stachecki
fb3c395277 Implement register-caching version of VLT. 2014-12-27 16:35:20 -05:00
Tyler Stachecki
ed1e354c68 Change RSP calling convention.
pblendvb needs the mask in %xmm0, so change the calling convention
around just enough so we can cut out a movdqa from most instructions.
2014-12-27 15:46:33 -05:00
Tyler Stachecki
d551d2b631 Implement register-caching version of VMRG. 2014-12-27 14:58:09 -05:00
Tyler Stachecki
9c43cf65ac Minor tweaks to VEQ/VNE register-cached versions. 2014-12-27 14:32:50 -05:00
Tyler Stachecki
c979744c1a Implement register-caching versions of VGE. 2014-12-27 13:16:15 -05:00
Tyler Stachecki
8915df71d8 Implement register-caching versions of VEQ/VNE. 2014-12-27 12:19:31 -05:00
Tyler Stachecki
ef997fe107 Prepare to register-cache RSP flags. 2014-12-27 10:55:31 -05:00
Tyler Stachecki
41eba75bc7 Register-caching variations of bitwise functions. 2014-12-27 10:13:53 -05:00
Tyler Stachecki
2efca0d94c Implement register-caching versions of VABS. 2014-12-27 09:45:03 -05:00
Tyler Stachecki
4d55c15721 Actually optimize RelWithDebInfo builds. 2014-12-27 08:15:10 -05:00
Tyler Stachecki
502e412dd7 Fix SSSE3 builds/regex mistake in CMakeLists. 2014-12-26 14:45:08 -05:00
Tyler Stachecki
b0469d8c97 Commit latest fork of angrylion/MAME RDP. 2014-12-26 14:26:17 -05:00
Tyler Stachecki
3a582f81ac Clamp VMOV/VRCP/VRSQ in/outputs to full elements. 2014-12-26 14:19:46 -05:00
Tyler Stachecki
c1f4ddd911 Fix MFC2/MTC2 odd-element byte indexing. 2014-12-26 14:19:46 -05:00
Tyler Stachecki
71db976759 Fix a typo in the VMOV implementation. 2014-12-26 14:19:46 -05:00
Tyler Stachecki
bc8300c7de Fix a pair RSP flag-related bugs. 2014-12-26 14:19:46 -05:00
Tyler Stachecki
574c85ad37 Add some missing flag clears to VCL. 2014-12-26 14:19:46 -05:00
Tyler Stachecki
8f17a516bc Fix a stray memory copy. 2014-12-26 14:19:46 -05:00
Tyler Stachecki
6d0af5d89a Cleanup SSSE3+ loads and stores. 2014-12-26 14:19:46 -05:00
Tyler Stachecki
ee526c543c Commit AIO's VCR optimizations. 2014-12-26 14:19:46 -05:00
Tyler Stachecki
3a969b2379 Do some general cleanup/optimization. 2014-12-26 14:19:46 -05:00
Tyler Stachecki
b740c9a5b3 Optimize RSP CP2 register transfers. 2014-12-26 14:19:45 -05:00
Tyler Stachecki
fea458e70c Add (partial) implementations for LPV/LUV/SPV/SUV.
Also, cleanup other SSSE3+ accelerated loads and stores.
2014-12-26 14:19:45 -05:00
Tyler Stachecki
03f04c1b82 Add implementation for MTC2. 2014-12-26 14:19:45 -05:00
Tyler Stachecki
9f9e3ebf80 Sort out a pair of RSP bugs. 2014-12-26 14:19:45 -05:00
Tyler Stachecki
b33f2800ae Add implementation for MFC2. 2014-12-26 14:19:45 -05:00
Tyler Stachecki
a2f87f843c Optimize VRCP* and VRSQ* functions. 2014-12-26 14:19:45 -05:00
Tyler Stachecki
824131db6b Use a union for RSP vectors to force alignment. 2014-12-26 14:19:45 -05:00
Tyler Stachecki
dc008abe77 Fix more show-stopping RSP bugs. 2014-12-26 14:19:45 -05:00
Tyler Stachecki
173815ed63 Another bug: make sure memory requests get filled. 2014-12-26 14:19:45 -05:00
Tyler Stachecki
1e059e3f71 Fix a potentially disasterous RSP bug. 2014-12-26 14:19:45 -05:00
Tyler Stachecki
645f4b06ea Minor cleanup to the RSP pipeline. 2014-12-26 14:19:45 -05:00
Tyler Stachecki
6faca60054 Start reworking RSP vector loads and stores. 2014-12-26 14:19:45 -05:00
Tyler Stachecki
f1929a056c Commit AIO's VMACF implementation. 2014-12-24 15:18:59 -05:00
Tyler Stachecki
ae714715fb Commit AIO's VABS optimization. 2014-12-23 01:13:50 -05:00
Tyler Stachecki
ab8dde80e9 Add AIO's implementation for VMULU. 2014-12-23 01:10:15 -05:00
Tyler Stachecki
2ee295a671 Fix RSP DMEM accesses.
Up until now, the simulator assumed that DMEM accesses had to be
aligned (similarly to the VR4300). This is not actually the case,
so allow scalar memory access to arbitrary DMEM addresses.
2014-12-22 23:53:13 -05:00
Tyler Stachecki
3f2329be5b Fix a bug in VRCP/VRSQ precision selection. 2014-12-22 21:06:17 -05:00
Tyler Stachecki
e52e031ce3 Add implementations for VRSQ, VRSQL, and VRSQH. 2014-12-22 20:47:48 -05:00