Commit graph

51 commits

Author SHA1 Message Date
Derek "Turtle" Roe
c4afd44ed7 See long description
Replaced all references to simulation with emulation
Updated copyright year
Updated .gitignore to reduce chances of random files being uploaded to
the repo
Added .gitattributes to normalize all text files, and to ignore binary
files (which includes the logo and the NEC PDF)
2015-07-03 08:18:16 -04:00
Tyler Stachecki
9b7a3c5fb5 Vectorize/inline/optimize CFC2. 2015-01-27 10:27:08 -05:00
Tyler Stachecki
9edd00f286 Remove old function definitions. 2015-01-02 23:55:28 -05:00
Tyler Stachecki
84d19566b9 Merge more functions together. 2015-01-02 23:51:20 -05:00
Tyler Stachecki
7262516636 Start merging RSP vector functions.
No need to separate all these functions when they contain so
much common code, so start combining things for the sake of
locality and predictor effectiveness (and size). In addition
to these benefits, the CPU backend is usually busy during the
execution of these functions, so suffering a misprediction
isn't as painful (especially seeing as we can potentially
improve the prediction from the indirect branch).
2015-01-02 22:17:41 -05:00
Tyler Stachecki
03663a68f6 Add an implementation for VMACU. 2015-01-02 20:52:39 -05:00
Tyler Stachecki
7a6ecabcc1 Fix a series of RSP bugs that krom pointed out. 2015-01-01 21:09:08 -05:00
Tyler Stachecki
878521f54b Add register-caching version of VCH.
Thanks go out to AIO for rounding out this commit with
his optimized SSE2 variant.
2014-12-31 08:51:40 -05:00
Tyler Stachecki
31577f57e6 Enable register-caching on MinGW.
Use a prelude to get around Microsoft's stupid calling convention.
2014-12-30 11:37:08 -05:00
Tyler Stachecki
86967b828e Actually enable the register caching...
And fix a lot of bugs introduced with a regex.
2014-12-27 17:32:57 -05:00
Tyler Stachecki
fb3c395277 Implement register-caching version of VLT. 2014-12-27 16:35:20 -05:00
Tyler Stachecki
ed1e354c68 Change RSP calling convention.
pblendvb needs the mask in %xmm0, so change the calling convention
around just enough so we can cut out a movdqa from most instructions.
2014-12-27 15:46:33 -05:00
Tyler Stachecki
c979744c1a Implement register-caching versions of VGE. 2014-12-27 13:16:15 -05:00
Tyler Stachecki
8915df71d8 Implement register-caching versions of VEQ/VNE. 2014-12-27 12:19:31 -05:00
Tyler Stachecki
ef997fe107 Prepare to register-cache RSP flags. 2014-12-27 10:55:31 -05:00
Tyler Stachecki
41eba75bc7 Register-caching variations of bitwise functions. 2014-12-27 10:13:53 -05:00
Tyler Stachecki
2efca0d94c Implement register-caching versions of VABS. 2014-12-27 09:45:03 -05:00
Tyler Stachecki
574c85ad37 Add some missing flag clears to VCL. 2014-12-26 14:19:46 -05:00
Tyler Stachecki
b33f2800ae Add implementation for MFC2. 2014-12-26 14:19:45 -05:00
Tyler Stachecki
824131db6b Use a union for RSP vectors to force alignment. 2014-12-26 14:19:45 -05:00
Tyler Stachecki
f1929a056c Commit AIO's VMACF implementation. 2014-12-24 15:18:59 -05:00
Tyler Stachecki
ab8dde80e9 Add AIO's implementation for VMULU. 2014-12-23 01:10:15 -05:00
Tyler Stachecki
3f2329be5b Fix a bug in VRCP/VRSQ precision selection. 2014-12-22 21:06:17 -05:00
Tyler Stachecki
e52e031ce3 Add implementations for VRSQ, VRSQL, and VRSQH. 2014-12-22 20:47:48 -05:00
Tyler Stachecki
4b6904240e Add implementations for VRCP, VRCPL, and VRCPH. 2014-12-22 20:29:16 -05:00
Tyler Stachecki
73709f4c45 Add implementation for VCR. 2014-12-22 13:01:03 -05:00
Tyler Stachecki
88310a8104 Add AIO's implementation for VMULF. 2014-12-22 09:50:29 -05:00
Tyler Stachecki
f268795da5 Add implementation for VMRG. 2014-12-21 15:49:44 -05:00
Tyler Stachecki
9f4664a4b6 Add implementation for VADDC. 2014-12-21 15:29:16 -05:00
Tyler Stachecki
a955bf1e2c Add implementation for VSUBC. 2014-12-21 15:07:00 -05:00
Tyler Stachecki
f199c7bac8 Add implementation for VABS. 2014-12-21 12:59:36 -05:00
Tyler Stachecki
de5b5b0f96 Commit AIO's VSUB optimizations, fix carry/borrow issue. 2014-12-21 12:55:38 -05:00
Tyler Stachecki
0be40f4358 Add implementations for VGE and VLT. 2014-12-21 11:08:00 -05:00
Tyler Stachecki
dc50279609 Add implementations for VEQ and VNE. 2014-12-21 10:39:10 -05:00
Tyler Stachecki
e1de6cd92d Add implementations for VCH. 2014-12-21 09:29:58 -05:00
Tyler Stachecki
145141225e Add implementations for VCL and CFC2. 2014-12-20 12:27:38 -05:00
Tyler Stachecki
c72f2c5028 Fix RSP alignment issues once and for all. 2014-12-19 20:03:03 -05:00
Tyler Stachecki
0a9b8c2367 Make read_acc_* return a value.
Instead of writing through a pointer, just return the value.
Thank you, Jared, for pointing out my stupidity.
2014-11-13 19:54:33 -05:00
Tyler Stachecki
3a24a67f1f Fix poor SSE2-based RSP performance. 2014-11-10 11:02:57 -05:00
Tyler Stachecki
380577dfe3 Remove an assertion from VSAR...
Thanks to krom for reverse engineering this!
2014-11-08 21:27:45 -05:00
Tyler Stachecki
9f8a9f9d62 Add implementations of VMADH and VMUDH. 2014-11-08 14:01:41 -05:00
Tyler Stachecki
007d72eda1 Add implementations of VMADL and VMADM. 2014-11-08 12:21:06 -05:00
Tyler Stachecki
e89f054674 Optimize extremely aggressively.
Tell GCC to optimize cold functions for size and stash them away in
a separate part of the binary. Put the simulate core, meanwhile, on
the hot path. Also, bump optimization to -O3 as we can now "afford"
to do so.
2014-11-05 08:39:47 -05:00
Tyler Stachecki
b668296589 Add implementations of VADD and VSUB. 2014-11-03 18:06:32 -05:00
Tyler Stachecki
083ad75286 arch/x86_64: Cache RSP accumulator regs in host CPU. 2014-11-03 16:48:38 -05:00
Tyler Stachecki
d4a8f82b10 Change the RSP vector calling convention. 2014-11-02 22:45:33 -05:00
Tyler Stachecki
b5ff809881 Add an implementation of VMADN. 2014-11-02 22:31:58 -05:00
Tyler Stachecki
bf197cf3bd Implement VMUDL, VMUDM, VMUDN. 2014-11-02 12:44:19 -05:00
Tyler Stachecki
c4612418ed Implement VINV, fixup INV. 2014-11-02 11:57:26 -05:00
Tyler Stachecki
e63b13605e Various LWC2/SWC2 fixes, add VSAR. 2014-10-24 21:07:25 -04:00