Derek "Turtle" Roe
c4afd44ed7
See long description
...
Replaced all references to simulation with emulation
Updated copyright year
Updated .gitignore to reduce chances of random files being uploaded to
the repo
Added .gitattributes to normalize all text files, and to ignore binary
files (which includes the logo and the NEC PDF)
2015-07-03 08:18:16 -04:00
Tyler Stachecki
9b7a3c5fb5
Vectorize/inline/optimize CFC2.
2015-01-27 10:27:08 -05:00
Tyler Stachecki
9edd00f286
Remove old function definitions.
2015-01-02 23:55:28 -05:00
Tyler Stachecki
84d19566b9
Merge more functions together.
2015-01-02 23:51:20 -05:00
Tyler Stachecki
7262516636
Start merging RSP vector functions.
...
No need to separate all these functions when they contain so
much common code, so start combining things for the sake of
locality and predictor effectiveness (and size). In addition
to these benefits, the CPU backend is usually busy during the
execution of these functions, so suffering a misprediction
isn't as painful (especially seeing as we can potentially
improve the prediction from the indirect branch).
2015-01-02 22:17:41 -05:00
Tyler Stachecki
03663a68f6
Add an implementation for VMACU.
2015-01-02 20:52:39 -05:00
Tyler Stachecki
7a6ecabcc1
Fix a series of RSP bugs that krom pointed out.
2015-01-01 21:09:08 -05:00
Tyler Stachecki
878521f54b
Add register-caching version of VCH.
...
Thanks go out to AIO for rounding out this commit with
his optimized SSE2 variant.
2014-12-31 08:51:40 -05:00
Tyler Stachecki
31577f57e6
Enable register-caching on MinGW.
...
Use a prelude to get around Microsoft's stupid calling convention.
2014-12-30 11:37:08 -05:00
Tyler Stachecki
86967b828e
Actually enable the register caching...
...
And fix a lot of bugs introduced with a regex.
2014-12-27 17:32:57 -05:00
Tyler Stachecki
fb3c395277
Implement register-caching version of VLT.
2014-12-27 16:35:20 -05:00
Tyler Stachecki
ed1e354c68
Change RSP calling convention.
...
pblendvb needs the mask in %xmm0, so change the calling convention
around just enough so we can cut out a movdqa from most instructions.
2014-12-27 15:46:33 -05:00
Tyler Stachecki
c979744c1a
Implement register-caching versions of VGE.
2014-12-27 13:16:15 -05:00
Tyler Stachecki
8915df71d8
Implement register-caching versions of VEQ/VNE.
2014-12-27 12:19:31 -05:00
Tyler Stachecki
ef997fe107
Prepare to register-cache RSP flags.
2014-12-27 10:55:31 -05:00
Tyler Stachecki
41eba75bc7
Register-caching variations of bitwise functions.
2014-12-27 10:13:53 -05:00
Tyler Stachecki
2efca0d94c
Implement register-caching versions of VABS.
2014-12-27 09:45:03 -05:00
Tyler Stachecki
574c85ad37
Add some missing flag clears to VCL.
2014-12-26 14:19:46 -05:00
Tyler Stachecki
b33f2800ae
Add implementation for MFC2.
2014-12-26 14:19:45 -05:00
Tyler Stachecki
824131db6b
Use a union for RSP vectors to force alignment.
2014-12-26 14:19:45 -05:00
Tyler Stachecki
f1929a056c
Commit AIO's VMACF implementation.
2014-12-24 15:18:59 -05:00
Tyler Stachecki
ab8dde80e9
Add AIO's implementation for VMULU.
2014-12-23 01:10:15 -05:00
Tyler Stachecki
3f2329be5b
Fix a bug in VRCP/VRSQ precision selection.
2014-12-22 21:06:17 -05:00
Tyler Stachecki
e52e031ce3
Add implementations for VRSQ, VRSQL, and VRSQH.
2014-12-22 20:47:48 -05:00
Tyler Stachecki
4b6904240e
Add implementations for VRCP, VRCPL, and VRCPH.
2014-12-22 20:29:16 -05:00
Tyler Stachecki
73709f4c45
Add implementation for VCR.
2014-12-22 13:01:03 -05:00
Tyler Stachecki
88310a8104
Add AIO's implementation for VMULF.
2014-12-22 09:50:29 -05:00
Tyler Stachecki
f268795da5
Add implementation for VMRG.
2014-12-21 15:49:44 -05:00
Tyler Stachecki
9f4664a4b6
Add implementation for VADDC.
2014-12-21 15:29:16 -05:00
Tyler Stachecki
a955bf1e2c
Add implementation for VSUBC.
2014-12-21 15:07:00 -05:00
Tyler Stachecki
f199c7bac8
Add implementation for VABS.
2014-12-21 12:59:36 -05:00
Tyler Stachecki
de5b5b0f96
Commit AIO's VSUB optimizations, fix carry/borrow issue.
2014-12-21 12:55:38 -05:00
Tyler Stachecki
0be40f4358
Add implementations for VGE and VLT.
2014-12-21 11:08:00 -05:00
Tyler Stachecki
dc50279609
Add implementations for VEQ and VNE.
2014-12-21 10:39:10 -05:00
Tyler Stachecki
e1de6cd92d
Add implementations for VCH.
2014-12-21 09:29:58 -05:00
Tyler Stachecki
145141225e
Add implementations for VCL and CFC2.
2014-12-20 12:27:38 -05:00
Tyler Stachecki
c72f2c5028
Fix RSP alignment issues once and for all.
2014-12-19 20:03:03 -05:00
Tyler Stachecki
0a9b8c2367
Make read_acc_* return a value.
...
Instead of writing through a pointer, just return the value.
Thank you, Jared, for pointing out my stupidity.
2014-11-13 19:54:33 -05:00
Tyler Stachecki
3a24a67f1f
Fix poor SSE2-based RSP performance.
2014-11-10 11:02:57 -05:00
Tyler Stachecki
380577dfe3
Remove an assertion from VSAR...
...
Thanks to krom for reverse engineering this!
2014-11-08 21:27:45 -05:00
Tyler Stachecki
9f8a9f9d62
Add implementations of VMADH and VMUDH.
2014-11-08 14:01:41 -05:00
Tyler Stachecki
007d72eda1
Add implementations of VMADL and VMADM.
2014-11-08 12:21:06 -05:00
Tyler Stachecki
e89f054674
Optimize extremely aggressively.
...
Tell GCC to optimize cold functions for size and stash them away in
a separate part of the binary. Put the simulate core, meanwhile, on
the hot path. Also, bump optimization to -O3 as we can now "afford"
to do so.
2014-11-05 08:39:47 -05:00
Tyler Stachecki
b668296589
Add implementations of VADD and VSUB.
2014-11-03 18:06:32 -05:00
Tyler Stachecki
083ad75286
arch/x86_64: Cache RSP accumulator regs in host CPU.
2014-11-03 16:48:38 -05:00
Tyler Stachecki
d4a8f82b10
Change the RSP vector calling convention.
2014-11-02 22:45:33 -05:00
Tyler Stachecki
b5ff809881
Add an implementation of VMADN.
2014-11-02 22:31:58 -05:00
Tyler Stachecki
bf197cf3bd
Implement VMUDL, VMUDM, VMUDN.
2014-11-02 12:44:19 -05:00
Tyler Stachecki
c4612418ed
Implement VINV, fixup INV.
2014-11-02 11:57:26 -05:00
Tyler Stachecki
e63b13605e
Various LWC2/SWC2 fixes, add VSAR.
2014-10-24 21:07:25 -04:00