Tyler J. Stachecki
c0d3be2561
rsp_veq_vge_vlt_vne optimization patch from izy.
...
mm: Might be better to fold these indirect branches back
into the RSP vector function table if we're going to ask
the compiler to generate a jump table here.
2016-02-06 14:00:49 -05:00
Tyler J. Stachecki
5e3c85df8d
Improved SSE2 vector shuffle patch from izy.
2016-02-06 13:58:04 -05:00
Tyler J. Stachecki
d62386cdd2
rsp_uclamp_acc optimization patch from izy.
...
mm: Not 100% sure on the correctness of this one;
will check and cherry-pick/edit this commit later
if that ends up not being the case.
2016-02-06 13:57:30 -05:00
Tyler J. Stachecki
b2341ce0e0
rsp.c patch from izy.
2016-02-03 22:24:33 -05:00
Derek "Turtle" Roe
c4afd44ed7
See long description
...
Replaced all references to simulation with emulation
Updated copyright year
Updated .gitignore to reduce chances of random files being uploaded to
the repo
Added .gitattributes to normalize all text files, and to ignore binary
files (which includes the logo and the NEC PDF)
2015-07-03 08:18:16 -04:00
Tyler J. Stachecki
65a9adddf2
Fix a serious typo.
2015-05-23 23:00:30 -04:00
Tyler J. Stachecki
40d7c43914
VR4300: Minor optimization.
2015-05-20 20:59:07 -04:00
Tyler J. Stachecki
b05f2b2c6a
Add a function to save/restore hostregs.
2015-03-07 11:37:51 -05:00
Tyler Stachecki
ec126dfc85
Fix SSE2 endian issue in the RSP ldst functions.
2015-01-28 22:38:24 -05:00
Tyler Stachecki
b80151069b
Alignment/size optimizations.
2015-01-28 22:16:50 -05:00
Tyler Stachecki
10d32ce427
Optimize FPU operations somewhat.
2015-01-27 10:27:13 -05:00
Tyler Stachecki
9b7a3c5fb5
Vectorize/inline/optimize CFC2.
2015-01-27 10:27:08 -05:00
Tyler Stachecki
ca8e052024
Add (unoptimized) SSE2 support.
2015-01-07 17:32:19 -05:00
John Paul Adrian Glaubitz
08feca5ff1
Fix name mismatches of 'srcp' parameter in rsp_vect_load_and_shuffle_operand.
...
Signed-off-by: Tyler Stachecki <tstache1@binghamton.edu>
2015-01-07 09:43:39 -05:00
Tyler Stachecki
5229996ecd
Trim off a few hundred bytes of code.
2015-01-05 23:00:49 -05:00
Tyler Stachecki
5240b35d45
More cleanup of the fault/TLB code.
2015-01-04 15:37:47 -05:00
Tyler Stachecki
179a81775f
Remove an old (unused) file.
2015-01-03 15:20:56 -05:00
Tyler Stachecki
84d19566b9
Merge more functions together.
2015-01-02 23:51:20 -05:00
Tyler Stachecki
4a40a4db8a
Merge a handful of the vector compares.
2015-01-02 23:03:15 -05:00
Tyler Stachecki
7262516636
Start merging RSP vector functions.
...
No need to separate all these functions when they contain so
much common code, so start combining things for the sake of
locality and predictor effectiveness (and size). In addition
to these benefits, the CPU backend is usually busy during the
execution of these functions, so suffering a misprediction
isn't as painful (especially seeing as we can potentially
improve the prediction from the indirect branch).
2015-01-02 22:17:41 -05:00
Tyler Stachecki
2de77746e7
Disable register caching for now.
...
Until we can work around system libraries stomping over the
registers we want to reserve, just disable register caching for
the time being.
2015-01-02 20:59:09 -05:00
Tyler Stachecki
03663a68f6
Add an implementation for VMACU.
2015-01-02 20:52:39 -05:00
Tyler Stachecki
a3b9e13ac4
Fix VMACF accumulation issues and lighting problems.
2015-01-02 19:47:52 -05:00
Tyler Stachecki
54c79ebc73
Hacky fix to patch register caching.
...
On Windows, acc_lo (%xmm5) was clashing with the x64 calling
convention, which states %xmm5 is a volatile register and is
the caller's responsibility to save. We need the register
preserved across calls, so until we have a better solution to
the problem, pick registers that are not volatile according to
the calling convention.
2015-01-02 15:31:45 -05:00
Tyler Stachecki
b29b33edff
Fix a CFC2/VCE error that produced the wrong mask.
2015-01-01 23:12:07 -05:00
Tyler Stachecki
0ce394bfe8
Fix potential undefined behaviour issues.
2015-01-01 21:57:49 -05:00
Tyler Stachecki
287e3370c5
Commit some MSVC-specific workarounds.
2014-12-31 16:20:53 -05:00
Tyler Stachecki
30f9dce6b5
Fix VLT clipping bugs.
...
Thank you, AIO, for pointing this out.
2014-12-31 16:17:49 -05:00
Tyler Stachecki
d5eb2f2296
Cleanup the recently-committed VCH.
...
We should refer to %xmm5 as acc_lo.
2014-12-31 10:36:09 -05:00
Tyler Stachecki
878521f54b
Add register-caching version of VCH.
...
Thanks go out to AIO for rounding out this commit with
his optimized SSE2 variant.
2014-12-31 08:51:40 -05:00
Tyler Stachecki
f0c4c90d7a
Fix a typo that broke some builds.
2014-12-30 17:51:06 -05:00
Tyler Stachecki
f80d494723
Convert AIO's VABS optimization to AVX.
2014-12-30 17:49:19 -05:00
Tyler Stachecki
4fdf41cb61
Fix a mask typo in the last commit.
2014-12-30 17:36:06 -05:00
Tyler Stachecki
9804d16330
Fix a buggy accumulator clamp algorithm.
2014-12-30 17:26:35 -05:00
Tyler Stachecki
31577f57e6
Enable register-caching on MinGW.
...
Use a prelude to get around Microsoft's stupid calling convention.
2014-12-30 11:37:08 -05:00
Tyler Stachecki
b36574e5bd
Commit AIO's VLT optimizations.
2014-12-30 10:32:55 -05:00
Tyler Stachecki
84544d9521
Work in AIO's optimizations for VABS.
2014-12-29 17:36:25 -05:00
Tyler Stachecki
2585daa532
Move around and patch bugs in new functions.
2014-12-28 15:18:47 -05:00
Tyler Stachecki
b8fb829e1c
Prevent register-caching on MinGW.
...
Since Microsoft decided to totally bork their x86_64 calling
convention, defer all Windows builds to non-optimized RSP
routines. When MinGW supports __vectorcall, this change can
be reverted.
2014-12-28 13:13:05 -05:00
Tyler Stachecki
dda369c888
Add support PE/COFF executable formats.
2014-12-28 11:18:36 -05:00
Tyler Stachecki
7e6293684e
Optimize register-caching version of VMRG.
2014-12-28 10:17:23 -05:00
Tyler Stachecki
9077006d24
Only use VEX-encoded SSE where it helps us.
...
Otherwise, stick to the "legacy" SSE instructions as they're
smaller and we don't use the upper halves of AVX registers
anyways.
2014-12-28 10:11:14 -05:00
Tyler Stachecki
d9d9eabf50
Fix register-caching version of VABS.
2014-12-28 10:05:43 -05:00
Tyler Stachecki
86967b828e
Actually enable the register caching...
...
And fix a lot of bugs introduced with a regex.
2014-12-27 17:32:57 -05:00
Tyler Stachecki
fb3c395277
Implement register-caching version of VLT.
2014-12-27 16:35:20 -05:00
Tyler Stachecki
ed1e354c68
Change RSP calling convention.
...
pblendvb needs the mask in %xmm0, so change the calling convention
around just enough so we can cut out a movdqa from most instructions.
2014-12-27 15:46:33 -05:00
Tyler Stachecki
d551d2b631
Implement register-caching version of VMRG.
2014-12-27 14:58:09 -05:00
Tyler Stachecki
9c43cf65ac
Minor tweaks to VEQ/VNE register-cached versions.
2014-12-27 14:32:50 -05:00
Tyler Stachecki
c979744c1a
Implement register-caching versions of VGE.
2014-12-27 13:16:15 -05:00
Tyler Stachecki
8915df71d8
Implement register-caching versions of VEQ/VNE.
2014-12-27 12:19:31 -05:00