pseudophpt
3b879e32d7
Remove debug information from VMOV code
2018-09-03 22:54:57 -04:00
pseudophpt
643e4a028a
Add shuffling to VMOV instruction
2018-09-03 22:52:18 -04:00
Tyler J. Stachecki
156d592abb
rsp: Bugfix for SSE2 RSP.
...
Thanks to Tiny Tiger and AIO for helping to point this out.
One of the arguments was being overwritten before it was
used, which caused an issue with the SSE2 codepath (while
the SSE4.1 one was fine).
2016-08-06 20:53:04 -04:00
Tyler Stachecki
b808fe50e0
rsp: Qualify shuffle arrays as static.
2016-07-09 20:00:33 -04:00
Tyler Stachecki
55e64a6c27
rsp: Fix LRV bug (data shifting problem).
...
tl;dr: Using LUTs to shift and byteswap all in one x86
instruction is awesome for performance, but makes things
absolutely horrendous to debug.
With this commit, audio mixing on the RSP works properly.
2016-07-09 17:49:03 -04:00
Tyler Stachecki
1e20e171a8
rsp: Fix LPV bug (more endianness issues).
2016-07-09 16:17:36 -04:00
Tyler J. Stachecki
c85def363c
Fix a bug in the recent CTC2 impl.
2016-06-30 10:05:11 -04:00
Tyler Stachecki
91b18f2644
rsp: Implement CTC2.
2016-06-29 21:38:25 -04:00
Mike Ryan
73f4420a4c
fix all build warnings, does not affect functionality
2016-06-16 20:40:51 -07:00
Tyler J. Stachecki
3003d774cb
Improved SSE2 vector shuffle patch from izy.
2016-02-06 14:26:47 -05:00
Tyler J. Stachecki
4e0620c637
rsp.c patch from izy.
2016-02-03 22:30:54 -05:00
Derek "Turtle" Roe
8b89df2fdc
See long description
...
Replaced all references to simulation with emulation
Updated copyright year
Updated .gitignore to reduce chances of random files being uploaded to
the repo
Added .gitattributes to normalize all text files, and to ignore binary
files (which includes the logo and the NEC PDF)
2015-07-01 18:44:21 -05:00
Tyler Stachecki
d177288d7b
Fix SSE2 endian issue in the RSP ldst functions.
2015-01-28 22:41:16 -05:00
Tyler Stachecki
1ba67eec9d
Alignment/size optimizations.
2015-01-28 22:41:07 -05:00
Tyler Stachecki
ca0b0c944d
Vectorize/inline/optimize CFC2.
2015-01-27 10:28:36 -05:00
Tyler Stachecki
88a3ea5646
Add (unoptimized) SSE2 support.
2015-01-07 17:37:24 -05:00
John Paul Adrian Glaubitz
5e16526958
Fix name mismatches of 'srcp' parameter in rsp_vect_load_and_shuffle_operand.
...
Signed-off-by: Tyler Stachecki <tstache1@binghamton.edu>
2015-01-07 09:41:43 -05:00
Tyler Stachecki
ec3748f0c2
Trim off a few hundred bytes of code.
2015-01-05 22:59:52 -05:00
Tyler Stachecki
2697ba9445
Merge more functions together.
2015-01-02 23:51:53 -05:00
Tyler Stachecki
d8f60c4afa
Merge a handful of the vector compares.
2015-01-02 23:51:40 -05:00
Tyler Stachecki
1c8f871df8
Start merging RSP vector functions.
...
No need to separate all these functions when they contain so
much common code, so start combining things for the sake of
locality and predictor effectiveness (and size). In addition
to these benefits, the CPU backend is usually busy during the
execution of these functions, so suffering a misprediction
isn't as painful (especially seeing as we can potentially
improve the prediction from the indirect branch).
2015-01-02 22:21:32 -05:00
Tyler Stachecki
d50450e624
Disable register caching for now.
...
Until we can work around system libraries stomping over the
registers we want to reserve, just disable register caching for
the time being.
2015-01-02 21:05:07 -05:00
Tyler Stachecki
c1f1998c78
Add an implementation for VMACU.
2015-01-02 21:04:44 -05:00
Tyler Stachecki
b55940f139
Fix VMACF accumulation issues and lighting problems.
2015-01-02 21:04:33 -05:00
Tyler Stachecki
9ad566c658
Hacky fix to patch register caching.
...
On Windows, acc_lo (%xmm5) was clashing with the x64 calling
convention, which states %xmm5 is a volatile register and is
the caller's responsibility to save. We need the register
preserved across calls, so until we have a better solution to
the problem, pick registers that are not volatile according to
the calling convention.
2015-01-02 15:39:21 -05:00
Tyler Stachecki
33fd6a394d
Fix a CFC2/VCE error that produced the wrong mask.
2015-01-01 23:15:45 -05:00
Tyler Stachecki
fbd0a646f6
Fix potential undefined behaviour issues.
2015-01-01 23:15:35 -05:00
Tyler Stachecki
1a7611b6dc
Commit some MSVC-specific workarounds.
2015-01-01 10:47:25 -05:00
Tyler Stachecki
eba6ce1420
Fix VLT clipping bugs.
...
Thank you, AIO, for pointing this out.
2015-01-01 10:47:07 -05:00
Tyler Stachecki
62eacc11a4
Cleanup the recently-committed VCH.
...
We should refer to %xmm5 as acc_lo.
2015-01-01 10:46:53 -05:00
Tyler Stachecki
e100147379
Add register-caching version of VCH.
...
Thanks go out to AIO for rounding out this commit with
his optimized SSE2 variant.
2015-01-01 10:46:41 -05:00
Tyler Stachecki
70efd3de4a
Fix a typo that broke some builds.
2015-01-01 10:46:34 -05:00
Tyler Stachecki
3e094c8985
Convert AIO's VABS optimization to AVX.
2015-01-01 10:46:28 -05:00
Tyler Stachecki
52afe866d4
Fix a mask typo in the last commit.
2015-01-01 10:46:22 -05:00
Tyler Stachecki
bf30cf29fd
Fix a buggy accumulator clamp algorithm.
2015-01-01 10:46:16 -05:00
Tyler Stachecki
5e313634d3
Enable register-caching on MinGW.
...
Use a prelude to get around Microsoft's stupid calling convention.
2015-01-01 10:46:10 -05:00
Tyler Stachecki
8cb3c319f9
Commit AIO's VLT optimizations.
2015-01-01 10:45:57 -05:00
Tyler Stachecki
d9b9171f92
Work in AIO's optimizations for VABS.
2015-01-01 10:45:52 -05:00
Tyler Stachecki
d9b19d3f32
Move around and patch bugs in new functions.
2015-01-01 10:45:38 -05:00
Tyler Stachecki
b54f9618df
Prevent register-caching on MinGW.
...
Since Microsoft decided to totally bork their x86_64 calling
convention, defer all Windows builds to non-optimized RSP
routines. When MinGW supports __vectorcall, this change can
be reverted.
2015-01-01 10:45:31 -05:00
Tyler Stachecki
5f10b427e1
Add support PE/COFF executable formats.
2015-01-01 10:45:22 -05:00
Tyler Stachecki
26d65b2ebe
Optimize register-caching version of VMRG.
2015-01-01 10:45:07 -05:00
Tyler Stachecki
cc785f9f5b
Only use VEX-encoded SSE where it helps us.
...
Otherwise, stick to the "legacy" SSE instructions as they're
smaller and we don't use the upper halves of AVX registers
anyways.
2015-01-01 10:45:01 -05:00
Tyler Stachecki
84cc9c93cb
Fix register-caching version of VABS.
2015-01-01 10:44:54 -05:00
Tyler Stachecki
94ad149a12
Actually enable the register caching...
...
And fix a lot of bugs introduced with a regex.
2015-01-01 10:44:47 -05:00
Tyler Stachecki
7bc95ee3ee
Implement register-caching version of VLT.
2015-01-01 10:44:40 -05:00
Tyler Stachecki
9b941eced8
Change RSP calling convention.
...
pblendvb needs the mask in %xmm0, so change the calling convention
around just enough so we can cut out a movdqa from most instructions.
2015-01-01 10:44:34 -05:00
Tyler Stachecki
ddb3c893e3
Implement register-caching version of VMRG.
2015-01-01 10:44:23 -05:00
Tyler Stachecki
4aabd7f49e
Minor tweaks to VEQ/VNE register-cached versions.
2015-01-01 10:44:16 -05:00
Tyler Stachecki
e810689fde
Implement register-caching versions of VGE.
2015-01-01 10:44:09 -05:00