Commit graph

43 commits

Author SHA1 Message Date
LegendOfDragoon
e5a215c547 Fix typo in tcmask 2016-11-04 22:49:49 -07:00
Tyler J. Stachecki
72ce1e0804 rdp: [AIO]: More LUT optimizations. 2016-07-18 18:49:14 -04:00
Tyler J. Stachecki
45b5e1cbc1 rdp: [AIO]: Reduce size of LUTs.
Some of the LUTs used unnecessarily large field sizes.
2016-07-18 18:10:23 -04:00
Tyler Stachecki
18ff341415 rdp: Fix the frameskipping problem.
Don't let the RDP get too far ahead of the other cores or
it causes lots of frameskipping issues. Unfortunately, this
also hurts the performance but such is life.
2016-07-18 01:48:57 -04:00
Tyler J. Stachecki
4b34b15e8d rdp: Lots of refactoring, optimizations.
* Don't use global data to store RDP command list, etc.

  * Reduce the amount of time the RDP holds the lock at
    any given time. This should prevent the RCP thread
    from getting held up at the RCP/VR4300 sync window.

  * Thread safety: Hold RDP mutex when reading RDP regs.

  * Fix early DP status update (we should hold pipe busy,
    cmd busy, etc. until we're really done with the entire
    command list; not the moment we start processing it!)

  * Inline more functions.
2016-07-17 17:59:18 -04:00
Tyler J. Stachecki
4b662eb234 rdp: Fix a regression introduced recently. 2016-07-16 12:28:29 -04:00
Tyler J. Stachecki
186fb254ea rdp: [izy] Optimize rgbaz_correct_clip and fbfills.
izy found some clever techniques to optimize the branch-y
part of rgbaz_correct_clip and the inner loop of fbfills.
2016-07-16 11:55:13 -04:00
Tyler J. Stachecki
8d31a56b91 rdp: Make RDP multithreaded.
CEN64 with -multithread now scales up to three threads.
This commit is very unoptimized, but still offers VI/s
faster than the single-threaded RDP.

Many things that were previously VI/s limited such as
Mario Tennis (in game), Vigilante 8, Goldeneye, etc.
will now run at 60 VI/s at least on an i7. More to come
in the future.
2016-07-13 18:20:21 -04:00
Tyler J. Stachecki
d74e9f4a7a rdp: Implement AIO's fbfill optimization.
AIO mentioned that fbfills could be sped up *multiple* times.

He wasn't lying.
2016-07-11 09:47:21 -04:00
Tyler Stachecki
9d9fc68796 rdp: Store hidden_bits along with RDRAM. 2016-07-10 21:34:04 -04:00
Tyler Stachecki
5839caf55d rdp: Use branch weights for unlikely events. 2016-07-10 20:19:31 -04:00
Tyler Stachecki
becbff4cb1 rdp: Optimize fbread and fbwrite.
* Kill fb_format global.
  * Reduce indirect branch targets (fold fb_read ptrs).
  * Reduce branching inside fbread/fbwrite functions.
2016-07-10 20:06:31 -04:00
Tyler J. Stachecki
9d5dbf564a rdp: Devirtualize the dither noise functions. 2016-07-10 17:06:05 -04:00
Tyler J. Stachecki
f30b2f8bff rdp: Devirtualize the dither function. 2016-07-10 17:06:01 -04:00
Tyler J. Stachecki
ddb390af27 rdp: devirtualize and optimize tcdiv. 2016-07-10 17:05:56 -04:00
Tyler J. Stachecki
66f44cb8a4 Fix segfaults caused by 25493f.
RDP commands can be 64-bit, so we can't guarantee 128-bit
alignment for loading ewdata into XMM registers.
2016-07-10 17:05:37 -04:00
Tyler J. Stachecki
cff7c9c0f3 Optimize some texel fetching operations. 2016-07-10 17:05:20 -04:00
Tyler J. Stachecki
2bdabf1798 Optimize prev color in texture_pipeline_cycle.
TODO: Verify correctness later, LGTM though...
2016-07-10 17:05:15 -04:00
Tyler J. Stachecki
333d5d5c98 Kill redundant code by folding common functions. 2016-07-10 17:05:11 -04:00
Tyler J. Stachecki
2cf82984cf Small rgbaz_correct_clip optimization.
Pick up a small optimization that the compiler missed.
This helps cut down on the otherwise branch-y switch
code.
2016-07-10 17:05:03 -04:00
Tyler J. Stachecki
a89e80fad7 Small edgewalker_for_prims optimization. 2016-07-10 17:04:57 -04:00
Tyler Stachecki
bb100347b9 Fix a typo in texture_pipeline_cycle optimization. 2016-07-10 17:02:59 -04:00
Tyler J. Stachecki
2d8c954139 Avoid spilling to stack in texture_pipeline_cycle. 2016-07-10 17:02:24 -04:00
Tyler J. Stachecki
7a5f80e4c6 More optimizations to texture_pipeline_cycle. 2016-07-10 17:02:18 -04:00
Tyler J. Stachecki
9322cc5d8f Minor optimization to fetch_texel_quadro_rgba16. 2016-07-10 17:02:12 -04:00
Tyler J. Stachecki
de4cafa1c5 Use 16-bit for COLOR storage.
Using 16-bit will allow us to better accelerate some of the
texel functions. This commit in and of itself doesn't really
change much.
2016-07-10 17:02:07 -04:00
Tyler Stachecki
22c87ca481 RDP: Force GCC to inline just about everything. 2016-07-10 17:02:02 -04:00
Tyler Stachecki
79286f49a7 Switch over to SSE-ified tcdiv_persp fns. 2016-07-10 17:01:54 -04:00
Tyler Stachecki
355991cfd9 More vectorization, eliminate some more globals. 2016-07-10 17:01:49 -04:00
Tyler Stachecki
74e0d391c0 Start vectorizing more of the render spans. 2016-07-10 17:01:43 -04:00
Tyler Stachecki
97703468a2 Pass some spans via registers instead of memory. 2016-07-10 17:01:38 -04:00
Tyler Stachecki
88784539b4 Add AIO's vectorized rgbaz_correct_clip algorithm. 2016-07-10 17:01:30 -04:00
Tyler Stachecki
d1694a2040 Optimize fetch_texel_quadro_rgba16. 2016-07-10 17:01:24 -04:00
Tyler Stachecki
d2bfebdfdf Vectorize some parts of spans rendering. 2016-07-10 17:01:19 -04:00
Tyler Stachecki
a65c44105c Vectorize the prologue of edgewalker_for_prims. 2016-07-10 17:01:12 -04:00
Tyler Stachecki
a8befe516c Vectorize one case of quadro texel access (RGBA16). 2016-07-10 17:00:49 -04:00
Tyler J. Stachecki
9a613894e3 Vectorize part of texture_pipeline_cycle.
AIO suggested vectorizing texture_pipeline_cycle for a
speed improvement. It seems to have worked quite nicely.
2016-07-10 17:00:37 -04:00
Tyler J. Stachecki
3288229a50 Start fixing MSVC builds.
Conflicts:
	rdp/n64video.c
2016-06-26 17:19:17 -04:00
Tyler J. Stachecki
8415caf9ad RDP fixes. Wonder how long these have been there? 2016-06-26 12:20:29 -04:00
Tyler Stachecki
0c41dc33dc Upgrade to angrylion r98.
r97 was VI filter related and r98 just adds MSVC solution files,
so this commit is really just for posterity more than anything.
2016-06-17 05:31:19 -04:00
Tyler J. Stachecki
cc07dad626 angrylion-rdp: Upgrade from r83 to r96.
Patch courtesy of Snowstorm64; thank you.

Conflicts:
	rdp/n64video.c
2016-06-17 05:11:27 -04:00
Tyler Stachecki
8ce013b165 Fix RDP RDRAM access range on Linux. 2016-06-17 05:07:25 -04:00
Tyler Stachecki
154343bdea Commit latest fork of angrylion/MAME RDP.
Conflicts:
	device/device.c
2016-06-17 05:06:34 -04:00