Commit graph

152 commits

Author SHA1 Message Date
Henrik Rydgård
f22249cef5 Reject zero-vertex-count draws.
I thought all the code was safe against it, but it isn't.
2023-12-29 14:09:45 +01:00
Henrik Rydgård
126d88ecfc Back out clearly inconsequential/useless .reserve() calls 2023-12-29 08:27:56 +01:00
Henrik Rydgård
ac208505a5 Remove bad debug assert 2023-12-21 11:48:52 +01:00
Henrik Rydgård
61acce195c Avoid decoding indices when we don't need them. 2023-12-20 14:25:19 +01:00
Henrik Rydgård
f86189c951 Show vertex decoders separately in profiles 2023-12-19 12:25:54 +01:00
Herman Semenov
0748ce610f [GPU/Common/D3D11/Directx9/GLES/Vulkan] Using reserve if possible 2023-12-15 14:08:22 +03:00
Henrik Rydgård
71aaad23fb Fix issue with zero-vertex draw calls. Though, should maybe just filter them out earlier. 2023-12-10 12:21:07 +01:00
Henrik Rydgård
aca3bbc9a0 DrawEngine: Remove the confusing MaxIndex accessor, replace with directly reading numDecodedVerts_ 2023-12-10 11:58:47 +01:00
Henrik Rydgård
904ce4f7e1 Quickfix regression in Outrun 2023-12-09 18:32:26 +01:00
Henrik Rydgård
7e85d3d10a Disable the new culling on RISC-V for now. 2023-12-09 16:49:02 +01:00
Henrik Rydgård
4e2a1bf81c NEON: vcvtq can scale directly, no need for a mul by const. 2023-12-09 16:48:59 +01:00
Henrik Rydgård
99548be8a3 NEON culling: Use mla operations to shave off some more cycles. ARM32 compat. 2023-12-09 16:36:01 +01:00
Henrik Rydgård
6a7ef83f4b NEON-optimize the culling 2023-12-09 15:55:51 +01:00
Henrik Rydgård
5b44e25150 SSE-optimize the frustum culling 2023-12-09 15:55:51 +01:00
Henrik Rydgård
62c936babf Flip the cull plane data around to avoid transforming each vertex multiple times. 2023-12-09 15:55:51 +01:00
Henrik Rydgård
a043962447 World space planes 2023-12-09 15:55:51 +01:00
Henrik Rydgård
dbf796bb66 Fastcull: SSE/NEON-optimize 16-bit position conversion 2023-12-09 15:55:51 +01:00
Henrik Rydgård
89d8ef87ec Use a less accurate but faster frustum cull for the general draws. 2023-12-09 15:55:51 +01:00
Henrik Rydgård
0905b6a5ad Frustum-cull small draws
Some games do a poor job of culling stuff, and some transparent
sprites can be very expensive if they cause a copy.
Skipping them if outside the viewport makes sense in that case.

One example are the flame sprites in #17797 .

Additionally, we should be able to cull through-mode draws easily, this
one doesn't even try.
2023-12-09 15:55:51 +01:00
Henrik Rydgård
aec0606ba4 Optimize the bounding box code for more vertex formats 2023-11-26 13:40:37 +01:00
Henrik Rydgård
e4ea4831e9 Delete the vertex cache option from the code. 2023-10-10 15:43:43 +02:00
Henrik Rydgård
078018a943 Move the clockwise calculation out of DrawEngineCommon 2023-10-10 13:16:34 +02:00
Henrik Rydgård
82606b6eb2 Move the clockwise calculation out of the AddPrim loop 2023-10-10 13:00:57 +02:00
Henrik Rydgård
af47ad035d Also use the new descriptor mechanism for in-game 2023-10-10 09:00:29 +02:00
Henrik Rydgård
24409f6f94 Additional check fix 2023-10-09 21:15:17 +02:00
Henrik Rydgård
10bc6b4cd8 Safety check that doesn't fix crazy taxi 2023-10-09 21:10:53 +02:00
Henrik Rydgård
a8b8580756 Don't forget to check the stall address, even in the optimized primitive loop 2023-10-09 14:08:11 +02:00
Henrik Rydgård
7fd7015987 Fix bug in vertex cache using uninitialized data 2023-10-09 14:03:41 +02:00
Henrik Rydgård
cd35252400 DrawEngine; Convert strip sequences in a tight loop 2023-10-06 16:25:13 +02:00
Henrik Rydgård
4d95250052 Optimize further 2023-10-03 11:01:37 +02:00
Henrik Rydgård
0260aebc26 Implement fast-path for merging non-indexed draws quickly. 2023-10-03 11:01:37 +02:00
Henrik Rydgård
1c49d5718c Add an offset field that we'll need later 2023-10-03 11:01:37 +02:00
Henrik Rydgård
92ffef2626 Remove some state from IndexGenerator, fix bugs. Mostly works except vertex cache. 2023-10-03 11:01:37 +02:00
Henrik Rydgård
9b411af1f5 It's running. 2023-10-03 11:01:37 +02:00
Henrik Rydgård
10f93875c6 Fix the semantics of DenseHashMap to be consistent even when inserting nulls 2023-09-11 12:07:18 +02:00
Unknown W. Brackets
622c69dbb9 x86jit: Expose option to select new IR based jit. 2023-08-20 22:28:54 -07:00
Henrik Rydgård
ebfd76d742 Add back the self-render check that kept Ridge Racer working.
This hack was removed in #17838
2023-08-08 15:42:52 +02:00
Henrik Rydgård
5ed4b532b7 Micro-optimize SubmitPrim, remove outdated mitigation 2023-08-02 19:14:32 +02:00
Henrik Rydgård
1475fcb065 Fix a comment 2023-08-01 00:28:54 +02:00
Henrik Rydgård
061131ec8a Cache planes used for BBOX culling
This isn't a huge performance boost for the games that use BBOX (like
Tekken), but it'll be more valuable if we start using soft culling more
widely automatically, see #17808
2023-07-30 14:42:22 +02:00
Henrik Rydgård
77da36c03f SSE addstrip: Add the early-outs. 2023-06-13 11:47:53 +02:00
Henrik Rydgård
22632b82bd
Merge pull request #17565 from hrydgard/breakout-vcache-vulkan
Vulkan: Breakout the vertex cache logic from DoFlush()
2023-06-13 09:56:52 +02:00
Henrik Rydgård
01cea7f088 Pass uvScale in as an argument to the vertex decoder
Cleaner than overwriting/restoring gstate_c.uvScale in the decoder
loop. A small cleanup I've been wanting to do for ages.

Expecting a negligble perf boost if any.
2023-06-12 20:25:18 +02:00
Henrik Rydgård
f5516d3248 Actually switch away from XXH to a custom hash, to de-risk 2023-06-12 14:24:20 +02:00
Henrik Rydgård
468757b93a Add comment about possible UV scale/offset bug. Move loop-max to local. 2023-06-12 13:16:14 +02:00
Henrik Rydgård
186b0f105c Simplify the vertex cache ID handling 2023-06-12 13:16:13 +02:00
Henrik Rydgård
80e47b7bd3 Only dirty the uniform UVSCALEOFFSET when really needed
Broken out from #17479

With OpenGL, greatly reduces the amount of glUniform4fv calls in many games (and
similar in the other backends).
2023-05-25 15:00:57 +02:00
Henrik Rydgård
f16f879b41 Some renaming to follow the standard of appending _ to member vars 2023-05-23 18:00:50 +02:00
Henrik Rydgård
d51d1413a3 DrawEngineCommon: Rename decoded to decoded_ 2023-05-23 16:46:43 +02:00
Henrik Rydgård
0e2fb13c61 Make sure we never end up with a null vertex decoder. 2023-05-03 22:22:54 +02:00