Commit graph

2969 commits

Author SHA1 Message Date
Henrik Rydgård
dae758e5f4 Fix some bugs and mistakes found by Nemoumbra through static analysis 2023-11-26 13:43:11 +01:00
Henrik Rydgård
aec0606ba4 Optimize the bounding box code for more vertex formats 2023-11-26 13:40:37 +01:00
Henrik Rydgård
cb9c6dc661
Merge pull request #18418 from hrydgard/simplify-input-layout
thin3d/backends: Remove code that pretended that we supported multiple vertex streams
2023-11-13 12:51:09 +01:00
Henrik Rydgård
d891aaf9cd Remove code that pretended that we supported multiple vertex streams
Don't really see that we'll have much use for this feature, so simplify
it away. Only single vertex stream data is now supported by the thin3d
API.
2023-11-13 01:15:28 +01:00
Henrik Rydgård
77825484a0 If available, use 16-bit texture formats for MakePixelTexture when appropriate.
Optimization for God of War on low-end platforms. Avoids calling a color
conversion function that's currently only SIMD-optimized on x86, so will
also benefit ARM a little bit.
2023-11-12 15:58:03 +01:00
Henrik Rydgård
49f5da370a Simplify the logic in MakePixelTexture a bit 2023-11-12 11:19:45 +01:00
Henrik Rydgård
cc6f9a73ca Oops, fix for previous commit. And minor optimization. 2023-11-12 01:32:02 +01:00
Henrik Rydgård
632fa1c9d6 Cache and hash data for DrawPixels.
We already had a cache to reuse texture objects so just
opportunistically reuse them when easy to do so.
2023-11-11 19:58:12 +01:00
Henrik Rydgård
4f2f1c4392 Tilt: Fix some edge cases leading to division by zero and similar. 2023-11-09 19:14:31 +01:00
Henrik Rydgård
48a1348352 Move a var for clarity 2023-11-01 21:30:04 -06:00
Henrik Rydgård
ee6ffac28e Ignore triangle strips with less than 3 vertices.
Should fix the new issue reported in #18273
2023-11-01 21:28:37 -06:00
Henrik Rydgård
e4ea4831e9 Delete the vertex cache option from the code. 2023-10-10 15:43:43 +02:00
Henrik Rydgård
078018a943 Move the clockwise calculation out of DrawEngineCommon 2023-10-10 13:16:34 +02:00
Henrik Rydgård
82606b6eb2 Move the clockwise calculation out of the AddPrim loop 2023-10-10 13:00:57 +02:00
Henrik Rydgård
af47ad035d Also use the new descriptor mechanism for in-game 2023-10-10 09:00:29 +02:00
Henrik Rydgård
24409f6f94 Additional check fix 2023-10-09 21:15:17 +02:00
Henrik Rydgård
10bc6b4cd8 Safety check that doesn't fix crazy taxi 2023-10-09 21:10:53 +02:00
Henrik Rydgård
a8b8580756 Don't forget to check the stall address, even in the optimized primitive loop 2023-10-09 14:08:11 +02:00
Henrik Rydgård
7fd7015987 Fix bug in vertex cache using uninitialized data 2023-10-09 14:03:41 +02:00
Henrik Rydgård
c7a3e7bc32 Remove a redundant variable 2023-10-06 16:32:59 +02:00
Henrik Rydgård
cd35252400 DrawEngine; Convert strip sequences in a tight loop 2023-10-06 16:25:13 +02:00
Henrik Rydgård
10ccbfd68c Unify the clearing of variables after a draw call 2023-10-06 15:39:59 +02:00
Henrik Rydgård
d4703e9534 Decoded position format is always the same 2023-10-06 15:39:58 +02:00
Henrik Rydgård
69b43ab734 Extend the Test Drive color ramp smoother to detect up to 3 ramps in a texture.
Note that we also offset the lookup slightly to miss the wrap-around
points. The existing 31 scale factor instead of 32, together with that
half-texel, are enough to avoid that problem.

Fixes #18300
2023-10-03 23:30:18 +02:00
Henrik Rydgård
226d25721a Add a block transfer GPU stat, remove a redundant one 2023-10-03 13:15:55 +02:00
Henrik Rydgård
d07c3c5148 Fix main-thread stalls due to decimate during replacement texture loading 2023-10-03 12:17:43 +02:00
Henrik Rydgård
4d95250052 Optimize further 2023-10-03 11:01:37 +02:00
Henrik Rydgård
0260aebc26 Implement fast-path for merging non-indexed draws quickly. 2023-10-03 11:01:37 +02:00
Henrik Rydgård
1c49d5718c Add an offset field that we'll need later 2023-10-03 11:01:37 +02:00
Henrik Rydgård
92ffef2626 Remove some state from IndexGenerator, fix bugs. Mostly works except vertex cache. 2023-10-03 11:01:37 +02:00
Henrik Rydgård
9b411af1f5 It's running. 2023-10-03 11:01:37 +02:00
Unknown W. Brackets
e79e0e21ad arm64jit: Skip unnecessary const load w/4 weights. 2023-09-30 15:41:56 -07:00
Henrik Rydgård
cf48532ef5
Merge pull request #18219 from hrydgard/get-index-bounds-autovec
Make GetIndexBounds friendlier to autovectorization. Works on x86 at least.
2023-09-29 11:31:34 +02:00
Henrik Rydgård
b8fa3a2071
Merge pull request #18125 from unknownbrackets/arm64-vertexjit
arm64jit: Optimize weight loading a bit
2023-09-29 09:52:56 +02:00
Henrik Rydgård
db421165c0
Merge pull request #18172 from hrydgard/more-lenient-clear-detection
Make clear detection a bit more lenient
2023-09-29 09:52:08 +02:00
Henrik Rydgård
abbd1c83bd Revert "Merge pull request #18184 from hrydgard/expand-lines-mem-fix"
This reverts commit 65b995ac6c, reversing
changes made to 01c3c3638f.
2023-09-27 20:04:37 +02:00
Henrik Rydgård
45bc4d8750 Make GetIndexBounds friendlier to autovectorization. Works on x86 at least. 2023-09-24 12:15:04 +02:00
Unknown W. Brackets
b610e2f314 GPU: Handle invalid blendeq more accurately. 2023-09-23 13:08:25 -07:00
Henrik Rydgård
81f47caf2f Clarify the primitive expansion, add reporting 2023-09-22 10:27:02 +02:00
Henrik Rydgård
966144fa64 Bounds check writing to the index buffer when expanding lines/rects/points 2023-09-20 19:26:36 +02:00
Henrik Rydgård
3f2ef508c9 Make it easier to reason about space in the inds buffer by moving an offset instead of the pointer. 2023-09-20 19:23:24 +02:00
Henrik Rydgård
e6a864ee04 Make clear detection a bit more lenient. Allows using clears in Assassin's Creed and likely more. 2023-09-18 23:57:20 +02:00
Henrik Rydgård
f4b0cddda3 ShaderId: Safer way to check for backend. 2023-09-18 16:25:00 +02:00
Henrik Rydgård
946d4b6251 Avoid causing shader gen failures due to bad blend eq values 2023-09-18 16:12:27 +02:00
Henrik Rydgård
6600b7ab08 Improved logging 2023-09-12 17:15:26 +02:00
Henrik Rydgård
be65cf0fc2 Assert improvements 2023-09-12 17:15:26 +02:00
Henrik Rydgård
844f1de041 Revert "Merge pull request #18008 from hrydgard/naruto-video-flicker-heuristic"
This reverts commit 985af4b03d, reversing
changes made to 64d04782ea.
2023-09-12 12:19:37 +02:00
Henrik Rydgård
10f93875c6 Fix the semantics of DenseHashMap to be consistent even when inserting nulls 2023-09-11 12:07:18 +02:00
Unknown W. Brackets
3c7b05c3e8 PPGe: Use texture windows for atlas text.
This makes it software rendering, which correctly applies clamp/wrap
limits at 512x512, still has readable text.  Other textures may still be
wrong.
2023-09-10 23:54:55 -07:00
Unknown W. Brackets
5c4e08fe19 arm64jit: Use FMLA for TC precale. 2023-09-10 23:04:15 -07:00