Unknown W. Brackets
d5b4c98f96
softgpu: Reduce some non-SIMD lighting math.
...
Small perf improvement for vertex/lighting heavy (i.e. 3D) scenes.
2023-07-16 10:31:44 -07:00
fp64
b0f71e08f4
Simplify projective texcoord calculation
...
As mentioned in https://github.com/hrydgard/ppsspp/issues/17613#issuecomment-1613583152 .
2023-07-03 10:59:09 -04:00
fp64
cd9f01c4df
Remove SSE4 path from Vec4<int>::operator*
2023-06-30 22:07:26 -04:00
fp64
f133739cd0
Replace some signed divison in SoftGPU
...
This also adds a few bitwise operations to Vec4<int> and further
SIMDifies it.
Also, fixes unrelated warning.
2023-06-29 16:43:21 -04:00
fp64
436b49c4f2
Streamline x86 SSE workaround
...
Seems clearer than using #ifdef's at each site. Also rationale
is clearly spelled out, one 'Go to definition' away from any instance.
2023-06-27 00:30:01 -04:00
Unknown W. Brackets
fedb92b0e9
softgpu: Ensure early depth test uses SIMD.
2023-06-25 10:18:21 -07:00
Henrik Rydgård
08d578dce9
Merge pull request #17618 from unknownbrackets/softgpu-opt-cast
...
Optimize casts in softgpu
2023-06-25 07:55:30 +02:00
Henrik Rydgård
ec92675c5e
Merge pull request #17619 from unknownbrackets/softgpu-opt-z
...
softgpu: Improve Z interpolation SIMD
2023-06-25 07:55:03 +02:00
Unknown W. Brackets
d42642edd2
softgpu: Improve Z interpolation SIMD.
2023-06-24 22:17:11 -07:00
Unknown W. Brackets
ae9d34370e
softgpu: Move wsum_recip out of the triangle loop.
...
Seems like a small benefit, but not seeing any issues from this.
Noticed by fp64.
2023-06-24 12:38:05 -07:00
fp64
159faaa2ec
softgpu: Optimize (bi-)linear texture filtering
...
Seeing as SampleLinearLevel is near the top in the profiler,
optimize actual bilinear filtering using SSE2. Solid win in the
synthetic benchmark (https://godbolt.org/z/fqh3xvbGx , also doubles
as correctness check), no visible difference in actual PPSSPP.
Note: profiler suggests that hot part of SampleLinearLevel is
elsewhere.
2023-06-21 20:02:34 +03:00
Unknown W. Brackets
efd8565ffe
Merge pull request #17592 from fp64/anymask-movemask
...
Use _mm_movemask_ps for AnyMask
2023-06-17 09:48:09 -07:00
fp64
ab85c46161
Use _mm_movemask_ps for AnyMask
...
Probably very minor speed improvement, but it's rather neat.
2023-06-17 01:05:02 -04:00
Henrik Rydgård
5b4fa06b00
Revert Dot33 on 32-bit x86 only. See #17584
2023-06-16 23:43:33 +02:00
fp64
f0d844a5a3
Convert Dot33 to SSE2
...
Simpler, lower requirements, and doesn't seem to hurt speed. See #17571 .
2023-06-14 22:02:50 -04:00
Henrik Rydgård
963ca50ba7
Merge pull request #17567 from hrydgard/uvscale-as-argument
...
Pass uvScale in as a fourth argument to the vertex decoder
2023-06-13 09:49:31 +02:00
Unknown W. Brackets
a7fa37d114
softgpu: Use SIMD more for dot products.
2023-06-12 19:54:32 -07:00
Henrik Rydgård
01cea7f088
Pass uvScale in as an argument to the vertex decoder
...
Cleaner than overwriting/restoring gstate_c.uvScale in the decoder
loop. A small cleanup I've been wanting to do for ages.
Expecting a negligble perf boost if any.
2023-06-12 20:25:18 +02:00
Henrik Rydgård
880379c15d
Extract some minor changes from #17497
2023-06-12 20:20:06 +02:00
Henrik Rydgård
ad8827ae70
Cleanup, address feedback
2023-05-26 10:28:10 +02:00
Henrik Rydgård
5c94a20ecb
SoftGPU: implement CheckConfigChanged, have it check postshaders. Fixes #17511 .
2023-05-26 09:48:51 +02:00
Henrik Rydgård
16b243b007
Centralize allocation of vertex decode buffers
2023-04-24 12:11:58 +02:00
Unknown W. Brackets
0490ad0039
softgpu: Add NEON variants as well.
2023-04-16 13:09:56 -07:00
Unknown W. Brackets
860fc176d8
softgpu: Use more SSE in lighting.
2023-04-16 11:59:57 -07:00
Unknown W. Brackets
2868495cf8
softgpu: Use SSE for lighting ceil if available.
...
Tiny optimization, helps only a little.
2023-04-16 11:13:43 -07:00
Unknown W. Brackets
b5206df04f
softgpu: Calc worldnormal a bit less often.
...
This is clearer anyway.
2023-04-16 10:16:32 -07:00
Unknown W. Brackets
59fb374c38
softgpu: Small optimization to clut updates.
2023-04-16 10:16:06 -07:00
Henrik Rydgård
8f96ec371e
Rename iBufFilter -> iDisplayFilter
2023-04-05 09:34:18 +02:00
Henrik Rydgård
3af961f3ba
Revert DrawPixel changes
2023-04-02 16:41:29 +02:00
Henrik Rydgård
fc62d587c0
Fix whitespace issues
2023-04-02 16:36:39 +02:00
Герман Семенов
122b63b9a8
GPU: using if constexpr
C++17 optimization
2023-04-02 16:36:37 +02:00
Unknown W. Brackets
a88b8a14f6
softgpu: Fix over-optimization of alpha test.
...
When alpha blend is off, was previously skipping the alpha test if only it
was enabled. See #17213 .
2023-03-31 23:53:37 -07:00
Unknown W. Brackets
2c5b0999e8
softgpu: Make debug-only optim more consistent.
...
Of course it doesn't matter when optimizations are enabled in any compiler
that can build PPSSPP...
2023-03-31 23:52:23 -07:00
Henrik Rydgård
c6352a262d
Fix crash in SoftGPU when frameskipping, noticed by sum2012 in Daxter
...
Fixes #17021
2023-02-28 23:21:36 +01:00
Henrik Rydgård
b3ce31c61e
Address feedback
2023-02-26 19:54:30 +01:00
Henrik Rydgård
72bed6f2b5
Some DeviceLost/DeviceRestore cleanup
2023-02-26 11:05:52 +01:00
Henrik Rydgård
231f4efbbb
Move some more stuff to GPUCommonHW
2023-02-26 10:33:11 +01:00
Henrik Rydgård
4c45f8a4b0
Pass in draw directly in GPUCommon::DeviceRestore, instead of awkwardly fetching it
2023-02-25 23:04:27 +01:00
Henrik Rydgård
c2c479b217
Remove function InitClear. Was only implemented for DX9, and only barely meaningful in non-buffered.
2023-02-25 16:32:50 +01:00
Henrik Rydgård
8b54a14bf2
Move the big command table to where it belongs, GPUCommonHW
2023-02-25 16:20:34 +01:00
Henrik Rydgård
e136ad795a
Some slight unification
2023-02-25 15:15:34 +01:00
Henrik Rydgård
18999c3687
Create the GPUCommonHW class.
2023-02-25 14:42:10 +01:00
Henrik Rydgård
ed03348c65
Unify PreExecuteOp, keep the soft GPU as a special case
2023-02-25 12:21:03 +01:00
Unknown W. Brackets
cd3fc26190
samplerjit: Prevent thread local stale cache read.
...
If the generation count happens to match, would still get a stale pointer
and crash. Let's just make the generation count static so it always
increases.
2023-02-22 21:15:03 -08:00
Unknown W. Brackets
89c18d8077
riscv: Cleanup missing Poison, Crash.
2023-02-12 12:10:29 -08:00
Unknown W. Brackets
88ba003f46
ThreadManager: Add a simple priority field.
...
Currently, not actually respected.
2023-02-02 17:08:24 -08:00
Unknown W. Brackets
3a6fa9b4ba
ThreadManager: Don't allow reordering of queue.
...
Allowing a priority item is faster, but can cause confusion when you
expect things to run in the same sequence they're enqueued.
2023-01-14 16:35:01 -08:00
Henrik Rydgård
ffb8a9be47
Fix another subtle NEON type mismatch.
...
Fixes #16777
2023-01-10 14:56:30 +01:00
Henrik Rydgård
ee3618290b
Typo fix in NEON code.
...
Fixes #16772
2023-01-10 12:32:33 +01:00
Unknown W. Brackets
1215714240
softgpu: Use NEON for lighting.
2023-01-07 19:06:35 -08:00