Unknown W. Brackets
e9f3720e20
softgpu: Cache fog color draw pixel state.
2022-01-15 13:03:10 -08:00
Henrik Rydgård
165e0a12a9
Merge pull request #15305 from unknownbrackets/softgpu-opt
...
softgpu: Avoid double calculating screenpos
2022-01-15 20:58:09 +01:00
Unknown W. Brackets
880826bab4
softgpu: Remove disable of cached pixel state.
...
That mode is slower now (with the other state changes), and we don't want
to read gstate anymore anyway.
2022-01-15 11:22:50 -08:00
Unknown W. Brackets
cf3384c993
softgpu: Avoid double calculating screenpos.
2022-01-15 11:22:36 -08:00
Unknown W. Brackets
3134bd1ff9
softgpu: Cleanup push/pop atomic handling.
...
Two concurrent push/pops would hazard, though we don't do that.
This improves perf a bit by avoiding an atomic read again.
2022-01-15 00:02:31 -08:00
Unknown W. Brackets
c86a0157d8
softgpu: Remove old task.
...
Oops.
2022-01-14 20:52:20 -08:00
Unknown W. Brackets
f091225572
softgpu: Stop storing model pos.
...
We don't even use this anywhere else. Also skip needless Lerp on clip.
2022-01-14 20:36:09 -08:00
Unknown W. Brackets
d6a8cb2a0e
softgpu: Stop storing normal/worldnormal/worldpos.
...
This is only needed for lighting, which is applied right away.
This improves perf just simply from less data being copied.
2022-01-14 20:32:18 -08:00
Unknown W. Brackets
5a35525fd4
softgpu: Enqueue batches of prims when binning.
...
This cuts some thread overhead.
2022-01-14 20:19:32 -08:00
Unknown W. Brackets
46e3c71522
softgpu: Adjust binning thresholds.
...
This improves Persona 3 and LBP.
2022-01-13 23:14:45 -08:00
Unknown W. Brackets
dffc333120
softgpu: Avoid thread ordering hazard.
...
Must run the primitives in the right order. No shortcutting allowed.
2022-01-13 23:03:42 -08:00
Unknown W. Brackets
970e9c2f51
softgpu: Move threading into BinManager.
...
This threads much more effectively, across entire prim call.
2022-01-13 22:45:23 -08:00
Unknown W. Brackets
48ef4a18b1
softgpu: Handle scissor/range in BinManager.
2022-01-13 19:07:41 -08:00
Unknown W. Brackets
a0a9b1e89b
softgpu: Add class to manage and enqueue for bins.
...
For now, just forwarding.
2022-01-13 09:26:59 -08:00
Unknown W. Brackets
6839aac109
Debugger: Cache list PC for softgpu tagging.
...
Still slow, but improved.
2022-01-12 21:23:49 -08:00
Unknown W. Brackets
d962fb35d3
softgpu: Centralize more prim drawing state.
2022-01-12 21:23:49 -08:00
Unknown W. Brackets
d06f17d27b
softgpu: Move tex filter setting check to state.
2022-01-11 00:07:24 -08:00
Unknown W. Brackets
75ff3e44e6
softgpu: Move texture addresses to prim state.
2022-01-11 00:00:03 -08:00
Unknown W. Brackets
d5c5e9478e
softgpu: Prepare more state per prim call.
2022-01-10 22:12:35 -08:00
Unknown W. Brackets
9ec7d65c49
softgpu: Use func IDs instead of gstate more.
2022-01-10 22:12:35 -08:00
Unknown W. Brackets
d7a82ab7b8
softgpu: Compute func IDs once per batch of verts.
...
This saves a decent chunk of time, especially when many verts are being
drawn.
2022-01-10 22:12:35 -08:00
Unknown W. Brackets
e57730a97d
softgpu: Output normals to GE debugger.
2022-01-09 21:33:45 -08:00
Unknown W. Brackets
b915a82c41
softgpu: Correct decal doubling without alpha.
2022-01-09 12:23:55 -08:00
Unknown W. Brackets
72aa4be879
samplerjit: Skip processing alpha if unused.
2022-01-09 12:23:55 -08:00
Unknown W. Brackets
fe0b3dbd01
samplerjit: Fix alpha for 565 in linear lookup.
2022-01-09 11:08:46 -08:00
Henrik Rydgård
2d7a7fd34e
Merge pull request #15288 from unknownbrackets/softgpu-self
...
softgpu: Draw top left of rectangles first
2022-01-09 08:33:28 +01:00
Unknown W. Brackets
88ef2d1ac1
softgpu: Skip threading when rendering to self.
...
This will probably always be a problem to thread.
2022-01-08 21:05:08 -08:00
Unknown W. Brackets
6367d5dc8f
softgpu: Draw top left of rectangles first.
...
This helps when things do self-rendering, since this way we won't read
from things we've just written to when scaling down. See #11623 .
2022-01-08 20:53:01 -08:00
Unknown W. Brackets
8a00c2d233
GPU: Allow gcc/clang/icc runtime SSE4 usage.
...
All our builds before were only using SSE4 in jit...
2022-01-08 17:09:09 -08:00
Henrik Rydgård
eee62849fe
Merge pull request #15284 from unknownbrackets/softgpu-opt
...
Improve softgpu lighting accuracy and speed
2022-01-08 22:05:06 +01:00
Unknown W. Brackets
c7fc448869
softgpu: Use some SSE4 in triangle interpolation.
2022-01-08 11:38:07 -08:00
Unknown W. Brackets
3b1cc0d3b8
softgpu: Limit minX/maxX per line.
...
Only helps when single-threaded, though.
2022-01-08 10:04:52 -08:00
Unknown W. Brackets
9458610d96
softgpu: Avoid rsqrt path for normals.
...
In LittleBigPlanet, it's noticeable that the lighting is very off due to
the slight loss of accuracy - possibly due to cutoff or similar.
2022-01-07 23:22:57 -08:00
Unknown W. Brackets
ce8a49b1c1
softgpu: Retain floats in diffuse/specular.
...
This seems to be a bit more accurate. Color blending seems correct now,
but the factors and especially pow results are off.
Also, normalize normal to 0, 0, 1, which seems to match results better.
2022-01-06 21:52:31 -08:00
Unknown W. Brackets
bd354164bc
softgpu: Cleanup -NAN and diffuse factor.
2022-01-06 21:52:27 -08:00
Unknown W. Brackets
537e357741
softgpu: Correct NAN spotlight exponent/direction.
2022-01-06 21:19:48 -08:00
Unknown W. Brackets
b86bdc9456
softgpu: Correct handling of NAN attenuation.
2022-01-06 21:19:47 -08:00
Unknown W. Brackets
fa80c448ee
softgpu: More closely match PSP light rounding.
2022-01-06 21:19:47 -08:00
Unknown W. Brackets
079b67e7ed
softgpu: Use common SIMD matrix multiplies.
2022-01-06 21:19:47 -08:00
Unknown W. Brackets
cba2374abd
softgpu: Separate calculation of S/T.
...
We could probably reuse, but we're not right now and it complicates the
logic.
2022-01-06 21:19:47 -08:00
Henrik Rydgård
683289402c
Merge pull request #15279 from unknownbrackets/samplerjit-fastpath
...
softgpu: Correct mirroring in fastpath+nearest
2022-01-05 09:43:20 +01:00
Henrik Rydgård
f82f24a9bb
Merge pull request #15280 from unknownbrackets/samplerjit-dxt
...
Correct some recent regressions in samplerjit
2022-01-05 09:42:30 +01:00
Unknown W. Brackets
0993771104
samplerjit: Fix standard bufw check.
...
Oops, bufw could be intentionally higher while w is 16 bytes.
2022-01-05 00:11:34 -08:00
Unknown W. Brackets
741a9b0a4d
samplerjit: Fix DXT compilation.
2022-01-05 00:00:03 -08:00
Unknown W. Brackets
19998976c7
samplerjit: Correct linear compile failure.
...
It was resetting to nullptr, because `nearest` was nullptr.
2022-01-04 23:58:07 -08:00
Unknown W. Brackets
e2f8cf8bf2
softgpu: Correct mirroring in fastpath+nearest.
2022-01-04 23:42:31 -08:00
Unknown W. Brackets
d98e5bfc97
softgpu: Improve usage of SSE for lighting.
...
Gives about a 2% improvement in many places.
2022-01-03 06:45:10 -08:00
Unknown W. Brackets
2aa57679fa
softjit: Keep mip S/T calc in SIMD.
...
This is only a tiny bit faster, though.
2022-01-03 06:45:10 -08:00
Unknown W. Brackets
a309ed791b
softjit: Use RIP access in color/depth off.
...
Seems to help, though it's small.
2022-01-03 06:45:10 -08:00
Unknown W. Brackets
612cc0ab5c
softjit: Optimize depth range checks.
...
This was higher than I expected on the profile. Not a huge improvement,
but a bit faster.
2022-01-03 06:45:10 -08:00