Commit graph

1031 commits

Author SHA1 Message Date
Unknown W. Brackets
f4f7ea2736 softgpu: Cache colortest params in draw pix state. 2022-01-15 13:03:11 -08:00
Unknown W. Brackets
aa9d751248 softgpu: Cache alpha/stencil test masks in state. 2022-01-15 13:03:11 -08:00
Unknown W. Brackets
acad2640dd softgpu: Cache logicOp in draw pixel state. 2022-01-15 13:03:10 -08:00
Unknown W. Brackets
c0d548846f softgpu: Use cached write mask in draw pixel. 2022-01-15 13:03:10 -08:00
Unknown W. Brackets
f1ce2e7715 softgpu: Cache minz/maxz in draw pixel state. 2022-01-15 13:03:10 -08:00
Unknown W. Brackets
0b3f096c01 softgpu: Cache strides in draw pixel state. 2022-01-15 13:03:10 -08:00
Unknown W. Brackets
e9f3720e20 softgpu: Cache fog color draw pixel state. 2022-01-15 13:03:10 -08:00
Henrik Rydgård
165e0a12a9
Merge pull request #15305 from unknownbrackets/softgpu-opt
softgpu: Avoid double calculating screenpos
2022-01-15 20:58:09 +01:00
Unknown W. Brackets
880826bab4 softgpu: Remove disable of cached pixel state.
That mode is slower now (with the other state changes), and we don't want
to read gstate anymore anyway.
2022-01-15 11:22:50 -08:00
Unknown W. Brackets
cf3384c993 softgpu: Avoid double calculating screenpos. 2022-01-15 11:22:36 -08:00
Unknown W. Brackets
3134bd1ff9 softgpu: Cleanup push/pop atomic handling.
Two concurrent push/pops would hazard, though we don't do that.
This improves perf a bit by avoiding an atomic read again.
2022-01-15 00:02:31 -08:00
Unknown W. Brackets
c86a0157d8 softgpu: Remove old task.
Oops.
2022-01-14 20:52:20 -08:00
Unknown W. Brackets
f091225572 softgpu: Stop storing model pos.
We don't even use this anywhere else.  Also skip needless Lerp on clip.
2022-01-14 20:36:09 -08:00
Unknown W. Brackets
d6a8cb2a0e softgpu: Stop storing normal/worldnormal/worldpos.
This is only needed for lighting, which is applied right away.

This improves perf just simply from less data being copied.
2022-01-14 20:32:18 -08:00
Unknown W. Brackets
5a35525fd4 softgpu: Enqueue batches of prims when binning.
This cuts some thread overhead.
2022-01-14 20:19:32 -08:00
Unknown W. Brackets
46e3c71522 softgpu: Adjust binning thresholds.
This improves Persona 3 and LBP.
2022-01-13 23:14:45 -08:00
Unknown W. Brackets
dffc333120 softgpu: Avoid thread ordering hazard.
Must run the primitives in the right order.  No shortcutting allowed.
2022-01-13 23:03:42 -08:00
Unknown W. Brackets
970e9c2f51 softgpu: Move threading into BinManager.
This threads much more effectively, across entire prim call.
2022-01-13 22:45:23 -08:00
Unknown W. Brackets
48ef4a18b1 softgpu: Handle scissor/range in BinManager. 2022-01-13 19:07:41 -08:00
Unknown W. Brackets
a0a9b1e89b softgpu: Add class to manage and enqueue for bins.
For now, just forwarding.
2022-01-13 09:26:59 -08:00
Unknown W. Brackets
6839aac109 Debugger: Cache list PC for softgpu tagging.
Still slow, but improved.
2022-01-12 21:23:49 -08:00
Unknown W. Brackets
d962fb35d3 softgpu: Centralize more prim drawing state. 2022-01-12 21:23:49 -08:00
Unknown W. Brackets
d06f17d27b softgpu: Move tex filter setting check to state. 2022-01-11 00:07:24 -08:00
Unknown W. Brackets
75ff3e44e6 softgpu: Move texture addresses to prim state. 2022-01-11 00:00:03 -08:00
Unknown W. Brackets
d5c5e9478e softgpu: Prepare more state per prim call. 2022-01-10 22:12:35 -08:00
Unknown W. Brackets
9ec7d65c49 softgpu: Use func IDs instead of gstate more. 2022-01-10 22:12:35 -08:00
Unknown W. Brackets
d7a82ab7b8 softgpu: Compute func IDs once per batch of verts.
This saves a decent chunk of time, especially when many verts are being
drawn.
2022-01-10 22:12:35 -08:00
Unknown W. Brackets
e57730a97d softgpu: Output normals to GE debugger. 2022-01-09 21:33:45 -08:00
Unknown W. Brackets
b915a82c41 softgpu: Correct decal doubling without alpha. 2022-01-09 12:23:55 -08:00
Unknown W. Brackets
72aa4be879 samplerjit: Skip processing alpha if unused. 2022-01-09 12:23:55 -08:00
Unknown W. Brackets
fe0b3dbd01 samplerjit: Fix alpha for 565 in linear lookup. 2022-01-09 11:08:46 -08:00
Henrik Rydgård
2d7a7fd34e
Merge pull request #15288 from unknownbrackets/softgpu-self
softgpu: Draw top left of rectangles first
2022-01-09 08:33:28 +01:00
Unknown W. Brackets
88ef2d1ac1 softgpu: Skip threading when rendering to self.
This will probably always be a problem to thread.
2022-01-08 21:05:08 -08:00
Unknown W. Brackets
6367d5dc8f softgpu: Draw top left of rectangles first.
This helps when things do self-rendering, since this way we won't read
from things we've just written to when scaling down.  See #11623.
2022-01-08 20:53:01 -08:00
Unknown W. Brackets
8a00c2d233 GPU: Allow gcc/clang/icc runtime SSE4 usage.
All our builds before were only using SSE4 in jit...
2022-01-08 17:09:09 -08:00
Henrik Rydgård
eee62849fe
Merge pull request #15284 from unknownbrackets/softgpu-opt
Improve softgpu lighting accuracy and speed
2022-01-08 22:05:06 +01:00
Unknown W. Brackets
c7fc448869 softgpu: Use some SSE4 in triangle interpolation. 2022-01-08 11:38:07 -08:00
Unknown W. Brackets
3b1cc0d3b8 softgpu: Limit minX/maxX per line.
Only helps when single-threaded, though.
2022-01-08 10:04:52 -08:00
Unknown W. Brackets
9458610d96 softgpu: Avoid rsqrt path for normals.
In LittleBigPlanet, it's noticeable that the lighting is very off due to
the slight loss of accuracy - possibly due to cutoff or similar.
2022-01-07 23:22:57 -08:00
Unknown W. Brackets
ce8a49b1c1 softgpu: Retain floats in diffuse/specular.
This seems to be a bit more accurate.  Color blending seems correct now,
but the factors and especially pow results are off.

Also, normalize normal to 0, 0, 1, which seems to match results better.
2022-01-06 21:52:31 -08:00
Unknown W. Brackets
bd354164bc softgpu: Cleanup -NAN and diffuse factor. 2022-01-06 21:52:27 -08:00
Unknown W. Brackets
537e357741 softgpu: Correct NAN spotlight exponent/direction. 2022-01-06 21:19:48 -08:00
Unknown W. Brackets
b86bdc9456 softgpu: Correct handling of NAN attenuation. 2022-01-06 21:19:47 -08:00
Unknown W. Brackets
fa80c448ee softgpu: More closely match PSP light rounding. 2022-01-06 21:19:47 -08:00
Unknown W. Brackets
079b67e7ed softgpu: Use common SIMD matrix multiplies. 2022-01-06 21:19:47 -08:00
Unknown W. Brackets
cba2374abd softgpu: Separate calculation of S/T.
We could probably reuse, but we're not right now and it complicates the
logic.
2022-01-06 21:19:47 -08:00
Henrik Rydgård
683289402c
Merge pull request #15279 from unknownbrackets/samplerjit-fastpath
softgpu: Correct mirroring in fastpath+nearest
2022-01-05 09:43:20 +01:00
Henrik Rydgård
f82f24a9bb
Merge pull request #15280 from unknownbrackets/samplerjit-dxt
Correct some recent regressions in samplerjit
2022-01-05 09:42:30 +01:00
Unknown W. Brackets
0993771104 samplerjit: Fix standard bufw check.
Oops, bufw could be intentionally higher while w is 16 bytes.
2022-01-05 00:11:34 -08:00
Unknown W. Brackets
741a9b0a4d samplerjit: Fix DXT compilation. 2022-01-05 00:00:03 -08:00