Commit graph

31601 commits

Author SHA1 Message Date
Unknown W. Brackets
0ba2d05da5 samplerjit: Simplify AVX shift-copies.
These have been the most common and the fallback is safe.  Let's just add
a helper.
2022-01-17 15:15:36 -08:00
Henrik Rydgård
4ea1c08551
Merge pull request #15323 from unknownbrackets/softgpu-opt2
softgpu: Guide more SSE light factor handling
2022-01-17 15:56:46 +01:00
Unknown W. Brackets
7218fbbe97 softgpu: Guide more SSE light factor handling.
Missed these others in computed state.  Helps mostly to do this inside
Process().
2022-01-17 06:25:52 -08:00
Henrik Rydgård
cba7461157
Merge pull request #15322 from unknownbrackets/softgpu-opt
softgpu: Reduce copying during clipping
2022-01-17 09:19:06 +01:00
Unknown W. Brackets
abef17caca softgpu: Simplify mask check.
This performs a bit better.
2022-01-16 23:40:57 -08:00
Unknown W. Brackets
89bc87a388 softgpu: Reduce copying during clipping.
Common case is nothing needs to be clipped.
2022-01-16 23:33:46 -08:00
Henrik Rydgård
128e2fa14e
Merge pull request #15318 from unknownbrackets/softgpu-opt
softgpu: Heuristic to avoid over-draining
2022-01-17 07:43:34 +01:00
Henrik Rydgård
5c15054181
Merge pull request #15321 from unknownbrackets/debugger
Debugger: Fix crash in software renderer
2022-01-17 07:41:59 +01:00
Henrik Rydgård
e603e201da
Merge pull request #15320 from unknownbrackets/softgpu-flush
softgpu: Fix block transfer flush detection
2022-01-17 07:41:01 +01:00
Henrik Rydgård
1b5ceb1e72
Merge pull request #15319 from unknownbrackets/softgpu-verts
Precompute state for vertex transform
2022-01-17 07:40:41 +01:00
Unknown W. Brackets
653c036ac8 Debugger: Fix crash in software renderer.
The clut isn't set by sampler state, it's set normally by the binner.
2022-01-16 21:53:55 -08:00
Unknown W. Brackets
206d586c1f softgpu: Fix block transfer flush detection.
Fixes video graphics in Gods Eater Burst.
2022-01-16 21:40:19 -08:00
Unknown W. Brackets
fcc3b7684e softgpu: Use SSE in lighting param computation.
The compiler couldn't figure this out.  Halves time in this func.
2022-01-16 21:31:53 -08:00
Unknown W. Brackets
73c143c44c softgpu: Precompute some of screen space multiply.
This at least avoids the shifts and makes it easier to vectorize.
Only helps a little.
2022-01-16 21:31:53 -08:00
Unknown W. Brackets
31745110e8 softpu: Premultiply matrix transforms.
Where possible, we can skip some multiplies per vertex.
2022-01-16 21:31:52 -08:00
Unknown W. Brackets
12a4c63fc7 softgpu: Precompute state for vertex transform.
Doesn't help a ton, but with lots of verts can improve a percent or two.
2022-01-16 21:31:52 -08:00
Unknown W. Brackets
423ec76258 softgpu: Correct texsize flush annotation. 2022-01-16 21:09:43 -08:00
Unknown W. Brackets
83adc44c2b softgpu: Heuristic to avoid over-draining.
Some games (i.e. VC3) benefit from an early drain, since they get more
done while processing more verts.  Others finish the draw quickly, and
then cause significant overhead in queueing new threads.

This attempts to balance the two, and improves Call of Duty and Blade
Dancer.
2022-01-16 21:09:28 -08:00
Henrik Rydgård
bdc69f5171
Merge pull request #15317 from unknownbrackets/softgpu-lighting
softgpu: Precompute lighting parameters
2022-01-17 01:06:35 +01:00
Henrik Rydgård
06ae4d0577
Merge pull request #15316 from unknownbrackets/softgpu-binning
Throw some memory at the softgpu problem
2022-01-17 01:05:48 +01:00
Unknown W. Brackets
1764111a4b softgpu: Reduce wasted memory. 2022-01-16 11:49:41 -08:00
Unknown W. Brackets
2797e035df softgpu: Precompute lighting parameters.
In many cases, games use lighting just for diffuse or something, this
helps skip what's not needed too.  Good improvement in a scene from a
Naruto game.
2022-01-16 11:27:53 -08:00
Unknown W. Brackets
cb5ac04d16 softgpu: Tune some queue sizes for perf.
Using a chunk of RAM for this, but mostly with many threads.
2022-01-16 11:27:43 -08:00
Unknown W. Brackets
d95475e021 softgpu: Expose flush reasons/times in debug stats. 2022-01-16 11:27:42 -08:00
Henrik Rydgård
d6d3bf360c
Merge pull request #15314 from unknownbrackets/softgpu-binning
Allow binning of separate textures
2022-01-16 19:54:47 +01:00
Unknown W. Brackets
7e5f03eed1 softgpu: Reduce flushing for smaller textures. 2022-01-16 08:23:52 -08:00
Unknown W. Brackets
86749a3fe0 softgpu: Flush block xfer only on overlap too. 2022-01-16 08:23:17 -08:00
Unknown W. Brackets
2de7993dc5 softgpu: Decorate some stats for flushes. 2022-01-16 08:23:15 -08:00
Unknown W. Brackets
cc155ec460 softgpu: Avoid texture/CLUT flush unless overlap.
Only need to flush here if there's some overlap in the target.
2022-01-16 08:22:13 -08:00
Unknown W. Brackets
9466dc6397 softgpu: Flush on offset changes. 2022-01-16 08:14:10 -08:00
Unknown W. Brackets
d6fa301ab1 softgpu: Track CLUTs as states for binning.
This way we can have multiple CLUTs in process at once, which helps.
2022-01-16 08:14:09 -08:00
Henrik Rydgård
ba63d9cf09
Merge pull request #15312 from unknownbrackets/softgpu-state
softgpu: Fix alpha blend with one/zero
2022-01-16 10:32:28 +01:00
Henrik Rydgård
f96c22765c
Merge pull request #15313 from unknownbrackets/softgpu-binning
softgpu: Allow binning across prim calls
2022-01-16 10:27:36 +01:00
Unknown W. Brackets
18f2a45a6a softgpu: Allow binning across prim calls. 2022-01-16 00:49:49 -08:00
Henrik Rydgård
9bef900cd7
Merge pull request #15311 from unknownbrackets/softgpu-state
Avoid gstate references in rasterizerization
2022-01-16 09:40:25 +01:00
Henrik Rydgård
2aa41b45b6
Merge pull request #15309 from unknownbrackets/debugger
Debugger: Avoid flushing meminfo on write lookup
2022-01-16 09:39:18 +01:00
Unknown W. Brackets
2ad7d8ed29 softgpu: Fix alpha blend with one/zero.
Wasn't setting the fixed value constants in these cases, so need to handle
in the C++ version.
2022-01-16 00:38:49 -08:00
Henrik Rydgård
86714d9f96
Merge pull request #15310 from unknownbrackets/softgpu-opt
softgpu: Tune queue push/pop to reduce overhead
2022-01-16 09:38:45 +01:00
Unknown W. Brackets
fc292b127b softgpu: Correct dither matrix lookup.
Oops, need to wrap x/y, of course...
2022-01-15 23:51:21 -08:00
Unknown W. Brackets
6da7765309 softgpu: Correct logic op state update. 2022-01-15 22:31:28 -08:00
Unknown W. Brackets
b42ebe15d8 softgpu: Fix off-by-one size limit on bin queues. 2022-01-15 21:59:23 -08:00
Unknown W. Brackets
2539fb7c3c softgpu: Tune queue push/pop to reduce overhead.
These aren't safetly atomic with concurrent pushers or poppers, but as
long as there's only one of each, they're still safe.

Shaves a decent % off Drain time for heavy scenes.
2022-01-15 20:18:49 -08:00
Unknown W. Brackets
0f2fc00f1b Debugger: Avoid flushing meminfo on write lookup.
Small improvement on frequent block transfers, etc.
2022-01-15 19:43:16 -08:00
Unknown W. Brackets
6896a7a64e softgpu: Use cached state for screen offset. 2022-01-15 18:20:25 -08:00
Unknown W. Brackets
edb79d968f softgpu: Cache CLUT params in sampler state.
And now there's no more gstate for pixel drawing or sampling.  Just a
little left in rasterization.
2022-01-15 18:09:09 -08:00
Unknown W. Brackets
c0e85e6170 softgpu: Move texenv color into sampler state. 2022-01-15 17:52:40 -08:00
Unknown W. Brackets
ad3635c82a softgpu: Move tex size to cached state. 2022-01-15 17:22:43 -08:00
Unknown W. Brackets
02c5559393 softgpu: Remove z from DrawingCoords.
It's not really used much of anywhere, anyway.
2022-01-15 15:38:56 -08:00
Unknown W. Brackets
bf2e060735 softgpu: Move c++ tex func to sampler.
It's not used anywhere else now.
2022-01-15 15:28:07 -08:00
Unknown W. Brackets
a228b2ab6c softgpu: Use cached sampler state outside jit. 2022-01-15 15:26:26 -08:00