Henrik Rydgård
10f93875c6
Fix the semantics of DenseHashMap to be consistent even when inserting nulls
2023-09-11 12:07:18 +02:00
Unknown W. Brackets
cd3fc26190
samplerjit: Prevent thread local stale cache read.
...
If the generation count happens to match, would still get a stale pointer
and crash. Let's just make the generation count static so it always
increases.
2023-02-22 21:15:03 -08:00
Unknown W. Brackets
d9522a7ac5
softgpu: Avoid clear hazard for last cached funcs.
2022-12-06 21:23:56 -08:00
Unknown W. Brackets
eda3ce556e
softgpu: Avoid atomic structs.
...
Apparently we don't link libatomic and rather than fighting that, I'll
just use thread local values.
2022-12-06 20:35:07 -08:00
Unknown W. Brackets
400f6abf9a
softgpu: Optimize lookup of last jit func.
...
This is common (for example, maybe a pixel state is updated but sampler is
not), and reduces time spent in ComputeRasterizerState() quite a bit in
Darkstalkers, where jits are available (i.e. Intel currently.)
2022-12-06 19:16:19 -08:00
Unknown W. Brackets
87fb9eef37
softgpu: Remove std::function usage.
...
Wanted to avoid coupling these, but don't like the std::function
construct/destructs showing in profiles...
2022-12-06 19:15:57 -08:00
Unknown W. Brackets
778a0487cb
softjit: Switch to DenseHashMap.
2022-12-02 20:59:13 -08:00
Unknown W. Brackets
4d06400548
softgpu: Fix compile hazard while running.
...
This prevents any clearing of cache while other threads may be using
previously cached funcs, and avoids wx exclusive hazards.
2022-11-20 12:04:02 -08:00
Unknown W. Brackets
ce51942508
softgpu: Correct WX-exclusive platform hazards.
...
Should mainly affect BSD at this point.
2022-11-20 10:55:35 -08:00
Unknown W. Brackets
79b1d1d35f
softgpu: Better approximate slope mip level mode ( #16276 )
...
* samplerjit: Remove unused x/y parameters.
Still need to tune the accuracy of filtering, but those were not the
right way.
* softgpu: Better approximate slope mip level mode.
This isn't exactly right, but it's closer.
* softgpu: Calculate auto from largest difference.
Direction shouldn't matter.
2022-10-23 10:15:43 +02:00
Unknown W. Brackets
167213c746
softgpu: Cache texture bufws at 16 bit.
...
Reducing the size of state a bit.
2022-09-12 21:57:00 -07:00
Unknown W. Brackets
90e009edb9
softgpu: Clamp/wrap textures at 512 pixels.
...
A texture larger than 512 is "valid", but simply wraps/clamps at 512.
Importantly, the texture coords are still calculated at the specified
size, which can be up to 32768.
2022-09-10 20:23:09 -07:00
Unknown W. Brackets
99d7703d33
samplerjit: Precalculate DXT1/3/5 offsets.
...
This improves WALL-E by 8% overall.
2022-02-05 13:04:17 -08:00
Unknown W. Brackets
1b2cf52bfe
samplerjit: Fix non-shared CLUT on Linux.
...
Oops, good that CI will catch this now - I've broken this more than once.
2022-01-29 22:20:46 -08:00
Unknown W. Brackets
d200ef40de
samplerjit: Compile sampler funcs together.
...
We can't have the cache clear between nearest/linear, because then we'll
call a bunch of int3's.
2022-01-29 20:28:20 -08:00
Unknown W. Brackets
eb70a90347
samplerjit: Avoid frac uv transfer to gen regs.
...
It should just stay in vec, this is more convenient anyway.
2022-01-28 23:50:54 -08:00
Unknown W. Brackets
99d6d569f0
samplerjit: Reduce transfers in nearest texel calc.
...
This benefits a few games, mostly where there's lots of UI or similar.
2022-01-24 21:28:04 -08:00
Unknown W. Brackets
c1e657ed47
samplerjit: Better vectorize UV linear calc.
...
Gives about 1-2% when mips are used.
2022-01-24 20:42:07 -08:00
Unknown W. Brackets
c2985bca31
softjit: Centralize some common funcs from sampler.
...
No need to duplicate this code.
2022-01-19 00:03:59 -08:00
Unknown W. Brackets
edb79d968f
softgpu: Cache CLUT params in sampler state.
...
And now there's no more gstate for pixel drawing or sampling. Just a
little left in rasterization.
2022-01-15 18:09:09 -08:00
Unknown W. Brackets
ad3635c82a
softgpu: Move tex size to cached state.
2022-01-15 17:22:43 -08:00
Unknown W. Brackets
a228b2ab6c
softgpu: Use cached sampler state outside jit.
2022-01-15 15:26:26 -08:00
Unknown W. Brackets
d5c5e9478e
softgpu: Prepare more state per prim call.
2022-01-10 22:12:35 -08:00
Unknown W. Brackets
72aa4be879
samplerjit: Skip processing alpha if unused.
2022-01-09 12:23:55 -08:00
Unknown W. Brackets
26e7768a67
samplerjit: Remove old linear nearest paths.
...
We only use it for DXT now, so let's not keep the dead code around.
2022-01-02 17:28:52 -08:00
Unknown W. Brackets
7806dfddea
samplerjit: Use VPGATHERDD for all types.
2022-01-02 17:19:18 -08:00
Unknown W. Brackets
ce6ea8da11
samplerjit: Apply gather lookup to all CLUT4.
2022-01-02 17:19:18 -08:00
Unknown W. Brackets
22f770c828
samplerjit: Use VPGATHERDD for simple CLUT4 loads.
...
Planning to expand this to more paths.
2022-01-02 17:19:17 -08:00
Henrik Rydgård
d3f0af7458
Merge pull request #15273 from unknownbrackets/softjit-bloom
...
Optimize software renderer handling of common bloom operations
2022-01-02 18:11:07 +01:00
Unknown W. Brackets
a259761262
samplerjit: Use nearest func in fast path too.
...
This uses the more optimal tex funcs.
2022-01-02 08:48:16 -08:00
Unknown W. Brackets
0eec4e7e4d
samplerjit: Decode colors in parallel.
...
Not used in a ton of games, but a decent improvement where it is used.
2022-01-02 08:27:55 -08:00
Unknown W. Brackets
7060035303
samplerjit: Implement nearest in jit.
...
This uses the tex func and similar within jit.
2022-01-01 16:58:05 -08:00
Unknown W. Brackets
91c9343e87
samplerjit: Refactor and reuse constant pool.
...
It's just here to be rip accessible, the fixed values can be output just
once.
2022-01-01 16:58:05 -08:00
Unknown W. Brackets
40240be91c
samplerjit: Update nearest args, temp disable jit.
...
This temporarily disables jit for nearest, but refactors to use the new
arg structure. It now matches linear.
2022-01-01 16:58:05 -08:00
Unknown W. Brackets
06e954fe2a
samplerjit: Create a separate fetch func.
...
This allows nearest to become more similar to linear, where it applies the
texture function.
2022-01-01 16:58:04 -08:00
Unknown W. Brackets
3bc6009158
samplerjit: Refactor sampler ID calculation.
...
Make it the same as pixel func IDs.
2022-01-01 16:58:04 -08:00
Unknown W. Brackets
6aec68aa5c
samplerjit: Correct wrong bufw at mip levels.
...
Oops, was always using the base bufw.
2022-01-01 16:40:02 -08:00
Unknown W. Brackets
1addf84e90
samplerjit: Use SSSE3/SSE4 in linear filtering.
2021-12-30 23:22:56 -08:00
Unknown W. Brackets
28cfbe0e5a
samplerjit: Add an alternate profiling method.
...
This is more useful to group common operations together for profiling.
2021-12-29 07:11:39 -08:00
Unknown W. Brackets
74eb450e76
samplerjit: Move texture function into jit.
...
Could do this also for nearest, might end up with a third set of functions
there for a direct sample lookup (for debug funcs.)
2021-12-28 17:52:17 -08:00
Unknown W. Brackets
940e6bb1d7
samplerjit: Lookup both mip tex values.
2021-12-28 16:22:54 -08:00
Unknown W. Brackets
6b55d328e5
samplerjit: Use regcache for linear filtering.
...
This makes it easier to reuse for mipmap filtering.
2021-12-28 15:37:25 -08:00
Unknown W. Brackets
cdf14c8579
samplerjit: Calculate mip level U/V/offsets.
...
Not actually doing the sampling for the second mip level in the single jit
pass yet, but close.
2021-12-28 14:12:58 -08:00
Unknown W. Brackets
a4558a5736
samplerjit: Take texptr/bufw as arrays.
...
Prep for moving mip map sampling into linear.
2021-12-28 12:04:16 -08:00
Unknown W. Brackets
4864850b3b
samplerjit: Handle mipmap width/height in S/T calc.
2021-12-28 11:29:29 -08:00
Unknown W. Brackets
a84accf713
samplerjit: Move S/T calculation into jit.
...
Gives a pretty decent 5-10% improvement in many places.
2021-12-28 09:58:23 -08:00
Unknown W. Brackets
820361f34b
samplerjit: Calculate texel byte offset as vector.
2021-12-27 11:37:32 -08:00
Unknown W. Brackets
b00a66e34c
samplerjit: Pass u/v coords as vector.
2021-12-27 11:37:32 -08:00
Unknown W. Brackets
ce3e29a649
softjit: Fix a function arg template warning.
...
We're just ignoring it because it's a false positive in this case.
2021-12-11 10:45:27 -08:00
Unknown W. Brackets
823c4adb15
softgpu: Keep arguments in vectors for sampling.
2021-12-04 15:45:06 -08:00