Henrik Rydgård
c1a290b41f
ReplacedTexture: Bugfix D3D workaround log check
2023-07-23 22:06:06 +02:00
Henrik Rydgård
ace217008a
In D3D11, force block compressed textures to have dimensions divisible by 4
...
Fixes #17745 (crash when loading certain texture packs in D3D11)
This is an old unfortunate limitation. Only applies to the top mip
level, which makes it obvious that it's kinda unnecessary for the
hardware and indeed, Vulkan and OpenGL don't have this limitation.
2023-07-20 19:44:00 +02:00
Henrik Rydgård
6b574e497f
Merge pull request #17730 from unknownbrackets/gedebugger-steptex
...
GE Debugger: Make step tex jump to first prim
2023-07-16 21:02:06 +02:00
Unknown W. Brackets
3b03c1ca85
GE Debugger: Make step tex jump to first prim.
2023-07-16 11:34:51 -07:00
Unknown W. Brackets
d6a5e84db5
softgpu: Fix worldpos skipping.
...
Oops, was reversed. We need worldpos for non-directional lights.
2023-07-16 10:59:44 -07:00
Unknown W. Brackets
47c29e0874
sopftgpu: Disable lights if all else disabled.
...
Tiny gain, but seeing it happen so might as well.
2023-07-16 10:31:58 -07:00
Unknown W. Brackets
d5b4c98f96
softgpu: Reduce some non-SIMD lighting math.
...
Small perf improvement for vertex/lighting heavy (i.e. 3D) scenes.
2023-07-16 10:31:44 -07:00
Henrik Rydgård
b4419a9146
Remove the old screen resolution popup thing
2023-07-16 17:05:26 +02:00
Henrik Rydgård
952e125c7e
Break out rendering of "notices" from OnScreenDisplay. They can now also be used as views.
...
Use it for the new message in ControlMappingScreen, when you try to map
a combo when that's disabled. It'll have more uses.
2023-07-07 15:23:19 +02:00
fp64
b0f71e08f4
Simplify projective texcoord calculation
...
As mentioned in https://github.com/hrydgard/ppsspp/issues/17613#issuecomment-1613583152 .
2023-07-03 10:59:09 -04:00
Henrik Rydgård
fc797ec55f
Merge pull request #17656 from lvonasek/compat_openxr_fixes
...
OpenXR - Game compatibility fixes
2023-07-02 21:12:21 +02:00
Lubos
6e10f20f8b
OpenXR - Tony Hawk mirroring hack better
2023-07-02 20:29:59 +02:00
Lubos
843b169fa3
OpenXR - Digimon Adventure rendering fixed
2023-07-02 15:05:29 +02:00
Unknown W. Brackets
9c08e27a0c
Merge pull request #17648 from fp64/div-less
...
Replace some signed divison in SoftGPU
2023-07-01 12:28:52 -07:00
fp64
cd9f01c4df
Remove SSE4 path from Vec4<int>::operator*
2023-06-30 22:07:26 -04:00
Henrik Rydgård
eb21a2e6c9
Break out the OSD data holder from Common/System/System.h, into OSD.cpp/h
2023-06-30 17:15:49 +02:00
fp64
f133739cd0
Replace some signed divison in SoftGPU
...
This also adds a few bitwise operations to Vec4<int> and further
SIMDifies it.
Also, fixes unrelated warning.
2023-06-29 16:43:21 -04:00
Unknown W. Brackets
dfe113e846
Merge pull request #17634 from fp64/macro-x86-loadu
...
Streamline x86 SSE workaround
2023-06-27 23:01:41 -07:00
Henrik Rydgård
e4229886b7
Merge pull request #17636 from lvonasek/review_openxr
...
OpenXR - Major review
2023-06-27 20:07:42 +02:00
Lubos
880168ee3c
OpenXR - Fix render glitches caused by wrong mirroring
2023-06-27 18:54:38 +02:00
M4xw
99ce3125df
[Softgpu] Fix AArch64 oversight
2023-06-27 17:20:11 +02:00
fp64
436b49c4f2
Streamline x86 SSE workaround
...
Seems clearer than using #ifdef's at each site. Also rationale
is clearly spelled out, one 'Go to definition' away from any instance.
2023-06-27 00:30:01 -04:00
Unknown W. Brackets
fedb92b0e9
softgpu: Ensure early depth test uses SIMD.
2023-06-25 10:18:21 -07:00
Henrik Rydgård
08d578dce9
Merge pull request #17618 from unknownbrackets/softgpu-opt-cast
...
Optimize casts in softgpu
2023-06-25 07:55:30 +02:00
Henrik Rydgård
ec92675c5e
Merge pull request #17619 from unknownbrackets/softgpu-opt-z
...
softgpu: Improve Z interpolation SIMD
2023-06-25 07:55:03 +02:00
Unknown W. Brackets
d42642edd2
softgpu: Improve Z interpolation SIMD.
2023-06-24 22:17:11 -07:00
Unknown W. Brackets
15b66ba6c0
softgpu: Make SIMD on x86_32 a bit safer.
2023-06-24 14:49:23 -07:00
Unknown W. Brackets
ae9d34370e
softgpu: Move wsum_recip out of the triangle loop.
...
Seems like a small benefit, but not seeing any issues from this.
Noticed by fp64.
2023-06-24 12:38:05 -07:00
Unknown W. Brackets
795de9b164
softgpu: Use SIMD for more Vec4 casts.
...
A number of these were falling back to some pretty terrible code.
Thanks to fp64 for noticing.
2023-06-24 12:36:44 -07:00
Unknown W. Brackets
76990aec70
Merge pull request #17609 from fp64/optimize-softgpu-tex-linear
...
softgpu: Optimize (bi-)linear texture filtering
2023-06-21 23:39:15 -07:00
fp64
159faaa2ec
softgpu: Optimize (bi-)linear texture filtering
...
Seeing as SampleLinearLevel is near the top in the profiler,
optimize actual bilinear filtering using SSE2. Solid win in the
synthetic benchmark (https://godbolt.org/z/fqh3xvbGx , also doubles
as correctness check), no visible difference in actual PPSSPP.
Note: profiler suggests that hot part of SampleLinearLevel is
elsewhere.
2023-06-21 20:02:34 +03:00
Henrik Rydgård
7cc8c6cea4
OSD: Add semantics, move the the OSD state to common (while keeping the renderer in the UI).
2023-06-20 14:40:46 +02:00
Unknown W. Brackets
efd8565ffe
Merge pull request #17592 from fp64/anymask-movemask
...
Use _mm_movemask_ps for AnyMask
2023-06-17 09:48:09 -07:00
fp64
ab85c46161
Use _mm_movemask_ps for AnyMask
...
Probably very minor speed improvement, but it's rather neat.
2023-06-17 01:05:02 -04:00
Henrik Rydgård
5b4fa06b00
Revert Dot33 on 32-bit x86 only. See #17584
2023-06-16 23:43:33 +02:00
Henrik Rydgård
6d4e5a0f3e
Merge pull request #17584 from fp64/sse2-dot33
...
Convert Dot33 to SSE2
2023-06-15 20:08:23 +02:00
Henrik Rydgård
def09bf575
Update the uvscale uniform a bit more conservatively on framebuffer changes
...
Plus fixes a few minor oversights
Fixes #17581 and possibly #17522
2023-06-15 11:57:30 +02:00
fp64
f0d844a5a3
Convert Dot33 to SSE2
...
Simpler, lower requirements, and doesn't seem to hurt speed. See #17571 .
2023-06-14 22:02:50 -04:00
Henrik Rydgård
6d8069dfd1
Vulkan: Remove the remains of the input attachment experiment
...
Haven't been using these for a while.
I've come to the conclusion here that I think it's better to try to
deal with the issues using safe workarounds like copies, instead of
relying on features with somewhat iffy driver support that are not
universal across APIs anyway.
2023-06-13 20:46:27 +02:00
Henrik Rydgård
df7bd89b7d
Division->shift. since it's a signed integer, gets rid of a cdq instruction.
2023-06-13 11:57:28 +02:00
Henrik Rydgård
0eb3702ecb
Then add the early-outs for NEON too.
2023-06-13 11:48:04 +02:00
Henrik Rydgård
9647872a09
Same for NEON, first the refactor...
2023-06-13 11:48:04 +02:00
Henrik Rydgård
77da36c03f
SSE addstrip: Add the early-outs.
2023-06-13 11:47:53 +02:00
Henrik Rydgård
39034586a4
SSE: Refactor AddStrip to prepare for early out
2023-06-13 11:45:59 +02:00
Henrik Rydgård
22632b82bd
Merge pull request #17565 from hrydgard/breakout-vcache-vulkan
...
Vulkan: Breakout the vertex cache logic from DoFlush()
2023-06-13 09:56:52 +02:00
Henrik Rydgård
963ca50ba7
Merge pull request #17567 from hrydgard/uvscale-as-argument
...
Pass uvScale in as a fourth argument to the vertex decoder
2023-06-13 09:49:31 +02:00
Henrik Rydgård
71a34d4ffc
Merge pull request #17569 from hrydgard/arm64dec-optimize-saved-regs
...
ARM64: Optimize saved registers in vertex decoder.
2023-06-13 09:49:08 +02:00
Unknown W. Brackets
a7fa37d114
softgpu: Use SIMD more for dot products.
2023-06-12 19:54:32 -07:00
Henrik Rydgård
cdcf3b272e
ARM64: Optimize saved registers in vertex decoder.
...
Simplify away some arrays with unused elements
2023-06-13 00:26:38 +02:00
Henrik Rydgård
4af6fac726
Nop-align the ARM and ARM64 loops too. Many CPUs benefit somewhat from hot loops being 16-byte aligned.
2023-06-13 00:05:48 +02:00