Henrik Rydgard
bc121242b3
Use fast_math matrix multiplication for culling and sw transform
2014-03-22 14:40:09 +01:00
Henrik Rydgård
98da5144ef
Merge pull request #5612 from raven02/patch-27
...
Shade mapping fix
2014-03-22 14:37:22 +01:00
Unknown W. Brackets
66f501b981
Avoid an invalid enum on GLES2 texture creation.
...
My device logs an error, which I'm guessing has perf impact.
2014-03-22 09:34:22 +01:00
Henrik Rydgard
f4db725400
Remove redundant call to ReplaceAlphaWithStencil
2014-03-22 09:28:45 +01:00
Henrik Rydgard
ba5d88e9d6
Fix bug in FastLoadBoneMatrix where the wrong uniform could be dirtied
2014-03-22 09:27:43 +01:00
Henrik Rydgard
0b673719c2
Crashfix for software renderer in 32-bit (SSE misalignment)
2014-03-22 00:12:21 +01:00
Unknown W. Brackets
a8a299c2e3
Fix ToRGB/ToRGBA possible accuracy loss.
...
It was always like this, but not used as much before. Shifts are fast and
it eneds to sum anyway, there should not be any benefit to multiplying as
floats, and it will probably lose accuracy.
2014-03-18 22:56:27 -07:00
Unknown W. Brackets
678237aa6c
Improve SSE usage in software transform.
...
It's actually already pretty decent (unlike the softgpu), but there were a
few places it could use a bit of help. Speeds up things with hardware
transform off, or areas that need to use software transform.
2014-03-17 23:05:48 -07:00
Unknown W. Brackets
416df17088
Inline From/ToRGB(A) to avoid losing SSE.
...
Otherwise it has to store it, which I'd like to avoid.
2014-03-17 23:03:04 -07:00
Unknown W. Brackets
1ce6bf399a
Buildfix for 32-bit x86, arg.
2014-03-17 21:52:45 -07:00
Unknown W. Brackets
833c93bd98
Dumb mistake, forgot the divide.
...
Probably caused the blending issues.
2014-03-17 12:53:49 -07:00
Unknown W. Brackets
6630e45eff
Just add a packed version of Vec3f.
...
This way we can have it aligned to memory where needed. I think it'd be
better to avoid this if possible so that we can actually vectorize
spline/etc. code.
Fixes #5673 .
2014-03-17 06:59:40 -07:00
Unknown W. Brackets
38d0bac1df
Optimize some 4444/8888 color conversions.
...
Small performance boost in softgpu.
2014-03-17 01:21:52 -07:00
Unknown W. Brackets
6de2129f98
softgpu: Don't re-pack 8888 colors.
...
It's like a bad joke, but MSVC was not optimizing this out.
2014-03-16 23:03:07 -07:00
Unknown W. Brackets
10456a09ac
Oops, forgot to multiply in float ToRGBA().
...
Not actually used...
2014-03-16 21:12:23 -07:00
Unknown W. Brackets
627027307c
softgpu: Use SSE in ToRGB()/FromRGB() etc.
2014-03-16 19:21:35 -07:00
Unknown W. Brackets
07ca96e226
softgpu: Use SSE in alpha blending.
2014-03-16 18:57:11 -07:00
Unknown W. Brackets
601ff10f1e
softgpu: Use SSE in tex modulation.
...
Could do others, this seems the most common. Gives a few more percent.
2014-03-16 18:28:06 -07:00
Unknown W. Brackets
47728528d7
softgpu: Use SSE in Vec?::Length().
...
Minor perf boost but if I do everything in Vec things get slower.
2014-03-16 17:56:34 -07:00
Unknown W. Brackets
6ef0aa123f
softgpu: Use SSE for the secondary color.
...
It's easy to speed up this code since it's so hot.
2014-03-16 16:21:12 -07:00
Unknown W. Brackets
7f3e158a0f
softgpu: Get all tex samples at the same time.
...
Kills a bunch of overhead, improving speed more.
2014-03-16 15:51:47 -07:00
Unknown W. Brackets
d9e29a2edf
softgpu: Optimize alpha blending handling.
...
This alone makes it a good bit faster.
2014-03-16 15:22:31 -07:00
Unknown W. Brackets
f21649e563
softgpu: Minor simplification for alpha blend.
2014-03-16 15:09:42 -07:00
Unknown W. Brackets
1ab7325d4a
softgpu: Use a full Vec4 for the prim color.
...
Simpler, and slightly faster.
2014-03-16 15:04:41 -07:00
Unknown W. Brackets
c3530a6674
softgpu: Don't multithread small triangles.
...
It ends up being slower with all the overhead, of course.
2014-03-16 14:49:49 -07:00
Unknown W. Brackets
b33d0c4046
softgpu: Use SSE for texture sampling.
2014-03-16 14:33:42 -07:00
Unknown W. Brackets
b357b00ace
softgpu: Use SSE for through texture coords.
2014-03-16 14:30:20 -07:00
Unknown W. Brackets
dd140b73bb
softgpu: Use SSE for gouraud shading.
2014-03-16 14:29:22 -07:00
Unknown W. Brackets
743854afc8
Fix off-by-one on fast matrix loads.
...
May matter mostly if there's a stall right at the end of the matrix.
2014-03-15 15:23:55 -07:00
Henrik Rydgård
78ce9b3f3c
Spline patches: Ignore too-small patch_div_s/t. May help #5663
2014-03-15 21:29:48 +01:00
Unknown W. Brackets
a843cbd580
Shrink the very common sceKernelThread.h include.
2014-03-15 11:44:02 -07:00
Unknown W. Brackets
996fa39684
Reduce some unnecessary includes in Core/.
2014-03-15 10:41:07 -07:00
Henrik Rydgard
b4d99b1981
Revert "Avoid caching when HW T&L with morph enabled."
...
This reverts commit 557eae7ca9
.
2014-03-15 10:46:04 +01:00
raven02
557eae7ca9
Avoid caching when HW T&L with morph enabled.
2014-03-14 21:04:32 +08:00
Henrik Rydgard
4df49a72ab
Add yet another hack setting to work around the 3rd Birthday problem.
...
Hopefully temporary...
2014-03-13 19:00:35 +01:00
Henrik Rydgard
2eb6a4e2f2
Fix a warning, rename some parameters, etc.
2014-03-08 10:40:43 +01:00
raven02
1b831ce022
SW T&L
2014-03-07 21:41:40 +08:00
Sacha
05571df8ec
Use a VLDM in Vertex Decoder.
2014-03-07 14:25:05 +10:00
raven02
2c7c1f547d
Shade mapping fix
2014-03-06 22:07:08 +08:00
Unknown W. Brackets
4fbb245382
Avoid leaving the fast runloop on jumps.
...
Jumps are actually very common in some games, like FF4 and Crisis Core,
and tons more. They are used to jump around vertex data.
Improves performance by a few percent in FF4.
2014-03-05 23:24:18 -08:00
Unknown W. Brackets
505b0c388f
Fix a typo.
2014-03-04 07:37:32 -08:00
Henrik Rydgard
e11e4cfff2
GCC buildfix
2014-03-04 11:38:33 +01:00
Unknown W. Brackets
b1acde2679
Oops, forgot the world matrix too.
...
VerySleepy is telling me that time is spent in WORLDMATRIXDATA in games,
but I didn't check the perf impact exactly. It's probably small, but may
help some games.
2014-03-04 01:09:04 -08:00
Unknown W. Brackets
9e35822d16
Try to load view and model matrices a bit faster.
2014-03-04 00:37:28 -08:00
Unknown W. Brackets
a8f9635e28
Optimize loading of texgen matrices.
...
Pretty small impact, may help games that use them a lot.
2014-03-04 00:23:10 -08:00
Unknown W. Brackets
d60b0272fa
Avoid flushing if the bone matrix is the same.
2014-03-04 00:17:16 -08:00
Unknown W. Brackets
eb04031975
Try to optimize inline matrix loads.
...
Improves performance by a few percent in Gods Eater Burst.
2014-03-04 00:11:03 -08:00
Unknown W. Brackets
f124e7dddc
Fix a minor typo.
2014-03-03 00:21:04 -08:00
Unknown W. Brackets
c7437bbe8e
Fix some minor warnings.
2014-03-03 00:08:32 -08:00
Henrik Rydgård
c2f76ac549
Merge pull request #5594 from unknownbrackets/gpu-minor
...
Fix some software skinning glitches
2014-03-03 13:32:03 +07:00