Commit graph

2339 commits

Author SHA1 Message Date
Henrik Rydgard
bc121242b3 Use fast_math matrix multiplication for culling and sw transform 2014-03-22 14:40:09 +01:00
Henrik Rydgård
98da5144ef Merge pull request #5612 from raven02/patch-27
Shade mapping fix
2014-03-22 14:37:22 +01:00
Unknown W. Brackets
66f501b981 Avoid an invalid enum on GLES2 texture creation.
My device logs an error, which I'm guessing has perf impact.
2014-03-22 09:34:22 +01:00
Henrik Rydgard
f4db725400 Remove redundant call to ReplaceAlphaWithStencil 2014-03-22 09:28:45 +01:00
Henrik Rydgard
ba5d88e9d6 Fix bug in FastLoadBoneMatrix where the wrong uniform could be dirtied 2014-03-22 09:27:43 +01:00
Henrik Rydgard
0b673719c2 Crashfix for software renderer in 32-bit (SSE misalignment) 2014-03-22 00:12:21 +01:00
Unknown W. Brackets
a8a299c2e3 Fix ToRGB/ToRGBA possible accuracy loss.
It was always like this, but not used as much before.  Shifts are fast and
it eneds to sum anyway, there should not be any benefit to multiplying as
floats, and it will probably lose accuracy.
2014-03-18 22:56:27 -07:00
Unknown W. Brackets
678237aa6c Improve SSE usage in software transform.
It's actually already pretty decent (unlike the softgpu), but there were a
few places it could use a bit of help.  Speeds up things with hardware
transform off, or areas that need to use software transform.
2014-03-17 23:05:48 -07:00
Unknown W. Brackets
416df17088 Inline From/ToRGB(A) to avoid losing SSE.
Otherwise it has to store it, which I'd like to avoid.
2014-03-17 23:03:04 -07:00
Unknown W. Brackets
1ce6bf399a Buildfix for 32-bit x86, arg. 2014-03-17 21:52:45 -07:00
Unknown W. Brackets
833c93bd98 Dumb mistake, forgot the divide.
Probably caused the blending issues.
2014-03-17 12:53:49 -07:00
Unknown W. Brackets
6630e45eff Just add a packed version of Vec3f.
This way we can have it aligned to memory where needed.  I think it'd be
better to avoid this if possible so that we can actually vectorize
spline/etc. code.

Fixes #5673.
2014-03-17 06:59:40 -07:00
Unknown W. Brackets
38d0bac1df Optimize some 4444/8888 color conversions.
Small performance boost in softgpu.
2014-03-17 01:21:52 -07:00
Unknown W. Brackets
6de2129f98 softgpu: Don't re-pack 8888 colors.
It's like a bad joke, but MSVC was not optimizing this out.
2014-03-16 23:03:07 -07:00
Unknown W. Brackets
10456a09ac Oops, forgot to multiply in float ToRGBA().
Not actually used...
2014-03-16 21:12:23 -07:00
Unknown W. Brackets
627027307c softgpu: Use SSE in ToRGB()/FromRGB() etc. 2014-03-16 19:21:35 -07:00
Unknown W. Brackets
07ca96e226 softgpu: Use SSE in alpha blending. 2014-03-16 18:57:11 -07:00
Unknown W. Brackets
601ff10f1e softgpu: Use SSE in tex modulation.
Could do others, this seems the most common.  Gives a few more percent.
2014-03-16 18:28:06 -07:00
Unknown W. Brackets
47728528d7 softgpu: Use SSE in Vec?::Length().
Minor perf boost but if I do everything in Vec things get slower.
2014-03-16 17:56:34 -07:00
Unknown W. Brackets
6ef0aa123f softgpu: Use SSE for the secondary color.
It's easy to speed up this code since it's so hot.
2014-03-16 16:21:12 -07:00
Unknown W. Brackets
7f3e158a0f softgpu: Get all tex samples at the same time.
Kills a bunch of overhead, improving speed more.
2014-03-16 15:51:47 -07:00
Unknown W. Brackets
d9e29a2edf softgpu: Optimize alpha blending handling.
This alone makes it a good bit faster.
2014-03-16 15:22:31 -07:00
Unknown W. Brackets
f21649e563 softgpu: Minor simplification for alpha blend. 2014-03-16 15:09:42 -07:00
Unknown W. Brackets
1ab7325d4a softgpu: Use a full Vec4 for the prim color.
Simpler, and slightly faster.
2014-03-16 15:04:41 -07:00
Unknown W. Brackets
c3530a6674 softgpu: Don't multithread small triangles.
It ends up being slower with all the overhead, of course.
2014-03-16 14:49:49 -07:00
Unknown W. Brackets
b33d0c4046 softgpu: Use SSE for texture sampling. 2014-03-16 14:33:42 -07:00
Unknown W. Brackets
b357b00ace softgpu: Use SSE for through texture coords. 2014-03-16 14:30:20 -07:00
Unknown W. Brackets
dd140b73bb softgpu: Use SSE for gouraud shading. 2014-03-16 14:29:22 -07:00
Unknown W. Brackets
743854afc8 Fix off-by-one on fast matrix loads.
May matter mostly if there's a stall right at the end of the matrix.
2014-03-15 15:23:55 -07:00
Henrik Rydgård
78ce9b3f3c Spline patches: Ignore too-small patch_div_s/t. May help #5663 2014-03-15 21:29:48 +01:00
Unknown W. Brackets
a843cbd580 Shrink the very common sceKernelThread.h include. 2014-03-15 11:44:02 -07:00
Unknown W. Brackets
996fa39684 Reduce some unnecessary includes in Core/. 2014-03-15 10:41:07 -07:00
Henrik Rydgard
b4d99b1981 Revert "Avoid caching when HW T&L with morph enabled."
This reverts commit 557eae7ca9.
2014-03-15 10:46:04 +01:00
raven02
557eae7ca9 Avoid caching when HW T&L with morph enabled. 2014-03-14 21:04:32 +08:00
Henrik Rydgard
4df49a72ab Add yet another hack setting to work around the 3rd Birthday problem.
Hopefully temporary...
2014-03-13 19:00:35 +01:00
Henrik Rydgard
2eb6a4e2f2 Fix a warning, rename some parameters, etc. 2014-03-08 10:40:43 +01:00
raven02
1b831ce022 SW T&L 2014-03-07 21:41:40 +08:00
Sacha
05571df8ec Use a VLDM in Vertex Decoder. 2014-03-07 14:25:05 +10:00
raven02
2c7c1f547d Shade mapping fix 2014-03-06 22:07:08 +08:00
Unknown W. Brackets
4fbb245382 Avoid leaving the fast runloop on jumps.
Jumps are actually very common in some games, like FF4 and Crisis Core,
and tons more.  They are used to jump around vertex data.

Improves performance by a few percent in FF4.
2014-03-05 23:24:18 -08:00
Unknown W. Brackets
505b0c388f Fix a typo. 2014-03-04 07:37:32 -08:00
Henrik Rydgard
e11e4cfff2 GCC buildfix 2014-03-04 11:38:33 +01:00
Unknown W. Brackets
b1acde2679 Oops, forgot the world matrix too.
VerySleepy is telling me that time is spent in WORLDMATRIXDATA in games,
but I didn't check the perf impact exactly.  It's probably small, but may
help some games.
2014-03-04 01:09:04 -08:00
Unknown W. Brackets
9e35822d16 Try to load view and model matrices a bit faster. 2014-03-04 00:37:28 -08:00
Unknown W. Brackets
a8f9635e28 Optimize loading of texgen matrices.
Pretty small impact, may help games that use them a lot.
2014-03-04 00:23:10 -08:00
Unknown W. Brackets
d60b0272fa Avoid flushing if the bone matrix is the same. 2014-03-04 00:17:16 -08:00
Unknown W. Brackets
eb04031975 Try to optimize inline matrix loads.
Improves performance by a few percent in Gods Eater Burst.
2014-03-04 00:11:03 -08:00
Unknown W. Brackets
f124e7dddc Fix a minor typo. 2014-03-03 00:21:04 -08:00
Unknown W. Brackets
c7437bbe8e Fix some minor warnings. 2014-03-03 00:08:32 -08:00
Henrik Rydgård
c2f76ac549 Merge pull request #5594 from unknownbrackets/gpu-minor
Fix some software skinning glitches
2014-03-03 13:32:03 +07:00