Commit graph

273 commits

Author SHA1 Message Date
unknown
c1ccc32408 deprecated MASK_XOR (old, broken, meant only to avoid multiplying) 2015-01-27 21:26:42 -05:00
unknown
95cf462dfb force dummy buffer allocations for LWC2, native SWC2 wraparound 2015-01-21 15:10:18 -05:00
unknown
a4a7f4bd8e forgot to modernize a few types 2015-01-18 16:39:59 -05:00
unknown
c42ac84651 Correct VCH sign flag comparison on -32768. 2015-01-18 12:56:16 -05:00
unknown
74b3ee72ce Force two-dimensional merging in VCL. 2015-01-17 22:29:44 -05:00
unknown
9c49dc4fff abolish SSSE3 configurator for byte-wise shuffling 2014-12-13 16:34:48 -05:00
unknown
699896f677 install new pointer types to distinguish mem. reference from decl.'s 2014-12-08 23:47:50 -05:00
unknown
ac4fb238da temporarily supporting the SSSE3 superset for early experiments 2014-11-16 15:46:54 -05:00
unknown
bfd74741f9 force vectorization of unsigned multiply, overflow and VMADL clamp 2014-10-28 20:50:10 -04:00
unknown
55ad9ad9d8 optimized VMADN with static overflow, carry and multiply-add 2014-10-28 15:35:05 -04:00
unknown
6d17d19dc6 correspond VMUDM intrinsics to multiply-accumulate variation 2014-10-26 23:58:18 -04:00
unknown
5cce9f457e new algorithm for mixed signed * unsigned factorization 2014-10-23 16:45:01 -04:00
unknown
ef09b4eb5d redesign VMUDN with carry and overflow/underflow SSE logic 2014-10-22 22:45:34 -04:00
unknown
f810a85e31 refer unsigned overflow to `negative' mask 2014-10-21 19:58:10 -04:00
unknown
c7a468e3d7 corresponding optimizations to VMUDL (same multiply, diff. clamp) 2014-10-21 00:05:26 -04:00
unknown
9dbdcc490c restyled some optimization and fix 48-bit MADD sign-extension 2014-10-20 22:25:01 -04:00
unknown
b832e39a92 merged bi-arch VMULF template into optimized SIMD mulf 2014-10-20 00:57:40 -04:00
unknown
d768f51077 more direct multiply-add high operation without bi-arch template 2014-10-18 22:27:08 -04:00
unknown
79c5aa0cf4 removed bi-arch template for VMUDH 2014-10-17 22:43:43 -04:00
unknown
291e7fb10b remove bi-arch template for VMUDL as mudl was greatly simplified in SSE 2014-10-17 19:02:02 -04:00
unknown
9490ea8e20 fix annoying "unused local variable" warnings ifndef ARCH_MIN_SSE2 2014-10-17 02:24:03 -04:00
unknown
2e1e9edf75 cut SHUFFLE_VECTOR to only 2 arguments with pre-loaded VT 2014-10-17 02:23:08 -04:00
unknown
f05e2d603e globalize the shuffle macro for a future shot at SMC in su.c 2014-10-17 00:50:30 -04:00
unknown
158a4d0b60 pass only 2 XMM operands, w/ no return slot ifndef ARCH_MIN_SSE2 2014-10-16 00:43:37 -04:00
unknown
fd765856fc regulate (temporarily?) that $vd begins as a zero'd vector 2014-10-15 00:55:40 -04:00
unknown
c550a19a8a fail paste kthxbai 2014-10-14 23:05:24 -04:00
unknown
e8e87ce602 some new flexible intrinsic macros for vector operations 2014-10-14 17:59:00 -04:00
unknown
9bfe2c20c3 new SSE2 shuffling template for later staticization 2014-10-14 14:21:54 -04:00
unknown
91ba902637 removed extra load/store from old scalar SHUFFLE_VECTOR template 2014-10-14 04:53:49 -04:00
unknown
7d80d7d115 fix macro re-definition warning in GCC (already in my_types.h) 2014-10-10 01:38:41 -04:00
unknown
f1481dd39b restructured modular layout of the source, dropped some optional features 2014-10-09 16:45:55 -04:00
unknown
d5692be247 dissolved VU arguments into higher-level SIMD vector call stacks 2014-10-07 00:33:15 -04:00
unknown
d24da458c6 clip test optimizations, fixes Wrestlemania 2000 clamped -(-32768) 2014-09-24 15:30:32 -04:00
unknown
a68b3251ed Harvest Moon 64 regression from 2013/9/11--broken zero-upper wrapping 2014-08-14 14:22:33 -04:00
unknown
d1f7688deb force copy alignment better to low GCC 4.8.1 compiler intelligence 2014-08-13 08:03:38 -04:00
unknown
3c9c183b47 if vector_copy() on shuffled vector, force ST alignment for compiler 2014-08-13 07:12:19 -04:00
unknown
93a6648926 more MSVC compliance (lack of implicit type conversion, force alignment) 2014-08-13 07:07:42 -04:00
unknown
1125d4a9d6 some MSVC compliance changes, force literal immediate syntax 2014-08-13 07:00:49 -04:00
unknown
861b636ced forgot to update the modification date 2014-08-13 00:08:15 -04:00
unknown
9d760917a1 Merge branch 'master' of https://github.com/cxd4/rsp 2014-08-12 22:21:58 -04:00
unknown
5cae28f628 Quake64 fix, delay shared register file writeback until after set NOTEQUAL 2014-08-12 22:19:44 -04:00
Sven Eckelmann
4392ebde42 Add an explicit public domain dedication statement
The process of dedicating a piece of work under the public domain is not the
same under different legal systems. It is possible that different rights are
given away depending on the origin. Sometimes even the dedication of the work
under public domain is not possible at all. CC0 tries to provide an explicit
way to waive all rights to still provide a secure way for other parties to use
this work [1] and provides a fallback when parts of the license may be judged
invalid under any jurisdiction.

[1] http://creativecommons.org/about/cc0
2013-12-17 13:36:56 +01:00
Sven Eckelmann
05696fde5a Replace long with (u)int32_t for systems with sizeof(long) > 4
This only affects long's which are used on an 64-bit system. The zilmar spec
with its DWORD is not touched because on 32 bit it is always 32 bit for common
systems. On Win64 systems it is also 32 bit because Microsoft adopted LLP64.
All other systems seem to use LP64 (IL32LP64), ILP64 or SILP64.

ILP64 and SILP64 would also need to have shorts and int (when the code expects
32 bit) to be changed to (u)int*_t counterparts. This is not done in here.
2013-12-12 12:53:27 +01:00
Sven Eckelmann
0faed67a64 Fix vector divides on systems with sizeof(long) > 4
The wrong div operations result in wrong transformations in many roms.

This regression was introduced in 87210c71a5
("cleanups and rewrites to the vector divide class").
2013-12-12 00:34:19 +01:00
unknown
92469fafc1 fetched wrong file for debugging VCE, fix signed acc logs 2013-12-04 19:05:39 -05:00
unknown
fdc611e6d4 staticized CPU loop PC updates to iterate within J/B 2013-12-04 18:16:28 -05:00
unknown
ec693a456d decided to implement the VRSQ op, with a warning 2013-12-01 20:38:23 -05:00
unknown
87210c71a5 cleanups and rewrites to the vector divide class 2013-11-26 19:15:19 -05:00
unknown
0b8b2ad900 restored old (surprisingly faster) small VU operand allocation 2013-11-26 13:22:37 -05:00
unknown
8607b1625d fixed jittery 3-D textures bug in World Driver Championship 2013-11-25 18:18:00 -05:00