Commit graph

25 commits

Author SHA1 Message Date
Julian Sikorski
e4ae22295e Merge remote-tracking branch 'upstream/master' 2019-07-15 20:51:11 +02:00
Iconoclast
2ea5951d80 Regulate undefined and defined states of RSP registers on boot.
Now with the correct file modification date set. :)
2018-12-19 01:12:14 -05:00
Iconoclast
b3f3736b54 fixed unused symbol warnings 2018-12-18 19:51:51 -05:00
Iconoclast
143911c8e8 VMOV from VT[de], not VT[e].
Fixes #21.

In the face of all adversity to other sources indicating that the four-bit shuffling element specifier is recycled as a selector for the source element from VT, the only way to pass krom's hardware tests on the VMOV operation with operands illegal to standard RSP assembler was to replace this notion with the seemingly oversimplified read from `de` instead of `e`, even though that specifier is already in use as the selector for which destination slice to write to and not just read from.

Despite being removed from any references in the corresponding translation unit's functional implementation, the four-bit element shuffling mask is still in use as with all other vector operations for pre-shuffling VT[] before jumping into the vector operation interpreter function pointer table.

In addition, the MovIn register is also half-emulated.  It is not maintained as a global state machine attribute and only stores the final, hardware-accurate result that was already going to be copied into VD[] anyway rather than the preconceived result of a direct copy from VT[e].
2018-12-18 19:16:34 -05:00
Iconoclast
e3c7f46090 refined optimization from bf7c98f to account for very high dividends
Fixes #19.

Disabling the optimized code is perhaps a temporary measure, but the more readable code under the #else clause should absolutely be kept.  The optimized version for 2's complement machines has however also been patched with a fix in case it becomes desirable to go back to enabling it for substantial speed gains.
2018-11-27 11:34:38 -05:00
Iconoclast
1f7c9fdc0f fixed regression from fixing VRCPL and VRSQL
Sign-extension is correct but only for single-precision reciprocal calculations.  Double-precision divides should still continue to mask in the zero-extended low 16 bits of the determined vector register slice if the previously executed divide instruction prepared a double-precision result rather than defining a single-precision one.
2018-11-25 17:35:40 -05:00
Zapeth
11acc78f6e Fix VRCPL and VRSQL ops
Removed the unsigned cast for DivIn, now passes all tests of this test rom -> https://github.com/PeterLemon/N64/tree/master/RSPTest/CP2/VRCPL
2018-11-24 22:16:53 +01:00
Francisco Zurita
cc6b8833e3 Add libretro NEON optimizations
credits: https://github.com/libretro/parallel-n64/tree/master/mupen64plus-rsp-cxd4
2017-03-04 23:36:21 -05:00
Francisco Zurita
e86432df61 Update to latest CXD4 2016-07-28 08:27:07 -04:00
34f17d1615 fixed rest of the set-but-never-used warnings 2016-03-23 23:52:01 -04:00
e9edb921cf warning: declaration of inst shadows a global declaration [-Wshadow] 2016-03-23 22:31:38 -04:00
88b125f6ab warning: overflow in implicit constant conversion [-Woverflow] 2016-03-05 17:14:06 -05:00
unknown
7d9a42c5ff Prevent in-line expansion of function do_div().
This is either for good or just temporary.  It depends how much performance is lost from having to call the NOINLINE function, but as this is the actual source of speed hits for the divide operations I find it all that much easier to benchmark it when it is not getting in-lined.

Furthermore, it's usually way low at the bottom of the function hot-spot lists anyway, so I'd rather save my 1 KB of DLL file size than worry about premature optimization for a function that needs more thorough benchmark testing anyway.
2015-12-12 18:21:19 -05:00
unknown
a1c53981a4 Make sure the RCP division ROM constants parse as unsigned. 2015-12-12 18:18:38 -05:00
unknown
4f251012b4 Try, yet again, to make GitHub not parse divide.h as C++.
At any rate, the new `static` storage class is advantageous for these divide unit state machine globals.
2015-11-29 21:10:00 -05:00
unknown
d807c78226 some trivial clean-ups 2015-11-27 19:40:39 -05:00
Gillou68310
8796295a2c Merge commit '73232513e7889c82f86fd77f81ac6a060fe7d828' 2015-11-10 11:57:18 +01:00
unknown
bf7c98f586 Conker's BFD micro-optimization with 2's cmpl. integer division 2015-06-08 16:28:21 -04:00
no
71fe84e2dc discovered and fixed implicit SE warning in 64-bit compiles 2015-01-30 15:13:53 -05:00
unknown
010f192a4d warning fix at enumerated storing of unsigned max to signed 2015-01-28 12:12:10 -05:00
unknown
cebd37c835 slight improvements to CPU complement/unsigned portability 2015-01-27 23:00:46 -05:00
unknown
9490ea8e20 fix annoying "unused local variable" warnings ifndef ARCH_MIN_SSE2 2014-10-17 02:24:03 -04:00
unknown
158a4d0b60 pass only 2 XMM operands, w/ no return slot ifndef ARCH_MIN_SSE2 2014-10-16 00:43:37 -04:00
unknown
fd765856fc regulate (temporarily?) that $vd begins as a zero'd vector 2014-10-15 00:55:40 -04:00
unknown
f1481dd39b restructured modular layout of the source, dropped some optional features 2014-10-09 16:45:55 -04:00
Renamed from vu/divrom.h (Browse further)