Commit graph

736 commits

Author SHA1 Message Date
Tyler Stachecki
2f0e33263d Add RSP reciprocal ROM contents. 2014-11-09 13:20:10 -05:00
Tyler Stachecki
380577dfe3 Remove an assertion from VSAR...
Thanks to krom for reverse engineering this!
2014-11-08 21:27:45 -05:00
Tyler Stachecki
4cfb7275a9 Fix and optimize rsp_uclamp_acc (once again). 2014-11-08 19:07:08 -05:00
Tyler Stachecki
9f8a9f9d62 Add implementations of VMADH and VMUDH. 2014-11-08 14:01:41 -05:00
Tyler Stachecki
007d72eda1 Add implementations of VMADL and VMADM. 2014-11-08 12:21:06 -05:00
Tyler Stachecki
e4da36ef72 Fix debug builds. 2014-11-05 19:22:40 -05:00
Tyler Stachecki
16a7c434da Fix/optimize the RSP accumulator clamp LO algorithm. 2014-11-05 16:58:59 -05:00
Tyler Stachecki
6a0604eaca Fix the RSP accumulator clamping algorithm. 2014-11-05 15:09:16 -05:00
Tyler Stachecki
e89f054674 Optimize extremely aggressively.
Tell GCC to optimize cold functions for size and stash them away in
a separate part of the binary. Put the simulate core, meanwhile, on
the hot path. Also, bump optimization to -O3 as we can now "afford"
to do so.
2014-11-05 08:39:47 -05:00
Tyler Stachecki
b668296589 Add implementations of VADD and VSUB. 2014-11-03 18:06:32 -05:00
Tyler Stachecki
083ad75286 arch/x86_64: Cache RSP accumulator regs in host CPU. 2014-11-03 16:48:38 -05:00
Tyler Stachecki
f741923329 AVX seems to help now; enable it again. 2014-11-02 23:06:47 -05:00
Tyler Stachecki
da0436cbe1 Fix SSE preprocessor macro mistake. 2014-11-02 22:50:06 -05:00
Tyler Stachecki
716410d7b0 Remove an extra newline. 2014-11-02 22:48:31 -05:00
Tyler Stachecki
d4a8f82b10 Change the RSP vector calling convention. 2014-11-02 22:45:33 -05:00
Tyler Stachecki
b5ff809881 Add an implementation of VMADN. 2014-11-02 22:31:58 -05:00
Tyler Stachecki
89ecd417d8 Pack RSP results into a result structure. 2014-11-02 13:40:49 -05:00
Tyler Stachecki
bf197cf3bd Implement VMUDL, VMUDM, VMUDN. 2014-11-02 12:44:19 -05:00
Tyler Stachecki
c4612418ed Implement VINV, fixup INV. 2014-11-02 11:57:26 -05:00
Tyler Stachecki
f6c77de8ea Fix an annoying little load-aligner bug. 2014-11-02 11:53:39 -05:00
Tyler Stachecki
6f54353825 Fix another incorrect RSP branch target. 2014-11-02 10:29:19 -05:00
Tyler Stachecki
87e856634f Undo changes from 1a90e6981e.
Only supposed to return bits [11:0] of the PC reg.
2014-11-02 10:00:22 -05:00
Tyler Stachecki
fae5dcca5d Get rid of some undefined behavior warnings. 2014-11-02 09:55:45 -05:00
Tyler Stachecki
1a90e6981e Make sure SP_PC_REG returns an address in IMEM. 2014-11-02 09:36:44 -05:00
Tyler Stachecki
aaf56a0928 Make sure RSP branch targets don't escape IMEM. 2014-11-02 09:35:50 -05:00
Tyler Stachecki
3f79c50369 Prevent wrap-arounds due to \t spillover. 2014-11-02 00:34:47 -04:00
Tyler Stachecki
a5b380c925 Couple temporary fixes for Windows/MSVC. 2014-11-02 00:28:05 -04:00
Tyler Stachecki
321c81b208 Start restoring the Windows/MSVC build. 2014-11-01 23:42:52 -04:00
Tyler Stachecki
1394c48f6b Cleanup vi/controller.c. 2014-11-01 23:37:17 -04:00
Tyler Stachecki
2bb8fa3b83 os/unix: Output VI/s in window title. 2014-11-01 21:49:58 -04:00
Tyler Stachecki
238d61d32e Cleanup hairy edges from the last commit. 2014-11-01 21:25:08 -04:00
Tyler Stachecki
9c9bdbe8bd os/unix: Thread the renderer and event system. 2014-11-01 20:42:40 -04:00
Tyler Stachecki
08439c0f5f Remove some calls to malloc(...). 2014-11-01 16:59:12 -04:00
Tyler Stachecki
1d339f8f74 Restructure a lot of OS-dependent stuff. 2014-11-01 14:36:48 -04:00
Tyler Stachecki
5b0297d777 More performance optimizations. 2014-10-28 00:10:14 -04:00
Tyler Stachecki
d45cab877a Add a -printsimstats switch.
Start working in a "extra" mode for debugging and other features
that we don't want on the main path. As a demonstration of what we
can do with this extra mode, print out a bunch of simulation info
that can help us optimize offline.
2014-10-27 19:25:22 -04:00
Tyler Stachecki
4ec470fed7 Cleanup CMakeLists, add support for ICC (on Unix). 2014-10-27 11:36:33 -04:00
Tyler Stachecki
79d24aa1ed Revert Clang optimization level to -O2. 2014-10-27 10:09:50 -04:00
Tyler Stachecki
56e3ed8408 Don't use AVX on processors that support it.
AVX instructions are a byte bigger than their non-AVX counter-
parts due to the VEX prefix. Since we aren't mixing SSE and AVX
anywhere, disabling this flag on processors that support it
actually results in a performance boost.
2014-10-26 17:08:00 -04:00
Tyler Stachecki
e3700274a9 Fix Clang compiler warnings. 2014-10-26 16:34:31 -04:00
Tyler Stachecki
c522b7cab0 Some minor tweaks/fixes to the SU pipeline. 2014-10-25 17:11:45 -04:00
Tyler Stachecki
6a6f4174ca Fix edge cases for some LWC2 operations. 2014-10-25 16:46:18 -04:00
Tyler Stachecki
304f667674 Implement several LWC2/SWC2 opcodes. 2014-10-25 14:03:26 -04:00
Tyler Stachecki
b9b989131f More peephole optimizations. 2014-10-25 13:25:07 -04:00
Tyler Stachecki
0c64ae620b Combine SLL, SLLV function logic. 2014-10-25 13:01:20 -04:00
Tyler Stachecki
87986a5037 Cut some instructions from execution functions.
Extend a LUT by a couple of entries to avoid a shift at runtime.
2014-10-25 12:52:41 -04:00
Tyler Stachecki
85a21616cc Micro-optimization: faster li emulation.
If we think about how the assembler forms 32-bit immediates, it
usually generates a lui and addiu pair. Well, if can craft the
simulation such that lui and addiu are the same indirect target
when branching to execution functions, we can reduce the chance
that we'll mispredict and have a resulting pipeline flush on the
host.

Every cycle counts!
2014-10-25 12:40:27 -04:00
Tyler Stachecki
e698bfe1d1 Improving accuracy of RSP LWC2/SWC2 operations. 2014-10-25 02:06:30 -04:00
Tyler Stachecki
74327ef79e Compress LQV/SQV into one function. 2014-10-24 23:56:42 -04:00
Tyler Stachecki
c027d75198 Fix a typo leading to an unnecessarily large array. 2014-10-24 23:44:36 -04:00