Tyler Stachecki
2f0e33263d
Add RSP reciprocal ROM contents.
2014-11-09 13:20:10 -05:00
Tyler Stachecki
380577dfe3
Remove an assertion from VSAR...
...
Thanks to krom for reverse engineering this!
2014-11-08 21:27:45 -05:00
Tyler Stachecki
4cfb7275a9
Fix and optimize rsp_uclamp_acc (once again).
2014-11-08 19:07:08 -05:00
Tyler Stachecki
9f8a9f9d62
Add implementations of VMADH and VMUDH.
2014-11-08 14:01:41 -05:00
Tyler Stachecki
007d72eda1
Add implementations of VMADL and VMADM.
2014-11-08 12:21:06 -05:00
Tyler Stachecki
e4da36ef72
Fix debug builds.
2014-11-05 19:22:40 -05:00
Tyler Stachecki
16a7c434da
Fix/optimize the RSP accumulator clamp LO algorithm.
2014-11-05 16:58:59 -05:00
Tyler Stachecki
6a0604eaca
Fix the RSP accumulator clamping algorithm.
2014-11-05 15:09:16 -05:00
Tyler Stachecki
e89f054674
Optimize extremely aggressively.
...
Tell GCC to optimize cold functions for size and stash them away in
a separate part of the binary. Put the simulate core, meanwhile, on
the hot path. Also, bump optimization to -O3 as we can now "afford"
to do so.
2014-11-05 08:39:47 -05:00
Tyler Stachecki
b668296589
Add implementations of VADD and VSUB.
2014-11-03 18:06:32 -05:00
Tyler Stachecki
083ad75286
arch/x86_64: Cache RSP accumulator regs in host CPU.
2014-11-03 16:48:38 -05:00
Tyler Stachecki
f741923329
AVX seems to help now; enable it again.
2014-11-02 23:06:47 -05:00
Tyler Stachecki
da0436cbe1
Fix SSE preprocessor macro mistake.
2014-11-02 22:50:06 -05:00
Tyler Stachecki
716410d7b0
Remove an extra newline.
2014-11-02 22:48:31 -05:00
Tyler Stachecki
d4a8f82b10
Change the RSP vector calling convention.
2014-11-02 22:45:33 -05:00
Tyler Stachecki
b5ff809881
Add an implementation of VMADN.
2014-11-02 22:31:58 -05:00
Tyler Stachecki
89ecd417d8
Pack RSP results into a result structure.
2014-11-02 13:40:49 -05:00
Tyler Stachecki
bf197cf3bd
Implement VMUDL, VMUDM, VMUDN.
2014-11-02 12:44:19 -05:00
Tyler Stachecki
c4612418ed
Implement VINV, fixup INV.
2014-11-02 11:57:26 -05:00
Tyler Stachecki
f6c77de8ea
Fix an annoying little load-aligner bug.
2014-11-02 11:53:39 -05:00
Tyler Stachecki
6f54353825
Fix another incorrect RSP branch target.
2014-11-02 10:29:19 -05:00
Tyler Stachecki
87e856634f
Undo changes from 1a90e6981e
.
...
Only supposed to return bits [11:0] of the PC reg.
2014-11-02 10:00:22 -05:00
Tyler Stachecki
fae5dcca5d
Get rid of some undefined behavior warnings.
2014-11-02 09:55:45 -05:00
Tyler Stachecki
1a90e6981e
Make sure SP_PC_REG returns an address in IMEM.
2014-11-02 09:36:44 -05:00
Tyler Stachecki
aaf56a0928
Make sure RSP branch targets don't escape IMEM.
2014-11-02 09:35:50 -05:00
Tyler Stachecki
3f79c50369
Prevent wrap-arounds due to \t spillover.
2014-11-02 00:34:47 -04:00
Tyler Stachecki
a5b380c925
Couple temporary fixes for Windows/MSVC.
2014-11-02 00:28:05 -04:00
Tyler Stachecki
321c81b208
Start restoring the Windows/MSVC build.
2014-11-01 23:42:52 -04:00
Tyler Stachecki
1394c48f6b
Cleanup vi/controller.c.
2014-11-01 23:37:17 -04:00
Tyler Stachecki
2bb8fa3b83
os/unix: Output VI/s in window title.
2014-11-01 21:49:58 -04:00
Tyler Stachecki
238d61d32e
Cleanup hairy edges from the last commit.
2014-11-01 21:25:08 -04:00
Tyler Stachecki
9c9bdbe8bd
os/unix: Thread the renderer and event system.
2014-11-01 20:42:40 -04:00
Tyler Stachecki
08439c0f5f
Remove some calls to malloc(...).
2014-11-01 16:59:12 -04:00
Tyler Stachecki
1d339f8f74
Restructure a lot of OS-dependent stuff.
2014-11-01 14:36:48 -04:00
Tyler Stachecki
5b0297d777
More performance optimizations.
2014-10-28 00:10:14 -04:00
Tyler Stachecki
d45cab877a
Add a -printsimstats switch.
...
Start working in a "extra" mode for debugging and other features
that we don't want on the main path. As a demonstration of what we
can do with this extra mode, print out a bunch of simulation info
that can help us optimize offline.
2014-10-27 19:25:22 -04:00
Tyler Stachecki
4ec470fed7
Cleanup CMakeLists, add support for ICC (on Unix).
2014-10-27 11:36:33 -04:00
Tyler Stachecki
79d24aa1ed
Revert Clang optimization level to -O2.
2014-10-27 10:09:50 -04:00
Tyler Stachecki
56e3ed8408
Don't use AVX on processors that support it.
...
AVX instructions are a byte bigger than their non-AVX counter-
parts due to the VEX prefix. Since we aren't mixing SSE and AVX
anywhere, disabling this flag on processors that support it
actually results in a performance boost.
2014-10-26 17:08:00 -04:00
Tyler Stachecki
e3700274a9
Fix Clang compiler warnings.
2014-10-26 16:34:31 -04:00
Tyler Stachecki
c522b7cab0
Some minor tweaks/fixes to the SU pipeline.
2014-10-25 17:11:45 -04:00
Tyler Stachecki
6a6f4174ca
Fix edge cases for some LWC2 operations.
2014-10-25 16:46:18 -04:00
Tyler Stachecki
304f667674
Implement several LWC2/SWC2 opcodes.
2014-10-25 14:03:26 -04:00
Tyler Stachecki
b9b989131f
More peephole optimizations.
2014-10-25 13:25:07 -04:00
Tyler Stachecki
0c64ae620b
Combine SLL, SLLV function logic.
2014-10-25 13:01:20 -04:00
Tyler Stachecki
87986a5037
Cut some instructions from execution functions.
...
Extend a LUT by a couple of entries to avoid a shift at runtime.
2014-10-25 12:52:41 -04:00
Tyler Stachecki
85a21616cc
Micro-optimization: faster li
emulation.
...
If we think about how the assembler forms 32-bit immediates, it
usually generates a lui and addiu pair. Well, if can craft the
simulation such that lui and addiu are the same indirect target
when branching to execution functions, we can reduce the chance
that we'll mispredict and have a resulting pipeline flush on the
host.
Every cycle counts!
2014-10-25 12:40:27 -04:00
Tyler Stachecki
e698bfe1d1
Improving accuracy of RSP LWC2/SWC2 operations.
2014-10-25 02:06:30 -04:00
Tyler Stachecki
74327ef79e
Compress LQV/SQV into one function.
2014-10-24 23:56:42 -04:00
Tyler Stachecki
c027d75198
Fix a typo leading to an unnecessarily large array.
2014-10-24 23:44:36 -04:00