Commit graph

156 commits

Author SHA1 Message Date
Tyler Stachecki
b4b95d1f21 Fix SS2 RSP vector loads/stores implementation. 2014-11-10 18:32:12 -05:00
Tyler Stachecki
316214d82d (Finally) permit SSE2-only builds.
Add SSE2 codepaths where necessary (even if not complete), while
still allowing the project to be compiled with SSSE3+ intrinsics.
2014-11-10 14:29:13 -05:00
Tyler Stachecki
3a24a67f1f Fix poor SSE2-based RSP performance. 2014-11-10 11:02:57 -05:00
Tyler Stachecki
2f0e33263d Add RSP reciprocal ROM contents. 2014-11-09 13:20:10 -05:00
Tyler Stachecki
380577dfe3 Remove an assertion from VSAR...
Thanks to krom for reverse engineering this!
2014-11-08 21:27:45 -05:00
Tyler Stachecki
9f8a9f9d62 Add implementations of VMADH and VMUDH. 2014-11-08 14:01:41 -05:00
Tyler Stachecki
007d72eda1 Add implementations of VMADL and VMADM. 2014-11-08 12:21:06 -05:00
Tyler Stachecki
e89f054674 Optimize extremely aggressively.
Tell GCC to optimize cold functions for size and stash them away in
a separate part of the binary. Put the simulate core, meanwhile, on
the hot path. Also, bump optimization to -O3 as we can now "afford"
to do so.
2014-11-05 08:39:47 -05:00
Tyler Stachecki
b668296589 Add implementations of VADD and VSUB. 2014-11-03 18:06:32 -05:00
Tyler Stachecki
083ad75286 arch/x86_64: Cache RSP accumulator regs in host CPU. 2014-11-03 16:48:38 -05:00
Tyler Stachecki
d4a8f82b10 Change the RSP vector calling convention. 2014-11-02 22:45:33 -05:00
Tyler Stachecki
b5ff809881 Add an implementation of VMADN. 2014-11-02 22:31:58 -05:00
Tyler Stachecki
89ecd417d8 Pack RSP results into a result structure. 2014-11-02 13:40:49 -05:00
Tyler Stachecki
bf197cf3bd Implement VMUDL, VMUDM, VMUDN. 2014-11-02 12:44:19 -05:00
Tyler Stachecki
c4612418ed Implement VINV, fixup INV. 2014-11-02 11:57:26 -05:00
Tyler Stachecki
f6c77de8ea Fix an annoying little load-aligner bug. 2014-11-02 11:53:39 -05:00
Tyler Stachecki
6f54353825 Fix another incorrect RSP branch target. 2014-11-02 10:29:19 -05:00
Tyler Stachecki
87e856634f Undo changes from 1a90e6981e.
Only supposed to return bits [11:0] of the PC reg.
2014-11-02 10:00:22 -05:00
Tyler Stachecki
fae5dcca5d Get rid of some undefined behavior warnings. 2014-11-02 09:55:45 -05:00
Tyler Stachecki
1a90e6981e Make sure SP_PC_REG returns an address in IMEM. 2014-11-02 09:36:44 -05:00
Tyler Stachecki
aaf56a0928 Make sure RSP branch targets don't escape IMEM. 2014-11-02 09:35:50 -05:00
Tyler Stachecki
c522b7cab0 Some minor tweaks/fixes to the SU pipeline. 2014-10-25 17:11:45 -04:00
Tyler Stachecki
304f667674 Implement several LWC2/SWC2 opcodes. 2014-10-25 14:03:26 -04:00
Tyler Stachecki
b9b989131f More peephole optimizations. 2014-10-25 13:25:07 -04:00
Tyler Stachecki
0c64ae620b Combine SLL, SLLV function logic. 2014-10-25 13:01:20 -04:00
Tyler Stachecki
87986a5037 Cut some instructions from execution functions.
Extend a LUT by a couple of entries to avoid a shift at runtime.
2014-10-25 12:52:41 -04:00
Tyler Stachecki
85a21616cc Micro-optimization: faster li emulation.
If we think about how the assembler forms 32-bit immediates, it
usually generates a lui and addiu pair. Well, if can craft the
simulation such that lui and addiu are the same indirect target
when branching to execution functions, we can reduce the chance
that we'll mispredict and have a resulting pipeline flush on the
host.

Every cycle counts!
2014-10-25 12:40:27 -04:00
Tyler Stachecki
e698bfe1d1 Improving accuracy of RSP LWC2/SWC2 operations. 2014-10-25 02:06:30 -04:00
Tyler Stachecki
74327ef79e Compress LQV/SQV into one function. 2014-10-24 23:56:42 -04:00
Tyler Stachecki
ba2ca6f427 Fix more byte-ordering issues. This was hard. 2014-10-24 23:43:24 -04:00
Tyler Stachecki
e63b13605e Various LWC2/SWC2 fixes, add VSAR. 2014-10-24 21:07:25 -04:00
Tyler Stachecki
f395be631e Start adding in support for LWC2/SWC2 ops: LQV/SQV. 2014-10-24 18:31:13 -04:00
Tyler Stachecki
519f59f429 Start implementing some vector operators. 2014-10-22 18:15:44 -04:00
Tyler Stachecki
8ccf4eca32 Add writes to RDP space from RSP CP0. 2014-10-20 13:20:09 -04:00
Tyler Stachecki
62ebbd8c54 Fix a typo (wrong enumeration). 2014-10-20 12:58:50 -04:00
Tyler Stachecki
4c2b49c779 Assist optimizations by changing a macro'd value. 2014-10-20 08:10:32 -04:00
Tyler Stachecki
ab8687e263 Remove an unnecessary pair of RF writes.
We always write to $0 during bypass logic to make sure that a
forwarded value, regardless of it's desination, never alters the
value of $0. Therefore, writing it to the RF as shown here is not
strictly necessary.
2014-10-20 07:42:07 -04:00
Tyler Stachecki
69b810cfaa Silence an unused variable warning. 2014-10-18 12:31:28 -04:00
Tyler Stachecki
9dc2a36313 Remove some RSP debugging code and sloppyness. 2014-10-18 12:30:36 -04:00
Tyler Stachecki
749b3906c9 Fix RSP DMEM endian issues and load-use code. 2014-10-18 12:26:03 -04:00
Tyler Stachecki
421b0e0519 Implement some RSP DMEM reads and writes. 2014-10-18 11:34:09 -04:00
Tyler Stachecki
4ff41a0e34 Fix DMA/interrupt issues with the RSP. 2014-10-18 11:34:02 -04:00
Tyler Stachecki
df68d13733 FIx some PC-related bugs in the RSP. 2014-10-18 11:33:56 -04:00
Tyler Stachecki
f5dc940dee Prevent the RSP from hanging the IPL. 2014-10-18 11:33:51 -04:00
Tyler Stachecki
0eea4f213e Start fleshing out the RSP backend. 2014-10-18 11:33:44 -04:00
Tyler Stachecki
b421093700 Start fleshing out the RSP frontend. 2014-10-18 11:33:14 -04:00
Tyler Stachecki
f520f6e9b8 Start fleshing out the RSP pipeline. 2014-10-18 11:33:04 -04:00
Tyler Stachecki
7ac625cec1 Implement RSP DMAs, COP0 registers, etc. 2014-10-18 11:32:51 -04:00
Tyler Stachecki
440c51fef2 Add modified functions for RSP. 2014-10-18 11:32:43 -04:00
Tyler Stachecki
71961f0b00 Implement the RSP decoder. 2014-10-18 11:32:36 -04:00