Commit graph

953 commits

Author SHA1 Message Date
Tyler Stachecki
1d339f8f74 Restructure a lot of OS-dependent stuff. 2014-11-01 14:36:48 -04:00
Tyler Stachecki
5b0297d777 More performance optimizations. 2014-10-28 00:10:14 -04:00
Tyler Stachecki
d45cab877a Add a -printsimstats switch.
Start working in a "extra" mode for debugging and other features
that we don't want on the main path. As a demonstration of what we
can do with this extra mode, print out a bunch of simulation info
that can help us optimize offline.
2014-10-27 19:25:22 -04:00
Tyler Stachecki
4ec470fed7 Cleanup CMakeLists, add support for ICC (on Unix). 2014-10-27 11:36:33 -04:00
Tyler Stachecki
79d24aa1ed Revert Clang optimization level to -O2. 2014-10-27 10:09:50 -04:00
Tyler Stachecki
56e3ed8408 Don't use AVX on processors that support it.
AVX instructions are a byte bigger than their non-AVX counter-
parts due to the VEX prefix. Since we aren't mixing SSE and AVX
anywhere, disabling this flag on processors that support it
actually results in a performance boost.
2014-10-26 17:08:00 -04:00
Tyler Stachecki
e3700274a9 Fix Clang compiler warnings. 2014-10-26 16:34:31 -04:00
Tyler Stachecki
c522b7cab0 Some minor tweaks/fixes to the SU pipeline. 2014-10-25 17:11:45 -04:00
Tyler Stachecki
6a6f4174ca Fix edge cases for some LWC2 operations. 2014-10-25 16:46:18 -04:00
Tyler Stachecki
304f667674 Implement several LWC2/SWC2 opcodes. 2014-10-25 14:03:26 -04:00
Tyler Stachecki
b9b989131f More peephole optimizations. 2014-10-25 13:25:07 -04:00
Tyler Stachecki
0c64ae620b Combine SLL, SLLV function logic. 2014-10-25 13:01:20 -04:00
Tyler Stachecki
87986a5037 Cut some instructions from execution functions.
Extend a LUT by a couple of entries to avoid a shift at runtime.
2014-10-25 12:52:41 -04:00
Tyler Stachecki
85a21616cc Micro-optimization: faster li emulation.
If we think about how the assembler forms 32-bit immediates, it
usually generates a lui and addiu pair. Well, if can craft the
simulation such that lui and addiu are the same indirect target
when branching to execution functions, we can reduce the chance
that we'll mispredict and have a resulting pipeline flush on the
host.

Every cycle counts!
2014-10-25 12:40:27 -04:00
Tyler Stachecki
e698bfe1d1 Improving accuracy of RSP LWC2/SWC2 operations. 2014-10-25 02:06:30 -04:00
Tyler Stachecki
74327ef79e Compress LQV/SQV into one function. 2014-10-24 23:56:42 -04:00
Tyler Stachecki
c027d75198 Fix a typo leading to an unnecessarily large array. 2014-10-24 23:44:36 -04:00
Tyler Stachecki
ba2ca6f427 Fix more byte-ordering issues. This was hard. 2014-10-24 23:43:24 -04:00
Tyler Stachecki
2a90218af5 Require SSSE3 until we get SSE2 back in order. 2014-10-24 21:38:39 -04:00
Tyler Stachecki
1292220694 Fix a byte-ordering issue in the x86_64 RSP backend. 2014-10-24 21:27:18 -04:00
Tyler Stachecki
e63b13605e Various LWC2/SWC2 fixes, add VSAR. 2014-10-24 21:07:25 -04:00
Tyler Stachecki
97587e3811 Add guards around SSSE3 version of rsp_vstore_dmem. 2014-10-24 18:34:36 -04:00
Tyler Stachecki
f395be631e Start adding in support for LWC2/SWC2 ops: LQV/SQV. 2014-10-24 18:31:13 -04:00
Tyler Stachecki
d0eb4d4532 Optimize (and fix a bug in) uncached reads. 2014-10-23 09:40:47 -04:00
Tyler Stachecki
5b49bb470d More branch folding: this time, loads. 2014-10-22 18:47:27 -04:00
Tyler Stachecki
e9e82b9b22 Fix a compilation error in the last commit. 2014-10-22 18:17:30 -04:00
Tyler Stachecki
519f59f429 Start implementing some vector operators. 2014-10-22 18:15:44 -04:00
Tyler Stachecki
1061cec86b Lots of branch folding in the LD/ST aligner. 2014-10-22 18:11:50 -04:00
Tyler Stachecki
620c1cbec5 Add SSE2 support to arch/x86_64/rsp. 2014-10-21 18:39:26 -04:00
Tyler Stachecki
8ccf4eca32 Add writes to RDP space from RSP CP0. 2014-10-20 13:20:09 -04:00
Tyler Stachecki
62ebbd8c54 Fix a typo (wrong enumeration). 2014-10-20 12:58:50 -04:00
Tyler Stachecki
b9ed6920c4 Implement multicycle instruction delays. 2014-10-20 12:55:20 -04:00
Tyler Stachecki
b245149f3e Optimize out a store, add a safety net.
Not sure whether or not both destination register variables need
to be waxed during a faulted stage, but I imagine so.
2014-10-20 08:41:04 -04:00
Tyler Stachecki
2079549cc5 Hoist assignment/store to assist optimizations. 2014-10-20 08:26:27 -04:00
Tyler Stachecki
4c2b49c779 Assist optimizations by changing a macro'd value. 2014-10-20 08:10:32 -04:00
Tyler Stachecki
ab8687e263 Remove an unnecessary pair of RF writes.
We always write to $0 during bypass logic to make sure that a
forwarded value, regardless of it's desination, never alters the
value of $0. Therefore, writing it to the RF as shown here is not
strictly necessary.
2014-10-20 07:42:07 -04:00
Tyler Stachecki
715f075088 Micro-optimization: Make LDI checks cheaper. 2014-10-19 14:15:49 -04:00
Tyler Stachecki
4f359d3ded Restructure main loop to assist in optimization. 2014-10-19 14:06:52 -04:00
Tyler Stachecki
69b810cfaa Silence an unused variable warning. 2014-10-18 12:31:28 -04:00
Tyler Stachecki
9dc2a36313 Remove some RSP debugging code and sloppyness. 2014-10-18 12:30:36 -04:00
Tyler Stachecki
749b3906c9 Fix RSP DMEM endian issues and load-use code. 2014-10-18 12:26:03 -04:00
Tyler Stachecki
421b0e0519 Implement some RSP DMEM reads and writes. 2014-10-18 11:34:09 -04:00
Tyler Stachecki
4ff41a0e34 Fix DMA/interrupt issues with the RSP. 2014-10-18 11:34:02 -04:00
Tyler Stachecki
df68d13733 FIx some PC-related bugs in the RSP. 2014-10-18 11:33:56 -04:00
Tyler Stachecki
f5dc940dee Prevent the RSP from hanging the IPL. 2014-10-18 11:33:51 -04:00
Tyler Stachecki
0eea4f213e Start fleshing out the RSP backend. 2014-10-18 11:33:44 -04:00
Tyler Stachecki
f021614648 Make sure LDI is triggered under the correct conditions.
We need to trigger LDI on consumption of sources that are
heading into the DC stage, not coming out of it.
2014-10-18 11:33:37 -04:00
Tyler Stachecki
b421093700 Start fleshing out the RSP frontend. 2014-10-18 11:33:14 -04:00
Tyler Stachecki
f520f6e9b8 Start fleshing out the RSP pipeline. 2014-10-18 11:33:04 -04:00
Tyler Stachecki
7ac625cec1 Implement RSP DMAs, COP0 registers, etc. 2014-10-18 11:32:51 -04:00