Tyler Stachecki
1d339f8f74
Restructure a lot of OS-dependent stuff.
2014-11-01 14:36:48 -04:00
Tyler Stachecki
5b0297d777
More performance optimizations.
2014-10-28 00:10:14 -04:00
Tyler Stachecki
d45cab877a
Add a -printsimstats switch.
...
Start working in a "extra" mode for debugging and other features
that we don't want on the main path. As a demonstration of what we
can do with this extra mode, print out a bunch of simulation info
that can help us optimize offline.
2014-10-27 19:25:22 -04:00
Tyler Stachecki
4ec470fed7
Cleanup CMakeLists, add support for ICC (on Unix).
2014-10-27 11:36:33 -04:00
Tyler Stachecki
79d24aa1ed
Revert Clang optimization level to -O2.
2014-10-27 10:09:50 -04:00
Tyler Stachecki
56e3ed8408
Don't use AVX on processors that support it.
...
AVX instructions are a byte bigger than their non-AVX counter-
parts due to the VEX prefix. Since we aren't mixing SSE and AVX
anywhere, disabling this flag on processors that support it
actually results in a performance boost.
2014-10-26 17:08:00 -04:00
Tyler Stachecki
e3700274a9
Fix Clang compiler warnings.
2014-10-26 16:34:31 -04:00
Tyler Stachecki
c522b7cab0
Some minor tweaks/fixes to the SU pipeline.
2014-10-25 17:11:45 -04:00
Tyler Stachecki
6a6f4174ca
Fix edge cases for some LWC2 operations.
2014-10-25 16:46:18 -04:00
Tyler Stachecki
304f667674
Implement several LWC2/SWC2 opcodes.
2014-10-25 14:03:26 -04:00
Tyler Stachecki
b9b989131f
More peephole optimizations.
2014-10-25 13:25:07 -04:00
Tyler Stachecki
0c64ae620b
Combine SLL, SLLV function logic.
2014-10-25 13:01:20 -04:00
Tyler Stachecki
87986a5037
Cut some instructions from execution functions.
...
Extend a LUT by a couple of entries to avoid a shift at runtime.
2014-10-25 12:52:41 -04:00
Tyler Stachecki
85a21616cc
Micro-optimization: faster li
emulation.
...
If we think about how the assembler forms 32-bit immediates, it
usually generates a lui and addiu pair. Well, if can craft the
simulation such that lui and addiu are the same indirect target
when branching to execution functions, we can reduce the chance
that we'll mispredict and have a resulting pipeline flush on the
host.
Every cycle counts!
2014-10-25 12:40:27 -04:00
Tyler Stachecki
e698bfe1d1
Improving accuracy of RSP LWC2/SWC2 operations.
2014-10-25 02:06:30 -04:00
Tyler Stachecki
74327ef79e
Compress LQV/SQV into one function.
2014-10-24 23:56:42 -04:00
Tyler Stachecki
c027d75198
Fix a typo leading to an unnecessarily large array.
2014-10-24 23:44:36 -04:00
Tyler Stachecki
ba2ca6f427
Fix more byte-ordering issues. This was hard.
2014-10-24 23:43:24 -04:00
Tyler Stachecki
2a90218af5
Require SSSE3 until we get SSE2 back in order.
2014-10-24 21:38:39 -04:00
Tyler Stachecki
1292220694
Fix a byte-ordering issue in the x86_64 RSP backend.
2014-10-24 21:27:18 -04:00
Tyler Stachecki
e63b13605e
Various LWC2/SWC2 fixes, add VSAR.
2014-10-24 21:07:25 -04:00
Tyler Stachecki
97587e3811
Add guards around SSSE3 version of rsp_vstore_dmem.
2014-10-24 18:34:36 -04:00
Tyler Stachecki
f395be631e
Start adding in support for LWC2/SWC2 ops: LQV/SQV.
2014-10-24 18:31:13 -04:00
Tyler Stachecki
d0eb4d4532
Optimize (and fix a bug in) uncached reads.
2014-10-23 09:40:47 -04:00
Tyler Stachecki
5b49bb470d
More branch folding: this time, loads.
2014-10-22 18:47:27 -04:00
Tyler Stachecki
e9e82b9b22
Fix a compilation error in the last commit.
2014-10-22 18:17:30 -04:00
Tyler Stachecki
519f59f429
Start implementing some vector operators.
2014-10-22 18:15:44 -04:00
Tyler Stachecki
1061cec86b
Lots of branch folding in the LD/ST aligner.
2014-10-22 18:11:50 -04:00
Tyler Stachecki
620c1cbec5
Add SSE2 support to arch/x86_64/rsp.
2014-10-21 18:39:26 -04:00
Tyler Stachecki
8ccf4eca32
Add writes to RDP space from RSP CP0.
2014-10-20 13:20:09 -04:00
Tyler Stachecki
62ebbd8c54
Fix a typo (wrong enumeration).
2014-10-20 12:58:50 -04:00
Tyler Stachecki
b9ed6920c4
Implement multicycle instruction delays.
2014-10-20 12:55:20 -04:00
Tyler Stachecki
b245149f3e
Optimize out a store, add a safety net.
...
Not sure whether or not both destination register variables need
to be waxed during a faulted stage, but I imagine so.
2014-10-20 08:41:04 -04:00
Tyler Stachecki
2079549cc5
Hoist assignment/store to assist optimizations.
2014-10-20 08:26:27 -04:00
Tyler Stachecki
4c2b49c779
Assist optimizations by changing a macro'd value.
2014-10-20 08:10:32 -04:00
Tyler Stachecki
ab8687e263
Remove an unnecessary pair of RF writes.
...
We always write to $0 during bypass logic to make sure that a
forwarded value, regardless of it's desination, never alters the
value of $0. Therefore, writing it to the RF as shown here is not
strictly necessary.
2014-10-20 07:42:07 -04:00
Tyler Stachecki
715f075088
Micro-optimization: Make LDI checks cheaper.
2014-10-19 14:15:49 -04:00
Tyler Stachecki
4f359d3ded
Restructure main loop to assist in optimization.
2014-10-19 14:06:52 -04:00
Tyler Stachecki
69b810cfaa
Silence an unused variable warning.
2014-10-18 12:31:28 -04:00
Tyler Stachecki
9dc2a36313
Remove some RSP debugging code and sloppyness.
2014-10-18 12:30:36 -04:00
Tyler Stachecki
749b3906c9
Fix RSP DMEM endian issues and load-use code.
2014-10-18 12:26:03 -04:00
Tyler Stachecki
421b0e0519
Implement some RSP DMEM reads and writes.
2014-10-18 11:34:09 -04:00
Tyler Stachecki
4ff41a0e34
Fix DMA/interrupt issues with the RSP.
2014-10-18 11:34:02 -04:00
Tyler Stachecki
df68d13733
FIx some PC-related bugs in the RSP.
2014-10-18 11:33:56 -04:00
Tyler Stachecki
f5dc940dee
Prevent the RSP from hanging the IPL.
2014-10-18 11:33:51 -04:00
Tyler Stachecki
0eea4f213e
Start fleshing out the RSP backend.
2014-10-18 11:33:44 -04:00
Tyler Stachecki
f021614648
Make sure LDI is triggered under the correct conditions.
...
We need to trigger LDI on consumption of sources that are
heading into the DC stage, not coming out of it.
2014-10-18 11:33:37 -04:00
Tyler Stachecki
b421093700
Start fleshing out the RSP frontend.
2014-10-18 11:33:14 -04:00
Tyler Stachecki
f520f6e9b8
Start fleshing out the RSP pipeline.
2014-10-18 11:33:04 -04:00
Tyler Stachecki
7ac625cec1
Implement RSP DMAs, COP0 registers, etc.
2014-10-18 11:32:51 -04:00