Commit graph

243 commits

Author SHA1 Message Date
Tyler J. Stachecki
bb3e305061 Delay when the cache operation requires it.
Also slightly tighten the emulated memory delays. With
this commit, WDC boots (but crashes shortly after). Seems
like memory timings are coming into play, among other
things.
2015-08-19 00:00:27 -04:00
Tyler Stachecki
dfe7d59ec9 Implement DCB-type stalls. 2015-07-05 08:15:04 -04:00
Derek "Turtle" Roe
c4afd44ed7 See long description
Replaced all references to simulation with emulation
Updated copyright year
Updated .gitignore to reduce chances of random files being uploaded to
the repo
Added .gitattributes to normalize all text files, and to ignore binary
files (which includes the logo and the NEC PDF)
2015-07-03 08:18:16 -04:00
Tyler J. Stachecki
b966e79a04 Add a temporary hack for the CACHE instruction.
When a CACHE instruction uses a mapped virtual address,
and a TLB miss results... just ignore it! Clearly, this
isn't the right thing to do, but all documentation is
ambiguous and this seems to float the boat for now.
2015-05-20 22:36:13 -04:00
Tyler J. Stachecki
a1b9e49186 VR4300: CACHE instructions can't cause TLB Mod. 2015-05-20 20:59:03 -04:00
Tyler J. Stachecki
df05ac51f4 VR4300: Minor pipeline optimizations. 2015-05-20 20:58:57 -04:00
Tyler J. Stachecki
5afe6c5f52 Various small optimizations. 2015-05-08 09:56:59 -04:00
Tyler Stachecki
7168fc5e6f Fix a slew of cache bugs. 2015-01-29 10:07:54 -05:00
Tyler Stachecki
4d3aca850b Various FPU optimizations. 2015-01-29 10:07:45 -05:00
Tyler Stachecki
b80151069b Alignment/size optimizations. 2015-01-28 22:16:50 -05:00
Tyler Stachecki
e8576c87dc Minor cache and TLB optimizations. 2015-01-28 21:33:39 -05:00
Tyler Stachecki
10d32ce427 Optimize FPU operations somewhat. 2015-01-27 10:27:13 -05:00
Tyler Stachecki
9d809eb73a Mark another uncommon function as cold. 2015-01-22 15:19:35 -05:00
Tyler Stachecki
b8481b0cd4 Unroll the top-level hot functions. 2015-01-22 14:31:25 -05:00
Tyler Stachecki
4819fee8b2 vr4300: Micro-optimizations. 2015-01-22 11:22:34 -05:00
Tyler Stachecki
f8afa41c91 Add some temporary hacks to the PI and UB fixes. 2015-01-13 17:41:39 -05:00
Tyler Stachecki
68a901e01f VR4300: Optimize load/store instructions. 2015-01-10 16:20:04 -05:00
Tyler Stachecki
d7427d6b73 VR4300: Cache read/write optimizations. 2015-01-10 14:16:40 -05:00
Tyler Stachecki
f896cc62b7 VR4300: Memory system optimizations. 2015-01-10 12:43:02 -05:00
Tyler Stachecki
4693c11887 Prevent 64DD thread from crashing.
RTC adjustment works and communication between the 64DD is
now present, but we don't actually save the RTC settings.
2015-01-06 21:44:45 -05:00
Tyler Stachecki
9e7c0c5e82 Decoder optimization: drastically reduce size. 2015-01-06 11:40:11 -05:00
Tyler Stachecki
5229996ecd Trim off a few hundred bytes of code. 2015-01-05 23:00:49 -05:00
Tyler Stachecki
c8ba97efc3 Mark LDI (interlocks) as unlikely.
MIPS compilers of the time optimized this out very aggressively as
they waste cycles and there's generally other instructions you can
toss in the load delay slot, so flag the interlock as unlikely.
2015-01-05 21:27:49 -05:00
Tyler Stachecki
8dd7195dfc Prevent a if statement over ternary expressions. 2015-01-05 20:54:50 -05:00
Tyler Stachecki
2301115c7a Make interrupt exception checks more efficient. 2015-01-05 19:53:36 -05:00
Tyler Stachecki
c7d4fe77ad Fix a last-minute bug in TLB exceptions. 2015-01-05 12:29:00 -05:00
Tyler Stachecki
9008e999af Add support for TLB modification exceptions. 2015-01-05 12:14:34 -05:00
Tyler Stachecki
e342a0ba2a Implement cache operations, fix cache op bug.
If we're doing a cache operation in the DC stage, don't
change the stage of the lines; the cache operations will
do it if needed. Also implement get/set taglo for DC.
2015-01-04 22:40:36 -05:00
Tyler Stachecki
da2fd05415 Respect the TLB entry conherency bits.
If the TLB entry 'C bits' indicate the cache isn't to be
used for that virtual address range... don't use the cache.
2015-01-04 21:33:29 -05:00
Tyler Stachecki
0d7a42c4ce Move cache functionality to the DC stage.
This is how the actual processor does it. In addition to
design correctness, we have the added benefit of being able
to support cache instructions whose virtual address lies
in a mapped part of the address space.
2015-01-04 21:10:12 -05:00
Tyler Stachecki
5240b35d45 More cleanup of the fault/TLB code. 2015-01-04 15:37:47 -05:00
Tyler Stachecki
17954bf0b8 Fix bugs, implement WatchLo/Hi support. 2015-01-04 11:52:11 -05:00
Tyler Stachecki
3725c7325a Squash IC->RF latch data on a fault. 2015-01-03 12:54:17 -05:00
Tyler Stachecki
8f602d576d Cleanup the VR4300 exception logic somewhat. 2015-01-03 12:40:01 -05:00
Tyler Stachecki
06d3d54c60 Restore most TLB functionality from backport. 2015-01-01 15:47:20 -05:00
Tyler Stachecki
287e3370c5 Commit some MSVC-specific workarounds. 2014-12-31 16:20:53 -05:00
Tyler Stachecki
cd9e41e54f Add a list of TODO for the VR4300. 2014-12-19 21:16:18 -05:00
Tyler Stachecki
369d33c2d1 Windows fixes as reported by magumagu. 2014-12-07 10:40:42 -05:00
Tyler Stachecki
10a5983c0c Add support for SSE4 FPU acceleration.
0d4a5de2f6 is wrong; we can take
advantage of SSE4 rounding intrinsics.
2014-11-16 14:06:34 -05:00
Tyler Stachecki
c1dc7cba08 Refactor for another major performance boost.
Since the CEN64 core now runs in it's own thread (and doesn't use
the FPU), we can steal the host's FPU state register and not have
to worry about preserving it.

Along with that major overhaul, don't force "extra" features like
simulation statistics and debugging if the user doesn't want them.
Including that code, even when it is not run, mucks with register
allocation or something ever so slightly.
2014-11-15 18:22:20 -05:00
Tyler Stachecki
4b806c5601 Remove some "experimental" code that got replaced. 2014-11-15 15:55:26 -05:00
Tyler Stachecki
172203eb70 Rework VR4300 CP1.
Use switch statements instead of if/else spaghetti to give the
compiler a better idea of what we're trying to do.
2014-11-15 15:40:15 -05:00
Tyler Stachecki
0d4a5de2f6 Remove some comments about SSE4 intrinsics.
Since we have to convert to an integer, as well as round in some
direction, these intrinsics (_mm_ceil_*, _mm_floor_*, _mm_round_*)
aren't of much use to us.
2014-11-15 14:33:43 -05:00
Tyler Stachecki
31443e65c5 Mark another function as cen64_cold. 2014-11-14 22:22:00 -05:00
Tyler Stachecki
01df3de520 Aggressively push more code into the cold section.
We will likely only hit a couple of the slow_cycle functions in
the VR4300 code when we interrupt. Because of this, push everything
just before what will be hit after a data cache fault into the cold
section.
2014-11-14 21:28:34 -05:00
Tyler Stachecki
4d46108cff Fix 8912b4cc50.
Commit 8912b4cc50 was mostly right,
but we still need to make sure we clear the fault type if an IADE
exception really does happen.
2014-11-14 21:11:52 -05:00
Tyler Stachecki
85654a891f Delay computing accurate value of count.
Instead, just bump the counter and don't track cycle count. When
it comes time to use count, shift it to the right by one instead.
2014-11-14 21:04:03 -05:00
Tyler Stachecki
8912b4cc50 IC stage should never fault... I think. 2014-11-14 20:39:33 -05:00
Tyler Stachecki
6e474a3251 Implement a neat optimization in the VR4300 core.
Perf reported a window where the backend was busy, and the frontend
was idle. Take advantage of the situation by inserting a branch that
has the potential to filter out (a lot of) instructions from the
backend when it's clogged. This works to our advantage, because more
often than not we aren't executing FPU instructions, or we execute
the FPU instructions in small batches.
2014-11-12 14:06:24 -05:00
Tyler Stachecki
e4fbc9831d Increase VR4300_BUSY_WAIT_DETECTION performance.
Don't split branch functions across "normal" and "busy wait detect"
variants; just have everything use the "busy wait detect" variant.
2014-11-12 12:56:42 -05:00