Tyler Stachecki
ab8dde80e9
Add AIO's implementation for VMULU.
2014-12-23 01:10:15 -05:00
Tyler Stachecki
2ee295a671
Fix RSP DMEM accesses.
...
Up until now, the simulator assumed that DMEM accesses had to be
aligned (similarly to the VR4300). This is not actually the case,
so allow scalar memory access to arbitrary DMEM addresses.
2014-12-22 23:53:13 -05:00
Tyler Stachecki
3f2329be5b
Fix a bug in VRCP/VRSQ precision selection.
2014-12-22 21:06:17 -05:00
Tyler Stachecki
e52e031ce3
Add implementations for VRSQ, VRSQL, and VRSQH.
2014-12-22 20:47:48 -05:00
Tyler Stachecki
4b6904240e
Add implementations for VRCP, VRCPL, and VRCPH.
2014-12-22 20:29:16 -05:00
Tyler Stachecki
73709f4c45
Add implementation for VCR.
2014-12-22 13:01:03 -05:00
Tyler Stachecki
d3938056a4
Add a CONTRIBUTORS file.
2014-12-22 11:32:20 -05:00
Tyler Stachecki
88310a8104
Add AIO's implementation for VMULF.
2014-12-22 09:50:29 -05:00
Tyler Stachecki
f268795da5
Add implementation for VMRG.
2014-12-21 15:49:44 -05:00
Tyler Stachecki
9f4664a4b6
Add implementation for VADDC.
2014-12-21 15:29:16 -05:00
Tyler Stachecki
a955bf1e2c
Add implementation for VSUBC.
2014-12-21 15:07:00 -05:00
Tyler Stachecki
bea9f197c0
Upgrade POSIX_C_SOURCE so we get snprintf.
2014-12-21 14:04:08 -05:00
Tyler Stachecki
f199c7bac8
Add implementation for VABS.
2014-12-21 12:59:36 -05:00
Tyler Stachecki
de5b5b0f96
Commit AIO's VSUB optimizations, fix carry/borrow issue.
2014-12-21 12:55:38 -05:00
Tyler Stachecki
0be40f4358
Add implementations for VGE and VLT.
2014-12-21 11:08:00 -05:00
Tyler Stachecki
dc50279609
Add implementations for VEQ and VNE.
2014-12-21 10:39:10 -05:00
Tyler Stachecki
579fb317a8
Formatting/consistency fixes (remove tabs).
2014-12-21 10:20:45 -05:00
Tyler Stachecki
bd899f5034
Unbreak SSE2 builds.
2014-12-21 09:48:01 -05:00
Tyler Stachecki
e1de6cd92d
Add implementations for VCH.
2014-12-21 09:29:58 -05:00
Tyler Stachecki
0c556f5d25
Fix a last minute SSE4.1->SSE2 change.
2014-12-20 17:01:31 -05:00
Tyler Stachecki
145141225e
Add implementations for VCL and CFC2.
2014-12-20 12:27:38 -05:00
Tyler Stachecki
7c83dcb0d3
Prevent GCC from eliding global register var writes.
...
Not sure why GCC was optimizing out these global register variable
writes when FLTO was enabled, but ensure that it does not by using
an inline assembly block.
2014-12-20 10:21:41 -05:00
Tyler Stachecki
affb4bb746
Add a patch job fix for SSE2 RSP builds.
2014-12-19 22:03:25 -05:00
Tyler Stachecki
cd9e41e54f
Add a list of TODO for the VR4300.
2014-12-19 21:16:18 -05:00
Tyler Stachecki
c72f2c5028
Fix RSP alignment issues once and for all.
2014-12-19 20:03:03 -05:00
Tyler Stachecki
78b4c78757
Add support for cross-compiling with mingw64.
2014-12-18 00:46:56 -05:00
Tyler Stachecki
369d33c2d1
Windows fixes as reported by magumagu.
2014-12-07 10:40:42 -05:00
Tyler Stachecki
8b363895d1
Add missing #include for snprintf.
...
Thanks, balrog.
2014-11-19 10:42:15 -05:00
Tyler Stachecki
8b45d7eab5
Fix padding around SSE register types.
...
Really need to stop doing patchjobs and just fix this.
2014-11-16 14:27:43 -05:00
Tyler Stachecki
b1ada90657
Fix incorrect return value on successful exit.
2014-11-16 14:21:28 -05:00
Tyler Stachecki
10a5983c0c
Add support for SSE4 FPU acceleration.
...
0d4a5de2f6
is wrong; we can take
advantage of SSE4 rounding intrinsics.
2014-11-16 14:06:34 -05:00
Tyler Stachecki
9e9114d2fa
Cleanup the CMakeLists a little.
2014-11-16 13:35:40 -05:00
Tyler Stachecki
459aed5e8d
Generate two binaries.
...
Generate a 'fast' release binary and a developer binary. The
developer binary contains extra calls that permit debugging and
such things.
2014-11-16 13:32:04 -05:00
Tyler Stachecki
11afa4123d
Give os/unix's UI thread a good waxing.
...
Periodically (~1000x) poll for input instead of waiting for a frame
boundary. Also relinquish the render_lock more aggressively in an
attempt to step out of the way of the simulator.
2014-11-16 11:51:49 -05:00
Tyler Stachecki
c90e55a05d
Lock around input reads.
...
Fix some obvious memory consistency issues.
2014-11-16 10:19:56 -05:00
Tyler Stachecki
c1dc7cba08
Refactor for another major performance boost.
...
Since the CEN64 core now runs in it's own thread (and doesn't use
the FPU), we can steal the host's FPU state register and not have
to worry about preserving it.
Along with that major overhaul, don't force "extra" features like
simulation statistics and debugging if the user doesn't want them.
Including that code, even when it is not run, mucks with register
allocation or something ever so slightly.
2014-11-15 18:22:20 -05:00
Tyler Stachecki
d17db4cc18
Make sure keep rsp_vect_t aligned to 16 bytes.
2014-11-15 15:58:35 -05:00
Tyler Stachecki
4b806c5601
Remove some "experimental" code that got replaced.
2014-11-15 15:55:26 -05:00
Tyler Stachecki
061a04e216
Change width of fpu_state_t for x86_64.
...
gcc (and probably other compilers) don't like working with 16-bit
types and will zero-extend where needed. Save some overhead and
just store the state as a 32-bit type.
2014-11-15 15:44:04 -05:00
Tyler Stachecki
172203eb70
Rework VR4300 CP1.
...
Use switch statements instead of if/else spaghetti to give the
compiler a better idea of what we're trying to do.
2014-11-15 15:40:15 -05:00
Tyler Stachecki
0d4a5de2f6
Remove some comments about SSE4 intrinsics.
...
Since we have to convert to an integer, as well as round in some
direction, these intrinsics (_mm_ceil_*, _mm_floor_*, _mm_round_*)
aren't of much use to us.
2014-11-15 14:33:43 -05:00
Tyler Stachecki
31443e65c5
Mark another function as cen64_cold.
2014-11-14 22:22:00 -05:00
Tyler Stachecki
01df3de520
Aggressively push more code into the cold section.
...
We will likely only hit a couple of the slow_cycle functions in
the VR4300 code when we interrupt. Because of this, push everything
just before what will be hit after a data cache fault into the cold
section.
2014-11-14 21:28:34 -05:00
Tyler Stachecki
4d46108cff
Fix 8912b4cc50
.
...
Commit 8912b4cc50
was mostly right,
but we still need to make sure we clear the fault type if an IADE
exception really does happen.
2014-11-14 21:11:52 -05:00
Tyler Stachecki
85654a891f
Delay computing accurate value of count.
...
Instead, just bump the counter and don't track cycle count. When
it comes time to use count, shift it to the right by one instead.
2014-11-14 21:04:03 -05:00
Tyler Stachecki
8912b4cc50
IC stage should never fault... I think.
2014-11-14 20:39:33 -05:00
Tyler Stachecki
0a9b8c2367
Make read_acc_* return a value.
...
Instead of writing through a pointer, just return the value.
Thank you, Jared, for pointing out my stupidity.
2014-11-13 19:54:33 -05:00
Tyler Stachecki
6e474a3251
Implement a neat optimization in the VR4300 core.
...
Perf reported a window where the backend was busy, and the frontend
was idle. Take advantage of the situation by inserting a branch that
has the potential to filter out (a lot of) instructions from the
backend when it's clogged. This works to our advantage, because more
often than not we aren't executing FPU instructions, or we execute
the FPU instructions in small batches.
2014-11-12 14:06:24 -05:00
Tyler Stachecki
e4fbc9831d
Increase VR4300_BUSY_WAIT_DETECTION performance.
...
Don't split branch functions across "normal" and "busy wait detect"
variants; just have everything use the "busy wait detect" variant.
2014-11-12 12:56:42 -05:00
Tyler Stachecki
a00af95ce1
os/unix: Remove stray character from window title.
2014-11-12 07:38:27 -05:00