Commit graph

52 commits

Author SHA1 Message Date
Derek "Turtle" Roe
c4afd44ed7 See long description
Replaced all references to simulation with emulation
Updated copyright year
Updated .gitignore to reduce chances of random files being uploaded to
the repo
Added .gitattributes to normalize all text files, and to ignore binary
files (which includes the logo and the NEC PDF)
2015-07-03 08:18:16 -04:00
Tyler Stachecki
d7427d6b73 VR4300: Cache read/write optimizations. 2015-01-10 14:16:40 -05:00
Tyler Stachecki
10a5983c0c Add support for SSE4 FPU acceleration.
0d4a5de2f6 is wrong; we can take
advantage of SSE4 rounding intrinsics.
2014-11-16 14:06:34 -05:00
Tyler Stachecki
c1dc7cba08 Refactor for another major performance boost.
Since the CEN64 core now runs in it's own thread (and doesn't use
the FPU), we can steal the host's FPU state register and not have
to worry about preserving it.

Along with that major overhaul, don't force "extra" features like
simulation statistics and debugging if the user doesn't want them.
Including that code, even when it is not run, mucks with register
allocation or something ever so slightly.
2014-11-15 18:22:20 -05:00
Tyler Stachecki
4b806c5601 Remove some "experimental" code that got replaced. 2014-11-15 15:55:26 -05:00
Tyler Stachecki
172203eb70 Rework VR4300 CP1.
Use switch statements instead of if/else spaghetti to give the
compiler a better idea of what we're trying to do.
2014-11-15 15:40:15 -05:00
Tyler Stachecki
0d4a5de2f6 Remove some comments about SSE4 intrinsics.
Since we have to convert to an integer, as well as round in some
direction, these intrinsics (_mm_ceil_*, _mm_floor_*, _mm_round_*)
aren't of much use to us.
2014-11-15 14:33:43 -05:00
Tyler Stachecki
e89f054674 Optimize extremely aggressively.
Tell GCC to optimize cold functions for size and stash them away in
a separate part of the binary. Put the simulate core, meanwhile, on
the hot path. Also, bump optimization to -O3 as we can now "afford"
to do so.
2014-11-05 08:39:47 -05:00
Tyler Stachecki
5b0297d777 More performance optimizations. 2014-10-28 00:10:14 -04:00
Tyler Stachecki
1061cec86b Lots of branch folding in the LD/ST aligner. 2014-10-22 18:11:50 -04:00
Tyler Stachecki
b9ed6920c4 Implement multicycle instruction delays. 2014-10-20 12:55:20 -04:00
Tyler Stachecki
971bcd131b Prevent namespace collisions. 2014-09-04 13:45:57 -04:00
Tyler Stachecki
d7aaff460d Fix a potential issue in the last commit.
request->size need not be >= 4 for SDL/SDR; fix it.
2014-08-31 12:11:29 -04:00
Tyler Stachecki
aba5b13ba1 Fix MFC1 now that FS/FT handling is correct. 2014-08-23 16:59:24 -04:00
Tyler Stachecki
c7e09f90bd Add enough support for libdragon to run. 2014-08-23 16:25:35 -04:00
Tyler Stachecki
2b94264a92 Pass IW to each opcode function.
Instead of having almost every opcode function load IW off the
stack through the VR4300 pointer, just pass it as an argument on
the stack to reduce binary size and hoist a load up.

Thanks go out to Narann for this idea.
2014-08-22 07:54:53 -04:00
Tyler Stachecki
81c799576a Fill and flush data cache lines as required. 2014-08-21 21:43:07 -04:00
Tyler Stachecki
3e8ba50851 Add common arch/ folder, move headers out of os/.
Much of the architecture-specific code uses compiler-agnostic
intrinsics. For this reason, split it out into an arch/ folder,
leaving only the compiler and environment-specific code in os/.
2014-08-18 16:08:45 +00:00
Tyler Stachecki
4cb2a1aca9 Fix potentially uninitialized variables to 0.
Shouldn't happen anyways, but do something safe if it does.
2014-07-31 16:48:30 -04:00
Tyler Stachecki
2ad431aaee Reduce VR4300/CP1 overhead. 2014-07-31 11:30:06 -04:00
Tyler Stachecki
ddb4626080 Change the FPU calling convention. 2014-07-29 00:13:39 -04:00
Tyler Stachecki
c4a121c139 Trim out some unused instructions. 2014-07-27 23:40:08 -04:00
Tyler Stachecki
d2f70fd2c8 Switch over ROUND, TRUNC, CEIL, and FLOOR. 2014-07-27 23:40:08 -04:00
Tyler Stachecki
e66c2e2e37 Simplify the floating point compare logic.
Also, finish converting things to SSE/SSE2.
2014-07-27 23:40:08 -04:00
Tyler Stachecki
45617d9c36 Start the switch from x87 to SSE. 2014-07-27 23:40:08 -04:00
Tyler Stachecki
81e9970b2e vr4300/cp1: Add missing unusable excps, fix bugs. 2014-07-27 23:40:08 -04:00
Tyler Stachecki
e7417bee66 Add get/set native FPU state functions. 2014-07-27 23:40:08 -04:00
Tyler Stachecki
596736f64d Hack in support for LDL/LDR. 2014-07-27 00:59:43 -04:00
Tyler Stachecki
d0662e9874 Remove preshift from memory operations. 2014-07-26 14:54:30 -04:00
Tyler Stachecki
349bdc1684 Merge krom's FPU comparison instructions.
Implements: C.F.D, C.F.S, C.NGE.D, C.NGE.S, C.NGL.D, C.NGL.S
C.NGLE.D, C.NGLE.S, C.NGT.D, C.NGT.S, C.OLE.C.OLE.S, C.OLE.S,
C.OLT.D, C.OLT.S, C.SEQ.D, C.SEQ.S, C.SF.`C.SF.S, C.SF.S,
C.UEQ.D, C.UEQ.S, C.ULE.D, C.ULE.S, C.ULT.D, C.ULT.S, C.UN.D
C.UN.S.
2014-07-17 20:03:28 -04:00
Tyler Stachecki
9ef940a6eb Fix a bug in MTC1.
Fix a mask-related bug that occured when storing the high
part of an FPU register with MTC1 (when 32-bit FPU mode is
active).
2014-07-13 15:23:23 -04:00
Tyler Stachecki
4b69669998 Add support for C.lt.fmt.
Conflicts:
	vr4300/cp1.h
2014-07-13 13:16:31 -04:00
Tyler Stachecki
9eede82001 Make FPU FCR writes available on the next cycle.
Even though the manual suggests otherwise...
See: http://cen64.com/viewtopic.php?f=6&t=109&p=876#p875
2014-07-13 13:01:58 -04:00
Tyler Stachecki
2bfd3870e8 Implement another conditional FPU operation. 2014-07-13 13:01:40 -04:00
Tyler Stachecki
16aea90110 Implement FPU conditional branches. 2014-07-13 13:01:31 -04:00
Tyler Stachecki
24c17acb62 Implement a FPU conditional operation.
Conflicts:
	vr4300/cp1.c
	vr4300/cp1.h
2014-07-13 13:01:19 -04:00
Tyler Stachecki
5dd0f5bc3c Add implementations for CFC1/CTC1. 2014-07-13 12:57:27 -04:00
Tyler Stachecki
c652f8359a Merge krom's FPU instructions.
Implements: ABS.D, ABS.S, CEIL.L.D, CEIL.L.S, CEIL.W.D
CEIL.W.S, FLOOR.L.D, FLOOR.L.S, FLOOR.W.D, FLOOR.W.S, NEG.D
NEG.S, ROUND.L.D, ROUND.L.S, ROUND.W.D, ROUND.W.S, SQRT.D
SQRT.S, TRUNC.L.D, and TRUNC.L.S!
2014-07-08 00:00:25 -04:00
Tyler Stachecki
423d8d9b9c Wave 2 of CP1 housekeeping. 2014-07-05 14:28:43 -04:00
Tyler Stachecki
72481d5df4 Wave 1 of CP1 housekeeping. 2014-07-05 13:43:24 -04:00
Tyler J. Stachecki
665d7468f9 Add more FPU support for MSVC/x86_64. 2014-07-05 12:12:07 -04:00
Tyler J. Stachecki
6b0215a082 Start adding FPU support for MSVC/x86_64. 2014-07-05 11:35:06 -04:00
Tyler Stachecki
7193b288fc Fix another couple of CP1 bugs. 2014-07-05 08:46:52 -04:00
Tyler Stachecki
c318a781bc Small amount of CP1 cleanup. 2014-07-04 18:25:25 -04:00
Tyler Stachecki
0734d6b4e4 Add support for MOV.fmt and SUB.fmt. 2014-07-04 17:53:07 -04:00
Tyler Stachecki
0b6433ea45 Fix a handful of sloppy CP1 errors. 2014-07-04 17:36:33 -04:00
Tyler Stachecki
33322f0870 Add support for CVT.l.fmt, CVT.w.fmt. 2014-07-04 14:20:20 -04:00
Tyler Stachecki
8debccc73e Add SDC1, CP1 comment fixes. 2014-07-04 13:52:56 -04:00
Tyler Stachecki
31d9f611d1 Implement a handful of CP1 instructions. 2014-07-04 13:17:32 -04:00
Tyler Stachecki
92887871f0 Forward CP1 registers in EX logic. 2014-07-04 11:01:25 -04:00