Commit graph

8836 commits

Author SHA1 Message Date
Henrik Rydgård
331a8f91e8 Fix that weird unordered compare mode, hopefully 2018-01-04 20:06:26 +01:00
Henrik Rydgård
18be23eccc IR: More fixes. Still something wrong with VFPU compares (not caused by this PR). 2018-01-04 19:38:36 +01:00
Henrik Rydgård
ca9050b84c On Linux, can't even include nmmintrin without explicitly enabling SSE 4.2 support. 2018-01-04 18:27:19 +01:00
Henrik Rydgård
fe88d12055 IR interpreter: Add some braces to allow variable declaration in the switch cases. 2018-01-04 18:27:19 +01:00
Henrik Rydgård
e0cc126d09 Add some more SIMD support to IR interpreter. Mostly just because, but also serves as implementation reference for later code generation backends. 2018-01-04 18:27:19 +01:00
Henrik Rydgård
a128624f98 IRInterpreter: Fix bugs in floating point truncation functions 2018-01-04 18:25:54 +01:00
Henrik Rydgård
1a97f62dc9 Fix running the CPU test from the UI. 2018-01-04 18:10:41 +01:00
Henrik Rydgård
604b3c3e97 IR Interpreter: Add missing break; to switch case IROp::FSign. 2018-01-04 11:08:56 +01:00
Unknown W. Brackets
bc541bd020 irjit: Encode downcount directly as a constant.
Simpler this way, now.
2018-01-03 23:32:31 -08:00
Unknown W. Brackets
cffb2d61a7 irjit: Embed constant inside IRInst.
This simplifies a bunch of code and improves compile performance by about
30%, at the cost of a bit more memory.
2018-01-03 23:24:04 -08:00
Unknown W. Brackets
64b57a0329 irjit: Fix swr typo.
Shifting the wrong direction - oops.
2018-01-03 08:14:25 -08:00
Henrik Rydgård
3ac2350ad6 IR Interpreter: Add a comment, minor cleanup, minor SSE stuff. 2018-01-03 16:31:55 +01:00
Unknown W. Brackets
29ed48c32a Module: Avoid scanning stubs if possible.
In this case, we often scan some garbage, but let's reduce it at least.
2018-01-02 22:21:08 -08:00
Unknown W. Brackets
00a05e97ac Module: Scan modules with no sections at all.
Also, fix some off-by-one issues with end addresses.
2018-01-02 22:20:09 -08:00
Unknown W. Brackets
30b2d05bac Module: Correct detection of executable sections. 2018-01-02 21:53:13 -08:00
Henrik Rydgård
690a409dac
Merge pull request #10496 from unknownbrackets/cpu-bgstart
Core: Asynchronously load the main ELF
2018-01-02 11:31:58 +01:00
Unknown W. Brackets
b41413b8a5 Core: Asynchronously load the main ELF.
Sometimes it takes a little time.  More importantly, this allows us to
load caches or do other things at start that might be a tad slow.

Not doing anything like that yet, though.
2018-01-01 22:58:06 -08:00
Unknown W. Brackets
b11858d9a0 irjit: Properly account for delay slots in size.
Otherwise we think blocks are 4 bytes too short, which can affect
invalidation.
2018-01-01 22:54:40 -08:00
Kentucky Compass
20794081ea iOS: Nix iosCanUseJit and targetIsJailbroken. Move NativeInit call to main so it can take cmd line args. 2018-01-01 19:10:44 -08:00
Henrik Rydgård
263941e9e0
Merge pull request #10494 from unknownbrackets/irjit
irjit: Implement lwl/etc.
2018-01-01 19:08:32 +01:00
Unknown W. Brackets
6509f8b433 HLE: Reset latestSyscall on save state load.
Loading a save state might call functions which call HLE log, such as
AtracSetContext.  This was outputting confusing log / reporting messages
based on a random recent syscall.
2018-01-01 08:57:08 -08:00
Unknown W. Brackets
3abcc4d6d8 irjit: Implement lwl/lwr/swl/swr.
This is very similar to the arm64jit implementation.
2018-01-01 08:38:13 -08:00
Unknown W. Brackets
b37ba9e599 irjit: Add options for compile/optimize steps.
This way the backend can set flags for the type of IR it wants.  It's
seems too complex to combine certain things like lwl/lwr in a pass.
2018-01-01 08:38:12 -08:00
Unknown W. Brackets
671be24105 irjit: Add extra temps to make lwl/swl/etc. easier. 2018-01-01 08:38:11 -08:00
Unknown W. Brackets
905d2c2da6 irjit: Cleanup some invalid op handling.
And log blocks the same way as other backends.
2018-01-01 08:38:11 -08:00
Unknown W. Brackets
d8d174fa2b arm64jit: Avoid spilling an extra reg for lwl/lwr.
It's only needed for swl and swr.
2018-01-01 08:38:10 -08:00
Unknown W. Brackets
8ffb0101fe jit: Report blocks with uneaten VFPU prefixes.
There may be options to avoid, like continuing these blocks, especially if
they're likely or something.
2018-01-01 08:38:10 -08:00
Henrik Rydgård
bf36965410
Merge pull request #10482 from unknownbrackets/irjit
irjit: Speed up icache block invalidation
2018-01-01 09:48:54 +01:00
Unknown W. Brackets
3af78883c7 irjit: Speed up icache block invalidation.
Turns out, in games using a ton of small memcpys, this was causing perf
issues.
2017-12-31 10:37:09 -08:00
Kentucky Compass
d8b3f4af33 Handle iOS audio session interruptions by reinitializing audio 2017-12-31 00:37:20 -08:00
Unknown W. Brackets
9ff812b313 arm64jit: Negate in ADDI2R/SUBI2R as well.
Should've done this at the same time as CMN.  It's not as common, mostly
catches addu calls, but it's good to have these generic for other uses.
2017-12-30 11:11:04 -08:00
Unknown W. Brackets
ae63628360 arm64jit: Statically allocate ra as well.
This doesn't seem to have a significant impact on performance, but it
improves bloat by about 5%.
2017-12-30 11:11:03 -08:00
Unknown W. Brackets
89cbf36611 arm64jit: Free up W23 for static alloc.
We shouldn't always reserve W23 for this uncommon case.
2017-12-30 07:51:27 -08:00
Unknown W. Brackets
e7ac672522 arm64jit: Cleanup method names, temp discard.
This way MapDirtyIn won't accidentally discard temps.
2017-12-30 07:51:27 -08:00
Unknown W. Brackets
0fc8274ec4 arm64jit: Enable safe memory for lwl/lwr. 2017-12-29 17:30:18 -08:00
Unknown W. Brackets
c00044c5d8 arm64jit: Avoid arithmetic movs.
ORR is the preferred encoding and may be faster on some chips.
2017-12-29 17:30:18 -08:00
Unknown W. Brackets
98ed6fab3f arm64jit: Fix spilling for more than one temp reg.
Otherwise we hang trying to spill the same reg over and over.
2017-12-29 17:30:17 -08:00
Unknown W. Brackets
ee236743f0 arm64jit: Use TBZ/TBNZ for vfpu branch as well. 2017-12-29 17:30:16 -08:00
Unknown W. Brackets
3b4917a308 arm64jit: Use TBZ/TBNZ for fp branches. 2017-12-29 17:30:15 -08:00
Unknown W. Brackets
c71285c970 arm64jit: Use CBZ/CBNZ for zero compare branches.
These are pretty common, so it reduces bloat decently.  Seems about the
same speed, though.
2017-12-29 17:30:15 -08:00
Unknown W. Brackets
7f8a871e30 arm64jit: Handle more imm compare cases. 2017-12-29 17:30:14 -08:00
Unknown W. Brackets
56d64f5c67 arm64jit: Avoid temporary on variable shift.
I think we should trust that it works per the spec.
2017-12-29 17:30:12 -08:00
Unknown W. Brackets
1ecce2a2e1 arm64jit: Reuse code in I2R funcs. 2017-12-29 17:30:07 -08:00
Unknown W. Brackets
2498ce5e3e arm64jit: Oops, properly init temp locked flag.
Fixes #10469.
2017-12-29 14:36:18 -08:00
Henrik Rydgård
cb3b1876dd
Merge pull request #10467 from unknownbrackets/arm64-jit
More arm64 optimizations and cleanup
2017-12-29 09:00:47 +01:00
Unknown W. Brackets
5177db0f91 arm64jit: Remove unnecessary address masking.
We use views like on x86_64, so this isn't needed.
2017-12-28 23:58:30 -08:00
Unknown W. Brackets
28da05fa7a HLE: Replace starocean framebuf clear func.
This reduces the performance impact significantly, by skipping the memset
uploads for each line.

Fixes #10466.
2017-12-28 23:40:18 -08:00
Unknown W. Brackets
27116dcb86 arm64jit: Avoid flushing when mapping as pointer. 2017-12-28 16:04:34 -08:00
Unknown W. Brackets
1b1e2c773b arm64jit: Jit lwl/lwr with proper temp regs.
It's possible rt might overlap with w9/w10, so we really need to allocate
these properly.  This locks and spills as necessary.
2017-12-28 15:54:03 -08:00
Unknown W. Brackets
970326c9e5 arm64jit: Fix and enable imm lwl/lwr. 2017-12-28 14:49:55 -08:00