Unknown W. Brackets
|
1c163e4817
|
arm64: Avoid an ORR for c.ueq.
This is about 15% faster for this single, uncommon instruction on A57.
|
2015-06-28 10:52:17 -07:00 |
|
Unknown W. Brackets
|
febe435946
|
arm64: Use FP load/stores for non-reg pointers.
|
2015-06-28 10:45:44 -07:00 |
|
Unknown W. Brackets
|
213ad4bcc9
|
arm64: Cleanup branch code a tiny bit.
Want to make it clear that we can't kill W0 at this point (delay slots.)
|
2015-06-28 09:28:54 -07:00 |
|
Unknown W. Brackets
|
0978aa4d5e
|
arm64: Use msub for div/divu remainder.
Not really much faster, but less instructions at least.
|
2015-06-28 09:05:39 -07:00 |
|
Unknown W. Brackets
|
0a5b1c030b
|
arm64: Implement ext and ins.
|
2015-06-28 08:45:17 -07:00 |
|
Unknown W. Brackets
|
daddb73f22
|
arm64: Implement nor.
|
2015-06-28 00:41:04 -07:00 |
|
Unknown W. Brackets
|
11a851a139
|
arm64: Enable movz/movn.
|
2015-06-28 00:41:04 -07:00 |
|
Unknown W. Brackets
|
223e55a453
|
arm64: Undisable clz/clo, they work.
Also, avoid a temp in clo. It's the tiniest bit faster on A57, though
we'll see how it works out elsewhere. A bit clearer without the temp
imho.
|
2015-06-28 00:41:03 -07:00 |
|
Unknown W. Brackets
|
81bc8107cf
|
arm64: Use UBFX, not LSR, for slti sign check.
This is about 22% faster on the A57 (for just this one instruction, so not
a huge impact overall.) Makes sense that it would be since not arith.
|
2015-06-28 00:41:03 -07:00 |
|
Unknown W. Brackets
|
fedbe645e0
|
arm64: Use all immediate compares in safemem.
Ah, this is better.
|
2015-06-27 00:22:09 -07:00 |
|
Unknown W. Brackets
|
3c29ec2051
|
arm64: Optimize codesize in safemem path a bit.
Will only be used for scratchpad, I think.
|
2015-06-27 00:22:04 -07:00 |
|
Unknown W. Brackets
|
fbd4db0fc4
|
arm64: Add a safemem path.
This is probably not optimal but at least it works.
|
2015-06-27 00:22:04 -07:00 |
|
Unknown W. Brackets
|
b3aa6d89e9
|
Fix UBFX encoding (thanks SonicAdvance1.)
|
2015-06-26 21:27:03 -07:00 |
|
Henrik Rydgard
|
e848247f88
|
ARM64: Also save FP registers around the JIT dispatcher loop
|
2015-06-14 13:03:46 +02:00 |
|
Henrik Rydgard
|
2c05334d47
|
ARM64: Fix bug where we didn't save the FP registers correctly in the vertex decoder.
Also port a few ops from dolphin's ARM64 emitter.
|
2015-06-14 12:56:44 +02:00 |
|
Henrik Rydgård
|
70fa830ba5
|
Split out the ReplaceJalTo test logic.
This makes it so the IR, in the future, can work correctly for
replacements.
|
2015-04-12 13:35:10 -07:00 |
|
Henrik Rydgård
|
d014d420db
|
Unify JitOptions across the backends.
This is required to make ExtractIR not a member of the various backends.
|
2015-04-12 11:41:26 -07:00 |
|
Henrik Rydgård
|
81dec36da8
|
Use an accessor to read the compilerPC.
In the IR it will be read from the block.
|
2015-04-11 01:14:37 -07:00 |
|
Henrik Rydgård
|
a897723e6a
|
Separate out jit reading nearby instructions.
This makes it easier to use an IR for these things, or remove them.
|
2015-04-11 00:53:24 -07:00 |
|
Unknown W. Brackets
|
b0d291032d
|
armjit Avoid cfc1/mfc1 to $0.
|
2015-04-07 18:30:36 -07:00 |
|
Unknown W. Brackets
|
788b9d78f8
|
jit: Avoid a super unlikely write to zero.
|
2015-04-07 18:20:37 -07:00 |
|
Henrik Rydgård
|
a8c2d0945a
|
ARM64: lwl: Pass INVALID_REG to be sure SCRATCH1 doesn't get overwritten...
|
2015-04-06 18:13:41 +02:00 |
|
Henrik Rydgård
|
13c9390c53
|
ARM64: Emitter fix, disable swl/swr/lwl/lwr again fully
|
2015-04-06 18:13:38 +02:00 |
|
Henrik Rydgård
|
fbaffdceab
|
Remove some outdated comments, minor stuff
|
2015-04-06 18:13:36 +02:00 |
|
Henrik Rydgard
|
0a70618f87
|
ARM64: Accurate floating point rounding. For some reason, FTZ doesn't seem to work though.
|
2015-04-06 18:13:36 +02:00 |
|
Henrik Rydgard
|
ad3d539451
|
ARM64: Attempt at lwl/lwr/swl/swr. The first two don't work
|
2015-04-06 18:13:35 +02:00 |
|
Henrik Rydgard
|
44286a2b37
|
ARM64: Accurate float->int conversion with rounding mode.
|
2015-04-06 18:13:34 +02:00 |
|
Henrik Rydgard
|
acf08eefa8
|
ARM64: Fix FCVTL, use it in v2hf
|
2015-04-06 18:13:33 +02:00 |
|
Henrik Rydgard
|
8eedcc7fb0
|
ARM64: Speedup fpu/vfpu load/stores too using "pointerification". Actually noticeable gain.
|
2015-04-06 18:13:32 +02:00 |
|
Henrik Rydgard
|
ad648baa9c
|
ARM64 regcache: Add support to "pointerify" registers. Use in load/store to cut down instructions.
|
2015-04-06 18:13:32 +02:00 |
|
Henrik Rydgard
|
2780eef595
|
ARM64: Another couple of VFPU ops
|
2015-04-06 18:13:31 +02:00 |
|
Henrik Rydgard
|
ca58f322e5
|
ARM64: Port over some missing VFPU instructions from ARM. Not much left now.
|
2015-04-06 18:13:30 +02:00 |
|
Henrik Rydgard
|
f06e9a9d18
|
ARM64: Even more VFPU instructions
|
2015-04-06 18:13:30 +02:00 |
|
Henrik Rydgard
|
1b1ab73b0f
|
ARM64: Enable some more VFPU instructions, some code cleanup
|
2015-04-06 18:13:29 +02:00 |
|
Henrik Rydgard
|
500ca94ab8
|
ARM64: Port over tons of VFPU code from ARM, leave most of it disabled.
|
2015-04-06 18:13:28 +02:00 |
|
Henrik Rydgard
|
8df8c210d1
|
ARM64: Start porting over VFPU stuff from ARM, fix regalloc bug
|
2015-04-06 18:13:28 +02:00 |
|
Henrik Rydgard
|
6cb107d6fc
|
ARM64: Fix LDP disassembly
|
2015-04-06 18:13:25 +02:00 |
|
Henrik Rydgard
|
34e61ab875
|
ARM64: More FPU instructions (int<->float convert), minor stuff
|
2015-04-06 18:13:25 +02:00 |
|
Henrik Rydgard
|
25ec85551f
|
ARM64: Implement FP compares, misc
|
2015-04-06 18:13:22 +02:00 |
|
Henrik Rydgard
|
ceb9f66502
|
ARM64: Fix bug in mult
|
2015-04-06 18:13:21 +02:00 |
|
Henrik Rydgard
|
1a02e32ad1
|
ARM64: Implement the multiplication instructions
|
2015-04-06 18:13:20 +02:00 |
|
Henrik Rydgard
|
a12e448fb4
|
ARM64: Stub vertex decoder jit, implementing just enough for the cube.elf cube.
|
2015-04-06 18:13:18 +02:00 |
|
Henrik Rydgard
|
57e759a605
|
ARM64: Fix and turn on basic block linking
|
2015-04-06 18:13:17 +02:00 |
|
Henrik Rydgard
|
5dff3f8c89
|
ARM64: Implement scalar FMOV. This makes the FPU2op ops work.
|
2015-04-06 18:13:16 +02:00 |
|
Henrik Rydgard
|
4233921ab7
|
ARM64: Some more instructions, func replacements
|
2015-04-06 18:13:16 +02:00 |
|
Henrik Rydgard
|
9e2786b319
|
ARM64: Fix and enable a bunch more instructions. temporarily disable movz movn
|
2015-04-06 18:13:15 +02:00 |
|
Henrik Rydgard
|
2bca05c4f2
|
ARM64: implement shifts, movz/movn. Corresponding fixes to emitter/disasm
|
2015-04-06 18:13:14 +02:00 |
|
Henrik Rydgard
|
86ff2a2806
|
ARM64: Enable a bunch of arithmetic instructions that now work, thanks to emitter fixes
|
2015-04-06 18:13:13 +02:00 |
|
Henrik Rydgard
|
77501e220d
|
ARM64: Enable a few more instructions, more emitter/disasm unittests
|
2015-04-06 18:13:13 +02:00 |
|
Henrik Rydgard
|
0922db6062
|
ARM64: Some FP work.
|
2015-04-06 18:13:11 +02:00 |
|