mirror of
https://github.com/n64dev/cen64.git
synced 2025-04-02 10:31:54 -04:00
izy noticed that the branch LUT was generating memory moves and could be replaced with an inlined function that coerces gcc into generating a lea in its place: 4005ac: 8d 1c 00 lea (%rax,%rax,1),%ebx 4005af: c1 fb 1f sar $0x1f,%ebx 4005b2: f7 d3 not %ebx (no memory access) 4005b9: c1 e8 1e shr $0x1e,%eax 4005bc: 83 e0 01 and $0x1,%eax 4005bf: 44 8b 24 85 90 07 40 mov 0x400790(,%rax,4),%r12d (original has memory access) This ends up optimizing branch instructions quite nicely: "You see that when you use "mask" you execute "~mask". The compiler understands that ~(~(partial_mask)) = partial_mask and removes both "NOTs". So in this case my version uses 2 instructions and no memory access/cache pollution." |
||
---|---|---|
.. | ||
docs | ||
cp0.c | ||
cp0.h | ||
cp1.c | ||
cp1.h | ||
cpu.c | ||
cpu.h | ||
dcache.c | ||
dcache.h | ||
decoder.c | ||
decoder.h | ||
fault.c | ||
fault.h | ||
fault.md | ||
functions.c | ||
icache.c | ||
icache.h | ||
interface.c | ||
interface.h | ||
opcodes.c | ||
opcodes.h | ||
opcodes.md | ||
opcodes_priv.h | ||
pipeline.c | ||
pipeline.h | ||
registers.md | ||
segment.c | ||
segment.h | ||
TODO.txt |