Added VU->IR code for updating Q pipeline when a Q-reading instruction is called when no stall has occurred in a block (still needs to be implemented in x64 emitter)
Added stub for FCOR
Forced a VU JIT flush when a new microprogram is uploaded
Made JIT XGKICK timings a bit better
Made MAC flags update every instruction (to be reworked)
The BIOS successfully boots now. Time to get FFX working...
Fixed a bug that caused multiple division operations to not properly update the Q register
All scenes in Slave of the VU now run. The third is bugged because of timing issues with XGKICK.
Slave of the VU, with some patches to skip IOP problems, runs at more than double the speed before getting bottlenecked by the GS.
Also changed execution to go at full speed and made the VU translator a class rather than a namespace
This means that we have general branching logic. It doesn't handle branches in branch delays or integer delay slots, however.
Next up: FMAND. This means we'll need to have an optimized way of updating the MAC flag pipeline...
Also added various optimizations, as well as stubs for IBNE and FCSET. Need to work on conditional branches now.
Side note: last commit fixed some ILWR bugs, not LQI. Sorry!
Replaced some(maybe all?) instances of jumbledcase with snake_case
Renamed some GS structs (GS_message, for example)
Renamed camelCaseFuncs() to snake_case_funcs()
Cases where jumbledcase and snake case are mixed together (e.g. ebit_delay_slot) are kept as-is.
Made some changes to IR::Instruction to make arguments easier to read inside the emitter.
Added a reset function to the JIT to flush out the cache.
Things to consider for the future:
* Windows uses a different x64 ABI (thanks Microsoft). Perhaps we can have a generic ABI_Call function that translates a function call to the appropriate x64 assembly.
* The VU JIT is hardcoded to only work for a single VU. It needs to be modified to support both VUs.
* microVU (PCSX2's VU JIT) recompiles entire microprograms rather than blocks, allowing for advanced optimization. It would be ideal to follow a similar approach.
Next step: define what an IR instruction is. VU upper and lower ops can be swapped; swapping can be done at compile time, but this requires the source and destination registers of the ops to be known.
A JitCache will contain all dynamically recompiled code belonging to the EE or VU JIT.
TODO: Windows has a different way of allocating virtual memory. Currently we just die on Windows compilers.
* Added somewhat functional COP2 Interlocking
Fixed a couple of minor VU bugs
* Fix GIF DMA when CHCR written to while active. Fixes Klonoa 2
* Modified VU0Wait to always interlock (solves hangs on MBit games)
Added COP2 VU0 updating when writing to COP2 if VU0 is busy
Added a Status register pipe to the VU's for handling FSSET
Small optimisation to the VU stall checks