[*] Fixed memcpy usage for aligned path for the memcpy swizzle and also enabled it for Linux/OSX
[*] Refactored unaligned path for memcpy swizzle and also fixed memcpy_test
[*] Set correctly sp mem region for DMA_SP
[*] Removed redundant check for flashram read/write, actually this was incorrect for flashram read..
[*] Added return for failed DMA flashram transfers
[*] Set correct save type for Derby Stallion
[*] Set flashram type correctly, fixes saving for Derby Stallion
[*] Added check to prevent reading rom from out of range
* Added check for odd PI DMA length, fixes Doraemon 3
* Fixed incorrect mask for spmem_address, thanks Rinnegatamante for pointing out this
* Moved rdram overflow check outside the count loop
* Added FAST_DMA_SP define and enforce 8 byte alignment for correctness. This is still only enabled for the PSP but should be safe to enable for other platforms where speed its important
* Small clean ups
*Correct bad pointer casting for optimized copy in Yoshi_Memrect and fixed non optimized copy
*Store n64 ram offset rather than the system memory for Fast TMEM (thanks strmnnrmn!). Fixes random crashes when using the non accurate path for TMEM emulation
*Make sure to reset tmem block for fast TMEM even when accurate TMEM its used since we fall back to the non accurate path when games set line to 0
*Ignore load tile for accurate TMEM when line its 0, this was causing a crash in Paper Mario which sets line = 0
Fixes bug in Paper Mario where a save slot will get duplicated also fixes a bug in Majora's Mask where link did not have a shield and sword during the first movie
*Fixed debug build when DAEDALUS_DEBUG_DISPLAYLIST its defined, also I enabled it for debug builds
*Fixed ptr->u32 cast in DLParser_DumpVtxInfoDKR, it was causing a compilation error for me)
*Removed unsused zlib from third_party, also added webby
*Fixed possible overflow of ucode entries, also spread entries to avoid always overriding the last entry
*Fixed out of bounds assert for microcode data
*Properly check both microcode data and code base for ucode cache
*Removed gLastUcodeBase, potentially caused ucodes not to be loaded correctly. This was legacy code before we had the ucode cache
*Fixed 32bit pointer assumption on DLParser_SetCustom