Releases: FEX-Emu/FEX
Releases · FEX-Emu/FEX
FEX-2107
Compatibility & Bug Fixes
- Fixes bugs in unaligned atomic signal handlers
- CPUID cleanups
- Fixed a memory leak in Register Allocation
- Several syscall fixes (pidfd_send_signal, arch_prctl, fcntl, send(m)msg, shmdt, recvmsg, *chown32, edge cases around time syscalls, 32/tmpfile)
- Implement more syscalls (setfsuid32, setfsgid32, getgroups32, setgroups32, settimeofday, futimesat, utimes, 32/sigpending, 32/truncate64)
- Deferred signal handler registration, fixes bash (pts handover)
- Fixed handling of some rare elf files
- Implemented 32 bit iret
Usability
- Adds support for squashfs based rootfses
- cpack support for debian packaging
Performance
- Huge page support in our VA allocator
Misc
- Reduce warnings
- Several assorted Cleanups
- Relocate ELF handling logic to the os frontend
- Improve logging SNR
FEX-2106
Compatibility & Bug Fixes
- Several syscall bug fixes (pselect, epool, semctl, msgctl, pidfd_getfd, poll, fadvise64, sigaction, mmap, epoll, uname, openat, ...)
- Implemented SSE4.1
Performance
- Multi threaded AOTGen
- Optimized ioctl/drm marshaling
Misc
- Several minor cleanups & refactors
FEX-2105
Compatibility & Bug Fixes
- New, simpler ELF Loader
- ioctl32 marshaling for several devices
- Handle more cases of an application pinging self
- binfmt_misc fixes to support AppImage
- Several system call fixes (eventfd, eventfd2, openat, creat, trucnate, clock_nanosleep, pselect6, sendmsg, recvmsg)
- Signal fixes
- Better handling of
/proc/self/
,/proc/pid-self/
- Implements support for CLFLUSH
Performance
- Faster AOTIR file loading
- AOTIR offline generation
- Switch to xxhash from fasthash
- IR structure optimizations
- Adds Long Divide removal pass
- Zero-cost asserts
Misc
- Fixes host thread stacks ending up in lower 32-bit VA
- Remove reliance on librt and libnuma
- Default hidden visibility and strip symbols
- FEX now uses jemalloc
FEX-2104
Compatibility & Bug Fixes
- Disables RCPC on ARM64 JIT
- Implements CPUID 0x8000'0005 for L1 cacheline information
- Implements CPUID 0x8000'0006 for cacheline information
- Implements MOVNTDQA
- Restrict imm code motion around selects to matching sizes, fixes dav1d
- Validate LOCK handling, add missing segment offsets
- Add atomic logic for SecondaryALUOp
- Adds support for locked NOT
- Adds support for locked ADC and SBB
- Adds a couple new 32bit syscalls
- Fixes an edge case of 32bit cmpxchg <reg>, <reg>
- Lock around FDToNameMap accesses (Fixes Geekbench 4 stability issues)
- Flush context around OP_SYSCALLs, Syscalls might read it
- SA_NOCLDSTOP only blocks CLD_CONTINUED/STOPPED/TRAPPED
- Add missing break for UD2 in INTOp
- Init on X87FNSAVE, fix FNINIT
- Switches FEXCore over to pthreads implementation
Usability
- Support for global application profiles
- Thunks can be configured with json as an overlay
- FEXConfig improvements
- Adds support for Named RootFS folders in FEXConfig
- Cleanup threads when they exit
- Default to no logging
- Allows installing of FEXThunks in our data directory
Documentation
- New Readme.md & auto generated SourceOutline.md to help newcomers to the codebase
- Man Pages
Internal restructuring
- Unify all four dispatchers
- Separate thread and state
- Allow both ARM64 and X86_64 jits to be compiled at the same time
FEX-2103
Compatibility
- Support for unaligned atomic memory ops
- Thread local mman cache invalidation
- glib 2.32 support (cpuid fixes, faccessat2)
- cmpxchg flag fixes
- fixed cvtt* ops to actually truncate, both for x87 and sse
- Implemented BTC, BTR and BTS atomic variants
- Workaround crashes around exit_group for graceful exit
- Fixed SAR8 & SAR16 sign extension
- Added support for x87 rounding modes
- Added support for x87 precision modes
- Temporary fix for x87's frem
- Fixed select system calls to update the descriptors
- Fixed handling of invalid ops when multiblock is enabled
- Fixed an edge case bug in CreateElementPair
- Fixed BRK handling
- Fixed /proc/cpuinfo topology
- Fixed compilation of superblocks with more than 256 spills
- Fixed a bug in in the MUL -> SHL optimization
Performance
- Added ir-cache that halves loading times when enabled (--aotir-capture && --aotir-load)
- x87 ops no longer force interpreter for the entire superblock
Usability
- uname now returns the host name
- Added checks for python in the CMake files
Misc
- ThunkLibs now generate asm wrappers, so they can be compiled with older GCCs
- Asserts on unsupported atomic ops
- Removed erratic asserts in RA
FEX-2102
Compatibility
- Unaligned cmpxchg & cmpxchg8b on ARMv8.1+
- Mutliblock now gracefully handles unsupported / invalid instructions. This makes it safe to always enable multi block
- Several fixes for 32-bit binaries: branch handling, signal return, memory allocation, ioctl32 for x86-64 host
- Improved BRK handling
Perfomance
- Much reduced stuttering during JIT compilation. The JIT optimizer is now over 3x as fast.
- Per-thread IR Caching with cached RA, for faster recovery from code cache resets
Usability
- Defaults now to jit, multiblock, 5000 instructions.
- CPUID returns fex version
Misc
- Cleaned up IR printing, extended asm tests to test IR dumping & IR printing
- Integrated gcc target tests to our CI
- Removed many warnings
The detailed change log is available here