Skip to content

FEX-2502

Latest
Compare
Choose a tag to compare
@Sonicadvance1 Sonicadvance1 released this 08 Feb 10:04
· 57 commits to main since this release
512643d

Read the blog post at FEX-Emu's Site!

One month later and we have another exciting FEX-Emu release! While we don't have a lot of individual topics, these are some very good usability and
performance improvements to enjoy!

Fix Steam again

Once again this last month Steam updated its version of their embedded Chromium source. This introduced a behaviour of passing around zero length environment variables.
While this isn't necessary a problem, FEX had mad an assumptions that all environment variables contained at least one character. This minor change in
behaviour results in a crash, preventing Steam from fully starting. With this problem fixed, Steam is once again able to run.

Multiblock improvements

This is one of the more exciting improvements this month as it increasing both JIT compile performance and JIT runtime performance!
bylaws this month too it upon themselves to fix bugs that had ended up in FEX's "Multibock" implementation.

The changes:

  • Stop multiblock discovery beyond a page.
    • Signicantly reducing the search space.
  • Early exit multiblock entries if the instructions are two bytes of zero
    • Highly unlikely to be real code and likely zero initialized memory
  • Split blocks at jump target boundaries
    • Significantly reducing the amount of redundant code compilation that occurs
  • Ensure our RIP reconstruction can handle large code jumps
    • Fixes crashes in game engines that do faulting magic
  • Stop copying the IR after generation
    • While not huge, when compiling hundreds of thousands of entrypoints, it adds up
      With the combination of all these changes, sometimes the JIT compilation time can get cut in half with multiblock enabled! Which means we get more performance, less stuttering, and less crashes when multiblock is in use.

We don't yet enable this feature by default, but it can be enabled in our FEXConfig tool. After a month of dogfooding this feature, we are highly
likely to enable it by default for next month.

Fix WINE memory allocator behaviour

This was an interesting bug that has technically been in FEX's source for years but only recently got unearthed as a problem. Due to a recent change
in reordering how FEX allocates memory, we started allocating a region of memory that WINE allocates at startup in their preloader. This wasn't
previously an error since WINE was most likely just overwriting some temporary buffer that FEX wasn't actually using. At startup WINE used this memory
region for some heap allocations and expects it to always be available.

The good thing is that this only happened on ARM devices that are configured to use a 48-bit Virtual Address size and due to ASLR it was unlikely to
cause problems. Now in this situation, FEX makes sure to keep its allocations out of the way of x86-64 applications, ensuring that there isn't a
memory conflict.

Minor optimization for x87 address modes

While hard to say this will be visible in most cases, this showed up in Crysis 2: Maximum Edition's audio thread that was consuming 100% CPU time. This
game's audio thread is so heavy that it completely maxes out a CPU core and drops audio samples. We noticed that for address modes with a small
immediate, we weren't optimizing those to ARM instructions that could do the same. With that fixed, the game is still dropping a ton of audio samples
but hopefully we have more room for improvements there.

Raw Changes

FEX Release FEX-2502

  • ARM64EC

    • Set EC_ENTRY_CPUAREA_REG at inline SMC dispatcher entry (20b00ec)
  • Allocator

  • Arm64

    • Fix bitmask used to match load/store instructions (0019bde)
  • CMake

    • Default enable clang thunk building (1becbab)
    • Check for compatible Catch2 versions (c892899)
    • Simplify vixl-related options (d01db8f)
    • Compile vixl if ENABLE_VIXL_DISASSEMBLER is set (a52dd71)
  • CPUBackend

    • Make guest RIP reconstruction offsets signed (981eea6)
  • CPUID

    • Remove duplicated ARM Neoverse-N2 (de431f1)
  • CodeEmitter

  • FEX

    • Allocate a VMA allocator when running on a 48-bit VA (c2f8b5b)
  • FEXConfig

    • Fixes instcount not being editable by keyboard (90db948)
  • FEXCore

    • Don't copy IR after compilation (d39dea1)

    • JIT

      • Encode the JITRIPReconstructionEntries using variable length integer (e94643d)
    • vl64

  • FEXLoader

    • ELFCodeLoader
      • Be robust against zero length environment variables (2a4c169)
  • FEXServer

  • Frontend

    • Split blocks at jump target boundaries (eaddd44)
    • Disallow cross-page branches in multiblock (3b8c368)
    • End multiblocks early after hitting 2 consecutive null bytes (48c03d7)
    • Stop all decoding once MaxInst/DecodeBufferSize is reached (f635a12)
  • GdbServer

    • Implement new netstream that can be interrupted (d2a56eb)
  • IRDumper

  • InstCountCI

    • Hardcode xchg instructions (f4c9275)
  • InstcountCI

    • Adds a hotblock for 32-bit TSO testing (e8cd655)
  • IoctlEmulation

  • JIT

    • Avoid OOB EC bitmap checks in ExitFunction (8c02bd4)
  • LinuxEmulation

    • Ensure syscall wrapper declaration has CpuStateFrame as the first argument (8d6a43d)
  • LinuxSyscalls

    • Update for new v6.13 syscalls (ac1b6d9)
  • NFC

  • Profiler

    • Setup for usage on Windows (8e2b4a3)

    • GPUViz

      • Stop allocating memory (fb2a59a)
      • Fixes typo in instant TraceObject (ae69c4d)
  • Scripts

    • Fix indentation of changelog items (276e9ad)
  • SignalDelegator

    • Protect first page of the altstack (62cfc26)
  • WINE

    • Fixes FEX_PORTABLE usage (9af52fb)
  • Windows

  • Misc

    • Add option StartupSleepProcName (55cbb0b)
    • Revert "Enable RA of SVE Predicate Registers" (db7fb56)
    • Skip 3DNow tests with precision issues (2bed744)
    • Predicate cache alternative implementation (b148cc6)
    • Library Forwarding/wayland: Fix regression caused by erroneous format (bd1bca2)
    • Update upload-artifact action to v4 (e9bd037)
    • Pass through FPRs argument (adff4bb)
    • Fix slight inaccuracy in test 3_F7_05_2 (42c931c)
    • Fix crashes in Paranoid TSO mode (56c95e3)
    • Fix warnings about unused objects (840f306)
    • Drop assume-asserting logging macros (9def89d)
    • x87 fst/fld optimization for different addrmodes (3f788eb)
    • Revert pred cache (8cfc016)
    • Print arg type f80Bit (8c94b78)
  • cmake

    • Adds some missing STATIC qualifiers (2293d30)