FEXCore: Optimize VPCMPISTRX implicit length calculation #4345

Sonicadvance1 · 2025-02-11T03:52:58Z

With ASIMD this can be decently faster. With my microbenchmark this makes pcmpistri ~6% faster.

With #4324 this can be made even faster since the incoming data can stay in vector registers; Removing some overhead of umov.

With ASIMD this can be decently faster. With my microbenchmark this makes pcmpistri ~6% faster. With FEX-Emu#4324 this can be made even faster since the incoming data can stay in vector registers; Removing some overhead of umov.

lioncash

Nice

Sonicadvance1 force-pushed the optimize_pcmpistri branch from 654211c to bd96ee9 Compare February 11, 2025 03:56

FEXCore: Optimize VPCMPISTRX implicit length calculation

d46722a

With ASIMD this can be decently faster. With my microbenchmark this makes pcmpistri ~6% faster. With FEX-Emu#4324 this can be made even faster since the incoming data can stay in vector registers; Removing some overhead of umov.

Sonicadvance1 force-pushed the optimize_pcmpistri branch from bd96ee9 to d46722a Compare February 11, 2025 04:17

lioncash approved these changes Feb 11, 2025

View reviewed changes

lioncash merged commit 2943cff into FEX-Emu:main Feb 11, 2025
12 checks passed

Sonicadvance1 deleted the optimize_pcmpistri branch February 11, 2025 17:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEXCore: Optimize VPCMPISTRX implicit length calculation #4345

FEXCore: Optimize VPCMPISTRX implicit length calculation #4345

Sonicadvance1 commented Feb 11, 2025 •

edited

Loading

lioncash left a comment

FEXCore: Optimize VPCMPISTRX implicit length calculation #4345

FEXCore: Optimize VPCMPISTRX implicit length calculation #4345

Conversation

Sonicadvance1 commented Feb 11, 2025 • edited Loading

lioncash left a comment

Choose a reason for hiding this comment

Sonicadvance1 commented Feb 11, 2025 •

edited

Loading