Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: two_percent is faster and more efficient than fzf #561

Open
kimono-koans opened this issue Mar 4, 2024 · 21 comments
Open

Performance: two_percent is faster and more efficient than fzf #561

kimono-koans opened this issue Mar 4, 2024 · 21 comments

Comments

@kimono-koans
Copy link

kimono-koans commented Mar 4, 2024

If the reason you're not using skim is raw performance, my branch, two_percent, is faster, more efficient and uses less memory than fzf for largish inputs (40MB):

> hyperfine -i -w 3 "sk --algo=simple --no-sort --query=hopscotchbubble --exit-0 < ../countwords/kjvbible_x10.txt" "fzf --algo=v1 --no-sort --query=hopscotchbubble --exit-0 < ../countwords/kjvbible_x10.txt" "sk --algo=skim_v2 --no-sort --query=hopscotchbubble --exit-0 < ../countwords/kjvbible_x10.txt"
Benchmark 1: sk --algo=simple --no-sort --query=hopscotchbubble --exit-0 < ../countwords/kjvbible_x10.txt
  Time (mean ± σ):      81.2 ms ±   2.0 ms    [User: 205.2 ms, System: 18.6 ms]
  Range (min … max):    78.7 ms …  86.2 ms    35 runs

  Warning: Ignoring non-zero exit code.

Benchmark 2: fzf --algo=v1 --no-sort --query=hopscotchbubble --exit-0 < ../countwords/kjvbible_x10.txt
  Time (mean ± σ):     125.5 ms ±   2.8 ms    [User: 229.7 ms, System: 72.7 ms]
  Range (min … max):   116.7 ms … 130.8 ms    22 runs

  Warning: Ignoring non-zero exit code.
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Benchmark 3: sk --algo=skim_v2 --no-sort --query=hopscotchbubble --exit-0 < ../countwords/kjvbible_x10.txt
  Time (mean ± σ):     135.2 ms ±   2.0 ms    [User: 797.5 ms, System: 18.3 ms]
  Range (min … max):   131.6 ms … 140.4 ms    21 runs

  Warning: Ignoring non-zero exit code.

Summary
  sk --algo=simple --no-sort --query=hopscotchbubble --exit-0 < ../countwords/kjvbible_x10.txt ran
    1.55 ± 0.05 times faster than fzf --algo=v1 --no-sort --query=hopscotchbubble --exit-0 < ../countwords/kjvbible_x10.txt
    1.66 ± 0.05 times faster than sk --algo=skim_v2 --no-sort --query=hopscotchbubble --exit-0 < ../countwords/kjvbible_x10.txt

If anyone else is interested in performance improvements, and refactoring to reduce long term memory usage, I've been building on skim actively and I am eager to work with others who are similarly inclined.

Re: @lotabout 's wonderful comment re: performance:

The overhead of skim's item is 88B while fzf's item is 56B.

For ordinary text inputs, I'm using Arc<Box<str>> which I think is 32B? If there is some way to use Arc<str>, that would be even better but I can't seem to make it work re: traits. Re: my 10x duplicated KJV Bible 43M-ish corpus, the memory usage is about 90M on my Mac.

In my experience, skim's memory usage is around 1.5x ~ 2x of fzf's.
Fuzzy finder is the excellent use case of shared memory, but rust has limited support of it.

On ingest, I have a method for deduplicating lines. This is a significant performance win for inputs with lots of empty or the same lines.

Both skim and fzf's matching algorithm are O(n)

Algorithm switching is broken in the latest version. This is fixed in mine. I have a simple algorithm which is much closer to the fzf v1 algo used for large inputs. See above. You too can now write your own super, simple algo or improve mine!

But the size of structure to store matching result is different (skim is bad at it).

I've made a few changes which may help.

Performance of crossbeam's channel seems to be slower than go(not sure)? It's claimed to be fast, but it's still one of the bottlenecks in skim's use case.

My experience has been that ingest was not reading in enough bytes at once, and other threads were spinning wait on the lock.

@nrdxp
Copy link

nrdxp commented Apr 1, 2024

Just curious. Is there a reason why you can't/won't merge these changes back into skim?

@kimono-koans
Copy link
Author

Just curious. Is there a reason why you can't/won't merge these changes back into skim?

As I stated in my post:

If anyone else is interested in performance improvements, and refactoring to reduce long term memory usage, I've been building on skim actively and I am eager to work with others who are similarly inclined.

My experience has been that this project is not actively maintained. I think I still have PRs outstanding: https://github.com/lotabout/skim/pulls/kimono-koans.

If anyone wants to assist, and contribute here, I'd help.

If no one else is so inclined, right now, I'll just keep developing in my own fork/branch.

@d3-X-t3r
Copy link

Just curious, what's the reason for the name "two_percent"? It's kinda hard to remeber (as in, what it does) and also to recommend to folks...

@kimono-koans
Copy link
Author

kimono-koans commented Apr 30, 2024

In the US, at least, we sell milk with different percentages of milk fat. skim has 0% milk fat. Whole milk has something like 4% milk fat. A popular option is called 2% milk or 2%.

If you install cargo install two_percent or cargo deb --install, the binary/command will be installed as sk.

@litoj
Copy link

litoj commented May 4, 2024

Have you also implemented fzf-like color option parsing? I am very used to having some elements of the ui in bold, but that doesn't seem to be possible with the current implementation.

Edit: I tried cargo install two_percent, but the result executable just shows the loading indicator forever without actually loading anything.

@kimono-koans
Copy link
Author

Have you also implemented fzf-like color option parsing? I am very used to having some elements of the ui in bold, but that doesn't seem to be possible with the current implementation.

No, I haven't done any work re this, but I already see bold colors in my view? See: https://asciinema.org/a/637475

@LoricAndre
Copy link
Contributor

Hi @kimono-koans we'd be happy to work with you to merge your improvements back into Skim ! Please tell us if you are interested

@litoj
Copy link

litoj commented Nov 8, 2024

No, I haven't done any work re this, but I already see bold colors in my view?

Maybe the white looks a little bit whiter, but that isn't bold as in bold.

@kimono-koans
Copy link
Author

kimono-koans commented Nov 10, 2024

Hi @kimono-koans we'd be happy to work with you to merge your improvements back into Skim ! Please tell us if you are interested

Once you start merging new things, I'll perhaps work on some commits for upstream.

@LoricAndre
Copy link
Contributor

With this release, we should have some "quiet time" to start progressively merging your branch if you'd like

@junegunn
Copy link

junegunn commented Dec 20, 2024

I don't see much point in measuring the performance of v1 algorithm, because it is rarely used in practice. fzf does switch to it if an item is very long, but only for that item, and the rest of the items in the list are processed using the default algorithm. This switching happens when M * N > 102400, where M is the length of the query, and N is the length of an item. So for the query hopscotchbubble, a single item should be longer than 68KB for v1 to be used, and there are few such items in practice.

Also, please consider benchmarking against the latest version of fzf because the performance of fzf has improved a lot.

@kimono-koans
Copy link
Author

kimono-koans commented Feb 16, 2025

Pretty cool that we got @junegunn to comment, huh?

Also, please consider benchmarking against the latest version of fzf because the performance of fzf has improved a lot.

This is very true as you'll see below. fzf is incredibly fast.

To @LoricAndre and skim maintainers:

As you can see below, the last, two_percent, is still way faster than skim, the first, and, for the moment, fzf, the second. This is on a Mac M3 with a 45MB dataset (the King James Bible concatenated 10x). On an old Intel Machine, the difference is more pronounced over fzf. two_percent to 20% faster.

Note, I am using my own rudimentary fuzzy find algorithm as well, but you can see the more general performance improvements in the System: numbers.

I'd be pleased to begin merging my code back into skim. However, I want to make sure we are on the same page.

One issue I see is that my methods are/were pretty hacky, compared to the right way to reduce memory usage, here, which is by using a string interner, but that would probably require an API break. If you're looking at doing something like that, I'll continue no further.

Next, my priorities may be different than yours/the project's. One goal is reduce long term memory usage, and allow for multiple invocations of skim as a library during a running program. That means removing as many 'static lifetimes as possible, etc.

So, philosophically, I don't know, but I hope we are on the same page. If not, I'd be pleased to allow others to merge my code as they see fit, and/or assist by directing them to where the bodies are.

hyperfine -i -m 20 -w 3 "./sk --query=hopscotchbubble --exit-0 < ~/Programming/countwords/kjvbible_x10.txt" "fzf --query=hopscotchbubble --exit-0 < ~/Programming/countwords/kjvbible_x10.txt" "sk --query=hopscotchbubble --exit-0 < ~/Programming/countwords/kjvbible_x10.txt"

Benchmark 1: ./sk --query=hopscotchbubble --exit-0 < ~/Programming/countwords/kjvbible_x10.txt
  Time (mean ± σ):      1.032 s ±  0.034 s    [User: 2.988 s, System: 3.376 s]
  Range (min … max):    0.985 s …  1.101 s    20 runs

  Warning: Ignoring non-zero exit code.

Benchmark 2: fzf --query=hopscotchbubble --exit-0 < ~/Programming/countwords/kjvbible_x10.txt
  Time (mean ± σ):      70.8 ms ±  12.0 ms    [User: 148.2 ms, System: 65.0 ms]
  Range (min … max):    60.0 ms …  85.5 ms    48 runs

  Warning: Ignoring non-zero exit code.
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Benchmark 3: sk --query=hopscotchbubble --exit-0 < ~/Programming/countwords/kjvbible_x10.txt
  Time (mean ± σ):      65.9 ms ±   6.6 ms    [User: 117.4 ms, System: 19.0 ms]
  Range (min … max):    51.9 ms …  77.5 ms    38 runs

  Warning: Ignoring non-zero exit code.

Summary
  sk --query=hopscotchbubble --exit-0 < ~/Programming/countwords/kjvbible_x10.txt ran
    1.07 ± 0.21 times faster than fzf --query=hopscotchbubble --exit-0 < ~/Programming/countwords/kjvbible_x10.txt
   15.66 ± 1.65 times faster than ./sk --query=hopscotchbubble --exit-0 < ~/Programming/countwords/kjvbible_x10.txt

@junegunn
Copy link

@kimono-koans Thanks for following up. Out of curiosity, I cloned your repository, built it (cargo build --release), and benchmarked it with the latest version of fzf. I don't have the input you tested with, so I just multiplied the find result of my local drive so that it's large enough.

unset FZF_DEFAULT_OPTS
QUERY=hopscotchbubblehopscotchbubble
for input in github8 github32; do
  du -sh $input
  wc -l $input
  hyperfine -i -m 20 -w 3 "./sk --query=$QUERY --exit-0 < $input" "fzf --query=$QUERY --exit-0 < $input"
done
 89M    github8
  823864 github8
Benchmark 1: ./sk --query=hopscotchbubblehopscotchbubble --exit-0 < github8
  Time (mean ± σ):     143.5 ms ±   5.1 ms    [User: 618.1 ms, System: 22.3 ms]
  Range (min … max):   136.6 ms … 157.0 ms    20 runs

  Warning: Ignoring non-zero exit code.

Benchmark 2: fzf --query=hopscotchbubblehopscotchbubble --exit-0 < github8
  Time (mean ± σ):      85.0 ms ±   2.9 ms    [User: 187.0 ms, System: 38.4 ms]
  Range (min … max):    69.2 ms …  87.1 ms    34 runs

  Warning: Ignoring non-zero exit code.
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Summary
  fzf --query=hopscotchbubblehopscotchbubble --exit-0 < github8 ran
    1.69 ± 0.08 times faster than ./sk --query=hopscotchbubblehopscotchbubble --exit-0 < github8
358M    github32
 3295456 github32
Benchmark 1: ./sk --query=hopscotchbubblehopscotchbubble --exit-0 < github32
  Time (mean ± σ):     520.0 ms ±   9.0 ms    [User: 2476.5 ms, System: 67.7 ms]
  Range (min … max):   505.5 ms … 537.3 ms    20 runs

  Warning: Ignoring non-zero exit code.

Benchmark 2: fzf --query=hopscotchbubblehopscotchbubble --exit-0 < github32
  Time (mean ± σ):     260.4 ms ±   2.9 ms    [User: 775.8 ms, System: 156.1 ms]
  Range (min … max):   257.3 ms … 270.6 ms    20 runs

  Warning: Ignoring non-zero exit code.

Summary
  fzf --query=hopscotchbubblehopscotchbubble --exit-0 < github32 ran
    2.00 ± 0.04 times faster than ./sk --query=hopscotchbubblehopscotchbubble --exit-0 < github32

The result was quite different from yours. Maybe it depends on the characteristics of the input and the query? Or am I doing something wrong? I noticed skim was a bit unstable that it would sometimes hang, I had to restart hyperfine multiple times to get the final result.

@kimono-koans
Copy link
Author

kimono-koans commented Feb 16, 2025

The result was quite different from yours. Maybe it depends on the characteristics of the input and the query? Or am I doing something wrong? I noticed skim was a bit unstable that it would sometimes hang, I had to restart hyperfine multiple times to get the final result.

@junegunn: I didn't want to call you, sir! I think all of us respect what you've done with fzf and we are all impressed with exactly how fast it is. It is the gold standard. I refused to use skim as an every day driver, until I seriously started tinkering with it, because fzf is so good.

But... so long as we're just having fun, my branch is called vendored. You'll want to use that branch for benchmarks. I have Debian builds of that branch here: https://github.com/kimono-koans/ppa

My test corpus is: https://github.com/benhoyt/countwords/blob/master/kjvbible.txt

You'll need to concatenate it together 10x: cat kjvbible.txt kjvbible.txt kjvbible.txt kjvbible.txt kjvbible.txt kjvbible.txt kjvbible.txt kjvbible.txt kjvbible.txt kjvbible.txt >kjvbible_x10.txt

No, I don't think the above is a reasonable test of what a fuzzy finder should do, but the Rust version must go faster, so I've tried to make it do so. While I have you here, could you give a brief summary of your v2 algo function and what makes it so good? I'm very interested in how it is so fast and so good.

Thank you.

@LoricAndre
Copy link
Contributor

I'd love to discuss your changes and how we can make skim faster. I'm not sure I want hacky changes to be merged here, but skim is still in v0 and we can allow ourselves some minor API changes (as was the case for the first few PRs when I took over).
I am still working on and off on the ratatui rewrite (sorry this is taking so long, the way tuikit and ratatui are architectured is different enough that some of the internal logic needs work, and it is still not as stable/responsive as I'd like with large inputs), but we could try adding the most impactful changes you have made in parallel.
If you could compile these "P0 changes" into a PR, even if it is breaking, I'd love to discuss the changes directly there and I'm sure we can find common ground.

@kimono-koans
Copy link
Author

kimono-koans commented Feb 16, 2025

I'm not sure I want hacky changes to be merged here, but skim is still in v0 and we can allow ourselves some minor API changes (as was the case for the first few PRs when I took over).

When I say "hacky", I mean adding a string interner is the "correct" solution to our problem, but I can't imagine implementing without blowing up much of skim, including the API.

it is still not as stable/responsive as I'd like with large inputs

Some of my prior PRs addressed this. See specifically: #505

That's another thing -- re: my prior PRs, although most are performance focused, I did add a few features, and I'll need the --disabled feature added to the CLI and the SkimOptions API: https://github.com/skim-rs/skim/pulls?q=is%3Apr+author%3Akimono-koans+is%3Aclosed

This shouldn't be too much of a problem because this is a feature fzf has and skim lacks: https://www.mankier.com/1/fzf#--disabled

At the end of this, I need skim to be API compatible with httm. See: https://github.com/kimono-koans/httm

@LoricAndre
Copy link
Contributor

About the disabled option, the PR was missing both tests and documentation, would you be able to reopen it with these elements against the current version of skim ? Your changes in run_with and caller should be mostly compatible, but the way the options are defined has changed since then.

@kimono-koans
Copy link
Author

kimono-koans commented Feb 18, 2025

@junegunn

I noticed skim was a bit unstable that it would sometimes hang, I had to restart hyperfine multiple times to get the final result.

Seems I had a deadlock caused by one thread trying to read an atomic bool, while I updated it from another thread. Had no idea this was even possible in Rust!

I've created a new version: https://github.com/kimono-koans/two_percent/releases/tag/0.12.10

This is against the default skimv2 algo, fzf 0.60.0 and a git clone of the skim repo:

> hyperfine -i -w 3 "sk --algo=simple --query=hopscotchbubble --exit-0 < ~/Programming/countwords/kjvbible_x10.txt" "sk --algo=skimv2 --query=hopscotchbubble --exit-0 < ~/Programming/countwords/kjvbible_x10.txt" "fzf --query=hopscotchbubble --exit-0 < ~/Programming/countwords/kjvbible_x10.txt" "./target/release/sk --query=hopscotchbubble --exit-0 < ~/Programming/countwords/kjvbible_x10.txt"
Benchmark 1: sk --algo=simple --query=hopscotchbubble --exit-0 < ~/Programming/countwords/kjvbible_x10.txt
  Time (mean ± σ):      63.7 ms ±   6.8 ms    [User: 117.0 ms, System: 20.1 ms]
  Range (min … max):    49.6 ms …  75.3 ms    39 runs

  Warning: Ignoring non-zero exit code.

Benchmark 2: sk --algo=skimv2 --query=hopscotchbubble --exit-0 < ~/Programming/countwords/kjvbible_x10.txt
  Time (mean ± σ):     108.1 ms ±   5.5 ms    [User: 587.7 ms, System: 21.6 ms]
  Range (min … max):    96.3 ms … 119.4 ms    28 runs

  Warning: Ignoring non-zero exit code.

Benchmark 3: fzf --query=hopscotchbubble --exit-0 < ~/Programming/countwords/kjvbible_x10.txt
  Time (mean ± σ):      71.1 ms ±  12.1 ms    [User: 148.4 ms, System: 65.2 ms]
  Range (min … max):    59.7 ms …  85.3 ms    48 runs

  Warning: Ignoring non-zero exit code.
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Benchmark 4: ./target/release/sk --query=hopscotchbubble --exit-0 < ~/Programming/countwords/kjvbible_x10.txt
  Time (mean ± σ):      1.022 s ±  0.034 s    [User: 2.916 s, System: 3.084 s]
  Range (min … max):    0.985 s …  1.085 s    10 runs

  Warning: Ignoring non-zero exit code.

Summary
  sk --algo=simple --query=hopscotchbubble --exit-0 < ~/Programming/countwords/kjvbible_x10.txt ran
    1.12 ± 0.22 times faster than fzf --query=hopscotchbubble --exit-0 < ~/Programming/countwords/kjvbible_x10.txt
    1.70 ± 0.20 times faster than sk --algo=skimv2 --query=hopscotchbubble --exit-0 < ~/Programming/countwords/kjvbible_x10.txt
   16.04 ± 1.79 times faster than ./target/release/sk --query=hopscotchbubble --exit-0 < ~/Programming/countwords/kjvbible_x10.txt

Thanks!

@kimono-koans
Copy link
Author

kimono-koans commented Feb 18, 2025

@LoricAndre

About the disabled option, the PR was missing both tests and documentation, would you be able to reopen it with these elements against the current version of skim ? Your changes in run_with and caller should be mostly compatible, but the way the options are defined has changed since then.

Pleased to resubmit the disabled feature with tests and docs.

@junegunn
Copy link

junegunn commented Feb 20, 2025

@kimono-koans

I think all of us respect what you've done with fzf and we are all impressed with exactly how fast it is.

Thank you.

my branch is called vendored

vendored is the default branch, so I did build skim from it.

Seems I had a deadlock caused by one thread trying to read an atomic bool, while I updated it from another thread. Had no idea this was even possible in Rust!

I can confirm that it no longer hangs.

My test corpus is: https://github.com/benhoyt/countwords/blob/master/kjvbible.txt

Thanks for clarification. But even with that, I'm getting vastly different numbers.

~/github/two_percent/target/release/sk --version
fzf --version

unset FZF_DEFAULT_OPTS
QUERY=hopscotchbubblehopscotchbubble
for input in kjvbible_x10.txt github16 github32; do
  echo ----------
  wc $input
  echo ----------
  hyperfine -i -m 20 -w 3 \
    "~/github/two_percent/target/release/sk --query=$QUERY --exit-0 < $input" \
    "fzf --query=$QUERY --exit-0 < $input"
done
sk 0.12.10
0.60.0 (3347d61)
----------
  998170 8211330 43325060 kjvbible_x10.txt
----------
Benchmark 1: ~/github/two_percent/target/release/sk --query=hopscotchbubblehopscotchbubble --exit-0 < kjvbible_x10.txt
  Time (mean ± σ):     126.1 ms ±  10.3 ms    [User: 517.1 ms, System: 16.0 ms]
  Range (min … max):   114.9 ms … 151.0 ms    21 runs

  Warning: Ignoring non-zero exit code.

Benchmark 2: fzf --query=hopscotchbubblehopscotchbubble --exit-0 < kjvbible_x10.txt
  Time (mean ± σ):      73.1 ms ±  12.0 ms    [User: 142.0 ms, System: 31.7 ms]
  Range (min … max):    58.7 ms …  85.5 ms    49 runs

  Warning: Ignoring non-zero exit code.

Summary
  fzf --query=hopscotchbubblehopscotchbubble --exit-0 < kjvbible_x10.txt ran
    1.73 ± 0.32 times faster than ~/github/two_percent/target/release/sk --query=hopscotchbubblehopscotchbubble --exit-0 < kjvbible_x10.txt
----------
 1647728 1648848 187676288 github16
----------
Benchmark 1: ~/github/two_percent/target/release/sk --query=hopscotchbubblehopscotchbubble --exit-0 < github16
  Time (mean ± σ):     277.1 ms ±   9.3 ms    [User: 1227.8 ms, System: 41.6 ms]
  Range (min … max):   262.1 ms … 299.3 ms    20 runs

  Warning: Ignoring non-zero exit code.

Benchmark 2: fzf --query=hopscotchbubblehopscotchbubble --exit-0 < github16
  Time (mean ± σ):     138.1 ms ±   1.1 ms    [User: 372.9 ms, System: 73.4 ms]
  Range (min … max):   136.7 ms … 141.9 ms    21 runs

  Warning: Ignoring non-zero exit code.
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Summary
  fzf --query=hopscotchbubblehopscotchbubble --exit-0 < github16 ran
    2.01 ± 0.07 times faster than ~/github/two_percent/target/release/sk --query=hopscotchbubblehopscotchbubble --exit-0 < github16
----------
 3295456 3297696 375352576 github32
----------
Benchmark 1: ~/github/two_percent/target/release/sk --query=hopscotchbubblehopscotchbubble --exit-0 < github32
  Time (mean ± σ):     537.7 ms ±  19.8 ms    [User: 2433.6 ms, System: 77.1 ms]
  Range (min … max):   512.9 ms … 588.6 ms    20 runs

  Warning: Ignoring non-zero exit code.

Benchmark 2: fzf --query=hopscotchbubblehopscotchbubble --exit-0 < github32
  Time (mean ± σ):     258.8 ms ±   1.9 ms    [User: 752.8 ms, System: 152.5 ms]
  Range (min … max):   256.5 ms … 264.8 ms    20 runs

  Warning: Ignoring non-zero exit code.
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Summary
  fzf --query=hopscotchbubblehopscotchbubble --exit-0 < github32 ran
    2.08 ± 0.08 times faster than ~/github/two_percent/target/release/sk --query=hopscotchbubblehopscotchbubble --exit-0 < github32

the Rust version must go faster

I'm not convinced a Rust implementation "must be faster" than a highly optimized Go implementation, when the latter was carefully crafted to minimize GC overhead, avoid memory copies, and leverage SIMD operations, etc. PGO can also make some difference.

fzf implements a modified version of Smith-Waterman algorithm. It's an O(nm) DP algorithm as I mentioned above, but in fzf, it's O(n) when the pattern doesn't match the item.

But anyway, I think we're at the point where the performance arms race doesn't matter much. Because they're all good enough for most everyday use cases. These days, I've mostly focused on the usability and extensibility aspects of fzf, not so much on performance.

@kimono-koans
Copy link
Author

kimono-koans commented Mar 7, 2025

Thanks for clarification. But even with that, I'm getting vastly different numbers.

You have to use a different algorithm for search, or set via an env var: SKIM_ALGORITHM=simple, as I may have above, and which may have misled you (sorry!).

➜ hyperfine -m 50 -i -w 3 "sk --algo=simple --query=hopscotchbubble --exit-0 < ~/junk/kjvbible_x10.txt" "fzf --query=hopscotchbubble --exit-0 < ~/junk/kjvbible_x10.txt" "sk --algo=skimv2 --query=hopscotchbubble --exit-0 < ~/junk/kjvbible_x10.txt"
Benchmark 1: sk --algo=simple --query=hopscotchbubble --exit-0 < ~/junk/kjvbible_x10.txt
  Time (mean ± σ):     446.5 ms ±  16.5 ms    [User: 578.9 ms, System: 182.6 ms]
  Range (min … max):   419.5 ms … 473.3 ms    50 runs

  Warning: Ignoring non-zero exit code.

Benchmark 2: fzf --query=hopscotchbubble --exit-0 < ~/junk/kjvbible_x10.txt
  Time (mean ± σ):     513.2 ms ±   5.2 ms    [User: 760.0 ms, System: 193.8 ms]
  Range (min … max):   499.6 ms … 528.7 ms    50 runs

  Warning: Ignoring non-zero exit code.

Benchmark 3: sk --algo=skimv2 --query=hopscotchbubble --exit-0 < ~/junk/kjvbible_x10.txt
  Time (mean ± σ):     723.5 ms ±  40.0 ms    [User: 1556.1 ms, System: 187.8 ms]
  Range (min … max):   672.3 ms … 916.3 ms    50 runs

  Warning: Ignoring non-zero exit code.

Summary
  sk --algo=simple --query=hopscotchbubble --exit-0 < ~/junk/kjvbible_x10.txt ran
    1.15 ± 0.04 times faster than fzf --query=hopscotchbubble --exit-0 < ~/junk/kjvbible_x10.txt
    1.62 ± 0.11 times faster than sk --algo=skimv2 --query=hopscotchbubble --exit-0 < ~/junk/kjvbible_x10.txt

the Rust version must go faster

I'm not convinced a Rust implementation "must be faster" than a highly optimized Go implementation, when the latter was carefully crafted to minimize GC overhead, avoid memory copies, and leverage SIMD operations, etc. PGO can also make some difference.

Agreed. My statement was meant, again, mostly as a joke.

Although I do think it's important that any Rust fuzzy finder must make an effort to go faster than skim's only within an order of magnitude of fzf.

But anyway, I think we're at the point where the performance arms race doesn't matter much. Because they're all good enough for most everyday use cases. These days, I've mostly focused on the usability and extensibility aspects of fzf, not so much on performance.

Agreed. fzf is still the gold standard. skim and two_percent are obviously still chasing fzf, but, as stated, I do think being faster is a good start for any Rust fuzzy finder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants