Skip to content

File deletion is slow #130

@penberg

Description

@penberg

...which is weird because writing out a whiteout should be fast.

For a find -depth -type d -name node_modules -exec rm -rf {} \; workload takes 1 second on native filesystem to complete, but takes almost 30 seconds with agentfs.

The strace summary of the command in agentfs run sandbox looks like this:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 50.98    1.151167        2546       452           wait4
 24.40    0.551031          19     27917           unlinkat
  4.78    0.107826           9     11738           getdents64
  3.67    0.082846           3     20762           close
  3.65    0.082435           3     23556           fcntl
  2.88    0.065141           8      7568           openat
  2.71    0.061180           9      6312           newfstatat
  1.70    0.038493           3     10557           fstat
  1.70    0.038439          85       452           clone
  1.52    0.034374           5      5877      5424 execve
  0.60    0.013492           3      3638           mmap
  0.23    0.005167           3      1362           mprotect
  0.19    0.004258           3      1356           rt_sigprocmask
  0.19    0.004232           3      1403           brk
  0.14    0.003250           7       453           munmap
  0.12    0.002693           6       448           fstatfs
  0.09    0.001990           2       906           pread64
  0.06    0.001382           3       453           ioctl
  0.05    0.001202           2       453           getrandom
  0.05    0.001107           2       454           prlimit64
  0.05    0.001055           1       905           set_robust_list
  0.05    0.001045           2       454       453 access
  0.05    0.001044           2       456           read
  0.04    0.001011           2       453           arch_prctl
  0.04    0.001002           2       453           rseq
  0.04    0.001000           2       453           set_tid_address
  0.01    0.000148           0       453           fchdir
  0.00    0.000004           4         1           futex
  0.00    0.000003           3         1           rt_sigaction
  0.00    0.000000           0         1           uname
  0.00    0.000000           0         2           statfs
------ ----------- ----------- --------- --------- ----------------
100.00    2.258017          17    129749      5877 total

It has some overhead compared to native filesystem, but only 2 seconds is spent on doing something, with majority of time unaccounted for.

However, tracing the agentfs FUSE daemon reveals what is going on:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ------------------
 91.93   45.147216          26   1708232     16343 futex
  4.91    2.410616           6    379098           statx
  0.71    0.350193           2    158794           pread64
  0.62    0.304792           2    107531           writev
  0.50    0.245544           1    132323           read
  0.46    0.226943           5     41457     23978 openat
  0.33    0.164057           5     29186           pwrite64
  0.21    0.104429           2     35148           getdents64
  0.16    0.080203           4     17479           close
  0.07    0.034940           1     17478           fstat
  0.06    0.029388          13      2107           sched_yield
  0.02    0.008469           3      2249           madvise
  0.00    0.002008           5       336           pwritev
  0.00    0.000049           9         5           mprotect
  0.00    0.000047           7         6           mmap
  0.00    0.000047          15         3         1 prctl
  0.00    0.000045          22         2           clone3
  0.00    0.000033           5         6           rt_sigprocmask
  0.00    0.000033          16         2           rseq
  0.00    0.000028           7         4           sigaltstack
  0.00    0.000017           8         2           sched_getaffinity
  0.00    0.000015           7         2           getrandom
  0.00    0.000014           7         2           gettid
  0.00    0.000004           2         2           set_robust_list
------ ----------- ----------- --------- --------- ------------------
100.00   49.109130          18   2631454     40322 total

It's lock contention in the deamon, likely from all the block_on() calls.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions