Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Massive memory usage for large file #96

Open
thmd opened this issue Aug 29, 2020 · 1 comment
Open

Massive memory usage for large file #96

thmd opened this issue Aug 29, 2020 · 1 comment
Labels
C-bug Category: Something isn't working

Comments

@thmd
Copy link

thmd commented Aug 29, 2020

Running on Mac (I am unable to run a test on linux right now) sd doesn't seem to output until the file is done. I'm not sure if that is the reason it's basically eating up a lot of memory. It's insanely fast compared to sed for my use case but unusable at the same time.

I started with cat file | sd '.*start' '' > out.file and also tried sd -p '.*start' '' file > out.file

My input file was 100GB+ and in both version sd keeps using up memory, pushing the system to use 10s of GB of swap (32GB Macbook pro) while no output is written to the file. I can see monitoring bytes read that sd is also working much faster than sed but using nearly 2/3 times memory of the amount read.

Is sd not geared for large files or I'm using this wrong or this is a bug?

@chmln
Copy link
Owner

chmln commented Sep 4, 2020

@thmd sounds like this may be solved with buffered writing.
I pushed a possible solution to a branch, please try it out and let me know if it helps :)

https://github.com/chmln/sd/tree/buffered-write

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants