Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[request] show number of files / directories #124

Closed
leeoniya opened this issue Aug 28, 2018 · 17 comments · Fixed by #591
Closed

[request] show number of files / directories #124

leeoniya opened this issue Aug 28, 2018 · 17 comments · Fixed by #591
Labels
feature help wanted Extra attention is needed

Comments

@leeoniya
Copy link

hey @styfle

would be cool to also show number of files & directories created during npm i. this number is often in the 10k+ range, which is nuts. some filesystems are notoriously slow to create huge filesystem trees, as well as deleting them (ehm, Windows).

@styfle
Copy link
Owner

styfle commented Aug 30, 2018

Hi Leon, thanks for the suggestion!

I think this would be relatively easy to implement.

Where do you think this information should be displayed on the page?

Ther is another request (#87) to get gzip transfer size and I kinda got stuck on how to display this information without making the UI too busy.

@styfle styfle added the feature label Aug 31, 2018
@leeoniya
Copy link
Author

not sure if you're trolling or actually serious :) surely there's some room to squeeze this info in.

capture

@styfle
Copy link
Owner

styfle commented Aug 31, 2018

I'm serious.

The design is minimal and I would like to keep it that way.

Your screenshot shows a package which doesn't have dependencies so it looks a little more simple.

Take a look at tape

image

Notice how the chart shows both publish size and install size.

Should the chart also show file count?
How about directory count?
Do both counts matter or is it the sum of both files and directories?

@leeoniya
Copy link
Author

leeoniya commented Aug 31, 2018

Should the chart also show file count? How about directory count?

no, i think just the text would be ok.

Do both counts matter or is it the sum of both files and directories?

they're typically closely correlated. in my experience the ratio is about 10:1 for huge packages (thousands of files, hundreds of directories). a sum of both would be ok. "filesystem objects"

@styfle
Copy link
Owner

styfle commented Aug 31, 2018

Should there be a "published file count" and "installed file count" similar to the size is calculated?

@leeoniya
Copy link
Author

probably only the installed one is relevant?

i assumed published/installed size are both relevant due to published being gzipped and transferred over network but installed is what ultimately ends up on disk? is that a correct assumption?

@styfle
Copy link
Owner

styfle commented Aug 31, 2018

No, the purpose is to show the difference between bytes written by the package author vs bytes written by the dependency authors.

The "publish size" is relevant if you are the owner of the package and you forgot to use .npmignore or files array in package.json.

I think I'm answering my own question. We probably should display both "published file count" and "installed file count" because the former could be relevant to the package author, and the latter is relevant to anyone who wishes to install this package.

@leeoniya
Copy link
Author

i see. i was confused by the nomenclature. maybe it makes sense to rename the terms to "self" and "with deps"

Package

Size:
Files:

With Deps

Size:
Files:

@styfle
Copy link
Owner

styfle commented Sep 2, 2018

That could be helpful!

I like the way you grouped them together.

Another thought is to show two charts: size and file count. This will allow the user to see which version a refactor (moving code from one file to many files) happened in. The the amount of data is the same, the chart with size will remain the same but the chart with files would change.

What do you think of this?

two-charts

@leeoniya
Copy link
Author

leeoniya commented Sep 3, 2018

here's a bit of a brain-dump, take from it what you will.

i think there are 2 somewhat distinct purposes that packagephobia tries to be useful for.

  1. pre-bloat - authors of new projects looking to avoid large dependencies. this fits the tagline, "Find the cost of adding a new dependency to your project"
  2. post-bloat - authors trying to see where they can remove or replace large dependencies and/or optimize their own size.

for the pre-bloat audience, it is sufficient to show the full installed size and filecount, including all dependencies. this is still imperfect, because if you're already sucking in a large shared dependency, then the cost of adding a lib can be dramatically less. it would be interesting to allow importing a list of deps you already have and then show only the additional weight incurred.

the post-bloat audience is quite under-served by packagephobia aside from alerting lib authors that they're publishing stuff that is unnecessary (as you mentioned). however, this is just the tip of the iceberg. most bloat in libs comes from some inadvertent huge dependency or two. for a clean-up effort to be fully effective, i would want some kind of a mix of package-dependencies [1] with size/file-count stats and a tree-map similar to webpack-bundle-analyzer [2].

i put together a sketch of what i think would be most useful to both audiences:

deps

this gives you a view that's only 1 level deep (but recursively summed), since the author only has control over the direct dependencies. it also clearly shows what will for sure be removed as part of ditching a dependency (its unshared deps) as well as what "may" be removed (its shared deps).

if you want to have two version-trend graphs (which i dont find particularly interesting, TBH) then i think it makes sense to group them by "installed" stats and "published" stats as i did above rather by a specific metric (files, sizes)

[1] https://beta.observablehq.com/@mbostock/package-dependencies?name=tape
[2] https://github.com/webpack-contrib/webpack-bundle-analyzer

@styfle
Copy link
Owner

styfle commented Sep 6, 2018

Thanks for the detailed write-up! 👍

As you discovered, there is a similar request open (#11) that attempts to visualize package bloat in a table or tree view. I even created a PR in another tool to add this feature: anvaka/npmgraph.an#27

I think that the visualization feature (#11) is outside the scope of this issue (#124) which is discussing file count 😄

That being said, I think we could start tracking file count now.

If you want to submit a PR, I think the place to start is npm-stats.ts which has a function getDirSize which could be complemented with getDirFileCount.

Then modify the PkgSize interface to include two new properties: installFiles and publishFiles.

@David-Else
Copy link

David-Else commented Dec 3, 2018

@styfle I think adding file count would be a huge bonus, something like #124 (comment) would be fine, but maybe having 4 different colours and the 4 variables on one graph would be equally good. Just a mater of aesthetic taste really.

Thanks for making this app, I find the bloat in some NPM packages literally repulsive! I think the tide is turning, more people seem to be talking about this now with the recent security fails. I use pnpm, so that stops them hurting my innocent SSD too much with symbolic links.

@styfle
Copy link
Owner

styfle commented Dec 4, 2018

@David-Else Thanks for chiming in! Glad you like it!

Would you like to contribute a PR to add the file counting logic? (we can implement the graph at a later time if that makes it easier)

@David-Else
Copy link

@styfle I am relatively new to JS, and have not used TS yet, so can't really contribute at this point. If I get the opportunity to learn Typescript I would certainly like to contribute in the future. Great project!

@David-Else
Copy link

This project could be of interest too: https://www.pikapkg.com/

@styfle
Copy link
Owner

styfle commented Dec 5, 2018

That pikapkg project looks really cool! Did you make it?

@David-Else
Copy link

No, afraid not.

@styfle styfle mentioned this issue Jun 28, 2020
5 tasks
styfle added a commit that referenced this issue Jun 28, 2020
Fixes #124 

- [x] Add file count to redis database
- [x] Add file count to API response
- [x] Add file count to tooltip (on hover)
- [x] Add file count to unit tests
- [ ] TODO: Add file count to barchart
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants