Skip to content

hugepage size parse issue #13214

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wangshaochuang opened this issue Apr 27, 2025 · 6 comments
Open

hugepage size parse issue #13214

wangshaochuang opened this issue Apr 27, 2025 · 6 comments

Comments

@wangshaochuang
Copy link

Thank you for taking the time to submit an issue!

Background information

What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)

main branch

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

ubuntu 22.04

If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.

Please describe the system on which you are running

  • Operating system/version: ubuntu22.04
  • Computer hardware:
  • Network type: rdma

Details of the problem

Please describe, in detail, the problem that you are having, including the behavior you expect to see, the actual behavior that you are seeing, steps to reproduce the problem, etc. It is most helpful if you can attach a small program that a developer can use to reproduce your problem.

Note: If you include verbatim output (or a code block), please use a GitHub Markdown code block like below:

shell$ mpirun -n 2 ./hello_world

the hugepase parse function has wrong result in some corner cases.

Image

@wangshaochuang
Copy link
Author

Image

@jsquyres
Copy link
Member

jsquyres commented Apr 27, 2025

Good find!

This is extremely old code in Open MPI. I'm actually wondering why we're preferring to parse /proc/mounts output rather than using statfs(). Since this component only targets Linux (since it looks at /proc/mounts), and -- at least according to my googling -- Linux distributions introduced statfs() around 1994, is there a reason we don't just delete the /proc/mounts-parsing code and just rely on statfs() / statvfs()?

Or, at a minimum, prefer to use statfs() / statvfs() instead of parsing /proc/mounts?

@jsquyres
Copy link
Member

More specifically, I'm wondering why we don't do something like this (this diff is against v5.0.x, but would apply to main and v4.1.x as well):

diff --git i/opal/mca/mpool/hugepage/mpool_hugepage_component.c w/opal/mca/mpo>
index 888562e65f..191fc405e7 100644
--- i/opal/mca/mpool/hugepage/mpool_hugepage_component.c
+++ w/opal/mca/mpool/hugepage/mpool_hugepage_component.c
@@ -218,7 +218,6 @@ static void mca_mpool_hugepage_find_hugepages(void)
     mca_mpool_hugepage_hugepage_t *hp;
     FILE *fh;
     struct mntent *mntent;
-    char *opts, *tok, *ctx;
 
     fh = setmntent("/proc/mounts", "r");
     if (NULL == fh) {
@@ -232,6 +231,18 @@ static void mca_mpool_hugepage_find_hugepages(void)
             continue;
         }
 
+#if defined(USE_STATFS)
+        struct statfs info;
+        statfs(mntent->mnt_dir, &info);
+        page_size = info.f_bsize;
+#elif defined(HAVE_STATVFS)
+        struct statvfs info;
+        statvfs(mntent->mnt_dir, &info);
+        page_size = info.f_bsize;
+#else
+        // Fallback for extremely old systems that do not have
+        // statfs().
+        char *opts, *tok, *ctx;
         opts = strdup(mntent->mnt_opts);
         if (NULL == opts) {
             break;
@@ -240,25 +251,19 @@ static void mca_mpool_hugepage_find_hugepages(void)
         tok = strtok_r(opts, ",", &ctx);
         do {
             if (NULL != tok && 0 == strncmp(tok, "pagesize", 8)) {
-                break;
+                // It is expected that pagesize=X will be an integer
+                // number with no units qualifier following it.
+                // Specifically: Linux circa 2025 has /proc/mounts
+                // output like "... rw,relatime,pagesize=2M".  But if
+                // your system is signifncantly older than that
+                // (statfs() was introduced around 1994), we're
+                // assuming that there is no units qualifier.
+                (void) sscanf(tok, "pagesize=%lu", &page_size);
             }
             tok = strtok_r(NULL, ",", &ctx);
         } while (tok);
-
-        if (!tok) {
-#    if defined(USE_STATFS)
-            struct statfs info;
-
-            statfs(mntent->mnt_dir, &info);
-#    elif defined(HAVE_STATVFS)
-            struct statvfs info;
-            statvfs(mntent->mnt_dir, &info);
-#    endif
-            page_size = info.f_bsize;
-        } else {
-            (void) sscanf(tok, "pagesize=%lu", &page_size);
-        }
         free(opts);
+#endif
 
         if (0 == page_size) {
             /* could not get page size */

@jsquyres
Copy link
Member

Opened main PR #13217 with the above patch.

@bosilca
Copy link
Member

bosilca commented Apr 29, 2025

I'm not sure I understand your comment about nor parsing /proc/mounts ? Where do you get the path to the hugetlbfs to pass into statfs/statvfs ?

@jsquyres
Copy link
Member

More carefully stated: we don't need to parse /proc/mounts to get the hugepage size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants