Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop internal copy of METIS #291

Closed
gruenich opened this issue Feb 6, 2023 · 9 comments
Closed

Drop internal copy of METIS #291

gruenich opened this issue Feb 6, 2023 · 9 comments

Comments

@gruenich
Copy link
Contributor

gruenich commented Feb 6, 2023

In December 2022 METIS 5.2.1 has been released. This might solve some of the problems, SuiteSparse try to work around by providing its own modified copy of METIS.

  • We should try to get rid of METIS as part of SuiteSparse and make it a pure external dependeny.
  • We should try get required changes upstreamed into SuiteSparse.

Packaging SuiteSparse with its own copy of METIS is a burden for package maintainers.

@DrTimothyAldenDavis
Copy link
Owner

Thanks for the update. I'll take a look.

I might still need to keep my own renamed copy of METIS, because of the definition of the METIS index type. I need to revise it to int64_t, which follows the instructions in the METIS include file, but it means that my revision would conflict with an unmodified METIS. If METIS 5.2.1 fixes that issue, then going back to an unmodified (64-bit capable) METIS would be fine. Otherwise, I'll have to stick with my modified version -- perhaps as a modification of METIS 5.2.1.

@gruenich
Copy link
Contributor Author

gruenich commented Feb 6, 2023

I investigated this further and I think your issue is not fixed. Further GKlib was spun off and has become a dependency of METIS, but there was no proper release for GKlib (1, 2).

If possible it would be good to have your diff somewhere documented, maybe we can create a merge request to METIS, and hope that we can convince Prof. George Karypis to include it upstream

@DrTimothyAldenDavis
Copy link
Owner

My diff is very short, but it's not a complete fix for these various METIS issues. I #include only the subset of METIS (GKlib and libmetis) that I need. I leave out features in GKlib that are not very portable (requiring POSIX for example) that are only needed for the METIS stand-alone programs. Perhaps you've seen this already:

https://github.com/DrTimothyAldenDavis/SuiteSparse/blob/dev/CHOLMOD/Partition/cholmod_metis_wrapper.c

You see the commented out #include's are files in METIS that I don't use, and thus those files are unmodified from the original.

Here's the output of a diff between the original metis-5.1.0 and my SuiteSparse/CHOLMOD/SuiteSparse_metis folder, from:

diff -ar metis-5.1.0 ~/dev/SuiteSparse/CHOLMOD/SuiteSparse_metis > metis_diff.txt

Gives this result below. I see though that I can trim the diff further. For example, I figured out that I do not need the GKlib/csr.c file at all, so it is now disabled in the cholmod_metis_wrapper.c. But I have an "#if 0 ... #endif" block to disable portions of that file. Now that I know I don't need any of csr.c, I can revert my copy back to the original, and then just ignore the entire file. Same for GKlib/graph.c and pdb.c.

Only in /home/faculty/d/davis/dev/SuiteSparse/CHOLMOD/SuiteSparse_metis/build: .gitignore
Only in metis-5.1.0/build: Linux-x86_64
diff -ar metis-5.1.0/CMakeLists.txt /home/faculty/d/davis/dev/SuiteSparse/CHOLMOD/SuiteSparse_metis/CMakeLists.txt
3a4,6
> # added for SuiteSparse:
> message ( FATAL "This copy of METIS (inside SuiteSparse) cannot be built as a stand-alone library.  Use the CHOLMOD/CMakeLists.txt to compile CHOLMOD with METIS." )
> 
diff -ar metis-5.1.0/GKlib/csr.c /home/faculty/d/davis/dev/SuiteSparse/CHOLMOD/SuiteSparse_metis/GKlib/csr.c
348a349,350
> #if 0
> // gk_csr_Read, gk_csr_Write disabled for SuiteSparse, Dec 2022
649a652
> #endif
diff -ar metis-5.1.0/GKlib/error.c /home/faculty/d/davis/dev/SuiteSparse/CHOLMOD/SuiteSparse_metis/GKlib/error.c
174a175,176
> #if 0
> // gk_strerror disabled for SuiteSparse, Jan 2023
191a194
> #endif
diff -ar metis-5.1.0/GKlib/gk_arch.h /home/faculty/d/davis/dev/SuiteSparse/CHOLMOD/SuiteSparse_metis/GKlib/gk_arch.h
15a16,21
> 
> // stdint.h, inttypes.h: added for SuiteSparse, Dec 2022
> #include <stdint.h>
> #include <inttypes.h>
> 
> #if 0
46a53
> #endif
52c59,61
< #ifdef WIN32
---
> // revised for SuiteSparse, Jan 2023
> #if defined ( NO_SSIZE_T )
> // #ifdef WIN32
53a63,65
> #else
> // POSIX: ssize_t is defined in sys/types.h
> #include <sys/types.h>
60a73,74
> #if 0
> // rint and INFINITY disabled for SuiteSparse, Dec 2022
67a82
> #endif
diff -ar metis-5.1.0/GKlib/GKlib.h /home/faculty/d/davis/dev/SuiteSparse/CHOLMOD/SuiteSparse_metis/GKlib/GKlib.h
39a40,53
> /* -------------------------------------------------------------------------- */
> /* Added for SuiteSparse, to disable signal handling when incorporated into
>  * a MATLAB mexFunction.  Tim Davis, Jan 30, 2016, Texas A&M University. */
> #ifdef MATLAB_MEX_FILE
> #include "mex.h"
> #define SIGABRT 6
> #define SIGTERM 15
> #define jmp_buf int
> #define raise(sig) { mexErrMsgTxt ("METIS error") ; }
> #define signal(sig,func) NULL
> #define longjmp(env,val) { mexErrMsgTxt ("METIS error") ; }
> #define setjmp(x) (0)
> #define exit(x) { mexErrMsgTxt ("METIS error") ; }
> #else
41a56,57
> #endif
> /* -------------------------------------------------------------------------- */
44a61,62
> #if 0
> // regex.h and gk_regex.h disabled for SuiteSparse, Jan 1, 2023.
53a72
> #endif
59a79,96
> 
> /* -------------------------------------------------------------------------- */
> /* Added for incorporation into SuiteSparse.
>    Tim Davis, Oct 31, 2022, Texas A&M University. */
> #include "SuiteSparse_config.h"
> #define malloc  SuiteSparse_config_malloc
> #define calloc  SuiteSparse_config_calloc
> #define realloc SuiteSparse_config_realloc
> #define free(p)                                 \
> {                                               \
>     if ((p) != NULL)                            \
>     {                                           \
>         SuiteSparse_config_free (p) ;           \
>         (p) = NULL ;                            \
>     }                                           \
> }
> 
> /* -------------------------------------------------------------------------- */
diff -ar metis-5.1.0/GKlib/gk_proto.h /home/faculty/d/davis/dev/SuiteSparse/CHOLMOD/SuiteSparse_metis/GKlib/gk_proto.h
270a271,272
> #if 0
> // disabled for SuiteSparse
282a285
> #endif
diff -ar metis-5.1.0/GKlib/graph.c /home/faculty/d/davis/dev/SuiteSparse/CHOLMOD/SuiteSparse_metis/GKlib/graph.c
84a85,86
> #if 0
> // gk_graph_Read, gk_graph_Write: disabled for SuiteSparse, Dec 2022
331a334
> #endif
diff -ar metis-5.1.0/GKlib/memory.c /home/faculty/d/davis/dev/SuiteSparse/CHOLMOD/SuiteSparse_metis/GKlib/memory.c
100a101,104
> #if 1
>   // Revised for SuiteSparse: do not create gkmcore:
>   gkmcore = NULL ;
> #else
107a112
> #endif
Only in /home/faculty/d/davis/dev/SuiteSparse/CHOLMOD/SuiteSparse_metis/GKlib: original
diff -ar metis-5.1.0/GKlib/pdb.c /home/faculty/d/davis/dev/SuiteSparse/CHOLMOD/SuiteSparse_metis/GKlib/pdb.c
23a24
> // gk_threetoone disabled for SuiteSparse, Jan 2023
diff -ar metis-5.1.0/include/metis.h /home/faculty/d/davis/dev/SuiteSparse/CHOLMOD/SuiteSparse_metis/include/metis.h
33c33,34
< #define IDXTYPEWIDTH 32
---
> // Revised for SuiteSparse:
> #define IDXTYPEWIDTH 64
112a114,115
>   // iabs replaced with SuiteSparse_metis_abs, Dec 2022
>   #if 0
113a117,123
>   #else
>   #define iabs          SuiteSparse_metis_abs64
>   static inline int64_t SuiteSparse_metis_abs64 (int64_t x)
>   {
>     return ((x < 0) ? (-x) : x) ;
>   }
>   #endif
170a181,182
> #if 0
> // Windows __declspec removed for SuiteSparse, Jan 2023
174a187,189
> #else
> #define METIS_API(type) type
> #endif
Only in /home/faculty/d/davis/dev/SuiteSparse/CHOLMOD/SuiteSparse_metis/include: original
Only in /home/faculty/d/davis/dev/SuiteSparse/CHOLMOD/SuiteSparse_metis: README.txt

That's my entire set of changes to METIS.

This strategy keeps my edits of METIS very terse, but it means that they are not a complete fix for all of the various METIS issues I've encountered.

@DrTimothyAldenDavis
Copy link
Owner

I have a cmake script that detects if ssize_t is available, and then I #define NO_SSIZE_T accordingly. The script is in SuiteSparse_config/cmake_modules. That's a better technique than the logic in the METIS file itself but it means my fix for METIS would require that cmake script.

@DrTimothyAldenDavis
Copy link
Owner

I've removed my edits to csr.c, pdb.c, and graph.c, reverting to the METIS 5.1.0 originals, in my dev2 branch.

I'll see about integrating METIS 5.2.1 into my SuiteSparse_metis variant. It's likely there are tweaks I had to make to METIS 5.1.0 that are now resolved in that version.

The main issues I've had with METIS are:

(1) size of the index type. I require 64 bits. This can be changed by editting the metis.h file, per the instruction in METIS, but then the resulting library conflicts with a default 32-bit build. METIS needs to use a different name space for 64-bit functions but it doesn't do that. So I have to keep my METIS revision private, to avoid conflicting with a default METIS that might be installed alongside SuiteSparse.

(2) memory functions. I require the use of alternative malloc/calloc/realloc/free functions. METIS doesn't support this, so I've editted GKlib.h to insert my own. I also must disable the setjmp/longjmp and gkmcore.

(3) signals. I can't let METIS raise signals since that breaks MATLAB. Throwing errors with signals, or modifying signal handlers, breaks all kinds of things, and METIS shouldn't have to do that. It throws a signal if it runs out of memory, which is a really awful thing to do to the applications that use SuiteSparse.

(4) portability problems: these could be fixed in the METIS upstream, but I have lots of problems with things like ssize_t, regex, Windows __declspec's, and other issues in files that I'm no longer using.

I've fixed most of these in my use of METIS. I still let METIS raise a signal if it's not inside MATLAB but I try to ensure this never happens by pre-allocating enough memory that I think METIS will use (before I call METIS) and then freeing it. See

/* === metis_memory_ok ====================================================== */

If that works, there's a good chance the memory will be there for METIS. It's not a complete solution. A complete solution would be to use a robust memory-error handling design in METIS itself: if malloc fails, free any workspace allocated so far, and return an "out of memory" flag to the caller. I do this in all of SuiteSparse and it works just fine, if done right.

METIS is a great package ... which is why I use it in spite of these quirks. I just have to work around the quirks.

@DrTimothyAldenDavis
Copy link
Owner

I have other workarounds too, to avoid some bugs that might now be fixed:

/* METIS workarounds */

@Fabian188
Copy link

I had quite some issues with the current Metis for non-SuiteSparse use. I had to stick to 5.1. I don't remember the issues, IIRC it was compiling on Windows and runtime behavior on Linux. I just want to suggest to be careful with switch there.

@DrTimothyAldenDavis
Copy link
Owner

Thanks for the heads up.

@DrTimothyAldenDavis
Copy link
Owner

I'm going to keep my internal copy of METIS. It's safe, since I don't conflict with the original METIS namespace (I rename all the functions) and I don't create any libmetis.so. None of my compiled METIS functions are visible to the user.

Using my own internal METIS allows me to revise/fix various issues (see the discussion above). I will evaluate METIS 5.2.1 in the future, but for now I'll stick with my copy of v5.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants