Skip to content

WIP: Unified Build#46

Open
bclyons12 wants to merge 15 commits intomasterfrom
unified_build
Open

WIP: Unified Build#46
bclyons12 wants to merge 15 commits intomasterfrom
unified_build

Conversation

@bclyons12
Copy link
Copy Markdown
Collaborator

This is to make a unified make build between the M3D-C1 code in unstructured and the libraries in m3dc1_scorec. I've finished it for stellar and it works, passing all regression tests.

Task to complete before merging

  1. Finalize strategy by making appropriate edits, using stellar as testbed
  2. Update all unstructured/*.mk and m3dc1_scorec/config-files/*.sh according to finalized strategy

@bclyons12
Copy link
Copy Markdown
Collaborator Author

bclyons12 commented Apr 15, 2022

We may need to get a bit more sophisticated with the phony scorec target. Right now, it's causing M3D-C1 to get relinked every time, even if nothing has changed.

@jchensw
Copy link
Copy Markdown
Collaborator

jchensw commented Apr 27, 2022

I tested it on stellar and all works fine with me.

@bclyons12
Copy link
Copy Markdown
Collaborator Author

@jchensw That's great! Any thoughts on the way it's been implemented? I think it can be improved. For example, the M3D-C1 code gets relinked every time even if nothing has changed. I think that's caused by the way the scorec target is set up.

@jchensw
Copy link
Copy Markdown
Collaborator

jchensw commented Apr 27, 2022

@jchensw That's great! Any thoughts on the way it's been implemented? I think it can be improved. For example, the M3D-C1 code gets relinked every time even if nothing has changed. I think that's caused by the way the scorec target is set up.

You mean the soft link in _$ARCH/bin directory?

@bclyons12
Copy link
Copy Markdown
Collaborator Author

You mean the soft link in _$ARCH/bin directory?

@jchensw I meant these lines:

M3DC1/unstructured/makefile

Lines 203 to 204 in 600d0b4

$(BIN): $(OBJS) scorec
$(LOADER) $(LDOPTS) $(OBJS) $(LIBS) -o $@
scorec is a phony target, so it get executed every time, which causes the M3D-C1 linking to happen every time even if m3dc1_scorec is unchanged.

A better way, I think, is to remove that prerequisite and then to add make scorec and make scorec COM=1 to make all. Thoughts on that change?

One thing I'm not sure of is how this will function:

SCOREC_LIBS= -L$(SCOREC_DIR)/lib $(M3DC1_SCOREC_LIB) \
-Wl,--start-group,-rpath,$(SCOREC_BASE_DIR)/lib -L$(SCOREC_BASE_DIR)/lib \
-lpumi -lapf -lapf_zoltan -lgmi -llion -lma -lmds -lmth -lparma \
-lpcu -lph -lsam -lspr -lcrv -Wl,--end-group

At the moment, $(SCOREC_BASE_DIR)/lib also contains libm3dc1_scorec.a, so I don't know which one will actually get linked to. As of 600d0b4, it definitely links to $(SCOREC_BASE_DIR)/lib if $(SCOREC_DIR)/lib has not been built yet (which makes testing challenging). The same goes for files in include folders. It may take some care to untangle these on all the systems.

@jchensw
Copy link
Copy Markdown
Collaborator

jchensw commented Apr 28, 2022

Let me think through the first part.

As for the second part, we don't need to worry about that it links to $(SCOREC_BASE_DIR)/lib because from now on, we will NOT install libm3dc1_scorec.a into $(SCOREC_BASE_DIR) directory any more. So $(SCOREC_DIR) will be the only place to find libm3dc1_scorec.a.

@jchensw
Copy link
Copy Markdown
Collaborator

jchensw commented Apr 28, 2022

Here is what could be done. In target.mk, "make scorec" is inserted as the first thing to do for "all" so that the freshly built "libm3dc1_scorec.a" could be used to link M3DC1 executables:

38 .PHONY: all
39 all :
make scorec
40 make OPT=1
41 make OPT=1 COM=1
42 make OPT=1 3D=1 MAX_PTS=60
43 make OPT=1 3D=1 MAX_PTS=60 ST=1
44 make a2cc
45 make bin

and in "makefile", delete the "scorec" dependence from "BIN":

203 $(BIN): $(OBJS)
204 $(LOADER) $(LDOPTS) $(OBJS) $(LIBS) -o $@

As for the second part, no need to worry about which libm3dc1_scorec.a is used. From now on, we will not manually install it in $(SCOREC_BASE_DIR), so it won't show up in $(SCOREC_BASE_DIR) anymore. Actually we can remove it now on stellar to use the one in automatically generated in $(SCOREC_DIR) directory.

@bclyons12
Copy link
Copy Markdown
Collaborator Author

@jchensw Done with 88bd541. Now we need to update all the .mk and config files. @seegyoung are all the up-to-date config files for each system in m3dc1_scorec/config-files? I don't see them for cori at the moment. There are only cori-cuda10.2-pgi19.10-config.sh and cori-mpich7.7.6-hsw-real-config.sh.
but the cori.mk is linking to /global/cfs/cdirs/mp288/jinchen/PETSC/core/upgrade-intel6610-craympich7719-hsw and cori_knl.mk to /global/cfs/cdirs/mp288/jinchen/PETSC/core/upgrade-intel6610-craympich7719-knl.

@jchensw
Copy link
Copy Markdown
Collaborator

jchensw commented Apr 29, 2022

@bclyons12

I have these configuration files. You can access them at

/global/homes/j/jinchen/project/M3DC1/gitrepo/M3DC1/m3dc1_scorec
-rwxrwx--- 1 jinchen mp288 2333 Mar 17 21:37 build-real2-zoltan-knl/corihsw-real.sh
-rwxrwx--- 1 jinchen mp288 2332 Mar 17 21:45 build-cplx-zoltan-knl/coriknl-cplx.sh
-rwxrwx--- 1 jinchen mp288 2399 Apr 21 16:11 build-real2-zoltan-hsw/corihsw-real.sh
-rwxrwx--- 1 jinchen mp288 2334 Apr 21 16:16 build-cplx-zoltan-hsw/corihsw-cplx.sh

and the following two are for perlmutter
-rwxrwx--- 1 jinchen mp288 2188 Dec 16 14:27 build-cplx-zoltan/perlmutter-scorec-cplx-config.sh_cj
-rwxrwx--- 1 jinchen mp288 2234 Feb 16 22:38 build-real2-zoltan/perlmutter-scorec-real-config.sh_cj

Do you have any specific format for their names? Or you can copy them and change their name to fit your already setup rules.

@seegyoung
Copy link
Copy Markdown
Collaborator

@bclyons12 and @jchensw

Some files in m3dc1_scorec/config-files are not up-to-date. I will commit the latest.

@bclyons12
Copy link
Copy Markdown
Collaborator Author

@jchensw I don't have any set format for the names. I just define it in the .mk file, like SCOREC_CONFIG=stellar-intelmpi-real-config.sh.

I would suggest that you and @seegyoung clean up the files in m3dc1_scorec/config-files/, removing any old ones that aren't needed anymore and making sure that there's one for each system that we actively use. You could either do tha ton master or on this branch as part of the pull request. Once those are all in place, I can work on updating all the .mk files accordingly.

@jchensw
Copy link
Copy Markdown
Collaborator

jchensw commented May 4, 2022

I believe Seegyoung is working on this.

@seegyoung
Copy link
Copy Markdown
Collaborator

@bclyons12 and @jchensw

instead of reusing m3dc1_scorec/config-files, I suggest to create a new folder to keep any config files necessary for unified build. The folder config-files is for internal SCOREC purpose only.

@bclyons12
Copy link
Copy Markdown
Collaborator Author

@seegyoung Okay, we can make a new folder with config.sh files, maybe just labeling them ${M3DC1_ARCH}_config.sh for uniformity.

@jchensw @nferraro @sjardin @changliu777 Maybe this is a good time to delete unused .mk files. They'll always be archived on git if we need them in the future. Which ones don't we need? edison.mk and eddy.mk for sure. Once we've pared down this list, we can focus on making the config.sh files for each remaining system. I'll make a separate issue to track this.

@nferraro
Copy link
Copy Markdown
Collaborator

nferraro commented May 4, 2022 via email

@seegyoung
Copy link
Copy Markdown
Collaborator

I updated the following config.sh from m3dc1_scorec/config-files which were used to build the m3dc1_scorec libraries in corresponding .mk files

  • centos7.mk: centos7-real-config.sh
  • sunfire.openmpi-4.0.3.mk: portal-openmpi-4.0.3-real-config.sh
  • stellar.mk: stellar-intelmpi-real-config.sh
  • stellar-openmpi.mk: stellar-intelmpi-real-config.sh
  • traverse.mk: traverse-pgi-real-config.sh
  • perseus.mk: perseus-real-config.sh
  • perseusamd.mk: perseus-amd-real-config.sh

@jchensw @changliu777, please provide the config.sh used to build m3dc1_scorec in traverse_gpu and cori.

Please let me know if you need any further from me to proceed.

@bclyons12
Copy link
Copy Markdown
Collaborator Author

@seegyoung I'm working to cleanup the config-files folder like we just did for the makefiles. In d8b6d54, I removed a bunch of them that we don't need. There are still some remaining that I'm unsure what they do. Could you comment if we should keep any of the following?

core-openmpi-gcc4.4.5-config.sh
core-sim-openmpi-gcc4.4.5-config.sh
cori-cuda10.2-pgi19.10-config.sh
openmpi-gcc4.4.5-pumi-sim-config.sh
openmpi-gcc4.4.5-real-config.sh
openmpi-gcc4.4.5-sim-config.sh
portal-meshgen-config.sh
portal-pumi-sim-config.sh

@bclyons12
Copy link
Copy Markdown
Collaborator Author

@jchensw @yaozhou1989 I just noticed that cori.mk and perlmutter.mk use a different PETSC for the stellarator version than the other real versions of the code. That's not true in other makefiles files, including stellar, cori_knl, and centos7. Is there a reason for that?

ifeq ($(ST), 1)
PETSC_ARCH=corihsw-PrgEnvintel6010-craympich7719-master-real-st
PETSC_WITH_EXTERNAL_LIB = -L${PETSC_DIR}/${PETSC_ARCH}/lib -Wl,-rpath,/global/cfs/cdirs/mp288/jinchen/PETSC/petsc.20220107/corihsw-PrgEnvintel6010-craympich7719-master-real-st/lib -L/global/cfs/cdirs/mp288/jinchen/PETSC/petsc.20220107/corihsw-PrgEnvintel6010-craympich7719-master-real-st/lib -lpetsc -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu -lsuperlu_dist -lflapack -lfblas -lzoltan -lparmetis -lmetis -lz -lquadmath -ldl -lstdc++
else
PETSC_ARCH=corihsw-PrgEnvintel6010-craympich7719-master-real
PETSC_WITH_EXTERNAL_LIB = -L${PETSC_DIR}/${PETSC_ARCH}/lib -Wl,-rpath,/global/cfs/cdirs/mp288/jinchen/PETSC/petsc.20220107/corihsw-PrgEnvintel6010-craympich7719-master-cplx/lib -L/global/cfs/cdirs/mp288/jinchen/PETSC/petsc.20220107/corihsw-PrgEnvintel6010-craympich7719-master-cplx/lib -lpetsc -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu -lsuperlu_dist -lflapack -lfblas -lzoltan -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lparmetis -lmetis -lz -lquadmath -ldl -lstdc++
endif

@jchensw
Copy link
Copy Markdown
Collaborator

jchensw commented May 11, 2022

@bclyons12 One has netcdf & hdf5 included and the other one doesn't. This is caused by the segfault for stellarator version when nersc built netcdf modules is used.

@bclyons12
Copy link
Copy Markdown
Collaborator Author

@jchensw So does m3dc1_scorec need to get compiled with the same version of PETSc for the stellarator version? We need three different m3dc1_scorec libraries? It doesn't look like cori.mk does that right now.

@jchensw
Copy link
Copy Markdown
Collaborator

jchensw commented May 11, 2022

@bclyons12 stellarator version share the same scorec lib with real mode.

@jchensw
Copy link
Copy Markdown
Collaborator

jchensw commented May 12, 2022

@bclyons12 All configuration files for cori haswell, knl, gpu, perlmutter have been pushed into master branch. Please check.

@bclyons12
Copy link
Copy Markdown
Collaborator Author

@jchensw Thanks! I see the files that correspond to cori_gpu_pgi.mk. What about cori_gpu.mk?

@jchensw
Copy link
Copy Markdown
Collaborator

jchensw commented May 13, 2022

@bclyons12 Please delete cori_gpu.mk, and then rename cori_gpu_pgi.mk to cori_gpu.mk.

@bclyons12 bclyons12 mentioned this pull request Jan 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants