episode sample project #349

elmbeech · 2025-01-12T20:27:55Z

This is the physicell episode sample project, written to run multiple consecutive episodes in a single runtime.
This project is linked to pull request #346 , #277 , and #266 .

…ersion 12.

drbergman

I'm glad that we're starting to see some of what you're planning with these additions! As this PR requires some of the others that are not yet merged, can you merge them into your elmbeech:patch1141_episode so that we can test this project?

Since this project's name does not immediately suggest what it is modeling, could you add a sample_projects/episode/README.md to describe what users should be expecting from this? I am not sure what part of this highlights the power of the episodes. To me, all I see is that this runs the model 4 times. Is there more to this that I'm missing? If that's the power (avoiding the overhead of relaunching the program), could you benchmark and provide a simple bar plot, maybe in that readme? Just running n sims with and without episodes and comparing wall time.

sample_projects/episode/config/PhysiCell_settings.xml

sample_projects/episode/main.cpp

elmbeech · 2025-01-14T15:21:03Z

Hi Daniel,

Sure, I can merge all the needed PR's together.

You are correct, all this sample project does is running consecutive episodes of a physicell model in one runtime.
This is an essential feature if you want to do learning. You simply cannot learn, if you have to destroy the runtime after each episode to reset to the initial condition because you will at the same time erase everything you learned from the episode.
We are on the way to releasing a PhysiCell project that does reinforcement learning, which simply can be installed under user_projects, but to have it work with the regular PhysiCell releases, we first have to get the changes from this PR into the code base.
The sample project is there for other developers, who want to do similar implementations and, most importantly, as a unit test, to make sure that this feature persists in future PhysiCell releases.

I can add a README with this explanation.

Best, Elmar

elmbeech · 2025-01-14T22:48:54Z

looks like windows cases some troubles.
I will look into it.

elmbeech · 2025-01-14T22:58:11Z

merged in pull requests:

elmbeech · 2025-01-15T00:16:21Z

ok @drbergman, now everything is merged, modified as requested, and all tests are passing!
could you please have another look?

sample_projects/episode/main.cpp

drbergman · 2025-01-15T15:29:25Z

Final question (no promises though 🙃): Does this sample project show the full use of the proposed updates to initialize_microenvironment and load_PhysiCell_config_file? In other words, will the reinforcement learning project rely on additional calls/functionality of those functions beyond what is in sample_projects/episode/main.cpp?

Makefile

sample_projects/episode/Makefile

rheiland · 2025-01-15T15:46:36Z

I'd probably prefer to stick with just the #if defined(_WIN32) and not use the MINGW-related directives. I can imagine allowing other C++ compilers on Windows someday (I've done past tests using cmake and native Windows C++).

I also confirmed that this sample project compiles, runs, and generates the output<episode#> directories on my Windows machine (in Command Prompt and Powershell). But I did not check that the output was "valid". One nitpick might be to consider using leading 0s in your output directory filenames (output_00001, etc) for easier post-processing.

Also, when running, I see the WARNING related to <random_seed> on episodes >= 1 which may be concerning to users, e.g.,

run episode: 1 !
(re)set global variables ...
load setting xml config/PhysiCell_settings.xml ...
Using config file config/PhysiCell_settings.xml ...
WARNING: Setting the random seed again.
        You probably have set a user parameter called random_seed.
        Here, we will use the random seed set in user parameters.
        HOWEVER, as of PhysiCell 1.14.0, you should set the random seed in the <options><random_seed> element in the config file.
        Future versions of PhysiCell may throw an error here. Kindly remove the user parameter and just use the <options><random_seed> element.

(to be clear, there is no random_seed in user params; the warning is printed due to the logic in the C++ code)

and related to that, do you really want to use the same random_seed for every episode and do you want to use just 1 core (omp_num_threads)?

elmbeech · 2025-01-15T15:54:27Z

Final question (no promises though 🙃): Does this sample project show the full use of the proposed updates to initialize_microenvironment and load_PhysiCell_config_file? In other words, will the reinforcement learning project rely on additional calls/functionality of those functions beyond what is in sample_projects/episode/main.cpp?

hmm ...
the main.cpp shows the full use of the proposed update to initialize_microenvironment and load_PhysiCell_config_file (and reset_max_basic_agent_ID and load_PhysiCell_config_file from earlier already accepted pull requests).
but the reinforcement learning will not run with this main.cpp. it is a bit more complicated what we had to implement for making reinforcement leading possible.
the episode sample project can be regarded as a totally boiled down version of what is absolutely needed to make (reinforcement) learning in physicell possible at all.

is this answering your question?

elmbeech · 2025-01-15T16:25:03Z

I'd probably prefer to stick with just the #if defined(_WIN32) and not use the MINGW-related directives. I can imagine allowing other C++ compilers on Windows someday (I've done past tests using cmake and native Windows C++).

cool! same conclusion on my side.

I also confirmed that this sample project compiles, runs, and generates the output<episode#> directories on my Windows machine (in Command Prompt and Powershell). But I did not check that the output was "valid". One nitpick might be to consider using leading 0s in your output directory filenames (output_00001, etc) for easier post-processing.

the leading 0 would be nice. could you please give me a hint, how to do this easy in C++?

I have a python script to check if the output is correct (exactly the same when running with one thread). it uses pcdl and the output is at this moment a bit cryptic. i tested heavily on linux. and trust that the output is the same on windows. this pcdl script will eventually become a unit test for the reinforcement learning project and run on github workflow. so i am not too worried that we not will catch such an error.
drift.py.txt

Also, when running, I see the WARNING related to <random_seed> on episodes >= 1 which may be concerning to users, e.g.,

run episode: 1 !
(re)set global variables ...
load setting xml config/PhysiCell_settings.xml ...
Using config file config/PhysiCell_settings.xml ...
WARNING: Setting the random seed again.
        You probably have set a user parameter called random_seed.
        Here, we will use the random seed set in user parameters.
        HOWEVER, as of PhysiCell 1.14.0, you should set the random seed in the <options><random_seed> element in the config file.
        Future versions of PhysiCell may throw an error here. Kindly remove the user parameter and just use the <options><random_seed> element.

(to be clear, there is no random_seed in user params; the warning is printed due to the logic in the C++ code)

I am aware of this warning, but this not really in my hand.

and related to that, do you really want to use the same random_seed for every episode and do you want to use just 1 core (omp_num_threads)?

yes, if you wanna check that you have no drift in consecutive episodes, you wanna run all episodes with the same seed and one thread.
and if you wanna do learning, you will have to run totally random anyway. so I think this is ok.
also, it is a sample project. if someone wants to change this behavior, he can simply modify it.

drbergman · 2025-01-15T16:59:35Z

hmm ... the main.cpp shows the full use of the proposed update to initialize_microenvironment and load_PhysiCell_config_file (and reset_max_basic_agent_ID and load_PhysiCell_config_file from earlier already accepted pull requests).

In this case, we can choose to refactor these functions rather than redefine them. This avoids introducing in these functions what I still find to be confusing delineations between what is and is not reloadable. See #346 for what is reloadable. My first pass replaces parts of your main.cpp with

if ( i_episode == 0 )
{
	XML_status = load_PhysiCell_config_file( settingxml );
}
else
{
	XML_status = read_PhysiCell_config_file( settingxml ); // NEWLY DEFINED (SEE BELOW)
	if (XML_status)
	{
		PhysiCell_settings.read_from_pugixml();
		create_output_directory( PhysiCell_settings.folder ); // could omit since elsewhere in main.cpp the output folder is created (or could omit that line instead)
	}
}

// reset microenvironment and mechanics voxel size and match the data structure to BioFVM
std::cout << "reset densities ..." << std::endl;
set_microenvironment_initial_condition(); // NEWLY DEFINED (does everything in set_microenvironment below the `if (!reload)` block)
microenvironment.display_information( std::cout );

where we define

bool read_PhysiCell_config_file( std::string filename )
{
	physicell_config_dom_initialized = false; 

	std::cout << "Using config file " << filename << " ... " << std::endl ; 
	pugi::xml_parse_result result = physicell_config_doc.load_file( filename.c_str()  );
	
	if( result.status != pugi::xml_parse_status::status_ok )
	{
		std::cout << "Error loading " << filename << "!" << std::endl; 
		return false;
	}
	
	physicell_config_root = physicell_config_doc.child("PhysiCell_settings");
	physicell_config_dom_initialized = true; 
	return true
}

I opened a PR showing the full set of changes for this refactor.

drbergman · 2025-01-15T19:30:55Z

Also, when running, I see the WARNING related to <random_seed> on episodes >= 1 which may be concerning to users, e.g.,

It only shows up on i_episode==1 for me. the SeedRandom has a static bool warned to make sure the user only gets warned once. If we want to avoid even that, we could move that (or probably the other static bool setup_done variable) into global scope and write a function to toggle its state.

.github/workflows/tests.yml

refactor to simplify edits to core

drbergman · 2025-01-15T20:08:23Z

still need to either include test svgs in tests/cases/output_episode-sample/ or set the output_folder: "" in tests.yml

sample_projects/episode/main.cpp

elmbeech · 2025-01-15T20:34:36Z

still need to either include test svgs in tests/cases/output_episode-sample/ or set the output_folder: "" in tests.yml

I thought had already updated the yml files.
please check for folder output3 then we are sure that really consecutive episodes were run!

drbergman · 2025-01-15T20:47:46Z

please check for folder output3 then we are sure that really consecutive episodes were run!

That works manually, but the gh actions will want to test the svgs in output3 against what is in tests/cases/output_episode-sample/ because of the line in the tests.yml matrix for episode-sample. See the failed actions.

do not compare episode svgs in test suite

elmbeech · 2025-01-15T23:01:46Z

this code passes all the tests.
i think we are pretty much there.
thank you!

drbergman

Looks good to me now. Refactors in the code improve code readability and make the necessary internals exposed for this project. A sample project that gives the framework a skeleton upon which to build will make this easier for users to pick up. Also, put the new project in the testing suite for CI.

elmbeech · 2025-01-16T03:52:08Z

Thank you, Daniel!

…ause the function is no longer that much different.

MathCancer · 2025-01-16T17:11:15Z

NICE WORK!!

elmbeech · 2025-01-16T18:18:59Z

Thank you, Paul!

elmbeech · 2025-01-16T20:58:09Z

I could successfully compile the reinforcement learning user_project.
The only thing I cannot make compatible (without going here into the details) with the current episode sample project is:

std::string settingxml = "config/PhysiCell_settings.xml";

to make it compatible, i might have to make it in C style like

char *settingxml = "config/PhysiCell_settings.xml";

i will check in deep right now.

… related reinforcement learning project.

elmbeech · 2025-01-17T02:34:48Z

ok! that's definitely it.
all ready for the 1.14.2 release.

MathCancer · 2025-01-20T18:18:53Z

Elmar thank you for the OUTSTANDING work on this, as well as for being so responsive on edits.

Daniel, thank you for all the mentoring and QC!!!

elmbeech added 6 commits January 3, 2025 15:13

for github workflow ubuntu unit tests adjust from gcc version 11 to v…

ef60e6e

…ersion 12.

enable reload setting xml file.

2cb6347

update Makefile-default with episode-sample project.

76e3584

add episode sample project.

34d817f

adjust to physicell formatting convention.

1a51f20

add github workflow yml entries for the episode sample project.

8972473

drbergman requested changes Jan 14, 2025

View reviewed changes

elmbeech added 4 commits January 14, 2025 15:19

Merge branch 'patch1141_unittest_ubuntu_gcc_v12' into patch1141_episode

56e93bc

Merge branch 'patch1141_reload_settingxml' into patch1141_episode

413ef48

requested changes for pull request 349.

ccabc3e

replace the current Makefile with the updated Makefile-default

589cc31

This was referenced Jan 14, 2025

enable reload setting xml file. #346

Closed

github workflow ubuntu unit tests adjustment from gcc version 11 to version 12 #345

Closed

elmbeech added 2 commits January 14, 2025 18:16

get mkdir command working on windows too.

8de6be4

get mkdir command working on windows too.

a18505e

minor formatting changes.

14a9678