Skip to content

Executing settings functions inside renvs #51

@azimov

Description

@azimov

There will be a problem on many systems where creating a study requires installed versions of specific R packages.

For example, if I want to use CohortGenerator with subsets I can define this in my base R/system R environment with CohortGenerator version 0.8.0 and then feed this to a package using version0.7.0. In this situation I could add Cohorts with subsets.
However, doing this one of two things will happen:

  • The Strategus CohortGenerator module will crash because it can't handle the subset payloads
  • The Strategus CohortGenerator module will run and not error, not generating my subsetted cohorts (which is arguably worse)

Both these situations are bad and (currently) the only solution would be to make sure your system env and Strategus module env are the same.

Note that this will happen even with any parameters. For example, cohort diagnostics creates a FeatureExtraction default feature set inside the base package.

However - I propose an alternative solution; to allow execution of settings creation inside module renv contexts and make this the standard procedure.

In the simplest case, the module functions exposed (e.g. getSharedResources) should not require many changes, but module developers should be provided with an API to call into the module environment.

In a more complex case I would like to be able to create arbitrary code execution inside the renv with something like this:

cohortDefinitionPayload <- withModuleRenv(
      module = "CohortGenerator",
      version ... 
      code = {
          library(CohortGenerator) # Load renv version

          cohortDefinitionSet <- loadCohortDefinitionSet(...)

          subsetDefinition <- createCohortSubsetDefinition(
           .... DO stuff ...
          )
         cohortDefinitionSet %>% addCohortSubsetDefinition(subsetDefinition)
         
      })

Note, I include similar code (though not designed to be exported in its current state) in this PR.

There is a lot of complexity here though - we would only really want this code to create serialized payloads for the strategus design definitions.

Passing data between the calling script and the renv is also an example. The best solution for these situations is probably to store the input and output as an RDS file somewhere (rstudio optionally does this when sourcing jobs, allowing you to copy the calling environment to the child process and allowing the child process to copy results back to the parent R session).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions