Skip to content

Add nlmixr_formula()#5

Open
billdenney wants to merge 30 commits intomainfrom
formula
Open

Add nlmixr_formula()#5
billdenney wants to merge 30 commits intomainfrom
formula

Conversation

@billdenney
Copy link
Contributor

@billdenney billdenney commented May 24, 2022

Fix #6.

What it does is it implements a formula interface that allows nlmixr2 models to be estimated when all that is required is an algebraic formula. It takes inspiration from lme4::nlmer() for the way to define the formula to use.

The current state of the PR does not have testing, and documentation needs expansion. There are some design decisions that I think are worth vetting, too.

The main design decisions that are questionable to me are:

  • Is there a better/simpler formula interface to use that incorporates fixed and random effects? What about automated modeling of parameters?
  • How should random effects be defined? Should they be required to be different than fixed effects or should the system automatically generate the different names for the user?
  • Can there be more than one type of random effect? In other words must it be just ID or are other columns also acceptable?
  • There's no way to specify the starting random effect variability; it simply starts at 1. Should an interface be added for that?
  • Is the way that it handles factors when generating multiple fixed effects good? Is there a better way to generate them?
  • Should it handle more types of automatic parameterization than just factors? (That can be a downstream option.)

@mattfidler
Copy link
Member

One minor comment, please adopt the style used in the rest of the package. I follow more or less the bionconductor style guide:

https://contributions.bioconductor.org/r-code.html

@mattfidler
Copy link
Member

mattfidler commented May 24, 2022

Mostly I really want a consistent feel from the user and developer perspective. If we follow a consistent style guide then this would be easier to achieve

@mattfidler
Copy link
Member

Can there be more than one type of random effect? In other words must it be just ID or are other columns also acceptable?

Eventually possible right now not possible

@mattfidler
Copy link
Member

Is the way that it handles factors when generating multiple fixed effects good? Is there a better way to generate them?

This perhaps should be generalized to the rest of nlmixr?

@billdenney
Copy link
Contributor Author

I'm happy to modify the style. My default style is snake_case, and I needed this pretty quickly for something else. I also thought that there may be significant modifications after considerations for the ways that it works, so I'm guessing there will be some pretty big rework before it's done.

I don't see a simple way to generalize the factor handling to the rest of nlmixr2, but I'd be interested to think about it more and hear thoughts of how it could be done. The way that the nlmixr2 ini() works right now, I don't readily see a way to make it work with factors. I could imagine a few different ways that it could be augmented, but at that point, I worry that the UI would start to get cumbersome. Balancing the feature with the UI needs should be discussed in depth.

A few ideas of how I considered it were:

  • Assignment with a vertical bar in the ini block could assign the same initial estimate to all factor levels (e.g. a <- fixed(1, 2, 3) | STUDYID). That would parallel what is done for random effects with fixed effects pretty simply. But, you would now have to watch for a somewhat subtle difference between <- and ~ which already trips me up sometimes when writing the models.
  • Only support it with post-processing of a model (but that doesn't feel good since I think you should have all of the flexibility in the main model syntax).

@billdenney
Copy link
Contributor Author

FYI, this PR update only changes things around the edges (documentation issues, mainly). I'll go through the coding style after the underlying functionality has been reviewed.

@billdenney
Copy link
Contributor Author

Another thought for generating parameters for a model with a fixed effect per factor level. Should we go all the way down the path of generating orthogonal contrasts for estimation and then back-transforming them? That seems like it would be mathematically the best (in most scenarios), but it gets a lot more cumbersome.

Or, maybe that is already sufficiently covered by nlmixr2extra::preconditionFit().

@mattfidler
Copy link
Member

mattfidler commented May 25, 2022 via email

@billdenney
Copy link
Contributor Author

I don't understand "all factors are mu referenced".

@billdenney
Copy link
Contributor Author

It would be good to switch to using the reformulas package for formula processing. https://cran.r-project.org/package=reformulas

@mattfidler
Copy link
Member

This is entirely up to you

@billdenney
Copy link
Contributor Author

@mattfidler, I think that this is ready to merge. Do you want to take one more look?

@billdenney billdenney changed the title Add nlmixr_formula() (testable work in progress) Add nlmixr_formula() Sep 17, 2025
@mattfidler mattfidler requested a review from Copilot September 17, 2025 17:25
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a formula interface for nlmixr2 that allows algebraic models to be estimated using a simplified formula syntax similar to lme4::nlmer(). The interface enables users to specify models without writing full nlmixr2 functions, making it easier to fit simple algebraic solutions.

Key changes:

  • Adds nlmixrFormula() function with formula parsing and model generation capabilities
  • Implements automatic parameter expansion for factor variables
  • Provides support for mixed-effects models with random effects grouping

Reviewed Changes

Copilot reviewed 14 out of 15 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
R/nlmixrFormula.R Main implementation with formula parser and model setup functions
tests/testthat/test-nlmixrFormula.R Comprehensive unit tests for formula parsing and parameter expansion
man/*.Rd Documentation files for the new functions and internal helpers
vignettes/nlmixrFormula.Rmd User guide demonstrating the formula interface with examples
NAMESPACE Exports the new nlmixrFormula() function
DESCRIPTION Updates R dependency and adds vignette builder

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

mattfidler and others added 3 commits September 17, 2025 13:41
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@mattfidler
Copy link
Member

This does look fine to me.

I don't exactly follow what param is supposed to be doing here, and there seems to be no examples to help me with this in the vignette.

I was also trying to see if there was a way to take out nlmixrFormula() and replace it with nlmixr()

@mattfidler
Copy link
Member

One last question -- do you want to support iov too?

@mattfidler
Copy link
Member

Like the rest of the nlmixr2 documentation, I have changed addErr to addSd since this is a bit more clear.

I have added a nlmixr2() method for formula that uses nlmixr2Formula.

Still haven't done anything about IOV

@mattfidler
Copy link
Member

If you are OK with the changes I made, I am OK merging this.

@billdenney
Copy link
Contributor Author

I like all your changes. Let me add how to use param in the vignette on top of your changes.

@mattfidler mattfidler requested a review from Copilot September 23, 2025 20:41
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 15 out of 16 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

tests/testthat/test-nlmixrFormula.R:1

  • The comment is incomplete and ends abruptly with 'so that'. Complete the explanation or remove the incomplete comment.
test_that(".nlmixrFormulaParser breaks the formula up into the correct bits", {

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +398 to +403
#' @param data The data used in the model
#' @return the interior of the model()
#' @keywords Internal
#' @noRd
#' @author William Denney
.nlmixrFormulaSetupModel <- function(start, predictor, residualModel, predictorVar="value", data) {
Copy link

Copilot AI Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data parameter is included in the function signature but is not documented in the roxygen comment above, and it's not used in the function body. Either remove the unused parameter or document it properly.

Suggested change
#' @param data The data used in the model
#' @return the interior of the model()
#' @keywords Internal
#' @noRd
#' @author William Denney
.nlmixrFormulaSetupModel <- function(start, predictor, residualModel, predictorVar="value", data) {
#' @return the interior of the model()
#' @keywords Internal
#' @noRd
#' @author William Denney
.nlmixrFormulaSetupModel <- function(start, predictor, residualModel, predictorVar="value") {

Copilot uses AI. Check for mistakes.
stopifnot(length(startValue) %in% c(1, length(paramLabel)))
if (length(startValue) == 1 && !is.ordered(data[[param]])) {
message("ordering the parameters by factor frequency: ", startName, " with parameter ", param)
paramLabel <- names(rev(sort(summary(data[[param]]))))
Copy link

Copilot AI Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] This line performs multiple nested operations that could be hard to follow. Consider breaking it into separate steps with descriptive variable names for better readability.

Suggested change
paramLabel <- names(rev(sort(summary(data[[param]]))))
paramSummary <- summary(data[[param]])
sortedParamSummary <- sort(paramSummary)
reversedSortedParamSummary <- rev(sortedParamSummary)
paramLabel <- names(reversedSortedParamSummary)

Copilot uses AI. Check for mistakes.
# Setup the dataset for nonlinear mixed-effects fitting with different types of
# parameters.

# Simulate the equation y = 3*x + 4(when z = 'a') or 6(when z = 'b') with normally-distributed residual error
Copy link

Copilot AI Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing space before parenthesis in '4(when'. Should be '4 (when' for proper formatting.

Suggested change
# Simulate the equation y = 3*x + 4(when z = 'a') or 6(when z = 'b') with normally-distributed residual error
# Simulate the equation y = 3*x + 4 (when z = 'a') or 6 (when z = 'b') with normally-distributed residual error

Copilot uses AI. Check for mistakes.
@mattfidler
Copy link
Member

Is this ready for review/merging Bill?

@billdenney
Copy link
Contributor Author

Not quite yet. There are some issues with param that I need to fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: nlmixr2.formula() function

3 participants