Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend assignment_operator to optionally allow '=' assignments #2698

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

MichaelChirico
Copy link
Collaborator

Closes #2441. Tests are partially copied from #2521 -- for implementation I'm starting from scratch rather than trying to re-work it based on the discussion there. Thanks @J-Moravec for getting things started & the fruitful discussion!

Pausing here for, well, even more discussion :)

I'm getting hung up on the interaction of the proposed argument with the other arguments, particularly allow_right_assign and allow_pipe_assign.

If we have (allow_right_assign=FALSE, top_level_operator="any"), that does make x -> 2 a bit ambiguous, right?

I can think of two options:

  1. Subsume the other options into this new one by allowing top_level_operator = c("<-", "->", "%<>%") to be equivalent to the current (allow_right_assign=TRUE, allow_pipe_assign=TRUE). That looks pretty transparent to me, but presents some back-compatibility issues.
  2. Ditch any in favor of either, i.e. either <- or =.

WDYT @IndrajeetPatil @AshesITR?

@AshesITR
Copy link
Collaborator

AshesITR commented Dec 9, 2024

From an interface standpoint I think top_level_operator should be called assignment_operator and "any" shouldn't be an option.

Maybe it's worth exploring what = assignment style should look like for inner assignments and reduce flexibility a bit when it comes to them.

@IndrajeetPatil
Copy link
Collaborator

I agree that we shouldn't support "any" as an option. That is effectively turning off the linter, which they can already do via the config file.

@MichaelChirico
Copy link
Collaborator Author

That is effectively turning off the linter, which they can already do via the config file.

Well, not exactly. It would still retain the remaining options -- even if we subsume all of allow_cascading_assign, allow_right_assign, allow_pipe_assign into one operator= argument, there's still the allow_trailing= argument that's independent of the assignment operator.

But maybe operator=NULL is the better way to say "any operator".

I think top_level_operator should be called assignment_operator

I didn't like the redundancy of assignment_linter(assignment_operator=...).

Maybe it's worth exploring what = assignment style should look like for inner assignments and reduce flexibility a bit when it comes to them.

I really wanted to punt here to make progress on this linter in the next release & enable using = even at some basic level. The discussion to resolving that in #2521 stalled a few times over 10 months. Not requiring <- is relatively highly requested; beyond the existing 👍 there I know we'd use it in {data.table} and I think {mlr} would use it too.

Using anything but <- for implicit assignments is already pretty niche, and implicit assignments are rare to begin with. It probably makes most sense to add a similar operator= argument there with similar behavior, but for implicit assignments. Glancing at whether it's prudent to subsume implicit_assignment_linter() into assignment_linter(), I'd say no -- it's not immediately clear how we'd cleanly include the three existing arguments to implicit_assignment_linter() in the assignment_linter() API.

@AshesITR
Copy link
Collaborator

I like the sound of operator = c("<-", "=", NA).

NULL can't be included in c() IINM.

@MichaelChirico
Copy link
Collaborator Author

I like the sound of operator = c("<-", "=", NA).

NULL can't be included in c() IINM.

what would NA mean? I was thinking 'NULL<->any operator' so it wouldn't need to be mixed with other strings & could be handled separately.

@AshesITR
Copy link
Collaborator

NA would mean any (i.e. disable to-level checks) but it allows exposition of the argument, as well as checking via match.arg(). I see no benefit for using NULL instead - the signature would have to hide this option. NA to disable this part of the linter seems somewhat more intuitive than "any".

@MichaelChirico
Copy link
Collaborator Author

I see, basically it's to facilitate match.arg(), makes sense. OTOH, it's not clear how we'd implement it clearly.

  • operator = c("<-", "=", ..., NA); match.arg(operator) --> the current behavior. But doesn't allow length(operator) > 1L.
  • operator = c("<-", "=", ..., NA); match.arg(operator, several.ok=TRUE) --> allow length(operator)>1L, but doesn't match current behavior.
  • operator = "<-"; match.arg(operator, c("<-", "=", ..., NA), several.ok=TRUE) --> allow length(operator)>1L, match current behavior, but the signature is no longer transparent about what operators are allowed.

I think NULL is still an option, then -- AFAICT there's no way to put NA (or NULL) in the signature for the behavior we want. And NULL cleans up the implementation a little (is.null(operator) vs. length(operator) == 1L && is.na(operator) or identical(operator, NA) but then NA_character_ won't work, something I always found awkward about testthat::expect_error(x, NA)).

I'm not sure there's much expositional difference between stating "NULL means 'any operator'" vs. "NA [logical or any type] means 'any operator'".

@J-Moravec
Copy link

Imho, assignment_operator = "any" is more readable than NA. What would not available mean operator mean? Null would be fine as way to turn off that part.

But in base, both are used as way to get some default behaviour or turn out the behaviour completely. I would probably not put NA in a signature, not using tidyverse, but can't say I saw it in base.

@MichaelChirico
Copy link
Collaborator Author

MichaelChirico commented Dec 12, 2024

In {base}, there's most saliently match(nomatch=NA), sort(na.last=NA), nchar(keepNA=NA), but 28 hits overall:

NULL # ensure .Last.value is NULL
getNamespaceExports('base') |>
  sapply(function(fn) {
    f <- get(fn)
    if (!is.function(f)) return(FALSE)
    f |>
      formals() |>
      sapply(\(arg) is.atomic(arg) && length(arg) == 1L && is.na(arg)) |>
      any()
  }) |>
  which() |>
  length()

37 more in other default packages.

@J-Moravec
Copy link

In {base}, there's most saliently match(nomatch=NA), sort(na.last=NA), nchar(keepNA=NA), but 28 hits overall:

NULL # ensure .Last.value is NULL
getNamespaceExports('base') |>
  sapply(function(fn) {
    f <- get(fn)
    if (!is.function(f)) return(FALSE)
    f |>
      formals() |>
      sapply(\(arg) is.atomic(arg) && length(arg) == 1L && is.na(arg)) |>
      any()
  }) |>
  which() |>
  length()

37 more in other default packages.

Yes, but correct me if I am wrong, but those are for single value. For instance, various plot options take some default value if they are NULL, but can be suppressed with NA (e.g., border in polygon(), the default NULL is par("fg"), but NA allows for not plotting border at all).

I haven't seen NA in multiple options. martch.arg is bit finicky with NAs and requires NA_character_ specifically.

opts = c("foo", "bar", NA) # NA converted to NA_character_ so fine
match.arg(NA, opts) # Error ... : 'arg' must be NULL or a character vector
match.arg(NA_character_, opts) # fine!

Maybe you have mentioned this in #2698 (comment), its a late here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extend assignment_linter() to enforce '=' by default for assignment
4 participants