Skip to content

Conversation

stibu81
Copy link
Contributor

@stibu81 stibu81 commented Sep 19, 2025

This introduces two negated expectations as suggested in #1851 with the following functionality:

  • expect_not_contains(x, y) tests that x contains none of the elements of y (i.e. y is disjoint from x).
  • expect_not_in(x, y) tests that no element of x is in y (i.e. x is disjoint from y).

While the not negated expectations actually do something different, these two are equivalent. It might still make sense to have them both.

During implementation I realised that one might have different expectations from these names. For example, one might expect that expect_not_in(x, y) checks that:

  • none of the elements of x are in y (which is what I implemented)
  • x is not a subset of y

Both of them could also meaningfully be understood as inversions of the other two expectations. Would the second variant also be of interest?

Let me know if anything should be improved.

@hadley
Copy link
Member

hadley commented Oct 6, 2025

@stibu81 my inclination would be to define the tests like this:

  • expect_not_contains(x, y) tests that x contains no element of y (i.e. y is a not subset of x).
  • expect_in(x, y) tests that no element of x is in y (i.e. x is a not subset of y).

(i.e. just replacing "every" with "no", and adding "not" before subset). Does that make sense to you or have I confused myself? (As I do whenever I look at these functions)

@lionel-
Copy link
Member

lionel- commented Oct 7, 2025

How about a single expect_disjoint() function?

@stibu81
Copy link
Contributor Author

stibu81 commented Oct 7, 2025

@hadley It is very confusing, yes. I think the confusion has to do with what I briefly mentioned at the end of my original post: there are two reasonable ways to think about these functions: in terms of single elements or in terms of sets. In the case of expect_in(), these two ways are equivalent, but with expect_not_in() they are not.

For expect_in(x, y), they can be formulated as follows:

  • elements: check that every element of x is in y
  • set: check that the entire set x is in y

These two statements turn out to say exactly the same (x is a subset of y) but for expect_not_in(x, y) this is different:

  • elements: check that no element of x is in y (x is disjoint from y).
  • set: check that the set x is not in y (x is not a subset of y).

As an example, expect_not_in(c(3, 4), 1:3) would fail in the first case because 3 is in 1:3, but it would succeed in the second case, because c(3, 4) is not a subset of 1:3. Only this second case is the exact inverse of expect_in() in that it succeeds precisely when the other fails. But (at least to me), the element-wise check seems more natural and more useful.

Your description of the tests mixes those two distinct ways of understanding the functions, so I think it is not correct.

The name suggested by @lionel- is much clearer and cannot be misunderstood, so I prefer that one. From the name, it is less obvious that this is a kind of inverse to expect_in() and expect_contains(), which someone might still try to find. But as I have said above, my implementation is also not the exact inverse of expect_in(), so it might actually be better to not imply that it is.

What is your preferred way forward? Should I replace the two functions by expect_disjoint()? And do you still see any justification to implement the "set-variants" of the functions (which would be difficult to name properly, I think)?

@DavisVaughan
Copy link
Member

When I think of these, I do think the "elements" based approach mentioned above is what I'd expect them to do. I think pictures are useful here:

IMG_7850

What falls out from these pictures is that not-contains and not-in would use the same implementation when defined this way. I do think their error messages would probably be a little different:

# Not contains
`actual` contains some of the values in `unexpected`

# Not in
Some values of `actual` are in `unexpected`

And of course you'd provide the arguments in different orders, expect_not_contains(haystack, needles) vs expect_not_in(needles, haystack).

But I do really like what @lionel- suggested here. I think expect_disjoint():

  • Is a very clear name. In particular I prefer to have a positive assertion over a not assertion.
  • Has no ambiguity about whether the vectors must be partially or fully separated (I think it implies fully disjoint with no overlap at all)
  • Is nice because it's a single function, capturing how the implementations are the same between the two of them.

I don't think the argument order actually matters all that much. Reporting something like this feels like it would be good enough for all use cases

{act$lab} (`actual`) and {exp$lab} (`expected`) are not disjoint.
* Present in both `values(union(act$val, exp$val))`

@stibu81
Copy link
Contributor Author

stibu81 commented Oct 10, 2025

@DavisVaughan I think, we agree then. This would replace the two functions that I implemented with a single one that does exactly the same, but has a clearer name and produces slightly different output. But it would not be the exact inverse of exptect_in() or expect_contains().

And I would not implement the set-variants of expect_not_in() and expect_not_contains().

Is it ok for me to go ahead or should I wait on a comment by @hadley?

@DavisVaughan
Copy link
Member

DavisVaughan commented Oct 10, 2025

I think you can go ahead!

@hadley
Copy link
Member

hadley commented Oct 10, 2025

Plan sounds good to me!

@stibu81
Copy link
Contributor Author

stibu81 commented Oct 10, 2025

Second attempt, now with expect_disjoint(). I tried to keep the documentation and the failure message in the spirit of the other expectations in "setequal-group".

Copy link
Member

@DavisVaughan DavisVaughan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hadley looks good to me and the implementation matches the spirit of expect_in() - I'll let you be the final approver and merge-er

)
msg_act <- c(
sprintf("Actual: %s", values(act$val)),
sprintf("Expected: none of %s", values(exp$val)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sprintf("Expected: none of %s", values(exp$val)),
sprintf("Expected: None of %s", values(exp$val)),

I think I like having this capitalized more


expect_snapshot_failure(expect_disjoint(x1, x2))
expect_snapshot_failure(expect_disjoint(x1, x3))
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be useful to have a test for expect_disjoint(c("a", NA), NA) to test that missing values are matched exactly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. I made the requested changes. In doing so, I might have stumbled onto something else: expect_failure() does not succeed for expect_disjoint() and some other functions in this file, e.g.:

expect_failure(expect_in(3, 5))
## Error: Expected zero successes.
## Actually succeeded 1 times

I think that the reason is that the call of fail() is not inside return() for some functions, such that the later pass() is also executed. expect_snapshot_failure() seems to be ok with this but not expect_failure().

I could fix those missing return()s, but I'm not sure that it is good to mix this into this PR that is about something else.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stibu81 that's because the expectation style has changed since you started working on this PR 😬 I've updated your expectation to the new style and expect_failure(expect_in(3, 5)) now correctly passes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems I picked a bad moment for this... 😆 Thanks for fixing it.

@DavisVaughan DavisVaughan changed the title expect_not_contains() and expect_not_in() Implement expect_disjoint() Oct 10, 2025
@hadley hadley merged commit ae5dda6 into r-lib:main Oct 10, 2025
14 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants