Skip to content

Conversation

@hadley
Copy link
Member

@hadley hadley commented Nov 5, 2025

First draft

Copy link
Member

@lionel- lionel- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice release!


Overall it was a successful experiment. It helped me close over 100 issues in what felt like less time than usual. I don't have any hard numbers, but my gut feeling is that it was maybe a 10-20% improvement to my development velocity. This is still significant, especially since I'm an experienced R programmer and my workflow has been pretty stable for the last few years. I mostly used Claude for smaller, well-defined tasks where I had a good sense of what was needed. I found it particularly useful for refactoring, where it was easy to say precisely what I wanted, but executing the changes required a bunch of fiddly edits across many files.

I also found it generally useful for getting over the "activation energy hump": there were a few issues that had been stagnating for years because they felt like they were going to be hard to do and with relatively limited payoff. I let Claude Code loose on a few of these and found it super useful. It only produced code I was really happy with a couple of times, but every time it gave me something to react to (often with strong negative feelings!) and that got me started actually engaging with the problem.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anger-driven engagement algorithm for coders


* All expectations have new failure messages: they now state what was expected, what was actually received, and, if possible, they clearly illustrate the difference.

* Expectations now consistently return the value of the first argument, regardless of whether the expectation succeeds or fails (the only exception is `expect_error()` and friends which return the captured condition). This is a relatively subtle change that won't affect tests that already pass, but it does improve failures when you pipe together multiple expectations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Expectations now consistently return the value of the first argument, regardless of whether the expectation succeeds or fails (the only exception is `expect_error()` and friends which return the captured condition). This is a relatively subtle change that won't affect tests that already pass, but it does improve failures when you pipe together multiple expectations.
* Expectations now consistently return the value of the first argument, regardless of whether the expectation succeeds or fails (the only exception is `expect_error()` and friends which return the captured condition so that you can perform addition assertions on the condition obect). This is a relatively subtle change that won't affect tests that already pass, but it does improve failures when you pipe together multiple expectations.

## Other new features
* testthat generally does a better job of handling nested tests, aka subtests, where you put a `test_that()` inside another `test_that()`, or more typically `it()` inside of `describe()`. Subtests will now generate more informative failure messages, free from duplication, with more informative skips if any subtests don't contain any expectations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought nesting test_that() was mostly useful for testing the testthat package with itself, but this paragraph makes it sound like there might be user-oriented cases where this is helpful? If that's the case, it might be interesting to add a sentence explaining the use case, otherwise a sentence explaining this is mostly for internal testing.

* `vignette("mocking")` explains mocking in detail, and new `local_mocked_s3_method()`, `local_mocked_s4_method()`, and `local_mocked_r6_class()` make it easier to mock S3 and S4 methods and R6 classes.
* `test_dir()`, `test_check()`, and friends gain a `shuffle` argument that uses `sample()` to randomly reorder the top-level expressions in each test file. This random reordering surfaces dependencies between tests and code outside of any test, as well as dependencies between tests, helping you find and eliminate unintentional dependencies.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's nice

* `test_dir()`, `test_check()`, and friends gain a `shuffle` argument that uses `sample()` to randomly reorder the top-level expressions in each test file. This random reordering surfaces dependencies between tests and code outside of any test, as well as dependencies between tests, helping you find and eliminate unintentional dependencies.
* `try_again()` is now publicized, as it's a useful tool for testing flaky code:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, might be worth a note that this should still be skipped on CRAN? And include a skip_on_cran() in the example.

Comment on lines +154 to +163
test_that("my flaky test is ok", {
# 10% chance of failure:
expect_equal(flaky_function(), 1)
# 1% chance of failure:
try_again(1, expect_equal(flaky_function(), 1))
# 0.1% chance of failure:
try_again(2, expect_equal(flaky_function(), 1))
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was confused at first because I thought this is how you use it. But it makes more sense that this shows three different usages. Maybe split in three different test_that and move comments to titles, e.g. "my flaky test is ok, 10% chance of failure"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also got confused by this.

})
```
* New `skip_unless_r()` skips tests on unsuitable versions of R. It has a convenient syntax so you can use, e.g., `skip_unless_r(">= 4.1.0")` to skip tests that require `...names()`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so long skip_if_not_installed("base", "4.1.0")

* New `SlowReporter` makes it easier to find the slowest tests in your package. You can run it with `devtools::test(reporter = "slow")`.
* New `vignette("challenging-functions")` provides an index to other documentation organized by various challenges.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To use the new features, do you recommend bumping the version of testthat in Suggests? Might be a good place to mention it.

Unfortunately pkgload only checks for Imports not Suggests, so bumping the dep won't trigger the install prompt on load.

Copy link
Contributor

@teunbrand teunbrand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes me excited to try out the new features!

library(testthat)
```

## Claude Code experiences
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, it feels the reader is all prepped to go read about the changes in testthat, but we're taking a detour first through this Claude experience. Perhaps it would feel more natural as a wrap-up at the end?

Comment on lines +154 to +163
test_that("my flaky test is ok", {
# 10% chance of failure:
expect_equal(flaky_function(), 1)
# 1% chance of failure:
try_again(1, expect_equal(flaky_function(), 1))
# 0.1% chance of failure:
try_again(2, expect_equal(flaky_function(), 1))
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also got confused by this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants