testthat 3.3.0 #765

hadley · 2025-11-05T15:07:09Z

First draft

lionel-

Nice release!

lionel- · 2025-11-11T07:56:48Z

content/blog/testthat-3-3-0/index.Rmd

+
+Overall it was a successful experiment. It helped me close over 100 issues in what felt like less time than usual. I don't have any hard numbers, but my gut feeling is that it was maybe a 10-20% improvement to my development velocity. This is still significant, especially since I'm an experienced R programmer and my workflow has been pretty stable for the last few years. I mostly used Claude for smaller, well-defined tasks where I had a good sense of what was needed. I found it particularly useful for refactoring, where it was easy to say precisely what I wanted, but executing the changes required a bunch of fiddly edits across many files.
+
+I also found it generally useful for getting over the "activation energy hump": there were a few issues that had been stagnating for years because they felt like they were going to be hard to do and with relatively limited payoff. I let Claude Code loose on a few of these and found it super useful. It only produced code I was really happy with a couple of times, but every time it gave me something to react to (often with strong negative feelings!) and that got me started actually engaging with the problem.


Anger-driven engagement algorithm for coders

lionel- · 2025-11-11T08:01:23Z

content/blog/testthat-3-3-0/index.Rmd

+
+* All expectations have new failure messages: they now state what was expected, what was actually received, and, if possible, they clearly illustrate the difference.
+
+* Expectations now consistently return the value of the first argument, regardless of whether the expectation succeeds or fails (the only exception is `expect_error()` and friends which return the captured condition). This is a relatively subtle change that won't affect tests that already pass, but it does improve failures when you pipe together multiple expectations.


Suggested change

* Expectations now consistently return the value of the first argument, regardless of whether the expectation succeeds or fails (the only exception is `expect_error()` and friends which return the captured condition). This is a relatively subtle change that won't affect tests that already pass, but it does improve failures when you pipe together multiple expectations.

* Expectations now consistently return the value of the first argument, regardless of whether the expectation succeeds or fails (the only exception is `expect_error()` and friends which return the captured condition so that you can perform addition assertions on the condition obect). This is a relatively subtle change that won't affect tests that already pass, but it does improve failures when you pipe together multiple expectations.

lionel- · 2025-11-11T08:05:46Z

content/blog/testthat-3-3-0/index.Rmd

+
+## Other new features
+
+* testthat generally does a better job of handling nested tests, aka subtests, where you put a `test_that()` inside another `test_that()`, or more typically `it()` inside of `describe()`. Subtests will now generate more informative failure messages, free from duplication, with more informative skips if any subtests don't contain any expectations.


I thought nesting test_that() was mostly useful for testing the testthat package with itself, but this paragraph makes it sound like there might be user-oriented cases where this is helpful? If that's the case, it might be interesting to add a sentence explaining the use case, otherwise a sentence explaining this is mostly for internal testing.

lionel- · 2025-11-11T08:07:09Z

content/blog/testthat-3-3-0/index.Rmd

+
+* `vignette("mocking")` explains mocking in detail, and new `local_mocked_s3_method()`, `local_mocked_s4_method()`, and `local_mocked_r6_class()` make it easier to mock S3 and S4 methods and R6 classes.
+
+* `test_dir()`, `test_check()`, and friends gain a `shuffle` argument that uses `sample()` to randomly reorder the top-level expressions in each test file. This random reordering surfaces dependencies between tests and code outside of any test, as well as dependencies between tests, helping you find and eliminate unintentional dependencies.


That's nice

lionel- · 2025-11-11T08:08:26Z

content/blog/testthat-3-3-0/index.Rmd

+
+* `test_dir()`, `test_check()`, and friends gain a `shuffle` argument that uses `sample()` to randomly reorder the top-level expressions in each test file. This random reordering surfaces dependencies between tests and code outside of any test, as well as dependencies between tests, helping you find and eliminate unintentional dependencies.
+
+* `try_again()` is now publicized, as it's a useful tool for testing flaky code:


Interesting, might be worth a note that this should still be skipped on CRAN? And include a skip_on_cran() in the example.

lionel- · 2025-11-11T08:09:25Z

content/blog/testthat-3-3-0/index.Rmd

+    test_that("my flaky test is ok", {
+      # 10% chance of failure:
+      expect_equal(flaky_function(), 1)
+
+      # 1% chance of failure:
+      try_again(1, expect_equal(flaky_function(), 1))
+
+      # 0.1% chance of failure:
+      try_again(2, expect_equal(flaky_function(), 1))
+    })


I was confused at first because I thought this is how you use it. But it makes more sense that this shows three different usages. Maybe split in three different test_that and move comments to titles, e.g. "my flaky test is ok, 10% chance of failure"

I also got confused by this.

lionel- · 2025-11-11T08:11:18Z

content/blog/testthat-3-3-0/index.Rmd

+    })
+    ```
+
+* New `skip_unless_r()` skips tests on unsuitable versions of R. It has a convenient syntax so you can use, e.g., `skip_unless_r(">= 4.1.0")` to skip tests that require `...names()`.


so long skip_if_not_installed("base", "4.1.0")

lionel- · 2025-11-11T08:15:19Z

content/blog/testthat-3-3-0/index.Rmd

+* New `SlowReporter` makes it easier to find the slowest tests in your package. You can run it with `devtools::test(reporter = "slow")`.
+
+* New `vignette("challenging-functions")` provides an index to other documentation organized by various challenges.
+


To use the new features, do you recommend bumping the version of testthat in Suggests? Might be a good place to mention it.

Unfortunately pkgload only checks for Imports not Suggests, so bumping the dep won't trigger the install prompt on load.

teunbrand

It makes me excited to try out the new features!

teunbrand · 2025-11-11T08:15:57Z

content/blog/testthat-3-3-0/index.Rmd

+library(testthat)
+```
+
+## Claude Code experiences


At this point, it feels the reader is all prepped to go read about the changes in testthat, but we're taking a detour first through this Claude experience. Perhaps it would feel more natural as a wrap-up at the end?

teunbrand · 2025-11-11T08:36:09Z

content/blog/testthat-3-3-0/index.Rmd

+    test_that("my flaky test is ok", {
+      # 10% chance of failure:
+      expect_equal(flaky_function(), 1)
+
+      # 1% chance of failure:
+      try_again(1, expect_equal(flaky_function(), 1))
+
+      # 0.1% chance of failure:
+      try_again(2, expect_equal(flaky_function(), 1))
+    })


I also got confused by this.

hadley added 8 commits November 5, 2025 09:06

testthat 3.3.0

5121891

First draft

Mention review bot

693a1a7

Merged origin/main into testthat-3.3.0

4782d34

Hacking it into shape

c21d383

Proofread

e4093ea

Correct path

421cac6

Polishing

1fc9171

Claude code proofreading

c49deee

hadley mentioned this pull request Nov 10, 2025

Release testthat 3.3.0 r-lib/testthat#2275

Open

24 tasks

lionel- reviewed Nov 11, 2025

View reviewed changes

teunbrand reviewed Nov 11, 2025

View reviewed changes


		Overall it was a successful experiment. It helped me close over 100 issues in what felt like less time than usual. I don't have any hard numbers, but my gut feeling is that it was maybe a 10-20% improvement to my development velocity. This is still significant, especially since I'm an experienced R programmer and my workflow has been pretty stable for the last few years. I mostly used Claude for smaller, well-defined tasks where I had a good sense of what was needed. I found it particularly useful for refactoring, where it was easy to say precisely what I wanted, but executing the changes required a bunch of fiddly edits across many files.

		I also found it generally useful for getting over the "activation energy hump": there were a few issues that had been stagnating for years because they felt like they were going to be hard to do and with relatively limited payoff. I let Claude Code loose on a few of these and found it super useful. It only produced code I was really happy with a couple of times, but every time it gave me something to react to (often with strong negative feelings!) and that got me started actually engaging with the problem.


		* All expectations have new failure messages: they now state what was expected, what was actually received, and, if possible, they clearly illustrate the difference.

		* Expectations now consistently return the value of the first argument, regardless of whether the expectation succeeds or fails (the only exception is `expect_error()` and friends which return the captured condition). This is a relatively subtle change that won't affect tests that already pass, but it does improve failures when you pipe together multiple expectations.


		## Other new features

		* testthat generally does a better job of handling nested tests, aka subtests, where you put a `test_that()` inside another `test_that()`, or more typically `it()` inside of `describe()`. Subtests will now generate more informative failure messages, free from duplication, with more informative skips if any subtests don't contain any expectations.


		* `vignette("mocking")` explains mocking in detail, and new `local_mocked_s3_method()`, `local_mocked_s4_method()`, and `local_mocked_r6_class()` make it easier to mock S3 and S4 methods and R6 classes.

		* `test_dir()`, `test_check()`, and friends gain a `shuffle` argument that uses `sample()` to randomly reorder the top-level expressions in each test file. This random reordering surfaces dependencies between tests and code outside of any test, as well as dependencies between tests, helping you find and eliminate unintentional dependencies.


		* `test_dir()`, `test_check()`, and friends gain a `shuffle` argument that uses `sample()` to randomly reorder the top-level expressions in each test file. This random reordering surfaces dependencies between tests and code outside of any test, as well as dependencies between tests, helping you find and eliminate unintentional dependencies.

		* `try_again()` is now publicized, as it's a useful tool for testing flaky code:

		* New `SlowReporter` makes it easier to find the slowest tests in your package. You can run it with `devtools::test(reporter = "slow")`.

		* New `vignette("challenging-functions")` provides an index to other documentation organized by various challenges.

testthat 3.3.0 #765

Are you sure you want to change the base?

testthat 3.3.0 #765

Conversation

hadley commented Nov 5, 2025

Uh oh!

lionel- left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

teunbrand left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants