tidyverse · hadley · Nov 5, 2025 · Nov 5, 2025 · Nov 6, 2025 · Nov 6, 2025
diff --git a/.gitignore b/.gitignore
@@ -6,4 +6,7 @@ public
 .DS_Store
 README.html
 
+# Don't accidentally commit quarto generated files
 /.quarto/
+content/**/index_files/**
+content/**/index.html
diff --git a/content/blog/testthat-3-3-0/index.Rmd b/content/blog/testthat-3-3-0/index.Rmd
@@ -0,0 +1,200 @@
+---
+output: hugodown::hugo_document
+
+slug: testthat-3-3-0
+title: testthat 3.3.0
+date: 2025-11-05
+author: Hadley Wickham
+description: >
+    testthat 3.3.0 brings improved expectations with better error messages,
+    new expectations for common testing patterns, and lifecycle changes including the removal of `local_mock()` and `with_mock()`. It also includes
+    a write-up of my experience doing package development with Claude Code.
+photo:
+  url: https://unsplash.com/photos/a-rack-filled-with-lots-of-yellow-hard-hats-wp81DxKUd1Ez
+  author: Pop & Zebra
+
+# one of: "deep-dive", "learn", "package", "programming", "roundup", or "other"
+categories: [package] 
+tags: [testthat, devtools]
+---
+
+```{=html}
+<!--
+TODO:
+* [x] Look over / edit the post's title in the yaml
+* [x] Edit (or delete) the description; note this appears in the Twitter card
+* [x] Pick category and tags (see existing with `hugodown::tidy_show_meta()`)
+* [x] Find photo & update yaml metadata
+* [x] Create `thumbnail-sq.jpg`; height and width should be equal
+* [x] Create `thumbnail-wd.jpg`; width should be >5x height
+* [x] `hugodown::use_tidy_thumbnails()`
+* [x] Add intro sentence, e.g. the standard tagline for the package
+* [x] `usethis::use_tidy_thanks()`
+-->
+```
+
+```{r}
+#| include: false
+knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
+options(cli.width = 70)
+```
+
+We're chuffed to announce the release of [testthat](https://testthat.r-lib.org) 3.3.0. testthat is a testing framework for R that makes it easy to turn your existing informal tests into formal, automated tests that you can rerun quickly and easily.
+
+You can install it from CRAN with:
+
+```{r, eval = FALSE}
+install.packages("testthat")
+```
+
+This blog post highlights the most important changes in this release, including lifecycle changes that removed long-deprecated mocking functions, improvements to expectations and their error messages, and a variety of new features that make testing easier and more robust. You can see a full list of changes in the [release notes](https://github.com/r-lib/testthat/releases/tag/v3.3.0).
+
+```{r setup}
+library(testthat)
+```
+
+## Claude Code experiences
+
+Before we dive into the changes, I wanted to talk a little bit about some changes to my development process, as I used this release as an opportunity to learn [Claude Code](https://www.claude.com/product/claude-code). This is the first package where I've really used AI to support the development of many features and I thought it might be useful to share my experience.
+
+Overall it was a successful experiment. It helped me close over 100 issues in what felt like less time than usual. I don't have any hard numbers, but my gut feeling is that it was maybe a 10-20% improvement to my development velocity. This is still significant, especially since I'm an experienced R programmer and my workflow has been pretty stable for the last few years. I mostly used Claude for smaller, well-defined tasks where I had a good sense of what was needed. I found it particularly useful for refactoring, where it was easy to say precisely what I wanted, but executing the changes required a bunch of fiddly edits across many files.
+
+I also found it generally useful for getting over the "activation energy hump": there were a few issues that had been stagnating for years because they felt like they were going to be hard to do and with relatively limited payoff. I let Claude Code loose on a few of these and found it super useful. It only produced code I was really happy with a couple of times, but every time it gave me something to react to (often with strong negative feelings!) and that got me started actually engaging with the problem.
+
+If you're interested in using Claude Code yourself, there are a couple of files you might find useful. My [`CLAUDE.md`](https://github.com/r-lib/testthat/blob/main/.claude/CLAUDE.md) tells Claude how to execute a devtools-based workflow, along with a few pointers to resolve common issues. My [`settings.json`](https://github.com/r-lib/testthat/blob/main/.claude/settings.json) allows Claude to run longer without human intervention, doing things that should mostly be safe. One note of caution: these settings do allow Claude to run R code, which does allow it to do practically anything. In my experience, Claude only used R to run tests or documentation.
+
+I also experimented with using Claude Code to review PRs. It was just barely useful enough that I kept it turned on for my own PRs, but I didn't bother trying to get it to work for contributed PRs. Most of the time it either gave a thumbs up or bad advice, but every now and then it would pick up a small error.
+
+(I've also used Claude Code to proofread this blog post!)
+
+## Lifecycle changes
+
+The biggest change in this release is that `local_mock()` and `with_mock()` are defunct. They were deprecated in 3.0.0 (2020-10-31) because it was becoming clear that the technique that made them work would be disallowed in a future version of R. This has now happened in R 4.5.0, so the functions have been removed. Removing `local_mock()` and `with_mock()` was a fairly disruptive change, affecting ~100 CRAN packages, but it had to be done, and I've been working on notifying package developers since January so everyone had plenty of time to update. Fortunately, the needed changes are generally small, since the newer `local_mocked_bindings()` and `with_mocked_bindings()` can solve most additional needs. (If you haven't heard of mocking before, you can read the new `vignette("mocking")` to learn what it is and why you might want to use it.)
+
+Other lifecycle changes:
+
+* testthat now requires R 4.1. This follows [our supported version policy](https://tidyverse.org/blog/2019/04/r-version-support/), which documents our commitment to support five versions of R (the current version and four previous versions). We're excited to be able to finally take advantage of the base pipe and compact anonymous functions (i.e. `\(x) x + 1`)!
+
+* `is_null()`/`matches()`, deprecated in 2.0.0 (2017-12-19), and `is_true()`/`is_false()`, deprecated in 2.1.0 (2019-04-23), have been removed. These conflicted with other tidyverse functions so we pushed their deprecation through, even though we have generally left the old `test_that()` API untouched.
+
+* `expect_snapshot(binary)`, soft deprecated in 3.0.3 (2021-06-16), is now fully deprecated. `test_files(wrap)`, deprecated in 3.0.0 (2020-10-31), has now been removed.
+
+* There were a few other changes that broke existing packages. The most impactful change was to start checking the inputs to `expect()` which, despite the name, is actually an internal helper. That revealed a surprising number of packages were accidentally using `expect()` instead of `expect_true()` or `expect_equal()`. We don't technically consider this a breaking change because it revealed off-label function usage: the function API hasn't changed; you just now learn when you're using it incorrectly.
+
+If you're interested in the process we use to manage the release of a package that breaks its reverse dependencies, you might like to read [the issue](https://github.com/r-lib/testthat/issues/2021) where I track all the problems and prepare PRs to fix them.
+
+## Expectations and the interactive testing experience
+
+A lot of work in this release was prompted by an overhaul of `vignette("custom-expectations")`, which describes how to create your own expectations that work just like testthat's. This is a long time coming, and as I was working on it, I realized that I didn't really know how to write new expectations, which had led to a lot of variation in the existing implementations. This kicked off a bunch of experimentation and iterating, leading to a swath of improvements:
+
+* All expectations have new failure messages: they now state what was expected, what was actually received, and, if possible, they clearly illustrate the difference.
+
+* Expectations now consistently return the value of the first argument, regardless of whether the expectation succeeds or fails (the only exception is `expect_error()` and friends which return the captured condition so that you can perform additional checks on the condition object). This is a relatively subtle change that won't affect tests that already pass, but it does improve failures when you pipe together multiple expectations.
+
+* A new `pass()` function makes it clear how to signal when an expectation succeeds. All existing expectations were rewritten to use `pass()` and (the existing) `fail()` instead of `expect()`, which I think makes the flow of logic easier to understand.
+
+* Improved `expect_success()` and `expect_failure()` expectations now test that an expectation always returns exactly one success or failure (this ensures that the counts that you see in the reporters are correct).
+
+This new framework helped us write six new expectations:
+
+*   `expect_all_equal()`, `expect_all_true()`, and `expect_all_false()` check that every element of a vector has the same value, giving better error messages than `expect_true(all(...))`:
+
+    ```{r}
+    #| error: true
+
+    test_that("some test", {
+      x <- c(0.408, 0.961, 0.883, 0.46, 0.537, 0.961, 0.851, 0.887, 0.023)
+      expect_all_true(x < 0.95)
+    })
+    ```
+
+*   `expect_disjoint()`, by [@stibu81](https://github.com/stibu81), expects values to be absent:
+
+    ```{r}
+    #| error: true
+
+    test_that("", {
+      expect_disjoint(c("a", "b", "c"), c("c", "d", "e"))
+    })
+    ```
+
+*   `expect_r6_class()` expects an R6 object:
+
+    ```{r}
+    #| error: true
+
+    test_that("", {
+      x <- 10
+      expect_r6_class(x, "foo")
+
+      x <- R6::R6Class("bar")$new()
+      expect_r6_class(x, "foo")
+    })
+    ```
+
+*   `expect_shape()`, by [@michaelchirico](https://github.com/michaelchirico), expects a specific shape (i.e., `nrow()`, `ncol()`, or `dim()`):
+
+    ```{r}
+    #| error: true
+
+    test_that("show off expect_shape() failure messages", {
+      x <- matrix(1:9, nrow = 3)
+      expect_shape(x, nrow = 4)
+      expect_shape(x, dim = c(3, 3, 3))
+      expect_shape(x, dim = c(3, 4))
+    })
+    ```
+
+As you can see from the examples above, when you run a single test interactively (i.e. not as a part of a test suite) you now see exactly how many expectations succeeded and failed.
+
+## Other new features
+
+* testthat generally does a better job of handling nested tests, aka subtests, where you put a `test_that()` inside another `test_that()`, or more typically `it()` inside of `describe()`. Subtests will now generate more informative failure messages, free from duplication, with more informative skips if any subtests don't contain any expectations.
+
+* The snapshot experience has been significantly improved, with all known bugs fixed and some new helpers added: `snapshot_reject()` rejects all modified snapshots by deleting the `.new` variants, and `snapshot_download_gh()` makes it easy to get snapshots off GitHub and into your local package. Additionally, `expect_snapshot()` and friends will now fail when creating a new snapshot on CI, as that's usually a signal that you've forgotten to run the snapshot code locally before committing.
+
+* On CRAN, `test_that()` will automatically skip if a package is not installed, which means that you no longer need to check if suggested packages are installed in your tests.
+
+* `vignette("mocking")` explains mocking in detail, and new `local_mocked_s3_method()`, `local_mocked_s4_method()`, and `local_mocked_r6_class()` make it easier to mock S3 and S4 methods and R6 classes.
+
+* `test_dir()`, `test_check()`, and friends gain a `shuffle` argument that uses `sample()` to randomly reorder the top-level expressions in each test file. This random reordering surfaces dependencies between tests and code outside of any test, as well as dependencies between tests, helping you find and eliminate unintentional dependencies.
+
+* `try_again()` is now publicized, as it's a useful tool for testing flaky code:
+
+    ```{r}
+    #| eval: false
+
+    flaky_function <- function() {
+      if (runif(1) < 0.1) 0 else 1
+    }
+
+    # 10% chance of failure:
+    test_that("my flaky test is ok", {
+      skip_on_cran()
+      expect_equal(flaky_function(), 1)
+    })
+
+    # 1% chance of failure:
+    test_that("my flaky test is ok", {
+      skip_on_cran()
+      try_again(1, expect_equal(flaky_function(), 1))
+    })
+
+    # 0.1% chance of failure:
+    test_that("my flaky test is ok", {
+      skip_on_cran()
+      try_again(2, expect_equal(flaky_function(), 1))
+    })
+    ```
+
+    Note that it's still good practice to skip such tests on CRAN.
+
+* New `skip_unless_r()` skips tests on unsuitable versions of R. It has a convenient syntax so you can use, e.g., `skip_unless_r(">= 4.1.0")` to skip tests that require `...names()`.
+
+* New `SlowReporter` makes it easier to find the slowest tests in your package. You can run it with `devtools::test(reporter = "slow")`.
+
+* New `vignette("challenging-functions")` provides an index to other documentation organized by various challenges.
+
+## Acknowledgements
+
+A big thank you to all the folks who helped make this release happen: [&#x0040;3styleJam](https://github.com/3styleJam), [&#x0040;afinez](https://github.com/afinez), [&#x0040;andybeet](https://github.com/andybeet), [&#x0040;atheriel](https://github.com/atheriel), [&#x0040;averissimo](https://github.com/averissimo), [&#x0040;d-morrison](https://github.com/d-morrison), [&#x0040;DanChaltiel](https://github.com/DanChaltiel), [&#x0040;DanielHermosilla](https://github.com/DanielHermosilla), [&#x0040;eitsupi](https://github.com/eitsupi), [&#x0040;EmilHvitfeldt](https://github.com/EmilHvitfeldt), [&#x0040;emstruong](https://github.com/emstruong), [&#x0040;gaborcsardi](https://github.com/gaborcsardi), [&#x0040;gael-millot](https://github.com/gael-millot), [&#x0040;hadley](https://github.com/hadley), [&#x0040;hoeflerb](https://github.com/hoeflerb), [&#x0040;jamesfowkes](https://github.com/jamesfowkes), [&#x0040;jan-swissre](https://github.com/jan-swissre), [&#x0040;jdblischak](https://github.com/jdblischak), [&#x0040;jennybc](https://github.com/jennybc), [&#x0040;jeroenjanssens](https://github.com/jeroenjanssens), [&#x0040;kevinushey](https://github.com/kevinushey), [&#x0040;krivit](https://github.com/krivit), [&#x0040;kubajal](https://github.com/kubajal), [&#x0040;lawalter](https://github.com/lawalter), [&#x0040;m-muecke](https://github.com/m-muecke), [&#x0040;maelle](https://github.com/maelle), [&#x0040;math-mcshane](https://github.com/math-mcshane), [&#x0040;mcol](https://github.com/mcol), [&#x0040;metanoid](https://github.com/metanoid), [&#x0040;MichaelChirico](https://github.com/MichaelChirico), [&#x0040;moodymudskipper](https://github.com/moodymudskipper), [&#x0040;njtierney](https://github.com/njtierney), [&#x0040;nunotexbsd](https://github.com/nunotexbsd), [&#x0040;pabangan](https://github.com/pabangan), [&#x0040;pachadotdev](https://github.com/pachadotdev), [&#x0040;plietar](https://github.com/plietar), [&#x0040;schloerke](https://github.com/schloerke), [&#x0040;schuemie](https://github.com/schuemie), [&#x0040;sebkopf](https://github.com/sebkopf), [&#x0040;shikokuchuo](https://github.com/shikokuchuo), [&#x0040;snystrom](https://github.com/snystrom), [&#x0040;stibu81](https://github.com/stibu81), [&#x0040;TimTaylor](https://github.com/TimTaylor), and [&#x0040;tylermorganwall](https://github.com/tylermorganwall).