From fdacb85903914250c90a0c8b636a8531e531f415 Mon Sep 17 00:00:00 2001 From: Hadley Wickham Date: Fri, 20 Dec 2024 12:56:12 -0600 Subject: [PATCH] Improve req_retry() docs (#600) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Including ✨ proof reading ✨ Fixes #575 --- NEWS.md | 3 ++ R/req-retries.R | 65 +++++++++++++++------------- man/req_retry.Rd | 56 ++++++++++++------------ tests/testthat/_snaps/req-retries.md | 15 +++++-- tests/testthat/test-req-retries.R | 13 ++++-- 5 files changed, 88 insertions(+), 64 deletions(-) diff --git a/NEWS.md b/NEWS.md index 05a0f3b0..64c1e8e3 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,5 +1,8 @@ # httr2 (development version) +* `req_retry()` now defaults to `max_tries = 2` with a message. + Set to `max_tries = 1` to disable retries. + * Errors thrown during the parsing of an OAuth response now have a dedicated `httr2_oauth_parse` error class that includes the original response object (@atheriel, #596). diff --git a/R/req-retries.R b/R/req-retries.R index adf396af..1faf758b 100644 --- a/R/req-retries.R +++ b/R/req-retries.R @@ -1,55 +1,55 @@ -#' Control when a request will retry, and how long it will wait between tries +#' Automatically retry a request on failure #' #' @description -#' `req_retry()` alters [req_perform()] so that it will automatically retry -#' in the case of failure. To activate it, you must specify either the total -#' number of requests to make with `max_tries` or the total amount of time -#' to spend with `max_seconds`. Then `req_perform()` will retry if the error is -#' "transient", i.e. it's an HTTP error that can be resolved by waiting. By -#' default, 429 and 503 statuses are treated as transient, but if the API you -#' are wrapping has other transient status codes (or conveys transient-ness -#' with some other property of the response), you can override the default -#' with `is_transient`. +#' `req_retry()` allows [req_perform()] to automatically retry failing +#' requests. It's particularly important for APIs with rate limiting, but can +#' also be useful when dealing with flaky servers. #' -#' Additionally, if you set `retry_on_failure = TRUE`, the request will retry -#' if either the HTTP request or HTTP response doesn't complete successfully +#' By default, `req_perform()` will retry if the response is a 429 +#' ("too many requests", often used for rate limiting) or 503 +#' ("service unavailable"). If the API you are wrapping has other transient +#' status codes (or conveys transience with some other property of the +#' response), you can override the default with `is_transient`. And +#' if you set `retry_on_failure = TRUE`, the request will retry +#' if either the HTTP request or HTTP response doesn't complete successfully, #' leading to an error from curl, the lower-level library that httr2 uses to -#' perform HTTP request. This occurs, for example, if your wifi is down. +#' perform HTTP requests. This occurs, for example, if your Wi-Fi is down. +#' +#' ## Delay #' #' It's a bad idea to immediately retry a request, so `req_perform()` will #' wait a little before trying again: #' #' * If the response contains the `Retry-After` header, httr2 will wait the #' amount of time it specifies. If the API you are wrapping conveys this -#' information with a different header (or other property of the response) -#' you can override the default behaviour with `retry_after`. +#' information with a different header (or other property of the response), +#' you can override the default behavior with `retry_after`. #' #' * Otherwise, httr2 will use "truncated exponential backoff with full -#' jitter", i.e. it will wait a random amount of time between one second and -#' `2 ^ tries` seconds, capped to at most 60 seconds. In other words, it +#' jitter", i.e., it will wait a random amount of time between one second and +#' `2 ^ tries` seconds, capped at a maximum of 60 seconds. In other words, it #' waits `runif(1, 1, 2)` seconds after the first failure, `runif(1, 1, 4)` #' after the second, `runif(1, 1, 8)` after the third, and so on. If you'd #' prefer a different strategy, you can override the default with `backoff`. #' #' @inheritParams req_perform -#' @param max_tries,max_seconds Cap the maximum number of attempts with -#' `max_tries` or the total elapsed time from the first request with -#' `max_seconds`. If neither option is supplied (the default), [req_perform()] -#' will not retry. +#' @param max_tries,max_seconds Cap the maximum number of attempts +#' (`max_tries`), the total elapsed time from the first request +#' (`max_seconds`), or both. #' -#' `max_tries` is the total number of attempts make, so this should always -#' be greater than one.` +#' `max_tries` is the total number of attempts made, so this should always +#' be greater than one. #' @param is_transient A predicate function that takes a single argument #' (the response) and returns `TRUE` or `FALSE` specifying whether or not #' the response represents a transient error. #' @param retry_on_failure Treat low-level failures as if they are -#' transient errors, and can be retried. +#' transient errors that can be retried. #' @param backoff A function that takes a single argument (the number of failed #' attempts so far) and returns the number of seconds to wait. #' @param after A function that takes a single argument (the response) and -#' returns either a number of seconds to wait or `NA`, which indicates -#' that a precise wait time is not available that the `backoff` strategy -#' should be used instead.. +#' returns either a number of seconds to wait or `NA`. `NA` indicates +#' that a precise wait time is not available and that the `backoff` strategy +#' should be used instead. #' @returns A modified HTTP [request]. #' @export #' @seealso [req_throttle()] if the API has a rate-limit but doesn't expose @@ -61,7 +61,7 @@ #' #' # use a constant 10s delay after every failure #' request("http://example.com") |> -#' req_retry(backoff = ~10) +#' req_retry(backoff = \(resp) 10) #' #' # When rate-limited, GitHub's API returns a 403 with #' # `X-RateLimit-Remaining: 0` and an Unix time stored in the @@ -86,9 +86,16 @@ req_retry <- function(req, is_transient = NULL, backoff = NULL, after = NULL) { + check_request(req) - check_number_whole(max_tries, min = 2, allow_null = TRUE) + check_number_whole(max_tries, min = 1, allow_null = TRUE) check_number_whole(max_seconds, min = 0, allow_null = TRUE) + + if (is.null(max_tries) && is.null(max_seconds)) { + max_tries <- 2 + cli::cli_inform("Setting {.code max_tries = 2}.") + } + check_bool(retry_on_failure) req_policies(req, diff --git a/man/req_retry.Rd b/man/req_retry.Rd index 20623108..0b7ebeec 100644 --- a/man/req_retry.Rd +++ b/man/req_retry.Rd @@ -2,7 +2,7 @@ % Please edit documentation in R/req-retries.R \name{req_retry} \alias{req_retry} -\title{Control when a request will retry, and how long it will wait between tries} +\title{Automatically retry a request on failure} \usage{ req_retry( req, @@ -17,16 +17,15 @@ req_retry( \arguments{ \item{req}{A httr2 \link{request} object.} -\item{max_tries, max_seconds}{Cap the maximum number of attempts with -\code{max_tries} or the total elapsed time from the first request with -\code{max_seconds}. If neither option is supplied (the default), \code{\link[=req_perform]{req_perform()}} -will not retry. +\item{max_tries, max_seconds}{Cap the maximum number of attempts +(\code{max_tries}), the total elapsed time from the first request +(\code{max_seconds}), or both. -\code{max_tries} is the total number of attempts make, so this should always -be greater than one.`} +\code{max_tries} is the total number of attempts made, so this should always +be greater than one.} \item{retry_on_failure}{Treat low-level failures as if they are -transient errors, and can be retried.} +transient errors that can be retried.} \item{is_transient}{A predicate function that takes a single argument (the response) and returns \code{TRUE} or \code{FALSE} specifying whether or not @@ -36,44 +35,45 @@ the response represents a transient error.} attempts so far) and returns the number of seconds to wait.} \item{after}{A function that takes a single argument (the response) and -returns either a number of seconds to wait or \code{NA}, which indicates -that a precise wait time is not available that the \code{backoff} strategy -should be used instead..} +returns either a number of seconds to wait or \code{NA}. \code{NA} indicates +that a precise wait time is not available and that the \code{backoff} strategy +should be used instead.} } \value{ A modified HTTP \link{request}. } \description{ -\code{req_retry()} alters \code{\link[=req_perform]{req_perform()}} so that it will automatically retry -in the case of failure. To activate it, you must specify either the total -number of requests to make with \code{max_tries} or the total amount of time -to spend with \code{max_seconds}. Then \code{req_perform()} will retry if the error is -"transient", i.e. it's an HTTP error that can be resolved by waiting. By -default, 429 and 503 statuses are treated as transient, but if the API you -are wrapping has other transient status codes (or conveys transient-ness -with some other property of the response), you can override the default -with \code{is_transient}. +\code{req_retry()} allows \code{\link[=req_perform]{req_perform()}} to automatically retry failing +requests. It's particularly important for APIs with rate limiting, but can +also be useful when dealing with flaky servers. -Additionally, if you set \code{retry_on_failure = TRUE}, the request will retry -if either the HTTP request or HTTP response doesn't complete successfully +By default, \code{req_perform()} will retry if the response is a 429 +("too many requests", often used for rate limiting) or 503 +("service unavailable"). If the API you are wrapping has other transient +status codes (or conveys transience with some other property of the +response), you can override the default with \code{is_transient}. And +if you set \code{retry_on_failure = TRUE}, the request will retry +if either the HTTP request or HTTP response doesn't complete successfully, leading to an error from curl, the lower-level library that httr2 uses to -perform HTTP request. This occurs, for example, if your wifi is down. +perform HTTP requests. This occurs, for example, if your Wi-Fi is down. +\subsection{Delay}{ It's a bad idea to immediately retry a request, so \code{req_perform()} will wait a little before trying again: \itemize{ \item If the response contains the \code{Retry-After} header, httr2 will wait the amount of time it specifies. If the API you are wrapping conveys this -information with a different header (or other property of the response) -you can override the default behaviour with \code{retry_after}. +information with a different header (or other property of the response), +you can override the default behavior with \code{retry_after}. \item Otherwise, httr2 will use "truncated exponential backoff with full -jitter", i.e. it will wait a random amount of time between one second and -\code{2 ^ tries} seconds, capped to at most 60 seconds. In other words, it +jitter", i.e., it will wait a random amount of time between one second and +\code{2 ^ tries} seconds, capped at a maximum of 60 seconds. In other words, it waits \code{runif(1, 1, 2)} seconds after the first failure, \code{runif(1, 1, 4)} after the second, \code{runif(1, 1, 8)} after the third, and so on. If you'd prefer a different strategy, you can override the default with \code{backoff}. } } +} \examples{ # google APIs assume that a 500 is also a transient error request("http://google.com") |> @@ -81,7 +81,7 @@ request("http://google.com") |> # use a constant 10s delay after every failure request("http://example.com") |> - req_retry(backoff = ~10) + req_retry(backoff = \(resp) 10) # When rate-limited, GitHub's API returns a 403 with # `X-RateLimit-Remaining: 0` and an Unix time stored in the diff --git a/tests/testthat/_snaps/req-retries.md b/tests/testthat/_snaps/req-retries.md index a44fc28d..a1d8ea12 100644 --- a/tests/testthat/_snaps/req-retries.md +++ b/tests/testthat/_snaps/req-retries.md @@ -1,3 +1,10 @@ +# has useful default (with message) + + Code + req <- req_retry(req) + Message + Setting `max_tries = 2`. + # useful message if `after` wrong Code @@ -9,17 +16,17 @@ # validates its inputs Code - req_retry(req, max_tries = 1) + req_retry(req, max_tries = 0) Condition Error in `req_retry()`: - ! `max_tries` must be a whole number larger than or equal to 2 or `NULL`, not the number 1. + ! `max_tries` must be a whole number larger than or equal to 1 or `NULL`, not the number 0. Code - req_retry(req, max_seconds = "x") + req_retry(req, max_tries = 2, max_seconds = "x") Condition Error in `req_retry()`: ! `max_seconds` must be a whole number or `NULL`, not the string "x". Code - req_retry(req, retry_on_failure = "x") + req_retry(req, max_tries = 2, retry_on_failure = "x") Condition Error in `req_retry()`: ! `retry_on_failure` must be `TRUE` or `FALSE`, not the string "x". diff --git a/tests/testthat/test-req-retries.R b/tests/testthat/test-req-retries.R index c59b3dca..e3918848 100644 --- a/tests/testthat/test-req-retries.R +++ b/tests/testthat/test-req-retries.R @@ -1,3 +1,10 @@ +test_that("has useful default (with message)", { + req <- request_test() + expect_snapshot(req <- req_retry(req)) + expect_equal(retry_max_tries(req), 2) + expect_equal(retry_max_seconds(req), Inf) +}) + test_that("can set define maximum retries", { req <- request_test() expect_equal(retry_max_tries(req), 1) @@ -70,9 +77,9 @@ test_that("validates its inputs", { req <- new_request("http://example.com") expect_snapshot(error = TRUE, { - req_retry(req, max_tries = 1) - req_retry(req, max_seconds = "x") - req_retry(req, retry_on_failure = "x") + req_retry(req, max_tries = 0) + req_retry(req, max_tries = 2, max_seconds = "x") + req_retry(req, max_tries = 2, retry_on_failure = "x") }) })