Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check if tests are coupled #1938

Open
wants to merge 20 commits into
base: main
Choose a base branch
from
Open

Check if tests are coupled #1938

wants to merge 20 commits into from

Conversation

IndrajeetPatil
Copy link
Collaborator

cf. #1937

@codecov-commenter

This comment was marked as off-topic.

@IndrajeetPatil
Copy link
Collaborator Author

Looks like there is some order dependence between our tests.

@MichaelChirico
Copy link
Collaborator

Could you describe how this works? I'm not reproducing. To check, I tried running our test suite with each file in an individual subprocess:

tests = list.files('tests/testthat', pattern = '^test', full.names = TRUE)
for (test in tests) {
  system2("Rscript", c("-e", shQuote(paste(
    c(
      'attach(asNamespace("lintr"), warn.conflicts=FALSE)',
      "library(testthat)",
      sprintf('test_file("%s")', test)
    ),
    collapse = ";"
  ))))
}

I see the errors in test-knitr_formats.R and test-object_name_linter.R (trivial fix incoming), but not any others.

@MichaelChirico
Copy link
Collaborator

MichaelChirico commented Apr 9, 2023

I guess this test also expects test_that() units within each file to run independently. Testing:

library(withr)
library(xml2)
library(xmlparsedata)

tests = list.files('tests/testthat', pattern = '^test', full.names = TRUE)
non_tests = setdiff(list.files('tests/testthat', full.names = TRUE), tests)
# required for assumed directory structure in tests to work in tempdir()
file.copy(non_tests, tempdir(), recursive = TRUE)

unit_xp = "expr/SYMBOL_FUNCTION_CALL[text() = 'test_that' or text() = 'with_parameters_test_that']"
lines_by_xpath <- function(xml, xpath) {
  matches <- xml_find_all(xml, xpath)
  line1 <- as.integer(xml_attr(matches, "line1"))
  line2 <- as.integer(xml_attr(matches, "line2"))
  mapply(`:`, line1, line2, SIMPLIFY = FALSE)
}
# NB: "testthat::CompactProgressReporter" is more informative
run_units_individually <- function(test_file, reporter = "testthat::FailReporter") {
  message(basename(test_file))
  test_xml <- read_xml(xml_parse_data(parse(test_file)))
  test_lines <- readLines(test_file)

  boilerplate_lines <- lines_by_xpath(test_xml, sprintf("expr[not(%s)]", unit_xp))
  test_boilerplate <- test_lines[unlist(boilerplate_lines)]

  test_that_units <- lines_by_xpath(test_xml, sprintf("expr[%s]", unit_xp))
  for (unit in test_that_units) {
    if (reporter == "testthat::FailReporter") message(".", appendLF = FALSE)
    unit_name <- gsub(" ", "_", gsub('^[^"]*"|"[^"]*$', '', test_lines[unit[1L]]))
    with_tempfile("tmp", pattern = paste0(basename(test_file), "_", unit_name), {
      cat(test_boilerplate, file = tmp, sep = "\n")
      cat(test_lines[unit], file = tmp, sep = "\n", append = TRUE)
      system2(
        "Rscript",
        env = "TESTTHAT_EDITION=3",
        c("-e", shQuote(sprintf(
          'testthat::test_file("%s", package = "lintr", reporter = %s)',
          tmp, reporter
        )))
      )
    })
  }
  if (reporter == "testthat::FailReporter")   cat("\n")
}
for (test in tests) run_units_individually(test)

I see some failures, but not the same as this CI test is giving (omitting passing files):

test-get_source_expressions.R
.............Error: Failures detected.
Execution halted
..
test-linter_tags.R
.........Error: Failures detected.
Execution halted
test-parse_exclusions.R
....Error: Failures detected.
Execution halted
.Error: Failures detected.
Execution halted
...

Investigating more.

@MichaelChirico
Copy link
Collaborator

The ones in test-get_source_expressions.R and test-linter_tags.R look spurious to me. Here's the failure output for test-parse_exclusions.R:

══ Testing test-parse_exclusions.R454a948b83fca ════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
[ FAIL 0 | WARN 0 | SKIP 0 | PASS 1 ] Done!

══ Testing test-parse_exclusions.R454a974dd3abe ════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
[ FAIL 0 | WARN 0 | SKIP 0 | PASS 2 ] Done!

══ Testing test-parse_exclusions.R454a930e47343 ════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
[ FAIL 0 | WARN 0 | SKIP 0 | PASS 3 ] Done!

══ Testing test-parse_exclusions.R454a973607214 ════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
[ FAIL 1 | WARN 0 | SKIP 0 | PASS 0 ]

── Failure (test-parse_exclusions.R454a973607214:21:3): it supports multiple linter exclusions ──
lintr:::parse_exclusions(t1) (`actual`) not identical to list(...) (`expected`).

`actual` is length 0
`expected` is length 4

`names(actual)` is absent
`names(expected)` is a character vector ('a', 'b', 'c', '')

`actual[[1]]` is absent
`expected[[1]]` is an integer vector (2, 4, 5, 6)

`actual[[2]]` is absent
`expected[[2]]` is an integer vector (2, 4, 5, 6)

`actual[[3]]` is absent
`expected[[3]]` is an integer vector (4, 5, 6)

`actual[[4]]` is absent
`expected[[4]]` is an integer vector (1, 5, 8, 9, 10, ...)

[ FAIL 1 | WARN 0 | SKIP 0 | PASS 0 ]

══ Testing test-parse_exclusions.R454a91f8e2630 ════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
[ FAIL 1 | WARN 0 | SKIP 0 | PASS 0 ]

── Failure (test-parse_exclusions.R454a91f8e2630:16:3): it supports overlapping exclusion ranges ──
lintr:::parse_exclusions(t1) (`actual`) not identical to list(a = 1L:5L, b = 3L:6L) (`expected`).

`actual` is length 0
`expected` is length 2

`names(actual)` is absent
`names(expected)` is a character vector ('a', 'b')

`actual$a` is absent
`expected$a` is an integer vector (1, 2, 3, 4, 5)

`actual$b` is absent
`expected$b` is an integer vector (3, 4, 5, 6)

[ FAIL 1 | WARN 0 | SKIP 0 | PASS 0 ]

══ Testing test-parse_exclusions.R454a910304a6e ════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
[ FAIL 0 | WARN 0 | SKIP 0 | PASS 2 ] Done!

══ Testing test-parse_exclusions.R454a940e45510 ════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
[ FAIL 0 | WARN 0 | SKIP 0 | PASS 1 ] Done!

══ Testing test-parse_exclusions.R454a92dbfdaf1 ════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
[ FAIL 0 | WARN 0 | SKIP 0 | PASS 2 ] Done!

@IndrajeetPatil
Copy link
Collaborator Author

This is the exact script I am using in the GHA workflow:

options(crayon.enabled = TRUE)
withr::local_envvar(TESTTHAT_PARALLEL = "FALSE")
library(cli)
library(glue)
pkgload::load_all(".")

test_script_paths <- testthat::find_test_scripts("tests/testthat")

seed <- sample.int(1e6, 1L)
cli_inform("Chosen seed for the current test run: {seed}")
set.seed(seed)

randomized_test_script_paths <- sample(test_script_paths)
any_test_failures <- FALSE
any_test_errors <- FALSE

test_path <- function(path) {
  report <- as.data.frame(testthat::test_file(path, reporter = "silent"))
  has_test_failures <- any(report$failed == 1L)
  has_test_errors <- any(report$error == 1L)

  if (has_test_failures) {
    cli_alert_danger(glue("Tests in `{path}` are failing."))
    any_test_failures <<- TRUE
    failed_tests <- tibble::as_tibble(subset(report, failed == 1L)[, c("test", "result")])
    print(glue::glue_data(failed_tests, "Test `{test}` is failing:\n{purrr::pluck(result, 1L, 1L)}"))
  }

  if (has_test_errors) {
    cli_alert_danger(glue("There was error while running tests in `{path}`."))
    any_test_errors <<- TRUE
    errored_tests <- tibble::as_tibble(subset(report, error == 1L)[, c("test", "result")])
    print(glue::glue_data(errored_tests, "Test `{test}` has error:\n{purrr::pluck(result, 1L, 1L)}"))
  }

  if (!has_test_failures && !has_test_errors) {
    cli_alert_success(glue("All tests passing in `{path}`."))
  }
}

cli_rule()
cli_inform("Running tests in random order:")
cli_rule()

purrr::walk(randomized_test_script_paths, test_path)

cli_rule()

if (any_test_failures) {
  cli_abort("Tests in some files are failing.")
}

if (any_test_errors) {
  cli_abort("There was error while running tests in some files.")
}

if (!any_test_failures && !any_test_errors) {
  cli_alert_success("Tests from all files are passing!")
}

cli_rule()

The approach is very simple: just randomize the order in which each test file is run. Tests within each file are always run in the same order, though. Maybe that's also something that can be included as an additional check.

@AshesITR
Copy link
Collaborator

I'd suggest putting the script into .dev instead of inline.
This would also allow adding convenience features such as a command line argument for a static seed when trying to debug a failure.

@IndrajeetPatil IndrajeetPatil added the internals Issues related to inner workings of lintr, i.e., not user-visible label Jun 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internals Issues related to inner workings of lintr, i.e., not user-visible testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants