Skip to content

Automated tests on CourtListener

Alberto Islas edited this page Dec 27, 2024 · 2 revisions

Any time you're contributing to or hacking on code base, whether adding features or fixing issues, you should validate your changes don't break core functionality by executing the test suite. This is also a great way to validate your development environment is properly configured. You should also add new tests for your new feature or fix.

In general, the easiest way to run the test suite is via Django's test command. In docker, that's:

docker exec -it cl-django python /opt/courtlistener/manage.py test cl --exclude-tag selenium --keepdb

The cl parameter is the name of the Python package to search for tests. It's not required, but a good habit to learn as you can more specifically specify tests by providing more details. For example:

  • cl.search to execute only tests in the search module, or...
  • cl.search.tests.SearchTest to run a particular test class, or...
  • cl.search.tests.SearchTest.test_a_simple_text_query to run a particular test.

Also:

--exclude-tag selenium is used to exclude selenium tests during local development. They'll be run on CI and they take a while, so it's sort of best not to bother with them most of the time.

--keepdb will keep your database between tests, a big speed up.

We use a custom test runner to make our tests a even faster:

  1. By default, it disables output aside from warnings. This makes tests slightly faster. You can enable output with our special command, --enable-logging.

  2. By default, it runs tests in parallel. Normally, you have to use the --parallel flag of the test command to do this, but developers forget. No more. If you want to override this so your tests run on a single core (why would you?) you could pass --parallel=1.

  3. No matter how many databases you have configured in your settings, only one is used during tests. This makes tests faster since they don't have to mess around with transactions in databases that aren't even used.

  4. When you use --keepdb, if your database was not deleted because the last run crashed, it will delete it for you. Ahhh.

  5. We use custom test classes (see below) and our runner blocks you from using other test classes.

For more details, Django provides a lot of documentation on [testing in Django][django-testing]. Make sure to read the docs related to the current release used in CourtListener.

This can also be set up using Intellij and a Docker compose file.

About the Types of Tests

There are a few different types of tests in the CourtListener test suite and can be generally categorized as follows in increasing complexity:

  • Unit Tests that exercise core application logic and may require some level of access to the Django test database,

  • Elasticsearch Tests that relies on the ESIndexTestCase class to prevent data collisions among test classes when running tests in parallel.

  • Selenium Tests that rely on the full-stack of Django, the database, and Elasticsearch to be available in order to test from the point of view of a web browser accessing the application.

Unit Tests

Unit tests all derive from the classes in cl.tests.cases. Typically, they will not need database access, and should thus use cl.tests.cases.SimpleTestCase. If possible, these should run without a functioning Elasticsearch, Postgresql, or Selenium environment.

These are the bread and butter of validating functions and business logic. You should contribute these when you write any new functions or update them when enhancing existing functions.

Elasticsearch Tests

Elasticsearch/search tests should derive from cl.tests.cases.ESIndexTestCase. This class contains proper setup and teardown methods to create and delete unique indexes per test class, preventing data collisions when running tests in parallel. The class also includes helper methods to manage indexes: rebuild_index (delete and create the index), create_index, and delete_index. These methods can be used to ensure that previous data is removed from the test indices or to have custom control over the creation and removal of indices for specific tests.

To populate a test index, you can create model instance factories within setUpTestData or inside a test method.

There are two methods to ensure these instances get indexed:

  • If your test class derives from cl.tests.cases.TestCase for database support, you can wrap your model factory instances within the context manager with self.captureOnCommitCallbacks(execute=True):, and instances will be automatically indexed upon creation. Alternatively, you can call the provided Django management command cl.search.management.commands.cl_index_parent_and_child_docs using the appropriate search_type (only available for PEOPLE, RECAP, and OPINION search types) and set testing_mode=True, e.g:
call_command(
  "cl_index_parent_and_child_docs",
  search_type=SEARCH_TYPES.RECAP,
  testing_mode=True,
)
  • If your test class derives from cl.tests.cases.TransactionTestCase for database support, you only need to create model instance factories, and they will be automatically indexed without using any of the methods described previously.

Selenium Tests

Selenium tests should derive from cl.tests.base.BaseSeleniumTest, which automatically handles the setup and teardown of a Selenium webdriver instance available at self.browser from within your test code.

There are some helper methods provided via BaseSeleniumTest as well:

  • reset_browser() - start a new browser session
  • click_link_for_new_page(link_text, timeout) - a wrapper around the Selenium functions for finding an anchor based on the anchor text and calling click(), but also does an explicit wait up until timeout seconds for the browser page to change. Use when expecting a navigation event.
  • attempt_sign_in(username, password) - from a given CL page, will attempt to use the Sign in / Register link and input the given username and password.
  • get_url_and_wait(url, timeout) - will input the given url into the browser's address bar, submit, and wait until timeout seconds for the given url to load.
  • assert_text_in_body(text) - attempts to find the given text in the body of the current web page, failing the test if not found
  • assert_text_not_in_body(text) - similar to previous, but tests that text is NOT in the body, failing if it's found.
  • extract_result_count_from_serp() - if on the search result page, will attempt to find and parse the total results found count into a number and return it.

Windows/WSL Tip: If you are running tests on a Windows machine with WSL you probably hit a wall because we don't have a /dev/shm directory and this won't let you run the selenium tests. To fix this: you need to get the full path to /dev/shm or /run/shm from your WSL virtual machine, in my case is: \wsl.localhost\Ubuntu-20.04\run\shm (you can get this path from Windows explorer), the next thing to do is to set the environment variable CL_SHM_DIR to that path and then restart your cl-selenium container.

Viewing the Remote Selenium Browser

You can watch the remote selenium browser using VNC. To do so, start a VNC client, and then connect to:

0.0.0.0:5900

The password is secret. Make sure that SELENIUM_HEADLESS is set to False or else you'll see nothing.

With those things done, run some tests and watch as it goes!

Increasing the Test Timeouts

The Selenium tests are wrapped with a timeout annotation that will fail them if they take too long to run. If you need to increase, or even want to decrease, this value then the easiest step is to set the SELENIUM_TIMEOUT environment variable to the given time in seconds.

For example, for a 2 minute timeout, you might do the following on Linux (or within the FreeLawBox):

export SELENIUM_TIMEOUT=120

Taking Screenshots on Failure

While a little flaky at the moment, most Selenium tests will be able to take a screenshot of the browser window on a failure.

To enable screenshots, simply define a SELENIUM_DEBUG environment variable set to anything. It's presence indicates it's enabled.

export SELENIUM_DEBUG=1

That will create screenshots at the end of every test as part of the tearDown method. If you want screenshots at other times, you can always add a line like:

self.browser.save_screenshot('/tmp/' + filename)

Screenshots will be saved into the cl-django container. To grab them, you can use docker cp. On GitHub, if the tests fail, these are stored as an "artifact" of the build, and you can download them to inspect them.