Common issues and solutions for Tach.
Run these commands to check system compatibility:
# Kernel version (needs 5.13+ for full features)
uname -r
# Landlock support
cat /sys/kernel/security/lsm | grep landlock
# Seccomp support
grep CONFIG_SECCOMP /boot/config-$(uname -r)
# Python version
python --version
# userfaultfd support
cat /proc/sys/vm/unprivileged_userfaultfdSymptom:
error: could not find Python interpreter
Solution:
export PYO3_PYTHON=$(which python)
cargo buildWSL2 Users: Source
.envrcto automatically set PYO3_PYTHON:source .envrc
Symptom:
error: Python 3.10+ required
Solution:
# Use specific Python
export PYO3_PYTHON=/usr/bin/python3.12
cargo build
# Or with virtual environment
python3.12 -m venv .venv
source .venv/bin/activate
export PYO3_PYTHON=$(which python)
cargo buildSymptom:
error: linker `cc` not found
Solution:
# Ubuntu/Debian
sudo apt install build-essential
# Fedora
sudo dnf install gcc make
# Arch
sudo pacman -S base-develSymptom:
error: failed to run custom build command for `tikv-jemallocator`
Solution:
# Install autoconf
sudo apt install autoconf
# Clean and rebuild
cargo clean
cargo buildSymptom:
[WARN] Landlock not available: EPERM
Cause: Kernel < 5.13 or Landlock not enabled.
Diagnosis:
# Check kernel version
uname -r
# Check if Landlock is in LSM list
cat /sys/kernel/security/lsmSolution:
- Upgrade kernel to 5.13+
- Or run with
--no-isolation(reduced security)
Symptom:
[WARN] Seccomp filter rejected: EPERM
Cause: Seccomp-BPF not enabled in kernel.
Diagnosis:
grep CONFIG_SECCOMP /boot/config-$(uname -r)
# Should show: CONFIG_SECCOMP=y and CONFIG_SECCOMP_FILTER=yIf /boot/config-* doesn't exist (common on WSL2 or cloud kernels):
zgrep CONFIG_SECCOMP /proc/config.gzSolution:
- Tach degrades gracefully (Landlock-only mode)
- For full security, enable seccomp in kernel config
Symptom:
Error: userfaultfd creation failed: EPERM
Cause: Unprivileged userfaultfd disabled.
Diagnosis:
cat /proc/sys/vm/unprivileged_userfaultfd
# 0 = disabled, 1 = enabledSolution:
# Enable temporarily
sudo sysctl vm.unprivileged_userfaultfd=1
# Enable permanently
echo 'vm.unprivileged_userfaultfd=1' | sudo tee /etc/sysctl.d/99-userfaultfd.conf
sudo sysctl --systemSymptom: Tests hang indefinitely without progress.
Common Causes:
| Cause | Solution |
|---|---|
| Clone syscall blocked | Ensure clone NOT in seccomp filter |
| Deadlock in test | Check for lock contention in test code |
| Infinite loop in fixture | Add timeout to fixture |
| Network wait in sandbox | Use --no-isolation for network tests |
Diagnosis:
# Check for stuck processes
ps aux | grep tach
# Trace syscalls
strace -f -p <PID> 2>&1 | tail -20
# Check what process is waiting on
cat /proc/<PID>/wchanSymptom:
CRASH: test_example.py::test_foo
Common Causes:
| Cause | Solution |
|---|---|
| Segfault in C extension | Check extension compatibility |
| Out of memory | Increase memory limits |
| Blocked syscall | Check seccomp filter |
| Signal handling | Test may be catching signals |
Diagnosis:
# Run with reduced isolation
tach-core --no-isolation tests/
# Check coredump
coredumpctl list
coredumpctl info <PID>Symptom: Coverage report shows 0% or missing files.
Common Causes:
| Cause | Solution |
|---|---|
| Python < 3.12 | Upgrade Python (PEP 669 required) |
| Source not in path | Add source to [tool.tach.coverage].source |
| Files omitted | Check [tool.tach.coverage].omit patterns |
| Buffer overflow | Increase buffer size (rare) |
Diagnosis:
# Check Python version
python --version
# Verify coverage enabled
tach-core --coverage . 2>&1 | head -20Symptom:
Discovered N tests in M files
Common Causes:
| Cause | Solution |
|---|---|
| Wrong pattern | Check [tool.tach].test_pattern |
| Syntax error in test file | Fix Python syntax |
| Non-standard naming | Rename to test_*.py |
| Wrong directory | Specify correct path |
Diagnosis:
# List discovered tests
tach-core list .
# Check for syntax errors
python -m py_compile tests/test_example.pySymptom:
Discovered N tests, M fixtures
Discovery reports zero tests even though test files exist and have valid syntax.
Cause:
The .ignore file (used by tools like Claude Code for context filtering) may contain a pattern that blocks Python files:
*.py
Tach uses the ignore crate for file discovery, which respects .ignore files. This pattern causes ALL Python files to be skipped during test discovery.
Diagnosis:
# Check if .ignore contains *.py
grep '^\*\.py$' .ignore && echo "FOUND: *.py in .ignore is blocking discovery"
# Verify files exist but are being ignored
ls tests/**/*.py # Files exist
tach-core list . # But discovery finds nothingSolution:
Remove *.py from .ignore:
sed -i '/^\*\.py$/d' .ignoreOr edit .ignore manually and remove the *.py line.
Prevention:
If you need to exclude Python files from other tools but not from tach-core, use more specific patterns (e.g., src/**/*.py instead of *.py).
Note: The
.ignorefile format is shared between multiple tools. Patterns added for one tool may affect others that use theignorecrate (ripgrep, fd, tach-core, etc.).
Symptom:
Error: Fixture 'my_fixture' not found
Common Causes:
| Cause | Solution |
|---|---|
| Missing conftest.py | Create conftest.py with fixture |
| Fixture in wrong scope | Move to correct conftest.py |
| Typo in fixture name | Check spelling |
| Dynamic fixture | Tach uses static analysis only |
Diagnosis:
# Check conftest.py exists
ls -la tests/conftest.py
# Verify fixture is defined
grep -r "def my_fixture" tests/Symptom: Async tests marked as skipped.
Solution: Ensure pytest-asyncio is installed and fixtures are properly scoped:
# conftest.py
import pytest
@pytest.fixture
def event_loop():
import asyncio
loop = asyncio.new_event_loop()
yield loop
loop.close()Symptom: Long delay before first test runs.
Cause: Zygote initialization includes importing all dependencies.
Solution:
- Reduce imports in conftest.py
- Lazy-load heavy dependencies
- Use bytecode cache (enabled by default)
Diagnosis:
# Profile import time
python -X importtime -c "import your_module" 2>&1 | head -30Symptom: Tests consuming excessive memory.
Cause: Large test data or memory leaks in tests.
Solution:
# Check memory usage
/usr/bin/time -v tach-core .
# Profile with valgrind
valgrind --tool=massif ./target/release/tach-core .Symptom: Tests running slower than expected.
Diagnosis:
# Check for toxic tests (require full restart)
tach-core list . 2>&1 | grep -i toxic
# Profile with perf
perf record -g ./target/release/tach-core .
perf reportUnder high concurrency (100+ workers), userfaultfd page fault handling may experience latency spikes due to mmap_lock contention. Linux 6.4+ includes Per-VMA locking which eliminates this bottleneck.
| Kernel Version | mmap_lock Behavior | High-Concurrency Performance |
|---|---|---|
| < 6.4 | Global lock | May spike to 50ms under load |
| >= 6.4 | Per-VMA locking | Consistent sub-100us |
Recommendation: For production CI with high parallelism, use Linux 6.4+ (requires CONFIG_PER_VMA_LOCK=y in kernel config).
Symptom:
[WARN] Landlock not available in container
Solution: Add required capabilities:
# docker-compose.yml
services:
tests:
security_opt:
- seccomp:unconfined
cap_add:
- SYS_PTRACE
- SYS_ADMINOr with docker run:
docker run \
--cap-add SYS_PTRACE \
--cap-add SYS_ADMIN \
--security-opt seccomp=unconfined \
your-imageSymptom:
userfaultfd not available in container
Solution: Ensure host kernel supports userfaultfd and container has SYS_PTRACE:
# On host
sudo sysctl vm.unprivileged_userfaultfd=1
# In container
docker run --cap-add SYS_PTRACE your-imageSymptom: Tests fail in GitHub Actions with EPERM.
Solution: Ensure runner has required permissions. For self-hosted runners:
jobs:
test:
runs-on: self-hosted
steps:
- uses: actions/checkout@v4
- name: Run tests
run: |
# May need --no-isolation in some environments
./target/release/tach-core --no-isolation .Symptom: No JUnit XML output in CI.
Solution:
# Specify output path explicitly
tach-core --junit-xml results.xml .
# Verify file exists
ls -la results.xmlSymptom: Database errors in Django tests.
Cause: Transaction isolation not working.
Solution: Configure Django for Tach:
# settings.py
DATABASES['default']['TEST'] = {
'NAME': ':memory:', # Use in-memory SQLite
}Symptom:
OperationalError: too many connections
Solution: Configure connection limits:
# Django
DATABASES['default']['CONN_MAX_AGE'] = 0
# SQLAlchemy
engine = create_engine(url, pool_size=5, max_overflow=0)# Verbose output
RUST_LOG=debug tach-core .
# Specific module
RUST_LOG=tach_core::isolation::sandbox=debug tach-core .| Log Pattern | Meaning |
|---|---|
[DEBUG] Landlock ABI: V3 |
Landlock version detected |
[WARN] Falling back to fork isolation |
Snapshot mode unavailable |
[INFO] Worker reset: 45us |
Healthy reset time |
[WARN] Worker reset: 5ms |
Slow reset (check memory usage) |
[ERROR] Worker crashed |
Worker process died unexpectedly |
# System info
uname -a
python --version
cat /etc/os-release
# Tach version
./target/release/tach-core --version
# Kernel features
cat /sys/kernel/security/lsm
cat /proc/sys/vm/unprivileged_userfaultfd
# Run self-test
./target/release/tach-core self-testWhen reporting issues, include:
- Full error message
- System diagnostic output (above)
- Minimal reproduction case
- pyproject.toml configuration
Tach uses structured error codes to help diagnose issues. Error codes follow the pattern EXXX where:
- E001-E004, E010, E012: User errors (test code, configuration, Python version)
- E005-E009, E011, E013-E016: System errors (kernel, permissions, resources)
- E017-E020: Extended user errors (syntax, fixtures, test status)
Category: User
Cause: A test assertion failed during execution. The test's expected outcome did not match the actual result.
Solution:
- Review the test assertion and expected values
- Check if the code under test has changed
- Verify test data and fixtures are correct
Category: User
Cause: Failed to import a module in a test file. This could be a missing dependency or incorrect import path.
Solution:
- Ensure the module is installed:
pip install <module> - Verify the import path is correct
- Check for circular imports
- Ensure
PYTHONPATHis set correctly
Category: User
Cause: A test requests a fixture that does not exist or is not accessible.
Solution:
- Define the fixture in
conftest.pyor the test file - Check for typos in the fixture name
- Ensure conftest.py is in the correct directory
- Verify fixture scope is appropriate
Category: User
Cause: The marker expression passed via -m flag has invalid syntax.
Solution:
- Check marker syntax:
-m "slow and not integration" - Use proper boolean operators:
and,or,not - Ensure marker names are valid identifiers
Category: User
Cause: A test exceeded the configured timeout limit.
Solution:
- Increase timeout:
@pytest.mark.timeout(N)on the test - Increase global timeout:
--timeout NCLI flag - Optimize the test for better performance
- Check for infinite loops or deadlocks
Category: User
Cause: The Python binary used does not match the expected version.
Solution:
- Set
PYO3_PYTHONto the correct Python binary path - Verify Python version:
python --version - Create a virtual environment with the correct version
Category: User
Cause: A Python syntax error was found in a test file.
Solution:
- Run
python -m py_compile <file>to locate the error - Fix the syntax error at the indicated line
- Check for missing colons, brackets, or indentation issues
Category: User
Cause: Fixtures have circular dependencies that cannot be resolved.
Solution:
- Review fixture dependency graph
- Refactor fixtures to break the cycle
- Use factory patterns to defer fixture creation
- Consider using fixture scopes to avoid the cycle
Category: User (Informational)
Cause: A test was skipped due to a skip marker or condition.
Note: This is informational, not an error. The test was intentionally skipped.
Category: User (Informational)
Cause: A test is marked as expected to fail (@pytest.mark.xfail).
Note: This is informational, not an error. The test is known to fail and tracked.
Category: System
Cause: The userfaultfd system call is not available. This is required for Tach's memory snapshot feature.
Solution:
- Enable unprivileged userfaultfd:
sudo sysctl -w vm.unprivileged_userfaultfd=1
- Make it persistent by adding to
/etc/sysctl.conf:vm.unprivileged_userfaultfd=1 - Alternatively, run with
CAP_SYS_PTRACE:sudo setcap cap_sys_ptrace+ep ./tach-core
Category: System
Cause: Landlock filesystem sandboxing is not available. Requires Linux kernel 5.13+.
Solution:
- Upgrade to Linux kernel 5.13 or later
- Tach will run with degraded filesystem isolation
- Check kernel config:
CONFIG_SECURITY_LANDLOCK=y
Category: System
Cause: An operation was denied due to insufficient permissions.
Solution:
- Check file and directory permissions
- Run with elevated privileges if necessary
- In containers, use
--privilegedflag - Check SELinux/AppArmor policies
Category: System
Cause: System ran out of memory during test execution.
Solution:
- Reduce worker count:
-n 2 - Increase system memory or swap
- Check for memory leaks in tests
- Use
--force-toxicto reduce snapshot memory usage
Category: System
Cause: The process exceeded the file descriptor limit.
Solution:
- Increase file descriptor limit:
ulimit -n 65536 - Make permanent in
/etc/security/limits.conf:* soft nofile 65536 * hard nofile 65536 - Reduce worker count to use fewer file descriptors
Category: System
Cause: Failed to mount an OverlayFS filesystem for test isolation.
Solution:
- Ensure the overlayfs kernel module is loaded:
sudo modprobe overlay
- Check mount permissions
- Verify the work directory supports overlayfs
Category: System
Cause: Failed to create a Linux namespace for process isolation.
Solution:
- Check kernel configuration for namespace support
- Run with
CAP_SYS_ADMIN:sudo setcap cap_sys_admin+ep ./tach-core
- In Docker, use
--privilegedor specific capability flags
Category: System
Cause: A worker process crashed with a signal (SIGSEGV, SIGBUS, etc.).
Solution:
- Check for memory corruption in C extensions
- Increase stack size:
ulimit -s unlimited - Run with
--force-toxicto isolate problematic tests - Check for segfault-causing code in tests
Category: System
Cause: Communication between supervisor and worker failed.
Solution:
- Check system resources (memory, file descriptors)
- Reduce worker count:
-n 2 - Check for worker crashes in logs
- Ensure
/dev/shmhas sufficient space
Category: System
Cause: Memory snapshot verification failed, indicating corruption.
Solution:
- This is an internal error - please report a bug
- Try running with
--force-toxicas a workaround - Check for memory-corrupting C extensions
- Verify system memory is healthy:
memtest86+
Tach uses static AST analysis for test discovery, which cannot detect:
| Feature | Limitation | Workaround |
|---|---|---|
pytest_generate_tests |
Dynamic test generation not visible statically | Use explicit parametrize decorators |
| Autouse fixtures | May not be fully detected in all cases | Document in test or use explicit marks |
| Nested TestClass | Deeply nested classes may not be discovered | Flatten test class hierarchy |
| Plugin-generated tests | Tests created by plugins at runtime | Run with --collect-only to verify |
These limitations are inherent to static analysis. If tests are missing, use --no-ignore to verify they aren't being filtered, or run pytest --collect-only to compare discovery results.
- README - Project overview and quick start
- Configuration - CLI and config options
- Development - Build and test commands
- Sandbox - Security architecture
- Snapshot - Memory snapshot details