A Spring Boot application demonstrating thread pool starvation and cascading failures in microservices architecture.
ServiceConsumer is the "victim" application that demonstrates how a microservice can become completely unresponsive when its dependencies fail. This happens through thread pool exhaustion - when all worker threads become blocked waiting for a slow or unresponsive dependency.
When SlowDependency stops responding:
- Requests to
/api/process-datawait for 3 seconds (timeout) before failing - Under heavy load (50+ concurrent requests), all 20 threads become blocked
- Even
/api/healthbecomes unreachable - despite having NO dependency on SlowDependency - The entire application becomes unresponsive
This demonstrates the cascading failure problem in microservices.
- Java: 8
- Spring Boot: 2.3.12.RELEASE
- Build Tool: Maven
- HTTP Client: RestTemplate (synchronous/blocking)
- Server: Embedded Tomcat (20 worker threads)
- Max Threads: 20 (realistic for production-like behavior)
- Connect Timeout: 2 seconds
- Read Timeout: 3 seconds
- Log Files: 50MB per file, 250MB total (5 files)
ServiceConsumer/
βββ src/main/java/com/example/serviceconsumer/
β βββ ServiceConsumerApplication.java # Main application
β βββ controller/
β β βββ DataController.java # Vulnerable endpoint
β β βββ HealthController.java # Control endpoint
β β βββ MetricsController.java # Thread pool metrics
β βββ service/
β β βββ DependencyService.java # Calls SlowDependency
β βββ filter/
β β βββ RequestIdFilter.java # Request ID tracking
β βββ interceptor/
β β βββ RestTemplateRequestIdInterceptor.java
β βββ monitor/
β β βββ ThreadPoolMonitor.java # Scheduled monitoring
β βββ config/
β β βββ RestTemplateConfig.java # HTTP client config
β βββ model/
β β βββ ApiResponse.java
β β βββ HealthResponse.java
β β βββ MetricsResponse.java
β β βββ ThreadPoolStats.java
β βββ exception/
β βββ GlobalExceptionHandler.java
βββ src/main/resources/
βββ application.properties # Configuration
- Java 8 or higher
- Maven 3.6+
- SlowDependency app running on port 8081
mvn clean packagemvn spring-boot:runOr run the JAR:
java -jar target/service-consumer-1.0.0.jarThe application will start on http://localhost:8080
GET http://localhost:8080/api/process-dataBehavior:
- Calls SlowDependency to fetch data
- Normal: Returns in ~100-200ms
- When SlowDependency hangs: Waits 3 seconds (timeout), then returns error
- Under load: Exhausts all threads, making entire app unresponsive
Example Response (Success):
{
"status": "success",
"data": "Data from SlowDependency",
"message": "Data processed successfully",
"timestamp": "2025-11-10T10:30:45.123",
"processingTimeMs": 150
}Example Response (Failure):
{
"status": "error",
"message": "Failed to fetch data from dependency",
"error": "Read timed out",
"timestamp": "2025-11-10T10:30:48.168",
"processingTimeMs": 3012
}GET http://localhost:8080/api/healthBehavior:
- NO external dependencies
- Normal: Returns in <10ms
- During thread starvation: HANGS (no available threads)
Response:
{
"status": "UP",
"timestamp": "2025-11-10T10:30:45.123",
"message": "Application is healthy"
}GET http://localhost:8080/api/metricsResponse:
{
"threadPool": {
"maxThreads": 20,
"activeThreads": 15,
"queueSize": 0,
"completedTasks": 1234,
"exhausted": false
},
"timestamp": "2025-11-10T10:30:45.123",
"applicationName": "ServiceConsumer",
"version": "1.0.0"
}GET http://localhost:8080/actuator/health# Get active threads
GET http://localhost:8080/actuator/metrics/tomcat.threads.busy
# Get max threads
GET http://localhost:8080/actuator/metrics/tomcat.threads.config.maxGET http://localhost:8080/actuator/threaddumpDuring thread starvation, this shows ALL threads blocked waiting for SlowDependency!
Three bash scripts are provided for easy demonstration:
Quick verification that all endpoints are working:
./test.shTests:
- β
/api/healthendpoint - β
/api/process-dataendpoint - β
/api/metricsendpoint - β Actuator endpoints
Full guided demonstration with explanations:
./demo.shThis script walks through:
- Scenario 1: Baseline (healthy state)
- Scenario 2: Single request timeout
- Scenario 3: Thread pool exhaustion (the main demo)
Features:
- Color-coded output
- Interactive (press ENTER to continue)
- Automatic thread pool monitoring
- Thread dump analysis
- Clear observations and takeaways
Automated load testing to trigger thread pool exhaustion:
One-time load test:
./load-test.sh [concurrent_requests]
# Examples:
./load-test.sh # Default: 50 concurrent requests
./load-test.sh 100 # 100 concurrent requestsContinuous load test (constant load):
./load-test.sh [concurrent_requests] [interval_seconds]
# Examples:
./load-test.sh 50 5 # 50 requests every 5 seconds (forever)
./load-test.sh 30 10 # 30 requests every 10 seconds (forever)
./load-test.sh 20 3 # 20 requests every 3 seconds (forever)Use Cases:
- One-time mode: Demonstrate instant thread pool exhaustion
- Continuous mode: Simulate sustained load over hours (gradual degradation)
What it does:
One-time mode:
- Launches N concurrent requests to
/api/process-data - Monitors thread pool during load
- Tests if
/api/healthis accessible (should hang!) - Captures thread dump for analysis
- Waits for recovery
Continuous mode:
- Sends batches of N requests every X seconds
- Monitors thread pool status after each batch
- Shows statistics every 10 batches
- Runs forever until Ctrl+C (graceful shutdown)
- Perfect for simulating 3-4 hour gradual starvation
Output:
- Real-time batch execution logs
- Thread pool statistics (color-coded: healthy/high load/exhausted)
- Cumulative statistics (total batches, requests, running time)
- Thread dumps (one-time mode only)
Example Output (Continuous Mode):
[2025-11-10 10:30:45] Batch #1: Launching 50 concurrent requests...
β Batch #1: 50 requests sent
Thread Pool: 15/20 (Healthy)
[2025-11-10 10:30:50] Batch #2: Launching 50 concurrent requests...
β Batch #2: 50 requests sent
Thread Pool: 18/20 (High Load)
[2025-11-10 10:30:55] Batch #3: Launching 50 concurrent requests...
β Batch #3: 50 requests sent
Thread Pool: 20/20 (EXHAUSTED!)
Statistics:
Batches sent: 10
Total requests: 500
Running time: 50s
Avg requests/sec: 10
Stop the test: Press Ctrl+C for graceful shutdown
Prerequisites: SlowDependency running normally on port 8081
# Test health endpoint - should be fast
curl http://localhost:8080/api/health
# Test data endpoint - should be fast
curl http://localhost:8080/api/process-data
# Check thread pool - should show low activity
curl http://localhost:8080/api/metricsExpected:
- All requests complete in <200ms
- Thread pool shows 1-2 active threads
Prerequisites: SlowDependency in "hang" mode
# Single request - will timeout after 3 seconds
curl http://localhost:8080/api/process-data
# Health endpoint should still work (threads available)
curl http://localhost:8080/api/healthExpected:
/api/process-datatakes 3 seconds, returns error/api/healthstill fast (threads available)
Prerequisites: SlowDependency in "hang" mode
Terminal 1: Monitor Logs
tail -f logs/serviceconsumer.logTerminal 2: Flood with Requests
# Send 50 concurrent requests (more than 20 threads)
for i in {1..50}; do
curl http://localhost:8080/api/process-data &
doneTerminal 3: Try Health Endpoint
# This should HANG! No threads available!
curl http://localhost:8080/api/healthTerminal 4: Check Thread Dump
# Shows all 20 threads blocked
curl http://localhost:8080/actuator/threaddumpExpected Results:
- All 50 requests start simultaneously
- First 20 requests grab all available threads
- Remaining 30 requests queue up
- All 20 threads block waiting for SlowDependency (3-second timeout)
/api/healthrequest cannot be processed (no free threads)- Thread dump shows all threads in
TIMED_WAITINGorRUNNABLEstate with stack traces pointing toRestTemplatesocket reads - After 3 seconds, first batch fails, next batch starts (continues until all 50 complete)
Log Evidence:
2025-11-10 10:30:00.000 [http-nio-8080-exec-1] [uuid-1] INFO - Incoming request: GET /api/process-data
...
2025-11-10 10:30:03.000 [http-nio-8080-exec-20] [uuid-20] INFO - Incoming request: GET /api/process-data
2025-11-10 10:30:05.000 [pool-monitor] WARN - β οΈ THREAD POOL EXHAUSTED! Active: 20/20 (100%) [ALL THREADS BUSY]
The application logs thread pool status every 30 seconds:
# Healthy state
2025-11-10 10:30:00.000 [pool-monitor] INFO - Thread Pool Status: Active: 2/20 (10%) [HEALTHY]
# Moderate load
2025-11-10 10:30:30.000 [pool-monitor] INFO - Thread Pool Status: Active: 12/20 (60%) [MODERATE LOAD]
# High load
2025-11-10 10:31:00.000 [pool-monitor] WARN - Thread Pool Status: Active: 18/20 (90%) [HIGH LOAD]
# Exhausted!
2025-11-10 10:31:30.000 [pool-monitor] WARN - β οΈ THREAD POOL EXHAUSTED! Active: 20/20 (100%) [ALL THREADS BUSY]
Every request gets a unique ID that appears in all logs:
# Send request with custom ID
curl -H "X-Request-ID: my-test-123" http://localhost:8080/api/process-data
# Filter logs by request ID
grep "my-test-123" logs/serviceconsumer.logExample:
2025-11-10 10:30:45.123 [http-nio-8080-exec-1] [my-test-123] INFO - Incoming request: GET /api/process-data
2025-11-10 10:30:45.125 [http-nio-8080-exec-1] [my-test-123] DEBUG - Calling SlowDependency
2025-11-10 10:30:48.130 [http-nio-8080-exec-1] [my-test-123] ERROR - SlowDependency call failed
Key configuration in application.properties:
# Thread Pool
server.tomcat.threads.max=20
server.tomcat.threads.min-spare=10
# HTTP Client Timeouts
http.client.connect-timeout=2000
http.client.read-timeout=3000
# Dependency URL
dependency.service.url=http://localhost:8081/api/data
# Logging
logging.file.name=logs/serviceconsumer.log
logging.file.max-size=50MB
logging.file.total-size-cap=250MB-
Thread Pool Starvation is Real: When all threads are blocked, the entire application stops responding
-
Cascading Failures: A failure in one dependency can make the entire application unreachable
-
Independent Endpoints Affected: Even
/api/health(with NO external dependencies) becomes unreachable -
Timeouts Are Not Enough: While timeouts eventually free threads, they don't prevent starvation under sustained load
-
Solutions (not implemented here, but worth discussing):
- Circuit Breakers (Hystrix, Resilience4j)
- Bulkheads (isolated thread pools)
- Async/non-blocking clients (WebClient)
- Rate limiting
- Back pressure
- Check Java version:
java -version(should be 8+) - Check port 8080 is available:
lsof -i :8080
- Ensure SlowDependency is running on port 8081
- Check URL in
application.properties
- Use during active load (when threads are blocked)
- Look for threads with
RestTemplateorSocketInputStreamin stack traces
# Clean build
mvn clean package
# Run with production settings (if needed)
java -jar target/service-consumer-1.0.0.jarBuilt to demonstrate thread pool starvation and cascading failures in microservices.
This is a demonstration application for educational purposes.