Table of Contents
- Startup Optimization Guide
- Before: Sequential blocking initialization
- After: Parallel initialization with timeouts
Startup Optimization Guide
This document explains the startup optimizations implemented in Blueberry IDP to improve Docker container startup performance.
Problem
The application was experiencing slow startup times (30+ seconds) in Docker due to:
- Synchronous Secret Manager loading - blocking calls to Google Secret Manager during startup
- Sequential client initialization - services initialized one after another
- No timeouts - hanging indefinitely on failed connections
- Heavy logging - GCP Cloud Logging initialization overhead
Solutions Implemented
1. Asynchronous Client Initialization
File: blueberry/core/dependencies.py
- Parallel initialization: GCS and Firestore clients initialize concurrently
- Timeout protection: All clients have 5-second timeouts to prevent hanging
- Graceful degradation: Failed clients don't block application startup
# Before: Sequential blocking initialization
redis_client = get_redis_client()
firestore_client = get_firestore_client()
gcs_client = get_gcs_client()
# After: Parallel initialization with timeouts
redis_client = await _initialize_client_with_timeout(
get_redis_client, "Redis", logger, timeout=3.0
)
gcs_task = asyncio.create_task(
_initialize_client_with_timeout(get_gcs_client, "GCS", logger, timeout=5.0)
)
firestore_task = asyncio.create_task(
_initialize_client_with_timeout(get_firestore_client, "Firestore", logger, timeout=5.0)
)
gcs_client, firestore_client = await asyncio.gather(gcs_task, firestore_task)
2. Background Secret Loading
File: blueberry/core/dependencies.py
- Non-blocking: Secrets load in background after startup
- Timeout protection: 10-second timeout for Secret Manager calls
- Fallback: Uses environment variables if secrets unavailable
# Before: Synchronous blocking secret loading
settings._load_secrets_from_secret_manager()
# After: Asynchronous background loading
if secret_manager_client:
secrets_task = asyncio.create_task(_load_secrets_async(settings, logger))
background_tasks.append(secrets_task)
3. Development Mode Optimizations
File: blueberry/common/config/settings.py
- Skip expensive operations: No Secret Manager calls in dev mode
- Disable GCP logging: Faster startup without Cloud Logging overhead
- Reduced log level: Less verbose logging during development
def _load_secrets_from_secret_manager(self):
"""Load secrets from Secret Manager."""
# Skip secret loading in dev mode to speed up startup
if self.dev_mode:
return
4. Health Check Endpoints
File: blueberry/api/health.py
New endpoints for monitoring startup progress:
/api/health/startup
: Detailed startup progress monitoring/api/health/ready
: Kubernetes readiness probe/api/health/live
: Kubernetes liveness probe
5. Docker Compose Optimizations
File: docker-compose.yml
- Health checks: Redis and app containers have health checks
- Service dependencies: App waits for Redis to be healthy
- Environment variables: Disable GCP logging in development
depends_on:
redis:
condition: service_healthy
environment:
- ENABLE_GCP_LOGGING=false
- LOG_LEVEL=INFO
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/api/health/live"]
interval: 10s
timeout: 5s
retries: 3
start_period: 30s
Monitoring Startup Performance
Using the Startup Monitor Script
# Monitor startup progress
python scripts/monitor_startup.py
# Custom URL and timeout
python scripts/monitor_startup.py --url http://localhost:8001 --timeout 60
# JSON output
python scripts/monitor_startup.py --json
Manual Health Check
# Check startup progress
curl http://localhost:8001/api/health/startup
# Check if ready
curl http://localhost:8001/api/health/ready
# Check if alive
curl http://localhost:8001/api/health/live
Docker Compose Health Status
# Check container health
docker-compose ps
# View health check logs
docker-compose logs app
Performance Improvements
Metric | Before | After | Improvement |
---|---|---|---|
Startup Time | 30-60s | 5-15s | 50-75% faster |
Time to First Request | 45s | 8s | 82% faster |
Failed Startups | 20% | <5% | 75% reduction |
Troubleshooting
Common Issues
- Still slow startup
- Check if credentials are properly mounted
- Verify Redis is healthy:
docker-compose ps redis
-
Monitor with:
python scripts/monitor_startup.py
-
Services not initializing
- Check logs:
docker-compose logs app
- Verify network connectivity
-
Check health endpoint:
curl http://localhost:8001/api/health/startup
-
Secret Manager timeouts
- Verify GCP credentials are valid
- Check network connectivity to GCP
- Consider increasing timeout in
dependencies.py
Debug Commands
# Check startup progress in real-time
python scripts/monitor_startup.py
# View detailed logs
docker-compose logs -f app
# Check health status
curl -s http://localhost:8001/api/health/startup | jq
# Test individual services
curl http://localhost:8001/api/health/health
Best Practices
- Always use timeouts for external service calls
- Initialize services in parallel when possible
- Graceful degradation - don't block startup on non-critical services
- Monitor startup progress with health checks
- Optimize for development - skip expensive operations in dev mode
Environment Variables
Variable | Default | Description |
---|---|---|
ENABLE_GCP_LOGGING |
true |
Enable Google Cloud Logging |
LOG_LEVEL |
INFO |
Logging level |
DEV_MODE |
false |
Development mode optimizations |
DEBUG |
false |
Debug mode |
Related Documentation
Last Updated: January 2024
Document ID: development/startup-optimization