Operations
Monitoring
Health and Readiness
# Health check (is the process running?)
curl http://localhost:3000/wayfinder/health
# Readiness check (is the router ready to serve traffic?)
curl http://localhost:3000/wayfinder/ready
# API Guard compatible health check
curl http://localhost:3000/ar-io/healthcheckUse these endpoints for load balancer health checks and orchestration systems.
Prometheus Metrics
curl http://localhost:3000/wayfinder/metricsExposes standard metrics for scraping by Prometheus. Configure your Prometheus scrape targets to point at /wayfinder/metrics.
Gateway Statistics
# Summary statistics
curl http://localhost:3000/wayfinder/stats/gateways
# List all tracked gateways
curl http://localhost:3000/wayfinder/stats/gateways/list
# Detailed stats for a specific gateway
curl http://localhost:3000/wayfinder/stats/gateways/:gateway
# Export telemetry data
curl http://localhost:3000/wayfinder/stats/exportRouter Info
curl http://localhost:3000/wayfinder/infoReturns current configuration, version, uptime, and operating mode.
Telemetry Storage
Telemetry is stored in SQLite at TELEMETRY_DB_PATH (default ./data/telemetry.db). Configure sampling rates to control storage growth:
TELEMETRY_SAMPLE_SUCCESS=0.1 # Sample 10% of successful requests
TELEMETRY_SAMPLE_ERRORS=1.0 # Record all errors
TELEMETRY_RETENTION_DAYS=30 # Auto-purge old dataLog Levels
Set log verbosity via LOG_LEVEL:
LOG_LEVEL=debug # trace, debug, info, warn, error, fatalContent Moderation
Block ArNS names or transaction IDs from being served.
Setup
MODERATION_ENABLED=true
MODERATION_ADMIN_TOKEN=<your-secure-token>Generate a secure token:
openssl rand -base64 32The blocklist is stored at MODERATION_BLOCKLIST_PATH (default ./data/blocklist.json) and is hot-reloaded on changes.
API Endpoints
All admin endpoints require Authorization: Bearer <token> header.
# Block an ArNS name
curl -X POST http://localhost:3000/wayfinder/moderation/block \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"type":"arns","value":"badcontent","reason":"Policy violation"}'
# Block a transaction ID
curl -X POST http://localhost:3000/wayfinder/moderation/block \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"type":"txid","value":"abc123...","reason":"DMCA takedown"}'# List all blocked content
curl http://localhost:3000/wayfinder/moderation/blocklist \
-H "Authorization: Bearer YOUR_TOKEN"
# Check if content is blocked (no auth required)
curl http://localhost:3000/wayfinder/moderation/check/arns/somename
# Moderation statistics
curl http://localhost:3000/wayfinder/moderation/stats \
-H "Authorization: Bearer YOUR_TOKEN"# Unblock content
curl -X DELETE http://localhost:3000/wayfinder/moderation/block/arns/badcontent \
-H "Authorization: Bearer YOUR_TOKEN"
# Reload blocklist from disk
curl -X POST http://localhost:3000/wayfinder/moderation/reload \
-H "Authorization: Bearer YOUR_TOKEN"Cache Management
For production, enable disk-backed caching. See Configuration for all cache settings.
Clearing Caches
From source:
bun run clear:telemetry # Clear telemetry database
bun run clear:all # Clear all data (telemetry + cache)Binary or Docker: Delete the data directory contents directly:
rm -rf ./data/content-cache/*
rm ./data/telemetry.dbGraceful Shutdown
The router handles SIGTERM and SIGINT signals with a two-phase shutdown:
- Drain phase - Stop accepting new connections, wait for in-flight requests
- Force exit - If drain exceeds timeout, force shutdown
# Configuration
SHUTDOWN_DRAIN_TIMEOUT_MS=15000 # 15s drain period
SHUTDOWN_TIMEOUT_MS=30000 # 30s total timeout# Graceful stop
kill -TERM <pid>
# Docker (sends SIGTERM, waits 10s, then SIGKILL)
docker stop wayfinder-router
# Docker with custom timeout
docker stop -t 30 wayfinder-routerTroubleshooting
Port Conflicts
If port 3000 or 3001 is already in use:
PORT=3080 ADMIN_PORT=3081 ./wayfinder-router-linux-x64ADMIN_PORT must differ from PORT - the router validates this at startup.
Gateway Health Issues
Check gateway status via the admin UI Gateways page or:
curl http://localhost:3000/wayfinder/stats/gatewaysIf all gateways show as unhealthy:
- Verify internet connectivity
- Check
ROUTING_GATEWAY_SOURCE- ifstatic, ensure URLs are correct - Check circuit breaker settings
- Review logs:
LOG_LEVEL=debug
ArNS Resolution Failures
ArNS names require consensus across multiple verification gateways. If resolution fails:
- Check
ARNS_CONSENSUS_THRESHOLD(default: 2) - Verify the ArNS name exists on the network
- Check verification gateway health
- Ensure
VERIFICATION_ENABLED=true
Subdomain Routing Not Working
ArNS subdomains require BASE_DOMAIN to match your actual domain:
# For local development
BASE_DOMAIN=localhost
# For production
BASE_DOMAIN=yourdomain.comRequests to {name}.yourdomain.com are only recognized as ArNS subdomains if BASE_DOMAIN=yourdomain.com.
Content Verification Failures
If content consistently fails verification:
- Check
VERIFICATION_GATEWAY_COUNT- more gateways increases reliability but adds latency - Verify
VERIFICATION_GATEWAY_SOURCEis correctly configured - Check if the content transaction is still being seeded
- Review logs for specific hash mismatch details
Memory Usage
If memory usage is high:
- Enable disk-backed cache:
CONTENT_CACHE_PATH=./data/content-cache - Reduce cache size:
CONTENT_CACHE_MAX_SIZE_BYTES - Reduce telemetry retention:
TELEMETRY_RETENTION_DAYS - Increase sampling:
TELEMETRY_SAMPLE_SUCCESS=0.01
Slow Response Times
If responses are slow:
- Check routing strategy -
temperatureadapts to gateway performance - Verify verification gateway health
- Check network connectivity to gateways
- Review
STREAM_TIMEOUT_MSsetting - Consider reducing
VERIFICATION_GATEWAY_COUNTif latency is acceptable
Production Checklist
Before deploying to production:
- Set
BASE_DOMAINto your actual domain - Configure
ROOT_HOST_CONTENTif serving at root - Enable disk cache:
CONTENT_CACHE_PATH=./data/content-cache - Set appropriate cache size limits
- Enable rate limiting:
RATE_LIMIT_ENABLED=true - Secure admin UI:
ADMIN_TOKENor keepADMIN_HOST=127.0.0.1 - Configure telemetry sampling rates
- Set up monitoring (health checks, Prometheus)
- Configure reverse proxy (nginx/Caddy) with SSL
- Set up log aggregation
- Plan for graceful shutdown in your orchestration
How is this guide?