API Health Checks: Patterns Every Backend Developer Should Know
A health check endpoint is the single most valuable endpoint in your API. It's the first thing monitoring tools hit, the first thing load balancers check, and the first thing you look at during an incident. Yet most developers implement it as an afterthought — a route that returns 200 OK unconditionally. Here's how to do it properly.
Shallow vs. deep health checks
Shallow health check
A shallow check verifies that the application process is running and can respond to HTTP requests. It doesn't check dependencies.
// Shallow health check
app.get('/health', (req, res) => {
res.status(200).json({ status: 'ok' });
});
Use case: Kubernetes liveness probes. You want to know "is the process alive?" not "are all dependencies healthy?" If the process is alive but a dependency is down, you don't want Kubernetes to restart the pod — that won't fix the dependency.
Deep health check
A deep check verifies the application and all its critical dependencies. This is what your monitoring service should hit.
// Deep health check
app.get('/health/ready', async (req, res) => {
const checks: Record<string, string> = {};
// Database
try {
const start = Date.now();
await db.query('SELECT 1');
checks.database = `ok (${Date.now() - start}ms)`;
} catch (err) {
checks.database = 'error';
}
// Redis cache
try {
await redis.ping();
checks.redis = 'ok';
} catch {
checks.redis = 'error';
}
// External API dependency
try {
const resp = await fetch('https://api.stripe.com/v1/', {
signal: AbortSignal.timeout(3000)
});
checks.stripe = resp.ok ? 'ok' : 'degraded';
} catch {
checks.stripe = 'unreachable';
}
const allOk = Object.values(checks).every(v => v.startsWith('ok'));
const status = allOk ? 'healthy' : 'degraded';
res.status(allOk ? 200 : 503).json({ status, checks });
});
The three-endpoint pattern
Production-grade applications should expose three health-related endpoints:
| Endpoint | Purpose | Used by |
|---|---|---|
/health/live | Is the process alive? | Kubernetes liveness probe |
/health/ready | Can it handle requests? | Load balancer, readiness probe |
/health/startup | Has it finished starting? | Kubernetes startup probe |
Why separate them?
A service might be alive (process running) but not ready (database migration in progress). Or it might be ready but a non-critical dependency is down (degraded but functional). Separating these signals lets orchestrators and monitoring tools make smarter decisions.
Caching health check results
If your deep health check queries the database, Redis, and external APIs, it might take 500ms+ to complete. If your monitoring tool checks every 30 seconds, that's a lot of unnecessary load on dependencies.
Cache the result for a short period:
let cachedResult: { data: any; timestamp: number } | null = null;
const CACHE_TTL = 10_000; // 10 seconds
app.get('/health/ready', async (req, res) => {
if (cachedResult && Date.now() - cachedResult.timestamp < CACHE_TTL) {
return res.status(cachedResult.data.status === 'healthy' ? 200 : 503)
.json(cachedResult.data);
}
const result = await runDeepHealthCheck();
cachedResult = { data: result, timestamp: Date.now() };
res.status(result.status === 'healthy' ? 200 : 503).json(result);
});
Timeout handling
Every dependency check in your health endpoint needs a timeout. Without one, a hanging database connection will cause your health check to hang, which causes your monitoring tool to report a timeout — obscuring the real issue.
async function checkWithTimeout(
name: string,
fn: () => Promise<void>,
timeoutMs = 3000
): Promise<{ name: string; status: string }> {
try {
await Promise.race([
fn(),
new Promise((_, reject) =>
setTimeout(() => reject(new Error('timeout')), timeoutMs)
),
]);
return { name, status: 'ok' };
} catch (err) {
return { name, status: err.message === 'timeout' ? 'timeout' : 'error' };
}
}
Security considerations
Health check endpoints can leak information. Be careful about what you expose:
- Don't expose database connection strings, API keys, or internal hostnames
- Don't expose exact version numbers (use a build hash instead)
- Do expose dependency status (ok/error), response times, and service name
- Consider requiring an API key for detailed health info, with a simple 200/503 for public checks
Monitoring your health checks
Health checks are only useful if something is watching them. Set up automated monitoring that hits your health endpoints at regular intervals and alerts you when something changes. If you need a quick way to monitor health check endpoints across multiple services, PingGuard checks your endpoints from 3 regions, verifies status codes, and alerts via Slack, email, or webhooks when things go wrong. Free for up to 5 endpoints.
Comments
Loading comments...