Health API Documentation
Overview
The Health API provides comprehensive monitoring of the system's operational status and performance metrics. This endpoint is designed for load balancers, monitoring systems, and operational teams to ensure service reliability.
Health Check Endpoint
The main health endpoint is available at /status
and returns detailed information about all system components.
Getting Started
The health endpoint returns different HTTP status codes based on system health:
- 200 OK: All systems operational
- 503 Service Unavailable: One or more critical components are down
Basic Health Check Request
curl -X GET "http://localhost:8080/status" \
-H "Accept: application/json"
const response = await fetch('http://localhost:8080/status', {
method: 'GET',
headers: {
'Accept': 'application/json'
}
});
const healthData = await response.json();
console.log('System Status:', healthData.status);
import requests
response = requests.get(
'http://localhost:8080/status',
headers={'Accept': 'application/json'}
)
health_data = response.json()
print(f"System Status: {health_data['status']}")
print(f"Response Time: {response.elapsed.total_seconds()}s")
package main
import (
"encoding/json"
"fmt"
"net/http"
"time"
)
type HealthResponse struct {
Status string `json:"status"`
Timestamp string `json:"timestamp"`
Environment string `json:"environment"`
}
func checkHealth() {
client := &http.Client{Timeout: 10 * time.Second}
resp, err := client.Get("http://localhost:8080/status")
if err != nil {
fmt.Printf("Error: %v\n", err)
return
}
defer resp.Body.Close()
var health HealthResponse
json.NewDecoder(resp.Body).Decode(&health)
fmt.Printf("System Status: %s\n", health.Status)
}
Implementing Health Check Monitoring
Set up automated monitoring for your applications:
Production Ready
This configuration has been tested in production environments with 99.9% uptime.
version: '3.8'
services:
api:
image: rapidaigo:latest
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/status"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
API Reference
Health Check Endpoint
Prop | Type | Default |
---|---|---|
method? | string | GET |
endpoint? | string | /status |
authentication? | boolean | false |
Request Headers
Header | Type | Required | Description |
---|---|---|---|
Accept | string | No | Response format preference (application/json) |
User-Agent | string | No | Client identification for monitoring logs |
Response Schema
Component Health Details
Each monitored component reports the following information:
Prop | Type | Default |
---|---|---|
status? | enum | - |
response_time? | string | - |
error? | string | - |
Monitored Components
Database (MySQL)
Connection Pool Status
Tests database connectivity, query execution, and connection pool health
Timeout: 5 seconds
Critical: ✅ System fails if unhealthy
Response Examples
Real-world Scenarios
{
"status": "healthy",
"timestamp": "2024-08-18T14:30:15.123Z",
"environment": "production",
"checks": {
"database": {
"status": "healthy",
"response_time": "1.8ms"
},
"redis": {
"status": "healthy",
"response_time": "0.9ms"
}
}
}
Optimal performance with sub-2ms database response times.
{
"status": "unhealthy",
"timestamp": "2024-08-18T14:31:45.789Z",
"environment": "production",
"checks": {
"database": {
"status": "unhealthy",
"response_time": "5.001s",
"error": "dial tcp 127.0.0.1:3306: connect: connection refused"
},
"redis": {
"status": "healthy",
"response_time": "1.1ms"
}
}
}
Database Offline
Database connection failed - immediate attention required. Load balancer will route traffic to healthy instances.
{
"status": "healthy",
"timestamp": "2024-08-18T14:32:20.456Z",
"environment": "production",
"checks": {
"database": {
"status": "healthy",
"response_time": "3.2ms"
}
}
}
Redis unavailable but system remains operational with reduced performance.
{
"status": "healthy",
"timestamp": "2024-08-18T14:33:05.234Z",
"environment": "production",
"checks": {
"database": {
"status": "healthy",
"response_time": "45.2ms"
},
"redis": {
"status": "healthy",
"response_time": "8.7ms"
}
}
}
Elevated response times during peak load - still within acceptable thresholds.
Implementation Details
Backend Implementation
The health check implementation in Go:
package handler
import (
"context"
"fmt"
"net/http"
"time"
"github.com/pranavsoft/rapidaigo-web/internal/middleware"
"github.com/pranavsoft/rapidaigo-web/internal/server"
"github.com/labstack/echo/v4"
)
type HealthHandler struct {
Handler
}
func NewHealthHandler(s *server.Server) *HealthHandler {
return &HealthHandler{
Handler: NewHandler(s),
}
}
func (h *HealthHandler) CheckHealth(c echo.Context) error {
start := time.Now()
logger := middleware.GetLogger(c).With().
Str("operation", "health_check").
Logger()
response := map[string]interface{}{
"status": "healthy",
"timestamp": time.Now().UTC(),
"environment": h.server.Config.Primary.Env,
"checks": make(map[string]interface{}),
}
checks := response["checks"].(map[string]interface{})
isHealthy := true
// Check database connectivity
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
dbStart := time.Now()
if err := h.server.DB.DB.PingContext(ctx); err != nil {
checks["database"] = map[string]interface{}{
"status": "unhealthy",
"response_time": time.Since(dbStart).String(),
"error": err.Error(),
}
isHealthy = false
} else {
checks["database"] = map[string]interface{}{
"status": "healthy",
"response_time": time.Since(dbStart).String(),
}
}
// Set overall status
if !isHealthy {
response["status"] = "unhealthy"
return c.JSON(http.StatusServiceUnavailable, response)
}
return c.JSON(http.StatusOK, response)
}
Frontend Type Definitions
import { z } from "zod";
const ZHealthCheck = z.object({
status: z.enum(["healthy", "unhealthy"]),
response_time: z.string(),
error: z.string().optional(),
});
export const ZHealthResponse = z.object({
status: z.enum(["healthy", "unhealthy"]),
timestamp: z.string().datetime(),
environment: z.string(),
checks: z.object({
database: ZHealthCheck,
redis: ZHealthCheck.optional(),
}),
});
export type HealthResponse = z.infer<typeof ZHealthResponse>;
Troubleshooting
OpenAPI Integration
Interactive API Documentation
The health endpoint is fully documented in our OpenAPI specification:
OpenAPI Specification
View the complete API documentation at /api-docs
for interactive testing and detailed schemas.
OpenAPI Definition
/status:
get:
summary: System Health Check
description: |
Returns comprehensive health status of the API and all its dependencies
including database connectivity, cache systems, and external services.
tags:
- health
operationId: getHealth
responses:
'200':
description: System is healthy
content:
application/json:
schema:
$ref: '#/components/schemas/SystemHealthResponse'
examples:
healthy_system:
summary: All systems operational
value:
status: healthy
timestamp: "2024-08-18T14:30:00Z"
environment: production
checks:
database:
status: healthy
response_time: "2.5ms"
'503':
description: System is unhealthy
content:
application/json:
schema:
$ref: '#/components/schemas/SystemHealthResponse'
examples:
unhealthy_system:
summary: Database connection failed
value:
status: unhealthy
timestamp: "2024-08-18T14:30:00Z"
environment: production
checks:
database:
status: unhealthy
response_time: "5.001s"
error: "Connection timeout after 5 seconds"
API Testing
Visit the interactive documentation at:
http://localhost:8080/api-docs
Features:
- ✅ Interactive request/response testing
- ✅ Real-time schema validation
- ✅ Response examples and status codes
- ✅ Authentication handling
Import the OpenAPI spec into Postman:
# Download OpenAPI spec
curl http://localhost:8080/openapi.json > api-spec.json
# Import into Postman collection
Collection Features:
- Pre-configured environments
- Automated testing scripts
- Response assertions
# Simple health check
curl -i http://localhost:8080/status
# With verbose output
curl -v \
-H "Accept: application/json" \
-H "User-Agent: HealthMonitor/1.0" \
http://localhost:8080/status
# Check response time
curl -w "Response Time: %{time_total}s\n" \
-o /dev/null -s \
http://localhost:8080/status
# Basic request
http GET localhost:8080/status
# With custom headers
http GET localhost:8080/status \
Accept:application/json \
User-Agent:HealthMonitor/1.0
# Pretty print JSON response
http --print=HhBb GET localhost:8080/status
Monitoring & Alerting
Metrics Collection
Set up comprehensive monitoring for the health endpoint:
Production Monitoring
These configurations are battle-tested in production environments serving millions of requests.
scrape_configs:
- job_name: 'rapidaigo-health'
static_configs:
- targets: ['localhost:8080']
metrics_path: '/metrics'
scrape_interval: 30s
# Health check monitoring
- job_name: 'health-check'
static_configs:
- targets: ['localhost:8080']
metrics_path: '/status'
scrape_interval: 15s
# Custom metrics
relabel_configs:
- source_labels: [__address__]
target_label: instance
replacement: 'rapidaigo-api'
Key Metrics:
health_check_duration_seconds
- Response time histogramhealth_check_status
- Current health status (0=unhealthy, 1=healthy)component_health_status
- Per-component health status
{
"dashboard": {
"title": "RapidAI Go Health Dashboard",
"panels": [
{
"title": "System Health Status",
"type": "stat",
"targets": [{
"expr": "health_check_status",
"legendFormat": "{{instance}}"
}],
"fieldConfig": {
"defaults": {
"thresholds": {
"steps": [
{"color": "red", "value": 0},
{"color": "green", "value": 1}
]
}
}
}
},
{
"title": "Response Time Trends",
"type": "graph",
"targets": [{
"expr": "rate(health_check_duration_seconds[5m])",
"legendFormat": "{{quantile}}"
}]
},
{
"title": "Component Health Matrix",
"type": "heatmap",
"targets": [{
"expr": "component_health_status",
"legendFormat": "{{component}}"
}]
}
]
}
}
Dashboard Features:
- Real-time health status indicators
- Response time trends and percentiles
- Component-level health matrix
- Alert history and acknowledgments
alerting_rules:
- name: health_check_alerts
rules:
- alert: ServiceDown
expr: health_check_status == 0
for: 1m
labels:
severity: critical
service: rapidaigo-api
annotations:
summary: "RapidAI Go API is down"
description: "Health check failing for {{ $labels.instance }}"
- alert: HighResponseTime
expr: health_check_duration_seconds > 0.1
for: 5m
labels:
severity: warning
service: rapidaigo-api
annotations:
summary: "High response time detected"
description: "Health check response time: {{ $value }}s"
- alert: DatabaseConnectionFailed
expr: component_health_status{component="database"} == 0
for: 30s
labels:
severity: critical
service: rapidaigo-api
component: database
annotations:
summary: "Database connection failed"
description: "Database health check failing"
#!/bin/bash
# Health monitoring script
ENDPOINT="http://localhost:8080/status"
SLACK_WEBHOOK="https://hooks.slack.com/services/YOUR/WEBHOOK/URL"
LOG_FILE="/var/log/health-check.log"
check_health() {
local response=$(curl -s -w "%{http_code}" -o /tmp/health.json "$ENDPOINT")
local http_code="${response: -3}"
local timestamp=$(date -Iseconds)
if [[ "$http_code" == "200" ]]; then
local status=$(jq -r '.status' /tmp/health.json)
local db_time=$(jq -r '.checks.database.response_time' /tmp/health.json)
echo "[$timestamp] Health: $status, DB Response: $db_time" >> "$LOG_FILE"
if [[ "$status" != "healthy" ]]; then
send_alert "⚠️ System unhealthy but responding" "$status"
fi
else
echo "[$timestamp] ERROR: HTTP $http_code" >> "$LOG_FILE"
send_alert "🚨 Health endpoint unreachable" "HTTP $http_code"
fi
}
send_alert() {
local message="$1"
local details="$2"
curl -X POST "$SLACK_WEBHOOK" \
-H 'Content-type: application/json' \
--data "{
\"text\": \"$message\",
\"attachments\": [{
\"color\": \"danger\",
\"fields\": [{
\"title\": \"Details\",
\"value\": \"$details\",
\"short\": false
}]
}]
}"
}
# Run health check
check_health
Security Considerations
Rate Limiting
// Implement rate limiting for health checks
func HealthRateLimit() echo.MiddlewareFunc {
limiter := rate.NewLimiter(rate.Every(time.Second), 10)
return func(next echo.HandlerFunc) echo.HandlerFunc {
return func(c echo.Context) error {
if !limiter.Allow() {
return c.JSON(http.StatusTooManyRequests, map[string]string{
"error": "Rate limit exceeded"
})
}
return next(c)
}
}
}
🎉 You're All Set!
You now have a comprehensive understanding of the Health API. Start implementing health checks in your applications and set up monitoring for production readiness.