Health Monitoring

Overview

EngageFabric provides comprehensive health monitoring endpoints to help you:

Monitor System Status: Check if all services are operational
Integrate with Monitoring Tools: Connect to Datadog, Prometheus, or custom dashboards
Automate Health Checks: Set up automated alerts for service degradation
Debug Issues: Identify which component is causing problems

System Status Endpoint

Get a comprehensive view of all system components:

curl https://api.engagefabric.com/health/status

Response:

{
  "api": {
    "status": "healthy",
    "uptime": 86400,
    "version": "0.1.10"
  },
  "database": {
    "status": "connected",
    "latencyMs": 5
  },
  "redis": {
    "status": "active",
    "latencyMs": 2
  },
  "eventQueue": {
    "status": "processing",
    "pendingEvents": 12
  },
  "websocket": {
    "connections": 150,
    "status": "active"
  },
  "timestamp": "2025-12-01T10:30:00Z"
}

The /health/status endpoint is public and does not require authentication, making it suitable for external monitoring tools.

Component Status Values

API Status

Status	Description
`healthy`	All systems operational
`degraded`	Some features may be slow or unavailable
`unhealthy`	Critical systems are down

Database Status

Status	Description
`connected`	Database connection is active
`disconnected`	No database connection
`error`	Database error occurred

Redis Status

Status	Description
`active`	Redis connection is healthy
`disconnected`	No Redis connection
`error`	Redis error occurred

Event Queue Status

Status	Description
`processing`	Events are being processed normally
`idle`	No events in queue
`backlogged`	Queue has more than 1000 pending events
`error`	Event processing error

WebSocket Status

Status	Description
`active`	WebSocket server is accepting connections
`inactive`	WebSocket server is not running

Monitoring Integration

Prometheus/Grafana

Use the health endpoint with Prometheus blackbox exporter:

# prometheus.yml
scrape_configs:
  - job_name: 'engagefabric'
    metrics_path: /health/status
    static_configs:
      - targets: ['api.engagefabric.com']

Datadog

Configure a custom HTTP check:

# datadog.yaml
http_check:
  - name: EngageFabric API
    url: https://api.engagefabric.com/health/status
    check_certificate_expiration: true
    tls_verify: true

UptimeRobot / Pingdom

Simply add the health endpoint URL:

https://api.engagefabric.com/health/status

Health Check Best Practices

Regular Polling

Poll health endpoints every 30-60 seconds for timely alerts without overwhelming the API.

Alert on Degraded

Set up alerts for both unhealthy and degraded status to catch issues early.

Monitor Latency

Track latencyMs values over time to identify performance degradation trends.

Check Event Queue

Monitor pendingEvents to ensure events are being processed in a timely manner.

Status Page

For real-time service status and incident history, visit:

EngageFabric Status

View current service status and subscribe to incident notifications

Rate Limits

Understand API rate limiting and quotas

Errors

Learn about error codes and handling

Getting Started

Guides

Core Concepts

API Reference

SDKs

Resources

Overview

System Status Endpoint

Component Status Values

API Status

Database Status

Redis Status

Event Queue Status

WebSocket Status

Monitoring Integration

Prometheus/Grafana

Datadog

UptimeRobot / Pingdom

Health Check Best Practices

Regular Polling

Alert on Degraded

Monitor Latency

Check Event Queue

Status Page

EngageFabric Status

Rate Limits

Errors

Getting Started

Guides

Core Concepts

API Reference

SDKs

Resources

​Overview

​System Status Endpoint

​Component Status Values

​API Status

​Database Status

​Redis Status

​Event Queue Status

​WebSocket Status

​Monitoring Integration

​Prometheus/Grafana

​Datadog

​UptimeRobot / Pingdom

​Health Check Best Practices

Regular Polling

Alert on Degraded

Monitor Latency

Check Event Queue

​Status Page

EngageFabric Status

​Related

Rate Limits

Errors

Overview

System Status Endpoint

Component Status Values

API Status

Database Status

Redis Status

Event Queue Status

WebSocket Status

Monitoring Integration

Prometheus/Grafana

Datadog

UptimeRobot / Pingdom

Health Check Best Practices

Status Page

Related