Alarm Rules

Built-in alarm rules define which system events create alarms and at what severity. Rules cannot be created or deleted — they are code-defined and seeded on boot. Users can toggle rules on/off and override the severity level. Adding a new alarm type requires only a code change (one emitEvent call + one rule definition).

Fields & Columns

Name Description
Enabled Whether this rule creates alarms. Disabled rules ignore matching events.
Name Unique rule identifier (e.g., scanner-down, tls-expiring)
Description Human-readable explanation of what the rule detects
Severity Current severity for new alarms. Can be overridden from the default.
Auto-Resolve Event type that auto-resolves matching alarms (null = manual only)
scanner-down Availability: fires on scanner_down (default: high), auto-resolves on scanner_up. Scanner missed heartbeat timeout (2 minutes).
ingest-down Availability: fires on ingest_down (default: critical), auto-resolves on ingest_up. Ingest API node stopped heartbeating.
db-down Availability: fires on db_down (default: critical), auto-resolves on db_up. Database connection pool test failed.
proxy-down Availability: fires on proxy_down (default: critical), auto-resolves on proxy_up. Traefik reverse proxy health check failed.
dns-down Availability: fires on dns_down (default: high), auto-resolves on dns_up. Public FQDN DNS resolution failed.
backup-failed Operations: fires on backup_failed (default: high), manual resolve only. Manual or scheduled backup completed with errors.
db-disk-high Threshold: fires on db_disk_high (default: warning), auto-resolves on db_disk_normal. Database disk usage exceeded threshold.
db-connections-high Threshold: fires on db_connections_high (default: warning), auto-resolves on db_connections_normal. Database connection pool has sustained waiting queries.
tls-expiring Deadline: fires on tls_expiring (default: warning), auto-resolves on tls_renewed. TLS certificate expires within 30 days.
scanner-load-high Resource: fires on scanner_load_high (default: warning), auto-resolves on scanner_load_normal. Scanner load average exceeds 80% of CPU capacity.
scanner-memory-high Resource: fires on scanner_memory_high (default: warning), auto-resolves on scanner_memory_normal. Scanner available memory below 10%.
db-sessions-high Database: fires on db_sessions_high (default: warning), auto-resolves on db_sessions_normal. Active database sessions exceed 80% of max_connections.
db-long-queries Database: fires on db_long_queries_high (default: warning), auto-resolves on db_long_queries_normal. Database query running longer than 60 seconds.
auth-failures-high Security: fires on auth_failures_high (default: high), auto-resolves on auth_failures_normal. More than 10 authentication failures in 5 minutes.
session-ip-spread Security: fires on session_ip_spread_high (default: warning), manual resolve only. One user account has active sessions from too many distinct IP addresses.
scan-stuck Stall: fires on scan_stuck (default: warning), manual resolve only. Scan still running but all jobs are finished.

How To

Disable an alarm type

  1. Find the rule in the table
  2. Click the enabled toggle to turn it off
  3. New events of this type will be ignored (existing alarms remain)

Override severity

  1. Find the rule in the table
  2. Use the severity dropdown to change the level
  3. New alarms will use the overridden severity

Reset severity to default

  1. Find the rule with a non-default severity (shown in parentheses)
  2. Click the "Reset" button
  3. Severity reverts to the code-defined default

Gotchas

  1. Disabling a rule does not resolve existing alarms — it only prevents new ones.
  2. Severity overrides apply to new alarms only. Existing alarms keep their original severity.
  3. Rules are re-seeded on boot. New rules appear automatically after a code update.
  4. The default severity is immutable — it reflects the code-defined importance of the fault.

API Calls (3)

Method Path Description
GET /api/admin/alerting/alarm-rules List all built-in alarm rules
PATCH /api/admin/alerting/alarm-rules/:id Toggle enabled or override severity
POST /api/admin/alerting/alarm-rules/:id/reset Reset severity to default

Related Pages

  • Alarms — Rules create alarms when matching events arrive
  • Events — Events are matched against rules to create alarms
  • Notification Policies — Policies can be scoped to specific alarm rule names