Maintenance

The Maintenance page centralizes all container management into a single Admin page. Restart any service, infrastructure component, or database — with appropriate danger warnings for high-risk operations. Also surfaces backup triggers, S3 connectivity tests, and recent maintenance activity.

Fields & columns

Name	Description
`Container`	Name of the Docker container
`Tier`	Classification: Services, Infrastructure, or Databases
`Status`	Current health status from the watchdog/health endpoints
`Danger`	Whether restarting this container carries risk (data loss, connection drops)

How to

Restart a service

Find the container in the Services section
Click the "Restart" button next to it
Type the container name to confirm
Wait for the restart to complete (cooldown: 5 minutes between restarts)

Restart a database (danger zone)

Find the database in the Databases section (marked with a danger warning)
Read the warning about potential data corruption
Click the "Restart" button
Type the container name to confirm
Monitor the health status to verify the database comes back healthy

Trigger a manual backup

Scroll to the Backup Operations section
Click "Trigger Manual Backup"
The backup container polls every 60 seconds, so it is not instantaneous
Check the recent activity log for the result

Test S3 connectivity

Scroll to the Backup Operations section
Click "Test S3 Connection"
The result will show whether the configured S3 bucket is reachable

Gotchas

Restarting the manager will disconnect your current session briefly — the page will reload automatically.
Database restarts risk data corruption if writes are in flight. Only restart databases when you are certain no active scans or writes are occurring.
Restarting Traefik drops ALL active connections to the system, including your own.
The watchdog container cannot restart itself — it is listed as non-restartable.
A 5-minute cooldown applies per container to prevent restart storms.
Backup triggers are not instantaneous — the backup container polls every 60 seconds.
The yellow row warning and the drawer waiting count can disagree — that is correct. The row warning requires sustained contention (3 or more consecutive 1-second samples with waiting > 0); the drawer Connection Pool tiles show only the most recent sample. A non-zero waiting count in the drawer with no yellow row warning means a transient spike that did not escalate to an alarm. Use the "Peak waiting in last 60s" line in the drawer to spot bursts that have already resolved. If the drawer shows no Connection Pool section, the sampler has not produced its first sample yet (manager just started).

API calls (7)

Method	Path	Description
GET	`/api/admin/services/restartable`	List all restartable containers with tier and danger classification
GET	`/api/admin/services/maintenance-context`	Enriched container list with health diagnostics, open alarm counts, last restart history, and watchdog reachability — loaded once on page open to minimize round trips
GET	`/api/admin/services/:name/context`	Per-container detail loaded lazily when the detail drawer opens: health diagnostic, uptime, memory usage, DB pool stats (for database containers), recent audit events, and open alarms
POST	`/api/admin/services/:name/restart`	Restart a specific container via the watchdog sidecar
POST	`/api/admin/backups/trigger`	Trigger an immediate backup
POST	`/api/admin/backups/test-s3`	Test S3 connection with current settings
GET	`/api/audit/changes`	Fetch recent audit events (filtered for maintenance activity)

Health Overview — Architecture diagram links to this page via wrench icons on each node
Backups — Full backup configuration — this page only surfaces trigger and S3 test shortcuts
Backup Health — Read-only backup health monitoring