Health & Monitoring
A Sovereign install exposes three layers of operational visibility:
- Health probes — lightweight HTTP endpoints suitable for load balancers, container orchestration, and external monitoring tools.
- Scheduled jobs — background tasks that sweep posture, reconcile state, and surface drift as security events.
- Investigations & Operations consoles — admin UI pages where you see security events, the audit chain, and ongoing operational incidents.
You wire the first to your monitoring stack. You watch the second and third from inside the admin shell.
Health probes
Every Novantra app exposes two endpoints:
| Endpoint | What it answers | Use it for |
|---|---|---|
GET /api/health/live | ”Is the process responsive?” | Liveness probes. Restart the container/service if this stops responding. |
GET /api/health/ready | ”Can I serve real traffic?” | Readiness probes. Take the instance out of load-balancer rotation if this returns not-ready. |
The endpoints return small JSON payloads with a status field. They are intentionally cheap to call — they do not perform expensive checks on every request. Heavy posture work happens in scheduled jobs (see below) and surfaces through security events, not through the readiness probe.
Wiring to Kubernetes
Standard probe spec:
livenessProbe:
httpGet:
path: /api/health/live
port: 7443
periodSeconds: 30
failureThreshold: 3
readinessProbe:
httpGet:
path: /api/health/ready
port: 7443
periodSeconds: 10
failureThreshold: 3Wiring to a generic monitor
Any HTTP monitor (UptimeRobot, Healthchecks.io, Datadog synthetics, your own Nagios) can hit these endpoints. Treat a non-200 response as a real incident; the probes are not flaky by design.
Scheduled jobs
Several background jobs run on every install. They are the primary mechanism by which the system finds and surfaces drift.
| Job | Cadence | What it does |
|---|---|---|
reconcile-license-state | Hourly | Contacts the control plane (connected installs only), refreshes the signed license, projects per-org entitlements, fires state-change events. |
storage-active-binding-posture-sweep | Periodic | Re-validates each organization’s active storage binding (credential, bucket reachability, read/write/list/delete round trip). Surfaces drift as a security event. |
storage-object-migration-runner | Periodic | Picks up active storage migration runs, processes the next batch of items. See Storage Migration. |
Jobs run from inside the install on the same process. They are visible in the system logs (Linux: journalctl -u novantra; Docker: docker logs; Windows: Event Viewer).
Security events
When something operationally interesting happens — a posture sweep finds drift, a key provider returns an unexpected error, a migration fails — the system records it as a security event. Security events are distinct from the workspace audit log (member-driven actions); they are the install’s view of itself.
A small sample of the security event types you may see:
| Event | What it means |
|---|---|
storageOperationsValidationFailed | A storage binding validation probe returned a failure. The active binding may be drifting (credential rotated, bucket policy changed). |
storageOperationsActivationFailed | An attempt to activate a storage binding failed. |
storageOperationsActiveBindingUnavailable | The active storage binding for an organization is unreachable. File operations for that org are failing closed. |
storageOperationsMigrationFailed | A storage migration run hit an unrecoverable error and stopped. |
licenseRefreshFailed | The hourly license reconcile job could not reach the control plane or got a signature/state mismatch. |
Every event carries:
- A timestamp.
- The organization involved (or
installfor install-scoped events). - A sanitized reason (no raw provider errors, no secrets, no stack traces).
- Enough context to find related operations evidence.
Investigations console
System → Investigations is where install admins review what the install has logged about itself:
- Security events — the stream described above, filterable by event type, organization, date range.
- Audit chain — the install-level audit log (different from per-organization audit logs). Includes a one-click chain verification action.
- Operation evidence — receipts for sensitive operations (storage activation, license import, key-provider verification, etc.).
Filtering and reviewing here is your first stop when something looks off.
Operations console
System → Operations is where ongoing operational state is summarized:
- Active incidents — anything currently unresolved (an unreachable binding, an unverified key provider).
- License status — current license state and the latest reconcile-state result.
- Version status — what version you’re running and what’s available (connected installs only).
- Recent scheduled-job runs — the last few runs of each background job with their status.
If you only have time to look at one page per day, this is the one.
Per-organization observability
Inside an organization workspace, members with the right permissions also have observability into their own org:
- Settings → Investigations → Audit Log — the audit log for that organization specifically.
- Settings → Storage — current binding status, last validation result, recent migration runs (see Self Managed Storage).
- Settings → Key management — provider verification status, last successful operation (see Self Managed Secret Keys).
Install admins also see all of the above via the system console; org members see only their own.
Logs
Application logs go to the standard system journal. Recommended starting points:
# Linux (.deb)
journalctl -u novantra -f
journalctl -u novantra --since "1 hour ago" | grep ERROR
# Docker
docker logs -f novantra
docker logs novantra --since 1h | grep ERROR
# Windows
Event Viewer → Applications → NovantraLogs are deliberately conservative: routine traffic doesn’t log per-request; errors and significant state transitions do. You should not need a log indexer to keep up with a healthy install.
Logs never contain raw secrets, raw provider error payloads, plaintext credentials, or unredacted user data. If you need that level of detail for a vendor support call, ask Novantra support how to enable scoped diagnostic logging — it’s not the default for safety reasons.
What’s not yet surfaced
A few things that are coming but aren’t operator-facing today:
- Per-organization connection pool status in a dashboard.
- Real-time performance metrics (request latency histograms, throughput).
- Prometheus / OpenMetrics endpoint for native scrape.
- Distributed tracing export (OpenTelemetry).
Until these land, the combination of health probes + journal logs + the Investigations/Operations consoles is the supported way to monitor.
Cloud customers
In Cloud, Novantra runs all of the above on your behalf. You don’t wire probes, you don’t watch the operations console — you’d contact support if you saw something off. Customer-facing service health information is published on Novantra’s status page.
Related
- License Management — license reconcile job and its events.
- Storage Migration — migration runner job and its failure events.
- Audit Log — per-organization audit (different from install security events).