Heartbeat Mechanism
Introduction
After enrollment, the sensor agent enters a continuous heartbeat loop, periodically collecting system metrics and sending them to the backend. This is the agent's primary ongoing responsibility — providing the defense center with real-time visibility into sensor host health and network interface activity.
Heartbeat Loop
The heartbeat loop never exits. Failed heartbeats are logged but the agent continues operating.
Heartbeat Request
Endpoint:
POST /api/sensors/:sensorId/agent/heartbeat
Headers:
| Header | Value |
|---|---|
Content-Type | application/json |
X-Sensor-Token | Durable sensor token from enrollment |
Payload:
{
"sensorToken": "durable-token",
"timestamp": "2025-12-19T12:00:00Z",
"cpuUsage": 25.5,
"memoryUsed": 4294967296,
"memoryTotal": 17179869184,
"memoryPercent": 25.0,
"interfaces": [
{
"name": "enp130s0f0",
"isManagement": true,
"isLoopback": false,
"isVirtual": false,
"isWireless": false,
"isUp": true,
"bytesSent": 1073741824,
"bytesRecv": 2147483648
}
]
}
Metrics Collected
CPU Usage
- Source:
gopsutil/cpu.Percent(time.Second, false) - Returns aggregate CPU usage across all cores as a single percentage.
- The 1-second sample window means each heartbeat cycle takes at least 1 second for CPU measurement.
Memory
- Source:
gopsutil/mem.VirtualMemory() - Collected fields:
| Field | Type | Description |
|---|---|---|
memoryUsed | uint64 | Used memory in bytes |
memoryTotal | uint64 | Total memory in bytes |
memoryPercent | float64 | Used memory percentage |
Network Interfaces
- Source:
net.Interfaces()+gopsutil/net.IOCounters(true) - Every interface on the host is reported with:
| Field | Type | Description |
|---|---|---|
name | string | Interface name (e.g., enp130s0f0) |
isManagement | bool | True if this is the default route interface |
isLoopback | bool | True for loopback interfaces (lo) |
isVirtual | bool | True for Docker bridges, veth, tun, tap, etc. |
isWireless | bool | True for wireless interfaces (wl*, wlan*) |
isUp | bool | True if the interface has the UP flag |
bytesSent | uint64 | Cumulative bytes sent (since boot) |
bytesRecv | uint64 | Cumulative bytes received (since boot) |
Interface Classification Rules
The agent classifies each network interface using these rules:
| Classification | Detection Method |
|---|---|
| Management | Interface name matches the default route device from ip route show default |
| Loopback | net.FlagLoopback flag set, or name is lo |
| Virtual | Name starts with: veth, docker, br-, virbr, vnet, tun, tap |
| Wireless | Name starts with: wl, wlan |
These classifications help the backend determine which interfaces are eligible for virtual sensor deployment. Management, loopback, virtual, and wireless interfaces are typically not eligible for packet capture.
Heartbeat Interval
The interval between heartbeats is determined by two sources:
- From enrollment response — The backend sends
heartbeatIntervalSecondsduring enrollment, and the agent adopts this value. - From environment variable —
HEARTBEAT_INTERVALsets the initial interval (default: 30 seconds). This is used for the pre-enrollment period; once enrolled, the backend-provided value takes precedence.
Backend Processing
When the backend receives a heartbeat:
- Validates the
X-Sensor-Tokenheader. - Stores CPU and memory metrics in InfluxDB (
sensor_metricsmeasurement). - Updates the sensor's
lastHeartbeatAttimestamp. - Stores interface state for the backend's interface inventory (
SensorNetworkInterface). - Updates online/offline status based on heartbeat recency.
The frontend then queries these metrics for:
- Sensor health badges (Healthy / Warning / Critical based on CPU/memory thresholds).
- Online/offline status (heartbeat younger than 5 minutes = online).
- Sensor detail page interface listing.
Failure Modes
| Scenario | Behavior |
|---|---|
| CPU metric collection fails | Heartbeat returns error, logged, loop continues |
| Memory metric collection fails | Heartbeat returns error, logged, loop continues |
| Interface collection fails | Heartbeat returns error, logged, loop continues |
| Backend unreachable | HTTP error logged, loop continues |
| Backend returns non-200 | Error logged with status code, loop continues |
| JSON marshaling fails | Error logged, loop continues |
The heartbeat loop is designed to be resilient — no single failure causes the agent to crash. This ensures continuous monitoring even during transient network issues.