Skip to content

Realtime Relay

The Nuxt frontend receives live device-position updates through a self-hosted Centrifugo relay running alongside barnowl-aruba on the ingestion VM. This document is the architecture reference for the relay, the channel taxonomy, the auth flow, and how to operate it.

Migration context: PR #335 replaced Azure Web PubSub Free_F1 after its 20 000-message/day quota started silently throttling sendToAll() calls. See the realtime migration runbook for the decommission plan.

Topology

mermaid
flowchart LR
  Aruba[Aruba APs] --> DC[Data Collectors]
  DC --> Barnowl[barnowl-aruba<br/>nisc-ingestion VM]
  Barnowl --> EH[Azure Event Hub]
  EH --> Func[Azure Functions<br/>arubaEventHubTrigger]
  Func -->|HTTP POST /api<br/>X-API-Key| Relay[Centrifugo :8100<br/>nisc-ingestion VM]
  Nuxt[Nuxt client] -->|GET /api/negotiate<br/>Bearer JWT| Func
  Nuxt -->|WSS /connection/websocket<br/>Centrifugo JWT| Relay

Both barnowl-aruba and centrifugo run on the same VM (nisc-ingestion.centralus.cloudapp.azure.com), with nginx terminating TLS and routing:

PathPurposeUpstream
/connection/websocketBrowser WS connection127.0.0.1:8100
/apiHTTP publish API (backend → relay)127.0.0.1:8100
/centrifugo/Admin UI127.0.0.1:8100
/aruba/aos10, /aruba/aos8Existing BLE ingest (unchanged)127.0.0.1:3001 (barnowl)

Channel taxonomy

Centrifugo requires channel names of the form <namespace>:<suffix> to inherit namespace rules. A bare name like global falls into the default namespace where client subscription is denied by default (discovered in PR #335's follow-up commit).

ChannelPublisherSubscriberPayloadFrequency
floor:<floorId>iotHub.js after trilaterationFloorplan page when viewing that floor{type: 'position-batch', data: [{transmitterId, position, positionMethod, floorId, …}]}1 per Event Hub batch containing trilat results (~3/sec aggregate across floors)
global:alliotHub.js::processAmbientBatch when classified.length > 0Floorplan page (useRealtime auto-subscribes){type: 'ambient-batch', data: [{deviceId, deviceType, manufacturer, receiverId, rssi, timestamp}]}Variable — only when new ambient devices are classified

Why per-floor channels matter

Under the old Web PubSub sendToAll path, every connected client received every floor's ~70-update batches at ~3 batches/sec — roughly 260 000 broadcasts/day total. Per-floor channels send each batch only to subscribers of that floor. A user viewing Floor 2 never sees Floor 1's firehose.

Authentication

Two JWTs are in play and they are not the same token:

  1. App JWT (HMAC via TOKEN_SECRET) — the token returned by /api/auth/login. Used for normal API calls to the Function App. Stored as the httpOnly auth_token cookie.
  2. Centrifugo JWT (HMAC via CENTRIFUGO_HMAC_SECRET) — issued by /api/negotiate after the caller passes app-JWT auth. Claims: {sub: <userId>, iat, exp}, TTL 1 hour. Used only to prove identity to Centrifugo.

Token refresh is handled by centrifuge-js automatically: the client calls getToken (which re-hits /api/negotiate) on every refresh cycle, so a rolling 1 h window keeps long-lived sessions alive without re-authentication.

Function App environment variables

All four must be set on nisc-muster-tracking-api:

NameValueConsumers
CENTRIFUGO_PUBLISH_URLhttps://nisc-ingestion.centralus.cloudapp.azure.com/apiutils/centrifugoPublisher.js
CENTRIFUGO_CLIENT_URLwss://nisc-ingestion.centralus.cloudapp.azure.com/connection/websocketroutes/realtime.js::negotiate
CENTRIFUGO_HMAC_SECRET64-hex — matches /etc/centrifugo/centrifugo.env CENTRIFUGO_TOKEN_HMAC_SECRET_KEYroutes/realtime.js::negotiate
CENTRIFUGO_API_KEY64-hex — matches /etc/centrifugo/centrifugo.env CENTRIFUGO_API_KEYutils/centrifugoPublisher.js

CI owns the Function App

Do not modify these settings via az functionapp config appsettings set outside CI. Setting them via az cli competes with the GitHub Actions Azure/functions-action@v1 deploy step's handling of WEBSITE_RUN_FROM_PACKAGE — the next CI run will fail with "please remove WEBSITE_RUN_FROM_PACKAGE app setting". Set values through the Portal or a repo-owned workflow step.

Monitoring

  • Admin UI: https://nisc-ingestion.centralus.cloudapp.azure.com/centrifugo/ — live connection count, channel list, publish stats.
  • Prometheus metrics: http://127.0.0.1:8100/metrics (loopback only; scrape from in-VM exporter).
  • systemd: journalctl -u centrifugo --since "10 min ago".
  • Client-side probe: curl -sS -H "Authorization: Bearer <app-jwt>" https://nisc-muster-tracking-api.azurewebsites.net/api/negotiate — expect a Centrifugo URL + token in the response.

Failure modes and how to diagnose

SymptomFirst checkLikely fix
"Last Realtime Update" timestamp stops ticking/api/negotiate response shape — Centrifugo URL vs. WebPubSub URLFunction App env vars (see table above)
Personnel markers don't moveSubscribe the floor's channel via a listener; see whether publishes arrivePer-floor routing in iotHub.js; check for a regression in position-batch publish
Ambient markers don't updateSubscribe to global:allAmbient classification (processAmbientBatch::classified.length > 0)
Browser shows "Offline" instead of "Live"Browser DevTools — WS upgrade to /connection/websocketnginx config; Centrifugo service status
Centrifugo admin UI shows 0 clients despite browser tabs openallowed_origins in config.jsonAdd the SWA hostname to the allowlist

NISC Muster Tracking Documentation