Skip to content

Milestone 1 runbook — operator vs family, two nodes, web app

This matches §2.5 Milestone 1 in P2POS_SOVEREIGN_FAMILY_VAULT_ARCHITECTURE.md.

Roles

Who Runs
Operator Signaling (p2pos-signal) + STUN/TURN (coturn) on a machine with a public IP (VPS or homelab with ports 8090, 3478 UDP/TCP published).
Family (user) Two p2pos-node instances + vault static UI (nginx). Nodes connect outbound to the operator’s WebSocket and ICE servers.

Secrets: use a strong shared P2POS_REPLICATE_PSK on every family node; set TURN user/password in turnserver.conf (or your operator equivalent) and mirror them in each node’s P2POS_WEBRTC_ICE_JSON / browser P2POS_BROWSER_WEBRTC_ICE_JSON.

Path A — All-in-one (developer / single host)

From repo root:

./scripts/e2e-docker-up.sh
# or:  cd docker/e2e && docker compose up --build -d

Open http://localhost:9080. Signaling ws://localhost:8090, TURN localhost:3478 (user/password e2e in the default E2E config).

Path B — Split: operator stack, then family nodes

Use when the operator VPS is distinct from the PCs running family nodes (same Docker host below; on different machines, publish operator ports on the VPS and set P2POS_SIGNALING_WS_URL / browser ICE to that host DNS or IP instead of signal / localhost).

1) Operator (terminal 1, docker/e2e/)

cd docker/e2e
docker compose -f docker-compose.operator-infra.yml up --build -d

Creates Docker network sover_operator_backbone and containers signal, coturn resolvable by those names on that network.

2) Family nodes + UI (terminal 2, docker/e2e/)

cd docker/e2e
docker compose -f docker-compose.family-nodes.yml up --build -d

Uses volumes family-node-a-data / family-node-b-data (separate from the all-in-one e2e_* volumes).

3) Browser

http://localhost:9080 — same as Path A if ports 8090 and 9080 are on your machine.

If the browser runs on another machine, set on node-a (source of GET /v1/nodes) the P2POS_BROWSER_SIGNALING_WS_URL and P2POS_BROWSER_WEBRTC_ICE_JSON to your operator’s reachable WebSocket and STUN/TURN (public hostname/IP), then rebuild vault-ui with matching VITE_SIGNALING_WS_URL or rely on the API response.

Replication after a peer was down

Jobs that exhaust per-attempt retries become failed. The node’s replication worker automatically requeues failed jobs whose updated_at is older than P2POS_REP_FAILED_RETRY_AFTER_SECS (default 120 seconds) back to pending, so when the peer returns, sync can complete without editing SQLite.

Automated tests

  • Rust: cargo test -p p2pos-node (includes requeue_stale_failed_replication_jobs behavior).
  • E2E (HTTP vault, CI): docker/e2e with docker-compose.ci.yml + Playwright.
  • WebRTC stack smoke: ./scripts/verify-webrtc-e2e.sh (brings up default compose and hits /health).

TLS

Milestone 1 demos use ws:// and http://. For production-shaped operator deploys, terminate WSS and HTTPS at a reverse proxy in front of p2pos-signal and the vault UI, and use TURNS on 5349 as needed (coturn supports it).