Skip to content

P2POS — Sovereign Family Photo Vault (MVP Architecture & Execution Plan)

Status: Phases A–D are implemented in-repo (peer registry, replication queue, HTTP PeerTransport in p2pos-replication, internal replicate HMAC, vault-web nodes UI); Phases E+ pending.
Stack: Rust (core + node runtime), React + TypeScript (web UX).
Shape: modular monolith, filesystem-backed encrypted blobs, explicit trusted-node replication.


1. Why the sovereign family photo vault is the best first stone

  • Emotional + clear: Everyone understands “our photos, our keys, our houses”—no blockchain or currency narrative required.
  • Exercises the right primitives: Identity (who is “family”), trust groups (which nodes), encrypted blobs (photos), replication (two houses), policies (who can read/write where), visibility (where copies live)—all map 1:1 to the long-term substrate.
  • Honest sovereignty story: Correctness does not depend on a central SaaS; a central host is optional for convenience only.
  • Small team–friendly: No LLM, no consensus research, no mobile store—ship a web UI + two node processes a technical family can run.
  • Wedge without painting into a corner: Albums/photos stay in an app layer; the core stays generic (objects, capabilities, nodes, policies).

2. Strict MVP

In scope

  • One family (single trust group) with 2–3 trusted nodes (e.g. home NAS + VPS + laptop).
  • Albums and photo objects (metadata + encrypted blob reference).
  • Upload from browser: client encrypts file; server/node stores ciphertext only.
  • Download/view in browser: decrypt with keys held by authorized members (see trust model).
  • Replication: user-configured targets; push-style sync between nodes (no global consensus).
  • Visibility UI: per-object or per-album “where is this stored?” and replication status (pending / ok / failed).
  • Enrollment: technical flow (token + URL, or paste public key)—no consumer onboarding polish. Product direction: “install + QR” and no home NAT/VPN setup via libp2p (relay + hole punch)—see §12; not required for the first technical demo.
  • Single modular monolith binary for the “node” plus a separate vault web app (could be static + API to any node).

Explicit MVP success: A demo where Alice uploads at Node A, sees replication complete to Node B, opens the UI pointed at Node B, and sees the same album/photo without any third-party cloud being on the critical path.


3. Explicitly out of scope (MVP)

  • Currency / value / tokens.
  • Local or hosted LLMs; arbitrary remote execution.
  • Android node runtime, browser SDK package (beyond ad-hoc fetch in the web app).
  • Consumer phone onboarding, push notifications, app store.
  • Community / multi-tenant “cloud” product.
  • Byzantine consensus, leader election across untrusted parties, complex CRDT graphs.
  • Microservices mesh, Kafka, etc.
  • Fine-grained photo editing, EXIF stripping policy, large-scale search—unless trivial.

4. Overall architecture (three surfaces, one repo)

┌─────────────────────────────────────────────────────────────┐
│  Layer 3 — UX: vault-web (React + TS)                       │
│  Talks to one node’s HTTP API; holds client keys in memory  │
└────────────────────────────┬────────────────────────────────┘
                             │ HTTPS + JSON (+ optional SSE)
┌────────────────────────────▼────────────────────────────────┐
│  Layer 2 — App: family-vault                                  │
│  Albums, photos, family membership projections, app policies  │
└────────────────────────────┬────────────────────────────────┘
                             │ in-process calls
┌────────────────────────────▼────────────────────────────────┐
│  Layer 1 — P2POS core (substrate)                             │
│  Identity, nodes, trust groups, blob store, replication,      │
│  policy engine (embryo), attestations (signed manifests)      │
└─────────────────────────────────────────────────────────────┘

Deployment MVP: Each node runs the same p2pos-node process (Rust monolith). The web app is built as static assets; during demo you either proxy to one node or configure API base URL per deployment.

Critical rule: The family vault crate depends on substrate traits/types; substrate must not import vault types.


Security audit (implemented milestones A–D)

This section records a design-time security review of the code and configuration as shipped through Phase D. It is not a formal penetration test, compliance audit, or cryptographic proof. Use it to decide what is safe to expose and what must change before broader deployment.

Milestone coverage vs security goals

Goal (from architecture) A–D status Notes
Server stores ciphertext only for file payloads Met for blob bytes Node persists opaque octets under blobs/; no server-side decrypt.
Client-side encryption for photos Met (web) AES-GCM in browser; IV prepended to ciphertext (see vault-web crypto module).
Identity tied to keys Partial Ed25519 proves possession of a key at login only. The session token is not bound to that public key on later requests (see gaps).
Trusted-node replication Partial HMAC with shared P2POS_REPLICATE_PSK proves knowledge of the family secret, not per-node identity.
Policy-based read (who may fetch which blob) Not implemented Any valid session can GET /v1/blobs/{id} if the blob exists on that node (Phase E target).
No mandatory central cloud Met for architecture Operators self-host nodes; security then depends on how they expose them.

Controls that exist today

  1. Blob confidentiality from the node (application layer)
    File content is encrypted before upload; the node stores and replicates ciphertext only. Compromise of the DB does not yield photo plaintext without client keys.

  2. Login proof-of-key (Ed25519)
    /v1/auth/challenge + /v1/auth/verify: client signs a fresh nonce. Invalid signatures are rejected.

  3. Opaque session token
    After verify, API routes (except /health and internal replicate) require Authorization: Bearer <token>; token is a random UUID-derived hex string, not a JWT.

  4. Inter-node replicate authenticity (symmetric)
    POST /internal/v1/replicate/{blob_id} requires header X-P2POS-Replicate-Signature: HMAC-SHA256 over blob_id (UTF-8) || body. Verification uses constant-time hex compare. Only parties with P2POS_REPLICATE_PSK should be able to ingest blobs this way.

  5. Transport for node→node client
    HttpPeerTransport uses rustls for HTTPS when the peer base_url is https://….

  6. SQLite integrity
    Albums/photos use foreign keys; replication jobs reference peers.

Gaps and risks (prioritized for follow-up)

Critical / high

  • Session token is not bound to identity. After verify, the server only checks membership in a session set. It does not store “this token belongs to public key X”. Any party who steals a bearer token (XSS, localStorage scrape, log leak) has the same API access as the user until restart or manual code change. Phase E+ should attach capabilities or key id to the session.

  • No session expiry or rotation. Tokens live until the process ends (in-memory set). Stolen tokens do not age out.

  • GET /v1/blobs/{id} is not authorization-scoped. There is no check that the caller “owns” or is a member for that blob; album/photo linkage is not enforced on read. Metadata in SQLite is readable with a valid session but blob fetch is effectively “any blob id”.

  • Shared replication PSK. All trusted peers share one secret. Leak of P2POS_REPLICATE_PSK allows forging replicates to any peer that trusts it. There is no per-peer or per-blob capability in the HMAC. Compromise of one node’s config compromises the replication trust model for the whole mesh.

  • Internal replicate endpoint is unauthenticated except HMAC. There is no rate limit; a holder of the PSK can fill disk (DoS). No TLS requirement for http:// peer URLs—ciphertext could be observed or modified on the wire unless operators use HTTPS and trust the path.

Medium

  • CORS is allow_origin(Any) on the node. Any script that learns the bearer token (e.g. XSS on the vault origin, or token pasted into a malicious page) can call the API from another origin and read responses, because the server reflects Access-Control-Allow-Origin: *. For production-shaped deployments, use an origin allowlist, httpOnly + Secure cookies (or similar), and short-lived tokens.

  • Default P2POS_REPLICATE_PSK and demo keys must be changed for any real deployment; they are documented defaults.

  • Album/photo metadata is cleartext in SQLite (titles, captions, blob ids). This is consistent with “metadata server” but is not “full secrecy” of everything about the vault.

  • /health is unauthenticated (by design for probes); ensure it reveals nothing sensitive (currently OK).

Lower / operational

  • No rate limiting, request size caps (beyond Axum defaults), or audit logging of security events.

  • Browser key storage: Ed25519 and AES key material in localStorage is convenient for demos and vulnerable to XSS and physical access.

API surface (security-relevant)

Surface AuthN AuthZ / notes
/v1/auth/challenge, /v1/auth/verify N/A (verify proves key once) Nonces are single-use when consumed.
/v1/blobs, /v1/albums, /v1/photos, /v1/nodes, /v1/replication/status Bearer session Weak authz on blob read; peers writable by any session.
/internal/v1/replicate/{id} HMAC header Not browser-facing; protect at network layer + strong PSK.
/health None OK for liveness.

Relation to libp2p (future)

Phases A–D use HTTP + TLS (optional) + HMAC for replication. libp2p (§12, Phase G / p2pos-net) is planned to improve reachability and eventually peer identity at the transport layer; it does not replace the need for application-layer policies (who may read which blob) unless explicitly designed that way.

Suggested next security milestones

  1. Phase E: Session bound to IdentityId, expiry, and blob read policy (e.g. only if blob referenced by a photo in an album the identity may access—or wrapped-key model).
  2. Replication: Per-peer secrets or signed push tokens; TLS-only peer URLs in strict mode.
  3. Web: httpOnly sessions, strict CORS, CSP to reduce XSS impact.
  4. Ops: Threat model doc per deployment (LAN-only vs internet-facing).

5. Monorepo / repository structure

/
├── Cargo.toml                 # workspace root
├── crates/
│   ├── p2pos-core/            # identity, trust, policy types, crypto helpers
│   ├── p2pos-storage/         # encrypted blob store (filesystem impl)
│   ├── p2pos-replication/     # PeerTransport + HMAC (HTTP push; libp2p later in p2pos-net)
│   ├── p2pos-net/             # (post–first-demo) libp2p: QUIC, relay, hole punch, mDNS
│   ├── p2pos-node/            # HTTP API, wiring, config (binary)
│   └── family-vault/          # domain: albums, photos, vault-specific policies
├── apps/
│   └── vault-web/             # Vite + React + TypeScript
├── docs/
│   └── P2POS_SOVEREIGN_FAMILY_VAULT_ARCHITECTURE.md
└── scripts/                   # demo: docker-compose or two local dirs

Optional later (not MVP): packages/p2pos-client-ts for the browser SDK surface—do not block the MVP on extracting it.


6. Rust workspace / crates / modules

Crate Responsibility
p2pos-core IdentityId, NodeId, TrustGroupId, Capability / Grant, Policy AST (minimal), SignedEnvelope, serialization contracts, error types.
p2pos-storage BlobStore trait: put, get, delete, list; FsEncryptedBlobStore using per-blob AEAD + wrapped DEK in sidecar or manifest; content addressing optional (hash as id).
p2pos-replication ReplicationTarget, ReplicationJob, retry/backoff, “last known state” per peer; transport trait PeerTransport (HTTP impl in node for dev; libp2p impl in p2pos-net for production reachability—see §12).
p2pos-net (After initial demo.) rust-libp2p: QUIC + Noise, relay + hole punch, mDNS; replication stream protocol; bootstrap/rendezvous config. Mobile: shared via UniFFI in Phase M2 (§12.4).
family-vault Album, Photo, membership, mapping photos → blob ids, vault-level “default replication targets”.
p2pos-node Axum (or Actix) server, routes, auth middleware, SQLite/Redb for indexes and replication queue (blobs stay on disk), startup config; wires PeerTransport.

Internal module boundaries inside p2pos-node: api/, auth/, config/, app_vault/ (handlers that call family-vault), substrate/ (re-exports wiring). Keep handlers thin.


7. Frontend structure (apps/vault-web)

apps/vault-web/
├── src/
│   ├── main.tsx
│   ├── App.tsx
│   ├── api/              # fetch client, types generated or hand-written
│   ├── crypto/           # encrypt/decrypt in Web Crypto (wrap in small module)
│   ├── pages/
│   │   ├── Dashboard.tsx
│   │   ├── Albums.tsx
│   │   ├── AlbumDetail.tsx
│   │   ├── Nodes.tsx
│   │   └── Settings.tsx
│   ├── components/
│   └── hooks/
└── vite.config.ts

Rule: No vault business logic hidden in components—use small hooks/services so a future p2pos-client-ts can lift the same patterns.


8. Core domain model

Substrate (generic)

  • Identity: IdentityId (Ed25519 public key or hash of it).
  • Node: NodeId, base URL, human label, public key for attestations.
  • TrustGroup: set of IdentityId + policy defaults (e.g. “members may read all blobs in group X”).
  • BlobRef: opaque id, size, content hash (of ciphertext), encryption scheme id, wrapped key material pointer.
  • Manifest / attestation: signed statement “this BlobRef is stored on NodeId at time T” (MVP: simple JSON + Ed25519).

Family vault (app)

  • Family ≈ one TrustGroupId for MVP (multi-family is a generalization).
  • Album: id, title, created_at, owner identity, ordered list of PhotoId.
  • Photo: id, album_id, blob_ref, thumbnail_blob_ref (optional second blob), caption, created_at.
  • Membership: which identities belong to the family (admin vs member optional flag).

9. Minimal trust / policy model (MVP)

Enrollment

  • Node generates node keypair; operator adds trusted peers by pasting peer URL + peer public key (TOFU).
  • User login MVP: sign a nonce with Ed25519 key in browser (import key from file or generate and download backup)—no passwords required for demo purity.

Policies (embryo)

  • Storage policy: “Blob class photo must exist on at least k of targets = [A,B].”
  • Read policy: “Only identities in TrustGroup may fetch wrapped keys for blobs in album Z.”
  • Execution policy: stub interface only (ExecutionPolicy enum with NoRemoteExecution default).

Wrapped keys

  • Per-blob DEK encrypted for each authorized member (NaCl crypto_box or HPKE-style). MVP: encrypt DEK for each IdentityId public key listed on the album.

What you are not solving yet: revocation rotation drama, group key agreement at scale, hardware attestation—document as follow-ups.


10. Storage model

  • Filesystem layout (per node):
    data/blobs/<hex-prefix>/<blob_id> ciphertext
    data/meta/<blob_id>.json encryption metadata + wrapped DEKs (or inline in DB)
  • Small index DB (SQLite): blob presence, album/photo rows, replication queue.
  • Client: never sends plaintext to node; sends ciphertext + metadata. Node stores and replicates ciphertext + metadata only.

11. Replication model

Semantics: eventual consistency, single-writer per photo object for MVP (last write wins if misused—acceptable for demo if UI avoids concurrent edits).

Mechanism

  1. After put_blob local success, enqueue ReplicationJob { blob_id, targets[] }.
  2. Worker pulls from queue; POST /internal/replicate (mTLS or HMAC with pre-shared node secret for MVP) to peer.
  3. Peer validates trust + optional quota, stores blob, acks.
  4. Originator marks target replicated; UI aggregates node attestations or simple presence map.

Shortcuts OK: periodic full reconcile (list blobs, diff); no Merkle sync required for MVP.

Dangerous shortcut to avoid: requiring S3 or one global database for correctness.

Reachability (product goal, post–first-demo): Plain inbound HTTP between two home NATs forces port forwarding, VPN, or tunneling—bad for “install app + scan QR.” The intended evolution is PeerTransport backed by libp2p (see §12): outbound-first, relay + hole punching, no user router configuration.


12. Reachability and P2P transport (libp2p, QR, no home NAT setup)

Problem: Two houses behind typical NAT cannot accept arbitrary inbound connections from the public internet without port forwarding, VPN, or equivalent. Requiring users to configure routers violates the super-simple onboarding goal (install + QR).

Principle: Move complexity off the router and into protocol + optional infrastructure you operate (or the family self-hosts). Users only dial out; home networks stay default-deny inbound.

12.1 libp2p as the default Rust stack

Use rust-libp2p (or a compatible higher-level Rust stack evaluated later, e.g. Iroh) as the cross-node replication transport for both home nodes and mobile—same codecs and security assumptions, different power/availability profiles.

Recommended transport stack (opinionated):

Layer Choice Rationale
Transport QUIC primary, TCP fallback Good NAT behavior; ubiquitous on modern networks.
Security Noise (libp2p standard) Consistent with “identity = keys”; composes with app-layer trust.
NAT / reachability Circuit Relay v2 on all nodes Mobile and NAT’d home nodes do not rely on public listen addrs.
Direct path Hole punching (e.g. DCUtR) Upgrade relay → direct when possible (latency + cost).
LAN mDNS discovery Same WiFi: phone ↔ home box without internet path or manual IPs.
Internet Bootstrap + optional Rendezvous Small always-on hints (DNS name in QR); not “a sync SaaS,” just routing.

Honest infrastructure: At least one of: cheap VPS (bootstrap + rendezvous + relay), always-on home node with outbound tunnel to a stable name, or community relays—otherwise cross-house sync is unreliable. This is reachability, not correctness of E2EE: ciphertext and keys model stay as in §9–§10; relays must not require plaintext.

12.2 Node roles: home server vs mobile

Role Expectations libp2p posture
Home node Always-on, large disk, stable power Runs relay client; may run relay server or bootstrap for the family if exposed via tunnel/DNS; strong replication source.
Mobile Intermittent, background limits, no public IP Relay + hole punch only; dial-out; prefer mDNS when on home LAN.

12.3 QR code payload (keep the QR small)

Do not embed long multiaddr lists in the QR. Prefer:

  • Short-lived join / pairing token (app-layer, signed or redeemable once).
  • Family or trust-group id (opaque).
  • Bootstrap hostname (stable DNS, e.g. bootstrap.example.org) resolving to current A/AAAA records so printed QRs do not go stale when IPs move.
  • Optional: rendezvous namespace or ticket for libp2p Rendezvous (or your thin coordinator).

The app uses the token + bootstrap/rendezvous to register and learn trusted peers’ PeerId / addresses out-of-band or via signed announcements—exact wire format is implementation detail; UX stays scan → joined.

12.4 Two-phase mobile plan (reduce schedule risk)

  1. Phase M1 — Asymmetric (faster): Mobile talks HTTP/WebSocket to the home node on LAN (or via tunnel URL); only home nodes run full libp2p to each other. Delivers “both platforms” quickly; weaker as a purity demo (mobile not a full peer).
  2. Phase M2 — Symmetric: Shared Rust crate p2pos-net (libp2p) behind a thin API; expose to iOS/Android via UniFFI (or equivalent). Mobile becomes a first-class libp2p peer (still relay-first, no user NAT).

Crate suggestion: add crates/p2pos-net/ (libp2p, discovery, relay, replication stream protocol over libp2p) implementing PeerTransport used by p2pos-replication. Keep HTTP transport as optional for dev/tests.

12.5 Sovereignty wording (stays coherent)

  • No user VPN / port forwarding is a product requirement.
  • Default bootstrap / rendezvous / relay is optional infrastructure for reachability; families can self-host those services or rely on an always-on home node + stable name.
  • Correctness of “our keys, our ciphertext” does not require trusting relay content—only availability of a path (relay sees ciphertext in motion if it must relay application bytes; design replication framing so application payload stays encrypted end-to-end as already specified).

13. Backend API (minimal)

Auth: Authorization: Bearer <session> where session is established after signature on /v1/auth/challenge.

Vault

  • GET /v1/albums — list
  • POST /v1/albums — create
  • GET /v1/albums/:id — detail + photos
  • POST /v1/photos — register metadata + BlobRef (after upload)
  • POST /v1/blobs — upload ciphertext (multipart or base64 JSON for small demo)
  • GET /v1/blobs/:id — download ciphertext

Substrate / ops

  • GET /v1/nodes — this node + known peers
  • POST /v1/nodes/peers — add trusted peer
  • GET /v1/replication/status — aggregate queue + per-blob status

Internal (peer-to-peer)

  • POST /internal/v1/replicate — ingest blob from trusted peer

All routes versioned under /v1 to preserve a stable path for a future browser SDK.


14. Frontend pages & major flows

Page Purpose
Dashboard Family name, health: this node, peer count, replication backlog.
Albums List/create albums.
Album detail Grid of thumbnails; upload flow (encrypt → upload blob → register photo).
Nodes Trusted nodes list, add peer, see last sync, storage location summary.
Settings Key backup/download, API base URL, dark mode (optional).

Flows: Create album → upload photos → open Nodes page → see two green checkmarks for replication → switch API URL to second node → album still visible.


15. Compelling demo scenario (script)

Cast: “River family”—two houses (Node Oak and Node Pine) and optional Cedar (cheap VPS) for off-site ciphertext.

  1. On Oak, create album “Summer 2026”, upload 5 photos; UI shows blobs only on Oak.
  2. Add Pine as trusted peer; replication jobs run; UI shows Oak ✓ Pine ✓.
  3. Disconnect Oak from network (or stop process); point browser at Pine; family opens same album; photos decrypt and display.
  4. Show Settings or Nodes copy: “No account on our servers—only keys and nodes you chose.”

16. Phased implementation plan

  1. Phase A — Skeleton: workspace, p2pos-node “hello”, vault-web shell, health endpoint.
  2. Phase B — Crypto path: browser encrypt/decrypt; node blob put/get; SQLite indexes.
  3. Phase C — Vault domain: albums/photos CRUD backed by DB + blob refs.
  4. Phase D — Replication: peer add, job queue, internal replicate endpoint (HTTP PeerTransport for dev), status UI.
  5. Phase E — Trust/policy embryo: signed challenges, wrapped DEKs per member, minimal policy checks.
  6. Phase F — Demo hardening: docker-compose, scripted reset, README runbook.
  7. Phase G — libp2p transport (post-MVP wedge toward QR onboarding): p2pos-net with QUIC + relay + hole punch + mDNS; pluggable bootstrap/rendezvous; Phase M1 mobile (HTTP to home) optional ahead of Phase M2 (UniFFI + full mobile peer).

17. Decisions that should stay stable (long-term)

  • Layering: substrate crates vs app crate vs UX app.
  • Blob abstraction: BlobStore trait + content ids.
  • Transport boundary: PeerTransport with multiple implementations (HTTP for dev/tests; libp2p for production reachability); replication logic must not assume HTTP-only.
  • API versioning: /v1/... for public HTTP.
  • Identity as keys: capabilities tied to cryptographic identity, not emails; align libp2p PeerId / keys with substrate IdentityId (explicit mapping layer).
  • Local-first truth: each node authoritative for what it has stored; sync merges via explicit protocols.

18. Acceptable MVP shortcuts

  • TOFU peer enrollment; PSK or single shared secret between nodes.
  • Single global family per deployment.
  • SQLite only; no HA database.
  • Last-write-wins on metadata.
  • Full blob re-upload on conflict detection.
  • Session tokens stored in localStorage (demo only; document httpOnly cookie path for production).

19. Dangerous shortcuts (undermine the vision)

  • Putting album/photo tables or concepts inside p2pos-core (contaminates substrate).
  • Central object store as the only source of truth (kills sovereignty story).
  • Server-side decryption for convenience (kills privacy-by-default).
  • Hard-coding one cloud provider in core storage.
  • Monolithic frontend that bakes node URLs and vault API without a thin client boundary.
  • Implicit trust (any peer can pull any blob) without policy hooks—prevents future multi-app substrate.

p2pos/
├── Cargo.toml
├── README.md
├── docs/
│   └── P2POS_SOVEREIGN_FAMILY_VAULT_ARCHITECTURE.md
├── crates/
│   ├── p2pos-core/
│   ├── p2pos-storage/
│   ├── p2pos-replication/
│   ├── p2pos-net/
│   ├── family-vault/
│   └── p2pos-node/
├── apps/
│   └── vault-web/
│       ├── package.json
│       ├── vite.config.ts
│       └── src/
└── scripts/
    └── demo-two-nodes.sh

8-week milestone plan

Week Milestone
1 Rust workspace + p2pos-node binary + health + config; Vite/React app with API client stub.
2 BlobStore + filesystem impl; upload/download ciphertext; minimal auth challenge.
3 SQLite schema for albums/photos; vault CRUD API; list UI.
4 Browser encrypt/decrypt + wrapped DEK for one user; wire upload pipeline.
5 Peer registry + replication job + internal endpoint; second node in docker-compose.
6 Replication status in API + Nodes page; basic backoff/retries.
7 Trust group embryo: multiple identities, DEK wrap for each; policy check on download.
8 Demo script, polish, failure modes (offline node), documentation; freeze MVP scope.

After week 8 (product track): Phase G / M1–M2p2pos-net (libp2p), default bootstrap+relay, QR enrollment; mobile first HTTP-to-home (M1), then UniFFI full peer (M2). See §12.


Exact next Cursor prompt (Phase E onward)

Phases A–D are in the repo. Copy and use as your next message:

Implement Phase E from docs/P2POS_SOVEREIGN_FAMILY_VAULT_ARCHITECTURE.md: tighten auth (optional session expiry), wrapped DEKs per album member on the client, minimal server policy check on blob fetch, and tests. Keep layering and ciphertext-only storage.


End of architecture document.