Skip to content

Sealed Identity โ€” Engineering Spike Plan โ€‹

Companion brief: docs/privacy/SEALED_IDENTITY.md. Status: Plan only. Stage A phase 1โ€“2 implementation has landed (PR #479). Counsel-track work runs in parallel with Stage B implementation, not as a gate โ€” see brief ยง6 for the reframe.

Goal โ€‹

Produce a concrete implementation plan (with cost estimate, schema migration, test plan, and rollout) for the two-stage sealed-identity architecture before any code is written. This is a planning artifact, not an implementation task.

Stage A โ€” Hash the phone โ€‹

Premise: plaintext users/{userId}.phone is the largest plain-PII surface in the auth path. Hashing it with a KMS-held pepper is independently valuable and also a prerequisite for Stage B.

A1. Schema diff โ€‹

FieldTodayAfter A
users/{userId}.phoneE.164 plaintextremoved (or kept TTL-short during migration window)
users/{userId}.phoneHashโ€”HMAC-SHA256(KMS_pepper, e164(phone)) hex
IndexphonephoneHash (composite with role if needed)

A2. Code touch list โ€‹

  • services/api/auth/src/routes/phone.js:42 โ€” replace where('phone', ...) with where('phoneHash', ...). Compute hash server-side from submitted phone (server holds pepper).
  • packages/shared/encryption โ€” add phoneHash(e164) helper that calls a server-side KMS API (web client never sees the pepper).
  • Signup paths writing users/{userId}.phone โ€” write phoneHash instead. Audit all callers via grep -r "phone:" services/ apps/ | grep users.
  • Firestore indexes โ€” add phoneHash index, drop phone index after migration window.
  • Firestore security rules โ€” disallow client reads of phoneHash (server-only).

A3. KMS pepper โ€‹

  • Store as a Cloud KMS MAC key (or, in the meantime, a Cloud Secret Manager secret accessed by the function โ€” see Stage A phase 1 implementation). Auth service uses a service account with cloudkms.cryptoKeyVersions.useToSignMac (compute HMACs via macSign) and cloudkms.cryptoKeyVersions.useToVerifyMac (verify via macVerify) โ€” no decrypt rights, ever.
  • Rotation: dual-pepper window. Login lookup tries new pepper first, falls back to old; on success with old, lazy re-hashes the row. After 90 days, drop old pepper from accepted set.

A4. Migration โ€‹

  • One-shot Cloud Run job iterating users collection, computing phoneHash for each, writing alongside phone. Use Firestore batched writes; estimate cost.
  • Cutover: switch read path to phoneHash query. Verify duplicate-detection (limit(5) + role-preference logic in phone.js:48โ€“55) still works.
  • Removal of plaintext phone: gated on a clean week of green metrics post-cutover.

A5. Testing โ€‹

  • Unit: phoneHash determinism, pepper-rotation fallback path, normalization edge cases (US/intl, formatting variants).
  • Integration: full login flow end-to-end against the Firebase emulator with phoneHash-only data.
  • Migration: dry-run on a snapshot copy of prod users; verify row counts and zero collision warnings.
  • Security: confirm users.where('phone', ...) queries return zero results post-cutover (i.e. nobody bypassed the change).

A6. Rollout โ€‹

Behind feature flag auth.phoneHash.enabled. Phased:

  1. Dual-write (write both phone and phoneHash, keep reading phone)
  2. Migrate existing rows
  3. Switch reads to phoneHash
  4. Stop writing phone
  5. Drop phone field + index

Each step is independently reversible until step 5.

A7. Estimate โ€‹

Roughly one engineering sprint (~2 weeks) including migration job and rollout. Does not require counsel review โ€” Stage A is a strict improvement over v1 with no semantic change to user-visible behavior.

Stage B โ€” Seal the userId โ€‹

Sequencing: ships immediately after Stage A. Counsel-track work (privacy policy audit, subpoena playbook, DPIA) runs in parallel and does not gate implementation. See brief ยง6.

B1. Schema diff (delta from Stage A) โ€‹

FieldAfter AAfter B
users/{userId} doc iduserId (Firebase Auth UID)unchanged
users/{userId}.phoneHashpresentmoved to a separate auth_lookup/{phoneHash} collection
auth_lookup/{phoneHash}โ€”{ encryptedUserIdBlob, phoneSalt, encryptedSeed, authProofHash, createdAt }
users/{userId}.phoneHashpresentremoved from user doc (lookup goes the other way only)

The userId in users/{userId} doc id remains opaque (UUID, not derived from phone). The auth_lookup collection's documents are keyed by phoneHash and contain only sealed material.

B2. Login flow โ€‹

1. POST /auth/phone/lookup { phone }
   โ†’ server: phoneHash = hmac(pepper, e164(phone))
   โ†’ server: read auth_lookup/{phoneHash}
   โ†’ server: 200 { phoneSalt, encryptedSeed, encryptedUserIdBlob, authProofHash }
            (note: no userId)
2. Client decrypts encryptedSeed with phone+PIN โ†’ entropy
3. Client derives blob_key = HKDF(entropy, "lantern-userid-blob-v1")
4. Client decrypts encryptedUserIdBlob with blob_key โ†’ userId
5. Client computes proofHmac = HMAC-SHA256(entropy, "lantern-auth-proof-v1")
6. POST /auth/phone/token { userId, proofHmac } as today

Server-side: step 1 produces a Firestore read of auth_lookup/{phoneHash} returning ciphertext. There is no path that yields phoneHash โ†’ userId as a single observable resolution.

B3. Code touch list โ€‹

  • services/api/auth/src/routes/phone.js โ€” split read path into a "lookup auth params" call that does not return userId.
  • New collection auth_lookup with security rules: server-only writes, no client reads of foreign rows.
  • apps/web/src/lib/auth.js, apps/web/src/lib/encryption.js โ€” add HKDF-derived blob-key path; add encryptedUserIdBlob decrypt step before custom-token request.
  • Signup flow โ€” generate userId, derive blob_key, encrypt, write auth_lookup row alongside users doc.
  • Recovery flow โ€” re-derive on user input; may require new UX if a recovery phrase is used.

B4. Migration โ€‹

  • For existing users: at next successful login, server detects no auth_lookup/{phoneHash} row, asks client to upload encryptedUserIdBlob derived from current entropy, writes the row, and (separately) clears phoneHash from the user doc on a subsequent login.
  • Users who don't return within the migration window stay in v1 form. Decision: do we force a logout/migration push at a cutoff? (Open.)

B5. Testing โ€‹

  • Unit: HKDF determinism, blob round-trip, wrong-PIN failure surfaces correctly (no userId leakage).
  • Integration: full login flow with sealed lookup; emulator suite covering brand-new signup, re-login, migration from Stage-A-only state, lockout behavior.
  • Security: red-team the /auth/phone/lookup response for any field that could be back-correlated to userId (e.g. lanternName length, createdAt timestamps with tight clustering).
  • Performance: measure added latency on /auth/phone/lookup + client-side decrypt round-trip across slow networks.

B6. Rollout โ€‹

Behind auth.sealedUserId.enabled. Stages:

  1. Dual-store (auth_lookup row written on signup; users.phoneHash still present)
  2. Switch login read path to auth_lookup-only response
  3. Backfill existing users on next login (lazy)
  4. Stop writing users.phoneHash
  5. Remove users.phoneHash field after backfill completion

B7. Estimate โ€‹

Multi-sprint (~6โ€“8 weeks engineering, plus parallel CS/T&S workflow redesign). Counsel review is the gating dependency, not a parallel task.

Cross-cutting work โ€‹

Independent of A/B but discovered during this audit:

  1. Build the banned_accounts collection described in SAFETY_MECHANICS.md. Stage A makes the hash form available; ban enforcement currently only works at userId level, allowing trivial re-registration. This should land alongside Stage A.
  2. Define a centralized login-events policy. Today, PIN failure counters live on users/{userId} and admin logins write to adminActions without IP. Decide what should be retained, for how long, and where (Cloud Run access logs vs. a dedicated collection). This is a prerequisite for any honest answer to ยง6 question 3 about user-notification policy.
  3. Document the existing zero-knowledge proof-of-entropy chain. customToken.service.js implements a sophisticated HMAC-of-entropy flow that's not described in the privacy docs. Add to HOW_ENCRYPTION_WORKS.md.

Out of scope โ€‹

  • Profile data encryption (already in place)
  • Merchant-side k-anonymity gating (independent system)
  • Ad-targeting privacy (Right #6 forbids targeting; nothing to seal)
  • Device fingerprinting / cross-device linkage (forbidden by constraint)

Decision log โ€‹

  • 2026-05-10: Brief rewritten against actual v1 (this plan's pair). Original draft's v1 description was inaccurate; corrected on this branch.
  • 2026-05-10: Two-stage decomposition (hash phone โ†’ seal userId) chosen over single-stage. Rationale: hashing the phone is independently valuable, low-risk, doesn't require counsel review, and unblocks banned_accounts.

Built with VitePress