Sealed Identity — Engineering Spike Plan

Companion brief: docs/privacy/SEALED_IDENTITY.md. Status: Plan only. Stage A phase 1–2 implementation has landed (PR #479). Counsel-track work runs in parallel with Stage B implementation, not as a gate — see brief §6 for the reframe.

Goal

Produce a concrete implementation plan (with cost estimate, schema migration, test plan, and rollout) for the two-stage sealed-identity architecture before any code is written. This is a planning artifact, not an implementation task.

Stage A — Hash the phone

Premise: plaintext users/{userId}.phone is the largest plain-PII surface in the auth path. Hashing it with a KMS-held pepper is independently valuable and also a prerequisite for Stage B.

A1. Schema diff

Field	Today	After A
`users/{userId}.phone`	E.164 plaintext	removed (or kept TTL-short during migration window)
`users/{userId}.phoneHash`	—	`HMAC-SHA256(KMS_pepper, e164(phone))` hex
Index	`phone`	`phoneHash` (composite with `role` if needed)

A2. Code touch list

services/api/auth/src/routes/phone.js:42 — replace where('phone', ...) with where('phoneHash', ...). Compute hash server-side from submitted phone (server holds pepper).
packages/shared/encryption — add phoneHash(e164) helper that calls a server-side KMS API (web client never sees the pepper).
Signup paths writing users/{userId}.phone — write phoneHash instead. Audit all callers via grep -r "phone:" services/ apps/ | grep users.
Firestore indexes — add phoneHash index, drop phone index after migration window.
Firestore security rules — disallow client reads of phoneHash (server-only).

A3. KMS pepper

Store as a Cloud KMS MAC key (or, in the meantime, a Cloud Secret Manager secret accessed by the function — see Stage A phase 1 implementation). Auth service uses a service account with cloudkms.cryptoKeyVersions.useToSignMac (compute HMACs via macSign) and cloudkms.cryptoKeyVersions.useToVerifyMac (verify via macVerify) — no decrypt rights, ever.
Rotation: dual-pepper window. Login lookup tries new pepper first, falls back to old; on success with old, lazy re-hashes the row. After 90 days, drop old pepper from accepted set.

A4. Migration

One-shot Cloud Run job iterating users collection, computing phoneHash for each, writing alongside phone. Use Firestore batched writes; estimate cost.
Cutover: switch read path to phoneHash query. Verify duplicate-detection (limit(5) + role-preference logic in phone.js:48–55) still works.
Removal of plaintext phone: gated on a clean week of green metrics post-cutover.

A5. Testing

Unit: phoneHash determinism, pepper-rotation fallback path, normalization edge cases (US/intl, formatting variants).
Integration: full login flow end-to-end against the Firebase emulator with phoneHash-only data.
Migration: dry-run on a snapshot copy of prod users; verify row counts and zero collision warnings.
Security: confirm users.where('phone', ...) queries return zero results post-cutover (i.e. nobody bypassed the change).

A6. Rollout

Behind feature flag auth.phoneHash.enabled. Phased:

Dual-write (write both phone and phoneHash, keep reading phone)
Migrate existing rows
Switch reads to phoneHash
Stop writing phone
Drop phone field + index

Each step is independently reversible until step 5.

A7. Estimate

Roughly one engineering sprint (~2 weeks) including migration job and rollout. Does not require counsel review — Stage A is a strict improvement over v1 with no semantic change to user-visible behavior.

Stage B — Seal the userId

Sequencing: ships immediately after Stage A. Counsel-track work (privacy policy audit, subpoena playbook, DPIA) runs in parallel and does not gate implementation. See brief §6.

B1. Schema diff (delta from Stage A)

Field	After A	After B
`users/{userId}` doc id	`userId` (Firebase Auth UID)	unchanged
`users/{userId}.phoneHash`	present	moved to a separate `auth_lookup/{phoneHash}` collection
`auth_lookup/{phoneHash}`	—	`{ encryptedUserIdBlob, phoneSalt, encryptedSeed, authProofHash, createdAt }`
`users/{userId}.phoneHash`	present	removed from user doc (lookup goes the other way only)

The userId in users/{userId} doc id remains opaque (UUID, not derived from phone). The auth_lookup collection's documents are keyed by phoneHash and contain only sealed material.

1. POST /auth/phone/lookup { phone }
   → server: phoneHash = hmac(pepper, e164(phone))
   → server: read auth_lookup/{phoneHash}
   → server: 200 { phoneSalt, encryptedSeed, encryptedUserIdBlob, authProofHash }
            (note: no userId)
2. Client decrypts encryptedSeed with phone+PIN → entropy
3. Client derives blob_key = HKDF(entropy, "lantern-userid-blob-v1")
4. Client decrypts encryptedUserIdBlob with blob_key → userId
5. Client computes proofHmac = HMAC-SHA256(entropy, "lantern-auth-proof-v1")
6. POST /auth/phone/token { userId, proofHmac } as today

Server-side: step 1 produces a Firestore read of auth_lookup/{phoneHash} returning ciphertext. There is no path that yields phoneHash → userId as a single observable resolution.

B3. Code touch list

services/api/auth/src/routes/phone.js — split read path into a "lookup auth params" call that does not return userId.
New collection auth_lookup with security rules: server-only writes, no client reads of foreign rows.
apps/web/src/lib/auth.js, apps/web/src/lib/encryption.js — add HKDF-derived blob-key path; add encryptedUserIdBlob decrypt step before custom-token request.
Signup flow — generate userId, derive blob_key, encrypt, write auth_lookup row alongside users doc.
Recovery flow — re-derive on user input; may require new UX if a recovery phrase is used.

B4. Migration

For existing users: at next successful login, server detects no auth_lookup/{phoneHash} row, asks client to upload encryptedUserIdBlob derived from current entropy, writes the row, and (separately) clears phoneHash from the user doc on a subsequent login.
Users who don't return within the migration window stay in v1 form. Decision: do we force a logout/migration push at a cutoff? (Open.)

B5. Testing

Unit: HKDF determinism, blob round-trip, wrong-PIN failure surfaces correctly (no userId leakage).
Integration: full login flow with sealed lookup; emulator suite covering brand-new signup, re-login, migration from Stage-A-only state, lockout behavior.
Security: red-team the /auth/phone/lookup response for any field that could be back-correlated to userId (e.g. lanternName length, createdAt timestamps with tight clustering).
Performance: measure added latency on /auth/phone/lookup + client-side decrypt round-trip across slow networks.

B6. Rollout

Behind auth.sealedUserId.enabled. Stages:

Dual-store (auth_lookup row written on signup; users.phoneHash still present)
Switch login read path to auth_lookup-only response
Backfill existing users on next login (lazy)
Stop writing users.phoneHash
Remove users.phoneHash field after backfill completion

B7. Estimate

Multi-sprint (~6–8 weeks engineering, plus parallel CS/T&S workflow redesign). Counsel review is the gating dependency, not a parallel task.

Cross-cutting work

Independent of A/B but discovered during this audit:

Build the banned_accounts collection described in SAFETY_MECHANICS.md. Stage A makes the hash form available; ban enforcement currently only works at userId level, allowing trivial re-registration. This should land alongside Stage A.
Define a centralized login-events policy. Today, PIN failure counters live on users/{userId} and admin logins write to adminActions without IP. Decide what should be retained, for how long, and where (Cloud Run access logs vs. a dedicated collection). This is a prerequisite for any honest answer to §6 question 3 about user-notification policy.
Document the existing zero-knowledge proof-of-entropy chain. customToken.service.js implements a sophisticated HMAC-of-entropy flow that's not described in the privacy docs. Add to HOW_ENCRYPTION_WORKS.md.

Out of scope

Profile data encryption (already in place)
Merchant-side k-anonymity gating (independent system)
Ad-targeting privacy (Right #6 forbids targeting; nothing to seal)
Device fingerprinting / cross-device linkage (forbidden by constraint)

Decision log

2026-05-10: Brief rewritten against actual v1 (this plan's pair). Original draft's v1 description was inaccurate; corrected on this branch.
2026-05-10: Two-stage decomposition (hash phone → seal userId) chosen over single-stage. Rationale: hashing the phone is independently valuable, low-risk, doesn't require counsel review, and unblocks banned_accounts.

Sealed Identity — Engineering Spike Plan ​

Goal ​

Stage A — Hash the phone ​

A1. Schema diff ​

A2. Code touch list ​

A3. KMS pepper ​

A4. Migration ​

A5. Testing ​

A6. Rollout ​

A7. Estimate ​

Stage B — Seal the userId ​

B1. Schema diff (delta from Stage A) ​

B2. Login flow ​

B3. Code touch list ​

B4. Migration ​

B5. Testing ​

B6. Rollout ​

B7. Estimate ​

Cross-cutting work ​

Out of scope ​

Decision log ​