Venue Caching & API Cost Control

Status: Active
Updated: 2026-03-16
Related Issues: #210, #155, #249, #224, #394

Overview

Venues are cached in Firestore and shared by all users. The Maps API (or other external venue source) is called only when:

Refreshing an already-mapped area on a 14-day minimum cadence (rate limiting).
First-time mapping of a new/unmapped area.

This keeps paid API costs low while ensuring users always have venue data.

Geohash-Based Caching

Each geographic area is identified by a 5-character geohash prefix (~5km × 5km cell):

┌─────────────────────────────────────────────────────────────────┐
│                    GEOHASH PRECISION GUIDE                      │
├───────────┬──────────────────┬──────────────────────────────────┤
│ Precision │ Cell Size        │ Use Case                         │
├───────────┼──────────────────┼──────────────────────────────────┤
│ 4         │ ~39km × 19.5km   │ Metro area (too coarse)          │
│ 5         │ ~4.9km × 4.9km   │ ✅ Neighborhood (current)        │
│ 6         │ ~1.2km × 0.6km   │ City blocks (too granular)       │
└───────────┴──────────────────┴──────────────────────────────────┘

Why precision 5?

Matches typical user search radius (~5km)
Balances API cost (fewer cells = fewer refreshes)
Still granular enough for urban areas

Refresh Thresholds

Each geohash area has independent refresh metadata:

Threshold	Days	Behavior
Fresh	0–14	Return cached venues, no API call
Minimum refresh interval	14	Won't refresh more often than this
Moderately stale	30–90	Return cached + background refresh
Very stale	90+	Block and wait for fresh API data
Never imported	∞	Block — treated same as very stale

Configuration: src/lib/venueRefreshService.js

Architecture Diagram

┌──────────────────────────────────────────────────────────────────────────┐
│                            FIRESTORE                                     │
│  venues collection (shared by all users)                                 │
│  - Keyed by area (geohash), contains name/lat/lng/category/etc.          │
│                                                                          │
│  venueRefreshMetadata collection                                         │
│  - Tracks lastRefreshedAt per 5-char geohash prefix                      │
│  - inProgress lock (5-min timeout) prevents concurrent refreshes         │
└──────▲───────────────────────────────────┬───────────────────────────────┘
       │ write (server-side only)          │ read (geohash range queries)
       │                                   ▼
┌──────┴──────────────┐          ┌───────────────────────────────────────┐
│  Venues API         │          │  Client (Dashboard)                    │
│  (Cloud Run)        │          │                                       │
│  POST /import/osm   │◄─────── │  1. loadInitialVenues()               │
│  POST /refresh/batch│          │  2. getNearbyVenues(lat,lng,1km)      │
│  POST /refresh/     │          │  3. prefetchVenueAddresses(venues)    │
│    enrich/:id       │          │  4. subscribeToVenueUpdates(ids)      │
└─────────────────────┘          └───────────────────────────────────────┘
       │                                   ▲
       ▼                                   │ (localStorage)
┌─────────────────────┐          ┌─────────┴──────────────────────────┐
│  OSM Overpass API    │          │  venueCacheManager                  │
│  (venue import)      │          │  - 5-min TTL + Haversine location   │
├─────────────────────┤          │  - Persisted to localStorage         │
│  Nominatim           │          │  - Loaded before geolocation resolves│
│  (address enrichment)│          └────────────────────────────────────┘
└─────────────────────┘

Full Dashboard Loading Pipeline

This traces every step from app open to venues on screen.

Step 1: Pre-Geolocation (immediate)

Dashboard mounts
 └─ loadInitialVenues()
     ├─ setLoadingVenues(true)
     └─ if cachedVenues && isCacheValid(cachedVenues)  // time-only check, 5min TTL
         ├─ setVenues(cached.venues)     ← user sees venues NOW
         ├─ setLoadingVenues(false)      ← spinner gone
         └─ (geolocation request starts in parallel)

Step 2: Geolocation Resolves

getLocation callback fires with lat/lng
 ├─ setUserLocation({lat, lng})
 └─ isCacheValidForLocation(cache, lat, lng)?  // time + Haversine distance
     │
     ├── YES (cache hit) ─────────────┐
     │   ├─ setLoadingVenues(false)   │
     │   ├─ subscribeToVenueUpdates   │  skipInitialSnapshot=false
     │   │   (corrects stale counts)  │  because data is from cache
     │   └─ return                    │
     │                                │
     └── NO (cache miss / moved) ─────┤
         └─ getNearbyVenues(lat, lng, 1km, {limit: 20})

Step 3: getNearbyVenues — Staleness Check

getNearbyVenues(lat, lng, radius)
 ├─ Compute geohash query bounds → touched prefixes (1–4 cells)
 ├─ checkStaleness(prefix) for each cell → partition into:
 │   ├─ freshCells      (< 30 days)   → skip
 │   ├─ backgroundCells (30–90 days)  → non-blocking refresh
 │   └─ blockingCells   (> 90 days or never imported) → WAIT
 │
 ├─ BLOCKING CELLS present?
 │   ├─ markRefreshInProgress(prefix) for each
 │   ├─ triggerAreaRefresh(prefix, lat, lng, radius)
 │   │   └─ venue-api POST /venues/import/osm
 │   │       ├─ Overpass API → fetch OSM venues
 │   │       ├─ importVenuesToFirestore() → dedup + write
 │   │       └─ response: { imported, skipped, importedVenues[] }
 │   │
 │   └─ SHORTCUT: if skipped=0 && importedVenues.length > 0
 │       ├─ Filter/sort imported venues client-side
 │       ├─ Seed in-memory cache
 │       └─ RETURN immediately (no Firestore query needed)
 │
 ├─ BACKGROUND CELLS present?
 │   └─ triggerAreaRefresh() fire-and-forget
 │
 ├─ Check in-memory cache (2min TTL, skip if any cell was stale)
 │
 └─ Firestore geohash range queries → filter/sort → return

Step 4: Display + Subscriptions

Venues returned to Dashboard
 ├─ setVenues(formattedVenues)
 ├─ setLoadingVenues(false)
 ├─ prefetchVenueAddresses(venues)    ← background enrichment via venue-api
 └─ subscribeToVenueUpdates(ids)      ← real-time lantern count updates
     └─ onSnapshot with batched 'in' queries (30 per batch)
         └─ skipInitialSnapshot=true for fresh data

Step 5: Load More (on-demand)

User scrolls to bottom → loadMoreVenues()
 ├─ getNearbyVenues(lat, lng, 5km)    ← wider radius
 ├─ Merge with existing venues (dedup by id)
 ├─ Update subscribeToVenueUpdates with all venue IDs
 └─ prefetchVenueAddresses(new venues)

Scenarios

Scenario 1: User opens app in a mapped area (e.g., San Diego)

User grants location permission → device gets lat/lng.
App calls getNearbyVenues(lat, lng).
Query hits Firestore using geohash bounds (geofire).
Venues returned from Firestore cache.
No Maps API call. User sees venues instantly.

Cost: $0 (Firestore reads only).

Scenario 2: User opens app in an unmapped area

User grants location → device gets lat/lng.
App calls getNearbyVenues(lat, lng).
Query hits Firestore → no venues found for this geohash area.
Proximity gate check:
- Is user actually at this location? (server-side validation)
- Is area allowed? (if you restrict to certain regions)
- Rate limit check (prevent abuse).
If all checks pass → trigger Maps API to fetch venues for this area.
Venues written to Firestore with lastRefreshedAt timestamp.
User sees venues.

Cost: 1 Maps API call (one-time for this area, shared by all future users).

Scenario 3: Scheduled refresh of a mapped area (TTL expired)

Background job or user request triggers refresh check.
Check lastRefreshedAt for the area's geohash prefix.
If older than TTL (30 days):
- Call Maps API to fetch fresh venues.
- Upsert venues into Firestore.
- Update lastRefreshedAt.
All users now get fresh data from Firestore.

Cost: 1 Maps API call per area per 30 days.

Scenario 4: User tries to spoof location to trigger new area mapping

User sends fake coordinates (e.g., claims to be in Tokyo from NYC).
Proximity gate rejects:
- Server-side validation compares claimed coords vs. IP geolocation (if implemented).
- Or: area is not in allowed regions list.
- Or: rate limit exceeded.
No Maps API call triggered.
User gets error or sees no venues.

Cost: $0 (blocked).

Scenario 5: User moves within the same area (e.g., across San Diego)

User moves 2 km within SD.
App calls getNearbyVenues(newLat, newLng).
Still within same geohash-5 bucket → Firestore query returns cached venues.
No Maps API call.

Cost: $0.

Scenario 6: User travels to a new mapped area (e.g., SD → LA)

User opens app in LA.
App calls getNearbyVenues(lat, lng).
Query hits Firestore for LA geohash → venues found (LA was already mapped).
No Maps API call.

Cost: $0.

Scenario 7: User travels to a new unmapped area (e.g., SD → Phoenix, if Phoenix isn't mapped yet)

User opens app in Phoenix.
Query hits Firestore → no venues for Phoenix geohash.
Proximity gate validates user is actually in Phoenix.
If allowed region + rate limit OK → trigger Maps API for Phoenix.
Venues written to Firestore.
Future Phoenix users get cached data.

Cost: 1 Maps API call (one-time for Phoenix).

Summary Table

Scenario	Maps API Call?	Who Pays?
Mapped area, normal use	No	—
Unmapped area, first user	Yes (1x)	Shared
TTL refresh (30 days)	Yes (1x)	Shared
Spoofed/blocked location	No	—
Move within same area	No	—
Travel to mapped city	No	—

Configuration

TTL & Staleness Thresholds

Defined in src/lib/venueRefreshService.js:

javascript

export const MINIMUM_REFRESH_INTERVAL_DAYS = 14 // Won't refresh more often
export const MODERATE_STALENESS_DAYS = 30       // Background refresh
export const VERY_STALE_DAYS = 90               // Blocking refresh

Proximity Gating

Defined in src/lib/locationProximityGate.js:

javascript

// Cache TTL for area metadata (30 days)
export const DEFAULT_PLACE_CACHE_TTL_MS = 30 * 24 * 60 * 60 * 1000

// Distance threshold for movement detection (500m)
export const DEFAULT_DISTANCE_THRESHOLD_METERS = 500

// Proximity gate radius for place operations (200m)
export const DEFAULT_PROXIMITY_GATE_RADIUS_METERS = 200

// Rate limit: max 10 API calls per minute
export const MAX_API_CALLS_PER_WINDOW = 10
export const RATE_LIMIT_WINDOW_MS = 60 * 1000

Geohash Bucketing

We use geohash-5 prefixes (~4.9 km x 4.9 km) to group venues by area:

Geohash Length	Approx Size	Use Case
4	~39 km	Region/metro
5	~4.9 km	City district (recommended)
6	~1.2 km	Neighborhood
7	~150 m	Block

Geohash-5 balances:

Few enough buckets to minimize fragmentation.
Small enough to localize venue queries.

Cost Estimates

Google Places API Pricing

API	Cost per 1,000 calls
Place Details	$17
Nearby Search	$32
Text Search	$32
Autocomplete	$2.83

What Costs Money (API Calls)

Event	Frequency	Cost
First user in new geohash-5 area	Once ever	~$0.017–0.032
TTL refresh of existing area	Once per 30 days per area	~$0.017–0.032

What's Free

All users reading venues from Firestore = no Maps API cost. They're reading cached data.

Realistic Monthly Estimates

Scenario: 10k users, mostly in San Diego metro (~50 geohash-5 areas)

Cost Type	Calculation	Monthly Cost
TTL refreshes (30-day)	50 areas × $0.032	$1.60
New unmapped areas	~10 new areas × $0.032	$0.32
Total		~$2/month

Scenario: 10k users spread across 500 areas nationwide

Cost Type	Calculation	Monthly Cost
TTL refreshes	500 areas × $0.032	$16/month

Caching Compliance

Google's Terms of Service allow caching Place data for up to 30 days as long as you:

Don't pre-fetch data speculatively
Refresh data that users request after 30 days
Don't resell the data

Our 30-day TTL aligns with this requirement.

File	Role
`apps/web/src/lib/venueService.js`	Venue CRUD, `getNearbyVenues()`, enrichment queue, `subscribeToVenueUpdates()`
`apps/web/src/lib/venueRefreshService.js`	Geohash staleness tracking, `triggerAreaRefresh()`
`apps/web/src/lib/venueApiClient.js`	HTTP client for venue-api Cloud Run service
`apps/web/src/lib/venueCacheManager.js`	localStorage venue cache (TTL + location validation)
`apps/web/src/lib/locationProximityGate.js`	Proximity validation & rate limiting
`apps/web/src/screens/dashboard/Dashboard.jsx`	Orchestrates the full loading pipeline
`services/api/venues/`	Cloud Run venue-api (import, enrichment, admin)
`services/api/venues/src/services/venue.service.js`	Server-side Firestore writes
`services/api/venues/src/routes/import.js`	OSM import endpoint
`services/api/venues/src/routes/refresh.js`	Enrichment endpoints (single + batch)

Address Enrichment (via Venues API)

Overview

Venues imported from OSM have placeholder or missing addresses (just coordinates). We use the Venues API (Cloud Run service at services/api/venues/) which calls Nominatim server-side to enrich venue addresses. This replaces the old client-side Nominatim calls and Firebase Cloud Function approach.

Enrichment Pipeline

┌──────────────────────────────────────────────────────────────────┐
│ Venues load → prefetchVenueAddresses(venues)                     │
│                                                                  │
│ For each venue missing addressComponents:                        │
│   → Queue in prefetchQueue (Map<venueId, {id, lat, lng}>)        │
│   → Process up to 15 venues per batch                            │
│   → POST /venues/refresh/batch → Venues API (Cloud Run)          │
│       └→ Nominatim reverse geocode (server-side, sequential)     │
│       └→ Update Firestore addressComponents                      │
│   → onSnapshot delivers update to subscribed clients             │
│                                                                  │
│ On-demand: User opens venue detail                               │
│   → enrichVenueAddress(venueId, venueMeta)                       │
│   → POST /venues/refresh/enrich/:venueId → Venues API           │
│   → Client applies result optimistically + onSnapshot confirms   │
└──────────────────────────────────────────────────────────────────┘

Key Files

File	Purpose
`apps/web/src/lib/venueService.js`	`prefetchVenueAddresses()`, `enrichVenueAddress()`, batch queue
`apps/web/src/lib/venueApiClient.js`	HTTP client for venue-api (`enrichVenue()`, `batchRefreshVenues()`)
`services/api/venues/src/routes/refresh.js`	Server-side enrichment endpoints
`services/api/venues/src/services/nominatim.service.js`	Nominatim reverse geocoding
`services/api/venues/src/services/venue.service.js`	Firestore writes (server-side)

Session-Level Deduplication

Three guards prevent redundant enrichment calls:

enrichedThisSession (Set) — venues already enriched this browser session
enrichmentInProgress (Set) — venues currently being enriched (prevents concurrent calls)
prefetchQueue.pending (Map) — venues queued but not yet processed

Monitoring

In development, enrichment timing stats are available:

javascript

import { getEnrichmentTimings } from './lib/venueService'
console.log(getEnrichmentTimings())
// { avgCloudFunctionMs: 450, avgTotalMs: 520, count: 12, batchCount: 10, singleCount: 2, history: [...] }

Caching Layers Summary

The venue system uses four distinct caching layers, each with different TTLs and scopes:

Layer	Location	TTL	Scope	Purpose
localStorage cache	Browser `localStorage`	5 min + location drift	Per-device	Show venues before geolocation resolves
In-memory query cache	`venueQueryCache` Map	2 min	Per-session, per-tab	Avoid redundant Firestore queries within a session
Firestore venues	`venues` collection	Permanent (refreshed by import)	Global (shared)	Source of truth for all venue data
Geohash refresh metadata	`venueRefreshMetadata` collection	14–90 day thresholds	Global (shared)	Determines when to re-import from OSM

Cache Invalidation Flow

App.jsx unmounts Dashboard
 └─ setCachedVenues(null)        ← only clears React state
    ⚠️ Does NOT call clearPersistedVenueCache()
    ⚠️ localStorage keeps the old cache (restored on remount)

invalidateVenueCache(setCachedVenues)
 └─ setCachedVenues(null)        ← same gap: localStorage not cleared

Known Cache Edge Cases

Latitude 0 (equator): isCacheValidForLocation guards with if (currentLat && currentLng) which treats 0 as falsy, skipping the location check for venues at the equator.
invalidateVenueCache doesn't clear localStorage: Only clears React state. The old cache survives and gets restored on next mount. Low risk since TTL catches it, but can cause confusion during debugging.
In-progress lock race: Two tabs can both read inProgress=false, both write true, and both trigger the Venues API. The 5-min timeout mitigates this, and the server-side dedup prevents duplicate venues, but it wastes an API call.

Real-Time Updates (subscribeToVenueUpdates)

Venues subscribe to Firestore onSnapshot listeners for real-time lantern count changes. The implementation uses batched where(documentId(), 'in', ids) queries (max 30 per batch) instead of individual document listeners.

skipInitialSnapshot option:

true (default): The first snapshot from each batch is ignored. Use when the caller just fetched fresh data from Firestore.
false: The first snapshot is processed. Use when the caller's data is from cache (localStorage) and may have stale lantern counts.

subscribeToVenueUpdates(venueIds, onUpdate, { skipInitialSnapshot: false })

Venue Caching & API Cost Control ​

Overview ​

Geohash-Based Caching ​

Refresh Thresholds ​

Architecture Diagram ​

Full Dashboard Loading Pipeline ​

Step 1: Pre-Geolocation (immediate) ​

Step 2: Geolocation Resolves ​

Step 3: getNearbyVenues — Staleness Check ​

Step 4: Display + Subscriptions ​

Step 5: Load More (on-demand) ​

Scenarios ​

Scenario 1: User opens app in a mapped area (e.g., San Diego) ​

Scenario 2: User opens app in an unmapped area ​

Scenario 3: Scheduled refresh of a mapped area (TTL expired) ​

Scenario 4: User tries to spoof location to trigger new area mapping ​

Scenario 5: User moves within the same area (e.g., across San Diego) ​

Scenario 6: User travels to a new mapped area (e.g., SD → LA) ​

Scenario 7: User travels to a new unmapped area (e.g., SD → Phoenix, if Phoenix isn't mapped yet) ​

Summary Table ​

Configuration ​

TTL & Staleness Thresholds ​

Proximity Gating ​

Geohash Bucketing ​

Cost Estimates ​

Google Places API Pricing ​

What Costs Money (API Calls) ​

What's Free ​

Realistic Monthly Estimates ​

Caching Compliance ​

Related Files ​

Address Enrichment (via Venues API) ​

Overview ​

Enrichment Pipeline ​

Key Files ​

Session-Level Deduplication ​

Monitoring ​

See Also ​

Caching Layers Summary ​

Cache Invalidation Flow ​

Known Cache Edge Cases ​

Real-Time Updates (subscribeToVenueUpdates) ​

Venue Caching & API Cost Control

Overview

Geohash-Based Caching

Refresh Thresholds

Architecture Diagram

Full Dashboard Loading Pipeline

Step 1: Pre-Geolocation (immediate)

Step 2: Geolocation Resolves

Step 3: getNearbyVenues — Staleness Check

Step 4: Display + Subscriptions

Step 5: Load More (on-demand)

Scenarios

Scenario 1: User opens app in a mapped area (e.g., San Diego)

Scenario 2: User opens app in an unmapped area

Scenario 3: Scheduled refresh of a mapped area (TTL expired)

Scenario 4: User tries to spoof location to trigger new area mapping

Scenario 5: User moves within the same area (e.g., across San Diego)

Scenario 6: User travels to a new mapped area (e.g., SD → LA)

Scenario 7: User travels to a new unmapped area (e.g., SD → Phoenix, if Phoenix isn't mapped yet)

Summary Table

Configuration

TTL & Staleness Thresholds

Proximity Gating

Geohash Bucketing

Cost Estimates

Google Places API Pricing

What Costs Money (API Calls)

What's Free

Realistic Monthly Estimates

Caching Compliance

Related Files

Address Enrichment (via Venues API)

Overview

Enrichment Pipeline

Key Files

Session-Level Deduplication

Monitoring

See Also

Caching Layers Summary

Cache Invalidation Flow

Known Cache Edge Cases

Real-Time Updates (subscribeToVenueUpdates)