Skip to content

BigQuery Billing โ€” Phase 2: Normalize + Attribute (Cloudflare) โ€‹

Last Updated: 2026-05-10 Status: Planning checkpoint (pre-implementation) Branch: claude/bigquery-billing-phase-2-v3K7yParent plan: docs/plans/2026-05-05_bigquery-billing-infrastructure.md ยง6 Phase 2 Phase 1 reference: PR #462 (scaffold) โ†’ PR #465 (flesh-out); same pattern intended here.


1. Goal โ€‹

Stand up the Dataform transform layer for the BigQuery billing pipeline, end-to-end for one vendor (Cloudflare). After Phase 2:

  • billing_raw.cloudflare_invoice_line_items_raw (Phase 1 output) is normalized into a vendor-agnostic billing_norm.fact_cost_line_items daily.
  • Rules in billing_attrib.dim_cost_allocation_rules join fact_cost_line_items against dim_taxonomy to produce billing_attrib.fact_cost_attributed.
  • billing_marts.unattributed_cost_daily (view) surfaces attribution coverage.

Categories enabled per parent ยง4: 2.1 (cost-by-service / SKU), 2.2 (cost-by-resource), 3.1 (cost-by-environment), 3.2 (cost-by-app), 5.4 (attribution coverage). All for Cloudflare only.

2. Done when โ€‹

sql
-- 1. Cloudflare attributed totals match raw within 1%, sliced by env + app:
WITH raw AS (
  SELECT snapshot_date, SUM(amount_usd) AS raw_total
  FROM `lantern-app-dev.billing_raw.cloudflare_invoice_line_items_raw`
  WHERE snapshot_date BETWEEN DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY) AND CURRENT_DATE()
  GROUP BY snapshot_date
),
attr AS (
  SELECT DATE(usage_start) AS snapshot_date, env, app,
         SUM(cost_usd) AS attr_total
  FROM `lantern-app-dev.billing_attrib.fact_cost_attributed`
  WHERE vendor = 'cloudflare'
    AND DATE(usage_start) BETWEEN DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY) AND CURRENT_DATE()
  GROUP BY snapshot_date, env, app
)
SELECT raw.snapshot_date, raw.raw_total,
       SUM(attr.attr_total) AS attributed_total,
       ABS(raw.raw_total - SUM(attr.attr_total)) / NULLIF(raw.raw_total, 0) AS delta_pct
FROM raw LEFT JOIN attr USING (snapshot_date)
GROUP BY raw.snapshot_date, raw.raw_total
ORDER BY raw.snapshot_date DESC;
-- All rows: delta_pct < 0.01

-- 2. Unattributed coverage check:
SELECT * FROM `lantern-app-dev.billing_marts.unattributed_cost_daily`
WHERE snapshot_date >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY);
-- All rows: unattributed_pct < 5%

3. Locked decisions โ€‹

RefDecisionSource
D1Dataform (not dbt)Parent ยง8
D6Taxonomy: env + app + servicePhase 2 kickoff (2026-05-10)
D8Omit shared-resource attribution columns until Phase 4-5Phase 2 kickoff (2026-05-10)
P2-ARepo: in-tree at services/dataform/, mirrored to GCP-side lantern-billing-transformsPhase 2 kickoff
P2-BSchedule: Dataform workflow_config, daily 03:30 PT (30 min after ingest-cloudflare)Phase 2 kickoff
P2-CCI: dataform compile only (no live BQ dry-run) on every PRPhase 2 kickoff

4. Repo layout โ€‹

services/dataform/
โ”œโ”€โ”€ workflow_settings.yaml       # defaultProject, defaultLocation, defaultDataset
โ”œโ”€โ”€ package.json                 # @dataform/core pin
โ”œโ”€โ”€ definitions/
โ”‚   โ”œโ”€โ”€ sources/
โ”‚   โ”‚   โ””โ”€โ”€ cloudflare_raw.sqlx  # declares billing_raw.cloudflare_invoice_line_items_raw
โ”‚   โ”œโ”€โ”€ norm/
โ”‚   โ”‚   โ””โ”€โ”€ fact_cost_line_items.sqlx           # incremental, partitioned, clustered
โ”‚   โ”œโ”€โ”€ attrib/
โ”‚   โ”‚   โ”œโ”€โ”€ dim_taxonomy.sqlx                   # static seed table
โ”‚   โ”‚   โ”œโ”€โ”€ dim_cost_allocation_rules.sqlx      # static seed table
โ”‚   โ”‚   โ””โ”€โ”€ fact_cost_attributed.sqlx           # incremental
โ”‚   โ””โ”€โ”€ marts/
โ”‚       โ””โ”€โ”€ unattributed_cost_daily.sqlx        # view
โ”œโ”€โ”€ includes/
โ”‚   โ”œโ”€โ”€ constants.js             # dataset names, partition keys
โ”‚   โ”œโ”€โ”€ taxonomy_seed.js         # initial dim_taxonomy rows
โ”‚   โ””โ”€โ”€ allocation_rules_seed.js # initial dim_cost_allocation_rules rows
โ””โ”€โ”€ README.md

workflow_settings.yaml keyed off ${dataformCore.envOverrides} so the same compiled artifact targets lantern-app-dev or lantern-app-prod based on the workflow that invokes it.

5. Schemas โ€‹

5.1 billing_norm.fact_cost_line_items โ€‹

Vendor-agnostic. One row per billable cost event.

ColumnTypeNotes
vendorSTRING REQUIRED'cloudflare' in Phase 2; 'gcp', 'railway', 'anthropic', 'resend', 'github' in Phase 4
serviceSTRING NULLABLEVendor's product/service name (e.g. 'workers', 'r2', 'zone_plan')
skuSTRING NULLABLEVendor SKU / line-item code
usage_amountFLOAT NULLABLEQuantity used
usage_unitSTRING NULLABLEe.g. 'requests', 'GB', 'hours'
cost_usdFLOAT REQUIREDCost in USD
usage_startTIMESTAMP REQUIREDPeriod start; falls back to snapshot_date if vendor doesn't provide
usage_endTIMESTAMP NULLABLEPeriod end
resource_idSTRING NULLABLEVendor-shaped resource identifier โ€” for Cloudflare, zone_id; for GCP, project/instance
labelsJSON NULLABLEVendor-specific extras (e.g. Cloudflare account_id, raw description)
source_line_item_idSTRING REQUIREDForeign key into billing_raw.* for audit
loaded_atTIMESTAMP REQUIREDDataform run timestamp

Partition: DAY on DATE(usage_start). Cluster: (vendor, service, resource_id). Incremental: WHERE DATE(usage_start) >= DATE_SUB(CURRENT_DATE(), INTERVAL 2 DAY) to handle late-arriving raw rows.

5.2 billing_attrib.dim_taxonomy โ€‹

Static reference. Three dimensions per D6.

ColumnTypeExample
envSTRING REQUIRED'dev', 'prod', 'shared'
appSTRING REQUIRED'web', 'api', 'admin', 'functions', 'merchant-portal', 'analytics', 'unknown'
serviceSTRING REQUIREDFree-form vendor service name; '*' matches any
descriptionSTRING NULLABLEHuman-readable for review tooling

Seeded by includes/taxonomy_seed.js. ~15-20 rows for the cartesian we care about. (Allocation rules โ€” ยง5.3 โ€” are seeded separately from includes/allocation_rules_seed.js; one seed file per dim table.)

5.3 billing_attrib.dim_cost_allocation_rules โ€‹

Static. Match rules from raw fields to taxonomy.

ColumnTypeNotes
rule_idSTRING REQUIREDStable identifier for tooling
priorityINTEGER REQUIREDLower wins ties; default catch-alls have highest priority numbers
vendorSTRING REQUIREDScope rule to a vendor
match_fieldSTRING REQUIREDOne of 'resource_id', 'service', 'sku', or 'labels.<key>' (e.g. 'labels.account_id' for Cloudflare, 'labels.gcp_project_id' for GCP). The labels.<key> form is resolved with JSON_VALUE(labels, '$.' || <key>) at evaluation time.
match_patternSTRING REQUIREDExact match for v1; SQL LIKE pattern allowed ('%' wildcard)
assigned_envSTRING NULLABLENULL means "don't override"
assigned_appSTRING NULLABLENULL means "don't override"
assigned_serviceSTRING NULLABLENULL โ†’ use vendor's service field
effective_fromDATE REQUIREDFor backfill correctness
effective_toDATE NULLABLENULL = open-ended

Initial rules (per parent ยง6 Phase 2). Columns: vendor, match_field, match_pattern, assignment, priority.

gcp,        labels.gcp_project_id, %-dev,                       env=dev          (priority 100)
gcp,        labels.gcp_project_id, %-prod,                      env=prod         (priority 100)
cloudflare, resource_id,           <ZONE_LANTERN_APP_COM>,      app=web          (priority 50)
cloudflare, resource_id,           <ZONE_API_LANTERN_APP_COM>,  app=api          (priority 50)
cloudflare, service,               zone_plan,                   service=zone_plan(priority 200)
cloudflare, *,                     %,                           app=unknown,env=shared (priority 999, catch-all)

Phase 2 seeds Cloudflare rules only. GCP rules become live when GCP ingest lands (Phase 4) but the rule shape stays.

5.4 billing_attrib.fact_cost_attributed โ€‹

Same shape as fact_cost_line_items plus three attribution columns:

Added ColumnTypeNotes
envSTRING REQUIREDResolved from rules
appSTRING REQUIREDResolved from rules
attributed_serviceSTRING NULLABLEResolved (rules can override raw service)
matched_rule_idSTRING NULLABLENULL when only the catch-all hit โ€” drives ยง5 unattributed marts

No attribution_method / attribution_weight columns (D8 โ€” deferred).

Partition + cluster: same as fact_cost_line_items.

5.5 billing_marts.unattributed_cost_daily (view) โ€‹

sql
SELECT
  DATE(usage_start) AS snapshot_date,
  vendor,
  SUM(cost_usd) AS total_cost_usd,
  SUM(IF(matched_rule_id IS NULL OR app = 'unknown', cost_usd, 0)) AS unattributed_cost_usd,
  SAFE_DIVIDE(
    SUM(IF(matched_rule_id IS NULL OR app = 'unknown', cost_usd, 0)),
    SUM(cost_usd)
  ) AS unattributed_pct
FROM ${ref('fact_cost_attributed')}
GROUP BY snapshot_date, vendor;

Sources Category 5.4 in parent ยง4.

6. Cloudflare normalization mapping โ€‹

billing_raw.cloudflare_invoice_line_items_raw โ†’ fact_cost_line_items:

Raw columnNorm columnTransform
line_item_idsource_line_item_iddirect
serviceservicedirect (NULL โ†’ 'unknown')
skuskudirect
quantityusage_amountdirect
unitusage_unitdirect
amount_usdcost_usddirect
period_startusage_startCOALESCE(period_start, TIMESTAMP(snapshot_date))
period_endusage_enddirect
zone_idresource_iddirect
account_id, description, raw_payloadlabelsJSON_OBJECT('account_id', account_id, 'description', description, 'raw_payload', raw_payload) โ€” produces a JSON value matching the column type. raw_payload is already JSON in the raw schema (see services/jobs/ingest-cloudflare/src/schemas.js), so it embeds without stringification.
(constant)vendor'cloudflare'
(Dataform)loaded_atCURRENT_TIMESTAMP()

Phase 1 finding (parent handoff): the /accounts/{id}/subscriptions endpoint returns two $0-cost free-tier subscriptions and misses zone-plan cost unless the fallback fires. Out of scope here โ€” fix in Phase 4 retro by always emitting zone-plan rows alongside subscriptions in cloudflareClient.js. Phase 2 normalization works correctly either way.

7. Deploy workflow changes โ€‹

Mirror Phase 1's deploy-ingest-cloudflare-job step in both deploy-dev.yml and deploy-prod.yml. New job: deploy-dataform-billing-transforms. The deploy workflow ensures BigQuery datasets exist defensively (bq show then bq mk if missing) and provisions the Dataform repo + workflow config. It does not grant IAM โ€” see PR #474 for the manual-grant convention. Before this job's first run, a project owner adds roles/dataform.editor (project) and roles/bigquery.dataEditor on billing_norm, billing_attrib, billing_marts to the WIF SA via the loop documented in services/jobs/ingest-cloudflare/README.md.

yaml
deploy-dataform-billing-transforms:
  needs: [check-ci-success, changes]
  if: needs.changes.outputs.dataform == 'true' || needs.changes.outputs.shared-deps == 'true'
  env:
    PROJECT_ID: lantern-app-dev
    REGION: us-central1
    DATAFORM_REPO: lantern-billing-transforms
    WORKFLOW_NAME: billing-daily
    SCHEDULE: '30 3 * * *'
    SCHEDULE_TZ: America/Los_Angeles
  steps:
    # ... auth + setup ...

    - name: Ensure BigQuery datasets exist
      run: |
        for ds in billing_norm billing_attrib billing_marts; do
          # bq show first; bq mk only when missing โ€” same pattern as Phase 1
        done

    # No IAM step here โ€” bindings are granted manually per
    # services/jobs/ingest-cloudflare/README.md. New roles for Phase 2:
    #   roles/dataform.editor (project)
    #   roles/bigquery.dataEditor on billing_norm, billing_attrib, billing_marts
    # Append to the loop in that README and re-run.

    - name: Ensure Dataform repository
      run: |
        if ! gcloud dataform repositories describe $DATAFORM_REPO \
           --region=$REGION --project=$PROJECT_ID > /dev/null 2>&1; then
          gcloud dataform repositories create $DATAFORM_REPO \
            --region=$REGION --project=$PROJECT_ID \
            --git-remote-settings-url=https://github.com/cattreedev/lantern_app \
            --git-remote-settings-default-branch=dev \
            --git-remote-settings-token-secret-version=projects/$PROJECT_ID/secrets/GITHUB_DATAFORM_TOKEN/versions/latest
        fi

    - name: Ensure release configuration
      run: |
        # release_config = "billing-dev" tracking branch dev,
        # compile_config.default_database = lantern-app-dev

    - name: Ensure workflow configuration
      run: |
        # workflow_config = "billing-daily" referencing the release_config,
        # cron_schedule = SCHEDULE, time_zone = SCHEDULE_TZ

changes.outputs.dataform filter watches services/dataform/** (add to the existing dorny/paths-filter block).

8. CI changes (.github/workflows/ci.yml) โ€‹

New dataform-compile job:

yaml
dataform-compile:
  if: <changes in services/dataform/**>
  steps:
    - uses: actions/checkout@v6
    - uses: actions/setup-node@v6
      with: { node-version: 22, cache: npm, cache-dependency-path: services/dataform/package-lock.json }
    - run: npm ci
      working-directory: services/dataform
    - run: npx @dataform/cli compile
      working-directory: services/dataform

No GCP credentials. ~10s. Catches SQL syntax + broken ${ref(...)}.

9. Out of scope (Phase 4+) โ€‹

  • Other vendors (Railway, Anthropic, Resend, GitHub, GCP). Each adds a definitions/<vendor>/ directory and rules but reuses the fact_cost_line_items / fact_cost_attributed schema.
  • Phase 1 Cloudflare zone-plan fix (cloudflareClient.js).
  • Cat 4 unit-economics views and Cat 5.1 anomaly detection.
  • attribution_method / attribution_weight columns (D8).
  • Team / feature taxonomy dimensions (D6).
  • Phase 3 marts (daily_cost_by_app_env_vendor, top_cost_drivers_30d, etc.) and admin endpoints.

10. Open follow-ups (non-blocking, from handoff) โ€‹

  • [x] Close PR #302 (superseded by Phase 1 PR #465) โ€” already closed; courtesy supersession comment posted.
  • [x] Rename issue #240 โ€” was "OpenAI billing tracking", now "Add Anthropic billing to Admin Portal".
  • [x] Drop or simplify the "Apply defensive IAM bindings (best-effort)" step in deploy-{dev,prod}.yml โ€” handled by PR #474 (deleted; manual-grant convention documented in services/jobs/ingest-cloudflare/README.md).

11. Suggested implementation PR shape โ€‹

Mirror Phase 1's #462 scaffold โ†’ #465 flesh-out split:

  1. Scaffold PR (~200 LOC): services/dataform/ skeleton, workflow_settings.yaml, source declaration, empty fact_cost_line_items.sqlx, CI compile job. No deploy-workflow changes yet.
  2. Flesh-out PR (~400-500 LOC): Full normalize + attribute + marts SQLX, allocation-rules seed, deploy-workflow deploy-dataform-billing-transforms job, README. Done-when queries from ยง2 above run green against lantern-app-dev.

Both PRs base on dev.

Built with VitePress