NWS Cloud Anomaly

Raw weather observations don't answer the question a solar trader or grid operator actually has β€” how far is this from what we expected? We score each new METAR reading as the residual β€” observed cloud cover minus the station's hour-of-day baseline β€” then feed residuals into a T-Digest streaming quantile sketch, entirely in your browser. Anomalous residuals flag themselves, and every observation emits a JSONFeed item with an _nws extension ready for a machine consumer.

Honest-about-the-baseline: a production anomaly detector would compare observed βˆ’ forecast. The NWS doesn't serve a historical forecast archive we can diff against, so this demo uses each station's hour-of-day median across a rolling 7-day window as a proxy baseline β€” and falls back to the station's overall daily median for hour-buckets too sparse to trust (< 3 samples). The _nws.baseline_source field in every feed item is explicit about which one was used so downstream consumers know what they're diffing.

Discoverable feed: this page advertises /feed.json (and an anomalies-only variant) via <link rel="alternate" type="application/feed+json"> in the page head. Point a JSONFeed reader, a discover job, or feed.works itself at this URL and you get the same scored items the UI shows β€” the server uses the exact same scoring module the browser does.

Loading 24-hour observation window…

Recent JSONFeed items (0)

Waiting for the first observation tick…

How this works

  • Our nightly Lambda fetches the last 7 days of METAR observations for ~20 curated stations from api.weather.gov, writes a single JSON blob to S3 + CloudFront.
  • Your browser downloads that blob (~50 KB gzipped, cached).
  • Per station, we compute a hour-of-day baseline (median cloud cover for each UTC hour of day) and build a T-Digest of residuals (observed βˆ’ baseline). Streaming quantile sketch, all in-browser, no model download.
  • Every 5 seconds, the next observation is replayed. We score its residual against the station's digest and emit a JSONFeed item carrying the observation, the baseline, the residual, and the percentile β€” all under the _nws extension.
  • The server-side /feed.json route shares the same scoring module with the island, so the discoverable feed and the live UI agree about what's an anomaly.
  • Turn on webhook-echo and every anomaly is POSTed to the shared /api/demos/webhook-echo sink β€” exactly the shape a real integrator would see.

What's in the feed

Every item's _nws extension is self-contained β€” a consumer doesn't have to re-fetch the baseline or re-run scoring:

{
  "_nws": {
    "station_id": "KPHX",
    "observed_at": "2026-04-20T15:53:00Z",
    "raw": "METAR KPHX 201553Z ...",
    "observed_percent": 87.5,
    "baseline_percent": 37.5,
    "baseline_source": "hour_of_day_median_24h",
    "residual_percent": 50.0,
    "percentile": 0.97,
    "anomaly_score": 1.88,
    "anomaly": true
  }
}

Userland is free to use the precomputed residual, to subtract themselves, or to score against their own baseline model. All three work from this one feed.

Corpus: 7-day rolling window across desert-southwest / central-valley / pacific-northwest / mountain-west / midwest / northeast / southeast stations. Refresh: daily at 04:15 UTC. The _nws.baseline_source field is hour_of_day_median_24h when the station's bucket for that UTC hour holds enough samples (β‰₯ 3) and falls back to daily_median_24h otherwise, so sparse stations still get a sensible baseline.