Author: Mark Ludwikowski <markl02us@yahoo.com> · INTERNAL / PREVIEW — ADRIZ self-assessment for INGV review. Use your browser’s Print → Save as PDF for a PDF copy.

Executive Summary — ADRIZ Etna Wildfire / Volcanic Visual Monitoring System

For: INGV decision-makers · Author: Mark Ludwikowski · Frozen baseline: commit 4d1b0cc (master) · Verified: 2026-06-29 UTC

What the system is

ADRIZ is a public-data-driven visual monitoring system for Mount Etna and the surrounding Sicilian flanks. It addresses the defining difficulty of this environment: vegetated, populated flanks that genuinely burn sit directly beneath an active volcano whose degassing, ash, lava glow, and strombolian activity are constant visual confounders. The system is built entirely from public feeds and low-cost commodity compute — no proprietary satellites, no edge or specialist hardware assumed.

It is a four-layer pipeline:

Camera wall — 5 configured public camera sources (the INGV EtnaTVChn mosaic, three Windy webcams, and the EtnaWalk stream), processed on a 75-second model-resident cycle, with honest per-source online/offline health reporting.
Frozen detector — a YOLO11s object detector (weights sha256 c0a3d0ea…) generates per-frame candidate boxes.
Crop-level Qwen3-VL semantic veto (qwen3-vl-32b, temperature 0.0) — invoked only on detector-routed hot/bright crops, asking WILDFIRE / VOLCANIC / NEITHER. A durable night-safety override makes it recall-safe at night.
Multi-feed corroboration + 64-feed data board — independent public satellite/weather/geospatial feeds (NASA FIRMS, Sentinel-3 SLSTR, MTG-FCI, MSG-SEVIRI, Sentinel-5P TROPOMI SO₂, CAMS) place each candidate in context, with operational staleness classification and a bilingual (Italian/English) dashboard.

Headline numbers (held-out evaluation — research-grade, small-sample)

On a held-out, leakage-controlled evaluation (these are offline benchmark numbers, not a live alert-dispatch measurement):

Volcanic false-alarm rate: 8.1% (5 of 62 volcanic frames), 95% CI [3.5–17.5%] — versus 9.7% for the detector alone. The five survivors are summit-degassing plumes mis-read as smoke (passed through by design to protect wildfire recall).
Daytime wildfire recall: 94.4% (17 of 18), 95% CI [74.2–99.0%] — the one loss is a borderline town/ground-lights frame the VLM correctly rejected, not a missed fire.
Night-safe. The VLM originally vetoed 2 true nighttime vegetation fires as "lava" (single-frame flame and lava are near-identical). A durable night-safety guard now withholds that volcanic veto unless independent on-crater satellite corroboration is present, surfacing the alarm as uncertain_night instead of dropping it. This takes the night silent false-negative count from 2 to 0, with the daytime numbers and the volcanic false-alarm rate unchanged.
Honest verdict on the VLM. Its measured effect was strictly one-directional (it only ever removes alarms; McNemar p=0.50, not significant at this sample size): on this set it removed exactly one sensor-artifact false alarm. It is an advisory precision layer made recall-safe at night — not a general early-smoke detector.
Cost / efficiency. The VLM is called only on hot/bright crops (~0.15 calls per frame, 0 on quiet frames); estimated all-in cost ~$5–15/month, $0 when quiet (scale-to-zero).

What is LIVE vs STAGED vs RESEARCH

LIVE_OPERATIONAL (deployed, running, current health evidence): - The camera wall (75 s cadence) — 3 of 5 sources online at verification (the INGV mosaic and EtnaWalk were OFFLINE; reported honestly, never faked). - The detector + crop-level VLM veto, running in the live service. - The 64-feed board: 46 live / 7 stale / 10 catalogued / 1 key-pending / 0 error, with a guard ensuring a degraded feed cannot masquerade as "live-with-zero" (live read: 23,379 roads / 341 rail; FWI 13.06 moderate). - WF36 corroboration logic for the 5 rule-backed classes (wildfire smoke, wildfire flame, lava/incandescence, ash plume, steam/degassing). - Bilingual dashboard (240/240 IT/EN key parity); hourly feed refresh; dedup/staleness guards.

STAGED_NOT_LIVE (built but not active in production): - Automated alert email dispatch (gated off) and the human-in-loop review workflow — the dashboard is explicitly an internal/preview self-assessment tool ("Not a public product"). - 8 corroboration classes (cloud, glare, frozen-frame, fog/haze, industrial smoke, dust, camera artifact, unknown) — staged targets, not claimed as detected or corroborated.

RESEARCH_ONLY (evaluated offline; not an operational claim): - All the headline performance metrics above (held-out, small-n). - The quantum evaluation — on the Etna volcanic-vs-wildfire task, simulated quantum classifiers were beaten by matched classical models with confidence intervals that exclude zero (a clean publishable negative). No quantum hardware was used; the one genuine novelty (volcanic source-inversion as a QUBO) is kept as a research line.

Honest limitations (reviewer-facing)

Small samples: n=62 volcanic and n=18 daytime-recall frames give wide confidence intervals; every point estimate is directional. The VLM's confirmed benefit (1 artifact false alarm removed) is within statistical noise — no system-level FP improvement is claimed.
The night-fire↔lava ambiguity is mitigated, not solved. The night guard prevents silent losses, but the underlying single-frame ambiguity remains; the durable fix (temporal persistence + hard satellite night-override) is future work.
Not instrumented: p90/p99 latency tails and frame-capture success rate are UNKNOWN — queued for instrumentation.
7 stale feeds at verification (upstream archive lags / backend outages) are honestly flagged and treated as no-evidence, never as live data.
Monitoring flag: transient TLS resets on the /api/cams endpoint (recovered on retry).

Next steps

0–2 weeks: instrument latency tails and frame-capture success rate; harden the camera-frame endpoint; add a frozen-frame guard.
1 month: enlarge the held-out sets to tighten the confidence intervals; run an adversarial hard-negative robustness battery.
3 months: domain adaptation across the distinct camera domains (the core next pillar); benchmark detector/segmentation challengers (RT-DETR, Grounding DINO, SAM 2) against the frozen baseline.
6 months: IR/thermal fusion (directly attacks the night lava-vs-fire ambiguity), a fine-tuned Etna wildfire/volcano VLM, and MTG-FCI event tracking (rate-of-spread, fire-arrival maps) — moving FCI from coverage-only context to genuine event tracking.

Bottom line

ADRIZ is a reproducible, evidence-backed, operationally-honest architecture for multi-source wildfire/volcanic monitoring at Etna. It demonstrably reduces volcanic false alarms while preserving daytime wildfire recall and remaining recall-safe at night, with every component classified by its true operational status and every limitation stated. It is not claimed to solve early wildfire detection generally; it is a defensible foundation and a clear roadmap toward an operational INGV decision-support tool.