The silent killer
The worst field bug is not a crash — it is a device running last month's firmware while you debug this month's config. It reports healthy, logs flow, and the feature you shipped simply is not there. Fleet operations die from this quietly.
Rule 1: manifest before flash
Keep a manifest listing, per device role, the markers its current firmware must contain — key sensor fields, message types, buffer behaviors. Before every flash, mechanically verify the YAML you are about to burn:
# example: sentinel firmware must ship these markers
grep -c "mag_h" sentinel.yaml # expect ≥ 1
grep -c "type:heartbeat" sentinel.yaml
A green git status is not proof the file is current — the file can be committed and still be the wrong variant. Check content, not cleanliness.
Rule 2: log every flash
One line per flash in a FLASH_LOG: date, device, config file, firmware variant, who, why. When a site misbehaves three weeks later, this log answers "what is actually running out there?" in seconds. Register the flash in your Latent device entry too — the registry is your fleet's memory.
Rule 3: own the serial port
Flashing fails mid-write if another process grabs the port — background relay daemons and forgotten serial monitors are the usual thieves. Before field flashing, stop anything that touches USB serial. A flash interrupted at 60% can leave a device boot-looping in a ceiling you need a ladder to reach.
Rule 4: verify after
After flashing: watch one full boot in logs, confirm the markers behave (heartbeat arrives, sensor field present), then pack the ladder. Sixty seconds of watching saves a return visit.