Skip to content

Replace MetadataKeyStore with per-file Store (HA pattern)#509

Merged
bdraco merged 2 commits into
mainfrom
remote-build-pairings-per-file-storage
May 9, 2026
Merged

Replace MetadataKeyStore with per-file Store (HA pattern)#509
bdraco merged 2 commits into
mainfrom
remote-build-pairings-per-file-storage

Conversation

@bdraco

@bdraco bdraco commented May 9, 2026

Copy link
Copy Markdown
Member

What does this implement/fix?

Replaces helpers/metadata_store.py (vendored in #508) with
helpers/storage.py — a per-file Store modelled more faithfully
on Home Assistant's homeassistant.helpers.storage.Store.

MetadataKeyStore wrapped a sub-key of the shared
device-builder.json sidecar. Walking that back: switching to
HA's actual per-file model is cleaner —

  • Atomic writes per-domain. Corrupting the offloader pairings
    file can't take out the device list.
  • No lock contention between unrelated writers.
  • Independent backup / restore per concern.
  • Closer to HA muscle memory (the reference implementation is
    per-file).
  • Easier "blow it all away" reset story per-domain.

MetadataKeyStore had no consumers landed yet, so dropping it now
is cheap and saves the "shared-sidecar wrapping" deadweight from
accumulating callers. Nothing has shipped, so no migration logic
is needed.

The new Store owns one JSON file end to end:

  • async_load() returns None if the file doesn't exist;
    decoder failures propagate so the consumer makes the recovery
    decision rather than silently starting from empty state.
  • _atomic_write stages a sibling tempfile then os.replaces
    (same-FS rename, atomic). Failed replace cleans up the tempfile
    so orphans don't accumulate. Creates parent directories
    automatically.
  • async_delay_save(data_func, delay) matches HA's debounce +
    extend semantics bit-for-bit. data_func invoked at flush time
    so multiple mutations within the window land as one write of
    the latest state.
  • Single-flight writes via asyncio.Lock; async_save_now()
    awaits any in-flight write before issuing its own final flush.
  • Mandatory shutdown_register constructor arg — same structural
    contract as Vendor MetadataKeyStore: debounced single-key sidecar writer #508's helper — so a store can't be instantiated
    without telling someone who will flush it at shutdown. SIGKILL
    / process crash skip the registry the same way HA's
    EVENT_HOMEASSISTANT_FINAL_WRITE skips on hard kills.

Encoder / decoder are caller-supplied so the helper stays agnostic
about JSON / orjson / msgpack / etc. mashumaro dataclasses pair
with encoder=lambda v: orjson.dumps(v.to_dict()) /
decoder=lambda b: SomeClass.from_dict(orjson.loads(b)).

Drops the consumer-side migration to a follow-up PR (the
offloader pairings refactor will rebase on this).

Related issue or feature (if applicable):

Types of changes

  • Bugfix (non-breaking change which fixes an issue) — bugfix
  • New feature (non-breaking change which adds functionality) — new-feature
  • Enhancement to an existing feature — enhancement
  • Breaking change (fix or feature that would cause existing functionality to not work as expected) — breaking-change
  • Refactor (no behaviour change) — refactor
  • Documentation only — docs
  • Maintenance / chore — maintenance
  • CI / workflow change — ci
  • Dependencies bump — dependencies

Frontend coordination

  • No frontend change needed
  • Companion frontend PR: esphome/device-builder-dashboard-frontend#

Checklist

  • The code change is tested and works locally.
  • Pre-commit hooks pass (ruff, codespell, yaml/json/python checks).
  • Tests have been added or updated under tests/ where applicable.
  • components.json has not been hand-edited (regenerate via script/sync_components.py if a sync is needed).
  • Architecture-level changes are reflected in docs/ARCHITECTURE.md and/or docs/API.md.

#508 vendored a debounced ``MetadataKeyStore`` that wrapped a
sub-key of the shared ``device-builder.json`` sidecar. Walking
that back: switching to HA's actual per-file ``Store`` model is
cleaner — atomic writes per-domain, no lock contention between
unrelated writers, independent backup/restore per concern, and
the shape matches HA muscle memory. ``MetadataKeyStore`` had no
consumers landed yet, so dropping it now is cheap and saves the
"shared sidecar wrapping" deadweight from accumulating callers.

The new ``Store`` owns one JSON file end-to-end:

* Read on startup via ``async_load`` (returns ``None`` if the
  file doesn't exist; decoder failures propagate so the consumer
  makes the recovery decision rather than silently starting from
  empty state).
* Atomic write via ``write-tmp + os.replace`` in the destination
  directory (same-FS guarantee). Failed ``replace`` cleans up
  the tempfile so orphans don't accumulate.
* Debounce + extend semantics matching HA's
  ``Store.async_delay_save`` bit-for-bit.
* ``data_func`` invoked at flush time so multiple mutations
  within the debounce window all land as one write of the latest
  state.
* Single-flight writes via ``asyncio.Lock``; ``async_save_now``
  awaits any in-flight write before issuing its own final flush.
* Mandatory ``shutdown_register`` constructor arg — same shape
  as #508's MetadataKeyStore — so a store can't be instantiated
  without telling someone who will flush it at shutdown.

Diverges from HA only on scope-trims that don't earn complexity
here (no bus events, no preload manager, no version migration,
no ``_load_future`` reentrancy guard, no ``read_only`` /
``private`` knobs).

Encoders/decoders are caller-supplied so the helper stays
agnostic about JSON / orjson / msgpack / etc. — typical use is
``orjson.dumps(value.to_dict())`` for a mashumaro dataclass.

21 tests, 100% line coverage, mypy + pre-commit clean. Drops the
old ``MetadataKeyStore`` + its tests in the same commit.
Copilot AI review requested due to automatic review settings May 9, 2026 22:24
@github-actions github-actions Bot added the refactor Code refactor with no user-visible behaviour change label May 9, 2026
@codspeed-hq

codspeed-hq Bot commented May 9, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

✅ 12 untouched benchmarks


Comparing remote-build-pairings-per-file-storage (468546b) with main (0bf1ef0)

Open in CodSpeed

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR replaces the previously introduced shared-sidecar sub-key writer (helpers/metadata_store.py) with a Home Assistant–style per-file persistence primitive (helpers/storage.py), enabling debounced, atomic, independent JSON (bytes) storage per concern.

Changes:

  • Added helpers.storage.Store: per-file load + atomic write + debounce/extend semantics + shutdown-flush registration.
  • Replaced the prior MetadataKeyStore implementation and its unit tests with per-file store tests.
  • Updated test coverage to validate debounce behavior, shutdown flush, atomic-write temp cleanup, and error logging for the new store.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
esphome_device_builder/helpers/storage.py Introduces new per-file Store implementation with HA-like debounce/extend semantics and atomic writes.
esphome_device_builder/helpers/metadata_store.py Removes the no-longer-needed shared-sidecar sub-key store implementation.
tests/test_helpers_storage.py Adds coverage for per-file store load/save/debounce/atomic-write/error logging behaviors.
tests/test_helpers_metadata_store.py Removes tests for the deleted MetadataKeyStore.

Comment thread esphome_device_builder/helpers/storage.py Outdated
Three changes folded together:

* **CI fix.** ``_async_handle_write`` was calling ``self._encoder``
  on the event-loop thread and only the file write in the
  executor. Linux CI's blockbuster instrumentation flagged a
  blocking call when the encoder did anything that acquired a
  lock (e.g. tests that gate on ``threading.Event``). Bundles
  encoder + write into a single ``run_in_executor`` hop via
  ``_encode_and_write``, so the loop stays responsive even when
  the encoder is slow.

* **mode kwarg (defaults to ``0o600``).** The previous shape
  relied on ``tempfile.mkstemp``'s undocumented POSIX default.
  Make it explicit: stores write at ``0o600`` unless the caller
  opts into a less-restrictive mode (e.g. ``0o644`` for a public
  catalog snapshot). Routed through ``atomic_write``'s ``mode``
  arg so the chmod lands on the staging tempfile *before* the
  rename — the destination inherits the mode atomically.

* **Delegate to ``helpers.atomic_io.atomic_write``** rather than
  re-implementing the mkstemp + chmod + rename dance. ``atomic_io``
  already handles the rare ``os.fdopen`` failure path (closes the
  raw fd to avoid leaking it before the with-block manages the
  file) — addresses Copilot's review on PR #509. Same on-error
  cleanup, less duplication.

Two new tests pin the mode behaviour (default ``0o600``, explicit
``0o644``); the inflight-write test now verifies the encoder
runs off the loop thread by relying on the same blocking
primitive that previously deadlocked.
@bdraco bdraco merged commit 30dec9f into main May 9, 2026
13 checks passed
@bdraco bdraco deleted the remote-build-pairings-per-file-storage branch May 9, 2026 22:33
@codecov-commenter

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.14%. Comparing base (0bf1ef0) to head (468546b).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #509   +/-   ##
=======================================
  Coverage   99.14%   99.14%           
=======================================
  Files          68       68           
  Lines        8070     8087   +17     
=======================================
+ Hits         8001     8018   +17     
  Misses         69       69           
Flag Coverage Δ
py3.12 99.10% <100.00%> (+<0.01%) ⬆️
py3.14 99.14% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
esphome_device_builder/helpers/storage.py 100.00% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

bdraco added a commit that referenced this pull request May 9, 2026
Builds on #509's per-file ``Store`` helper. Drops the
``_offloader_remote_build`` sub-key from the shared
``device-builder.json`` sidecar entirely; offloader pairings now
live at ``<config_dir>/.offloader_pairings.json`` (sibling of the
metadata sidecar) under a per-file ``Store`` instance.

Storage shape changes:

* ``StoredPairing`` gains a ``status: PeerStatus`` field that
  defaults to ``APPROVED`` (so older sidecars without the field
  round-trip cleanly through mashumaro). The serialiser at flush
  time strips PENDING rows so the on-disk shape stays
  APPROVED-only.
* ``OffloaderRemoteBuildSettings`` is now an in-RAM serialisation
  shape only — no longer a sub-key of the shared sidecar.
* ``controllers.config`` drops every ``*_offloader_remote_build_*``
  helper (load, save, transaction, decoder, key constant). Their
  jobs are absorbed by the ``Store``'s encoder/decoder + the
  controller's debounced save scheduling.
* The accompanying storage-layer test module is dropped (no
  consumers + no helpers to test).

Controller refactor:

* Single in-RAM ``_pairings: dict[(host, port), StoredPairing]``
  replaces the ``_pending_pairings`` dict + on-disk APPROVED list
  duo. Both PENDING and APPROVED rows live here; the disk filter
  in ``_serialize_pairings`` drops PENDING at flush time.
* ``start()`` loads the per-file store (returns ``None`` if the
  file doesn't exist) and seeds the dict.
* Mutation paths (``request_pair`` / ``unpair`` /
  ``_apply_pair_status_result``) update the dict in-place and
  call ``_pairings_store.async_delay_save(...)``. APPROVED-flip
  via ``_apply_pair_status_result`` mutates the row's status in
  place rather than popping + reinserting.
* ``stop()`` walks ``_shutdown_callbacks`` (populated by the
  ``Store``'s self-registration at construction) so any debounced
  save lands before the dict is cleared. Same hook
  ``DeviceBuilder.stop`` ultimately drives.
* ``pairings_snapshot()`` becomes a sync RAM read — no executor
  hop, no disk read, no race window. The previous merge+dedupe
  logic + race test are obsolete by construction and dropped.

Codec uses ``helpers.json`` (orjson wrapper) per project
convention; the encoder writes ``orjson.dumps(value.to_dict())``,
the decoder soft-recovers to empty on a corrupt blob (logs the
exception so the recovery is observable in production traces).

Test updates:
* ``_pending_pairings`` → ``_pairings`` everywhere.
* Drop ``_persist_approved_pairing`` / ``_upsert_pairing`` /
  ``_remove_pairing_by_address`` references.
* New ``_saved_pairings(offloader)`` helper flushes the debounced
  save via the registered shutdown callback then loads from disk.
* The dedupe race test is removed (race no longer exists; sync
  RAM read is atomic against any concurrent mutation).
* ``test_subscribe_events_cleanup`` fixtures gain
  ``db.remote_build = None`` so the snapshot send doesn't trip
  AttributeError on the bare-bones DeviceBuilder stand-in. The
  forwards-bus-events test filters out the always-fired
  ``initial_state`` event when asserting on what arrives via the
  drain.
* Drops the obsolete
  ``tests/test_offloader_remote_build_storage.py`` module.

Pre-existing ``test_device_yaml.py`` failures (3 esphome-version
drift cases) are out of scope for this PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

refactor Code refactor with no user-visible behaviour change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants