Skip to content
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,22 @@ jobs:
# #483-#492; flipped on once the standing count hit zero.
run: mypy esphome_device_builder

- name: Check import time against budget
# ``device_builder`` import cost lands on dashboard startup,
# which is painfully slow on low-power hosts (HA Green). Fails
# the build if a fresh eager import (e.g. an ``esphome.components.*``
# module) pushes the measured time past the budget + margin.
run: python script/check_import_time.py --check --har importtime.har

- name: Upload import-time waterfall
# Keep the HAR so a regression can be inspected as a waterfall.
if: always()
uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7.0.1
with:
name: import-time-waterfall
path: importtime.har
retention-days: 14

test:
name: Pytest (${{ matrix.os }} / Python ${{ matrix.python-version }})
runs-on: ${{ matrix.os }}
Expand Down
15 changes: 14 additions & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -334,6 +334,16 @@ against legacy behaviour before assuming the simpler version suffices.
refinement, `unit_of_measurement` options). Component descriptions/
titles fall back to the docs MDX repo when the schema index is sparse.
All in `script/sync_components.py`.
- **The long-lived process never imports `esphome.components.*`.**
Importing `esphome.components.esp32` drags in espidf → requests →
`esphome.config` (~9s of cold start on an HA Green). Static platform
metadata is snapshotted into `platform_capabilities.index.json` by the
nightly sync and read at runtime; the one dynamic case
(`get_download_types` for libretiny/nrf52, which reads the build dir)
runs in the `device-builder-helper` subprocess. `script/check_import_time.py`
+ `tests/test_cold_import_floor.py` guard the invariant in CI. Don't add
an eager (or runtime in-process) `esphome.components.*` import on the
dashboard side; precompute into the index or push it into the helper.
- **Catalog id format**: `<domain>.<stem>` (e.g. `sensor.dht`). The
schema's natural format is the reverse — `<stem>.<domain>`;
`_split_qualified_key` flips it.
Expand Down Expand Up @@ -583,9 +593,12 @@ When changing the sync script or catalog handling, watch for these:
| `esphome_device_builder/definitions/components.index.json` + `components/<id>.json` | Generated; do not hand-edit. Slim index loaded eagerly; per-id bodies hydrate lazily via `ComponentCatalog.get_body`. |
| `esphome_device_builder/definitions/boards.index.json` + `board_bodies/<id>.json` + `featured_components.index.json` | Generated; do not hand-edit. Slim board index + per-id lazy bodies (via `BoardCatalog._body_store`) + aggregated featured-components map (read once by the components controller's registry build). |
| `esphome_device_builder/definitions/boards/<id>/manifest.yaml` | Curated; hand-edited. The body directory is `board_bodies/` (separate from this manifests dir) so the body-swap rmtree can't trample the hand-curated source. |
| `esphome_device_builder/definitions/platform_capabilities.index.json` | Generated; do not hand-edit. esphome platform metadata the long-lived process reads instead of importing `esphome.components.*` (download routing, wifi-inference no-wifi sets, static download-types). Loaded via `load_platform_capabilities_index`. |
| `esphome_device_builder/helper_cli.py` (`device-builder-helper`) | Subprocess for `get_download_types` on build-dir-dependent platforms (libretiny/nrf52), so the child imports `esphome.components.<X>`, not the dashboard process. |
| `script/sync_boards.py` | Regenerates the split board catalog from the manifests |
| `script/sync_components.py` | Regenerates the component catalog |
| `script/sync_components.py` | Regenerates the component catalog + `platform_capabilities.index.json` |
| `script/check_catalog.py` | Smoke test for popular components |
| `script/check_import_time.py` | CI guard: fails if `import …device_builder` regresses past `script/import_time_budget.json` (e.g. a fresh eager `esphome.components.*` import) |
| `script/validate_definitions.py` | Lint board manifests |
| `docs/ARCHITECTURE.md` | Full architecture + deployment + CI overview |
| `docs/API.md` | Every WS command + payload shape + event |
Expand Down
27 changes: 27 additions & 0 deletions docs/ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -282,6 +282,33 @@ PR with a diff summary when the rebuild produces a change.

All workflow files are commented — start there for the source of truth.

## Cold-start import discipline

The long-lived dashboard process never imports `esphome.components.*`.
Importing `esphome.components.esp32` transitively pulls espidf → `requests` →
`esphome.config`, roughly 9s of cold start before the first log line on an HA
Green. Two mechanisms keep it out:

- **Snapshot the static data.** Everything the dashboard needs that's keyed on
the esphome version (esp32 variants + libretiny families for download
routing, esp32 no-wifi variants + rp2040 no-wifi boards for wifi inference,
and the static `get_download_types` lists for esp32/esp8266/rp2040) is
generated into `definitions/platform_capabilities.index.json` by
`script/sync_components.py` and read at runtime via
`load_platform_capabilities_index`. The committed index is a subset of the
installed esphome (the CI matrix runs newer esphome); the nightly sync keeps
it current.
- **Subprocess the one dynamic case.** `get_download_types` for libretiny and
nrf52 reads the build directory, so it can't be precomputed. The dashboard
spawns `device-builder-helper` (`helper_cli.py`), which imports
`esphome.components.<X>` in a throwaway child and returns JSON; the reply is
validated through `coerce_download_entries` at the boundary.

`script/check_import_time.py` (import budget) and
`tests/test_cold_import_floor.py` (`sys.modules` probe after import + `start()`)
guard the invariant in CI. New code that needs `esphome.components.*` data
precomputes it into the index or runs in the helper, never an in-process import.

## Authentication

Auth is opaque server-issued session tokens, gated by the WebSocket handshake. See [API.md](API.md#authentication) for the wire protocol and [THREAT_MODEL.md](THREAT_MODEL.md) for what the auth gate is defending (short version: authenticated callers are host-equivalent, because `external_components:` provides arbitrary Python at compile time).
Expand Down
171 changes: 139 additions & 32 deletions esphome_device_builder/controllers/firmware/download.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,39 +3,76 @@
from __future__ import annotations

import asyncio
import importlib
import logging
import re
import secrets
import subprocess
import time
from dataclasses import dataclass
from functools import cache
from pathlib import Path
from typing import TYPE_CHECKING

from aiohttp import web
from esphome.components.esp32 import VARIANTS as ESP32_VARIANTS
from esphome.components.libretiny.const import (
FAMILY_COMPONENT as _LIBRETINY_FAMILY_COMPONENT,
)
from esphome.storage_json import StorageJSON

from ...definitions import (
PlatformCapabilities,
coerce_download_entries,
load_platform_capabilities_index,
)
from ...helpers.api import CommandError
from ...helpers.json import JSONDecodeError
from ...helpers.json import loads as json_loads
from ...helpers.storage_path import resolve_storage_path
from .helpers import _find_sibling_cli

if TYPE_CHECKING:
from .controller import FirmwareController

_LOGGER = logging.getLogger(__name__)

# Generous: the child pays a full esphome import before answering.
_HELPER_TIMEOUT_S = 60


def _capabilities() -> PlatformCapabilities:
"""Seam over the (cached) platform-capabilities index loader."""
return load_platform_capabilities_index()


@dataclass(frozen=True, slots=True)
class _DownloadRouting:
"""Sets that map a ``target_platform`` to an ``esphome.components`` module."""

esp32_variants: frozenset[str]
libretiny_targets: frozenset[str]


@cache
def _platform_sets() -> _DownloadRouting:
"""Return the download-routing sets, derived from the generated index.

Read from the index rather than ``esphome.components.esp32`` to keep espidf /
requests / esphome.config off cold start. LibreTiny chip families collapse to
the ``libretiny`` component, so the umbrella name joins that set.
"""
caps = _capabilities()
return _DownloadRouting(
esp32_variants=frozenset(caps.esp32_variants),
libretiny_targets=frozenset(caps.libretiny_families) | {"libretiny"},
)


# Prime the cached index read at import so the first download request doesn't
# pay the (small, esphome-free) file read inside the event loop.
_platform_sets()


def _helper_cmd() -> tuple[str, ...]:
"""Argv prefix for the device-builder-helper child (cached by _find_sibling_cli)."""
return _find_sibling_cli("device-builder-helper", "esphome_device_builder.helper_cli")

# Platforms whose ``target_platform`` value isn't the component
# module name. ESP32 variants collapse to the umbrella ``esp32``
# component; LibreTiny chip families collapse to ``libretiny``.
# The LibreTiny set is sourced from upstream's
# ``FAMILY_COMPONENT.values()`` so it picks up new chip families
# automatically on the next ``esphome`` dependency bump.
_LIBRETINY_TARGET_PLATFORMS: frozenset[str] = frozenset(_LIBRETINY_FAMILY_COMPONENT.values()) | {
"libretiny"
}

# Stable ``type`` tag per artifact filename so the frontend can map it to a
# localized label (falling back to the platform-supplied ``title`` for any
Expand Down Expand Up @@ -66,15 +103,18 @@ async def get_binaries(controller: FirmwareController, *, configuration: str) ->
loop = asyncio.get_running_loop()

def _get_types() -> list[dict]:
storage = StorageJSON.load(resolve_storage_path(configuration))
storage_path = resolve_storage_path(configuration)
storage = StorageJSON.load(storage_path)
if storage is None:
return []
return collect_download_entries(storage, label=configuration)
return collect_download_entries(storage, storage_path, label=configuration)

return await loop.run_in_executor(None, _get_types)


def collect_download_entries(storage: StorageJSON, *, label: str | None = None) -> list[dict]:
def collect_download_entries(
storage: StorageJSON, storage_path: Path | None = None, *, label: str | None = None
) -> list[dict]:
"""Return the downloadable artifacts on disk for *storage* as ``[{title, file, ...}]``.

The platform's ``get_download_types`` entries that exist under
Expand All @@ -84,20 +124,16 @@ def collect_download_entries(storage: StorageJSON, *, label: str | None = None)
offers; ``get_binaries`` is its async wrapper. *label* identifies the
build in the failure log -- the caller's configuration filename when it
has one (more specific than ``storage.name`` across colliding device
names); defaults to ``storage.name``.
names); defaults to ``storage.name``. *storage_path* is required to resolve
the build-dir-dependent platforms (libretiny / nrf52) through the helper
subprocess; the static platforms are answered from the catalog regardless.
"""
try:
component = _resolve_download_component(storage.target_platform)
module = importlib.import_module(f"esphome.components.{component}")
types = list(module.get_download_types(storage))
except Exception: # a third-party get_download_types regression could raise anything
_LOGGER.warning(
"Could not determine download types for %s", label or storage.name, exc_info=True
)
return []
# No build dir → can't confirm anything on disk → treat as not built.
# Checked before resolving types so an unbuilt libretiny / nrf52 device
# doesn't spawn the helper subprocess just to discard the result.
if storage.firmware_bin_path is None:
return []
types = _download_types_for(storage, storage_path, label=label)
build_dir = storage.firmware_bin_path.parent
# Filter to files that exist so a cleaned build reads as "compile
# first" rather than offering a name ``firmware/download`` would 404 on.
Expand All @@ -122,6 +158,71 @@ def collect_download_entries(storage: StorageJSON, *, label: str | None = None)
return downloads


def _download_types_for(
storage: StorageJSON, storage_path: Path | None, *, label: str | None
) -> list[dict]:
"""Return ``get_download_types`` entries for *storage*'s platform.

Static platforms (esp32 / esp8266 / rp2040) come straight from the
precomputed catalog index. Build-dir-dependent platforms (libretiny / nrf52)
are answered by the device-builder-helper subprocess, so the long-lived
process never imports ``esphome.components.*``. A missing *storage_path* or a
failing helper yields ``[]`` -- the same "treat as not built" fall-through
the in-process import used to take on error.
"""
component = _resolve_download_component(storage.target_platform)
if not component:
return []
precomputed = _capabilities().download_types.get(component)
if precomputed is not None:
return precomputed
if storage_path is None:
_LOGGER.warning(
"No storage path given to resolve %s download types for %s",
component,
label or storage.name,
)
return []
cmd = [*_helper_cmd(), "download-types", str(storage_path), component]
try:
# ``close_fds=False`` mirrors helpers.subprocess's policy (skip the
# fork-time /proc/self/fd close walk; we inherit nothing the child needs shut).
result = subprocess.run( # noqa: S603 — argv is internally built, no shell
cmd,
check=True,
capture_output=True,
text=True,
timeout=_HELPER_TIMEOUT_S,
close_fds=False,
)
except Exception as err: # spawn / nonzero exit / timeout / esphome regression
# An infrastructure failure (helper not installed, timeout, import error)
# is distinct from a built device with no artifacts; surface the child's
# stderr so it's diagnosable, not an unbuilt-looking empty row. Still
# degrade to [] (the listing must keep rendering for other devices).
_LOGGER.warning(
"download-types helper failed for %s: %s",
label or storage.name,
getattr(err, "stderr", None) or err,
exc_info=True,
)
return []
try:
payload = json_loads(result.stdout)
except JSONDecodeError: # non-JSON stdout (rare: the helper isolates its stdout)
_LOGGER.warning(
"download-types helper returned non-JSON for %s: stdout=%r stderr=%r",
label or storage.name,
result.stdout[:200],
result.stderr[:200],
exc_info=True,
)
return []
# Coerce at the boundary so a malformed reply can't reach a downstream
# ``entry["file"]``; same validation the index payload goes through.
return coerce_download_entries(payload)


def _resolve_artifact_path(configuration: str, file: str) -> tuple[Path, str]:
"""Resolve a build artifact to ``(path, download_name)``, traversal-safe.

Expand Down Expand Up @@ -229,13 +330,19 @@ async def http_download(request: web.Request) -> web.StreamResponse:
def _resolve_download_component(target_platform: str | None) -> str:
"""Return the ``esphome.components`` module name for *target_platform*.

``None`` / empty input collapses to ``""``; the caller's
``importlib.import_module`` then fails in its ``try/except``
and logs a warning.
``None`` / empty input collapses to ``""``; the helper subprocess then fails
its import and the caller logs a warning and treats the build as not built.
"""
platform = (target_platform or "").lower()
if platform.upper() in ESP32_VARIANTS:
routing = _platform_sets()
if platform.upper() in routing.esp32_variants:
return "esp32"
if platform in _LIBRETINY_TARGET_PLATFORMS:
if platform in routing.libretiny_targets:
return "libretiny"
# Every esp32 variant is the umbrella ``esp32`` component, so fold by prefix
# even when the index is degraded (empty variants) — a missing index then
# makes an ESP32-S3/C3/... download slow (helper spawn) rather than broken
# (helper importing a nonexistent ``esphome.components.esp32s3``).
if platform.startswith("esp32"):
return "esp32"
return platform
10 changes: 7 additions & 3 deletions esphome_device_builder/controllers/firmware/helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -153,8 +153,12 @@ def _find_esptool_cmd() -> list[str]:


@lru_cache(maxsize=8)
def _find_sibling_cli(name: str) -> tuple[str, ...]:
"""Sibling script next to ``sys.executable``, else ``python -m <name>``.
def _find_sibling_cli(name: str, module: str | None = None) -> tuple[str, ...]:
"""Sibling script next to ``sys.executable``, else ``python -m <module or name>``.

*module* lets the ``-m`` fallback target an import path that differs from the
console-script *name* (e.g. ``device-builder-helper`` ->
``esphome_device_builder.helper_cli``); it defaults to *name*.

Result is cached so the ``sibling.exists()`` filesystem probe
runs once per ``name`` — async callers (``_run_esptool``,
Expand All @@ -168,7 +172,7 @@ def _find_sibling_cli(name: str) -> tuple[str, ...]:
sibling = Path(python).parent / (f"{name}.exe" if os.name == "nt" else name)
if sibling.exists():
return (str(sibling),)
return (python, "-m", name)
return (python, "-m", module or name)


def _parse_progress(line: str) -> int | None:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,25 +9,34 @@

from __future__ import annotations

from esphome.components.esp32 import VARIANTS as _ESP32_VARIANTS
from functools import cache

from ....definitions import load_platform_capabilities_index
from . import bk72xx, esp32, esp8266, ln882x, nrf52, rp2040, rtl87xx

_PLATFORMS = (bk72xx, esp8266, esp32, ln882x, nrf52, rp2040, rtl87xx)

_BY_TARGET: dict[str, tuple[str, ...]] = {
mod.TARGET_PLATFORM.lower(): mod.BUILD_FILES for mod in _PLATFORMS
}

# ESP32 chip variants StorageJSON stores as ``target_platform``
# (``ESP32S3``, ``ESP32C3``, ``ESP32H2``, …) all build through the
# umbrella ``esp32`` component, so resolve them to the same module.
# Sourced from upstream so a new variant lands here without an
# edit. The base ``"esp32"`` is already in ``_BY_TARGET``.
for _variant in _ESP32_VARIANTS:
_BY_TARGET.setdefault(_variant.lower(), esp32.BUILD_FILES)
@cache
def _by_target() -> dict[str, tuple[str, ...]]:
"""Map ``target_platform`` -> BUILD_FILES, ESP32 chip variants folded to esp32.

StorageJSON stores variants (``ESP32S3``, ``ESP32C3``, …) as
``target_platform``; they all build through the umbrella ``esp32`` component.
The variant list comes from the generated index rather than
``esphome.components.esp32`` so this import stays off cold start.
"""
by_target = {mod.TARGET_PLATFORM.lower(): mod.BUILD_FILES for mod in _PLATFORMS}
for variant in load_platform_capabilities_index().esp32_variants:
by_target.setdefault(variant.lower(), esp32.BUILD_FILES)
return by_target


def build_files_for_platform(target_platform: str) -> tuple[str, ...]:
"""Return BUILD_FILES for *target_platform*; empty tuple if unrecognised."""
return _BY_TARGET.get(target_platform.lower(), ())
return _by_target().get(target_platform.lower(), ())


# Prime the cached map at import so the first artifact build doesn't pay the
# (small, esphome-free) index read inside the event loop.
_by_target()
Loading
Loading