Skip to content

Bug/login and device discovery#118

Open
lsiddiquee wants to merge 16 commits into
mainfrom
bug/login-and-device-discovery
Open

Bug/login and device discovery#118
lsiddiquee wants to merge 16 commits into
mainfrom
bug/login-and-device-discovery

Conversation

@lsiddiquee

@lsiddiquee lsiddiquee commented May 18, 2026

Copy link
Copy Markdown
Owner

Summary

Login & device-discovery hardening across 1.0.15 → 1.0.22, with the final two
versions (1.0.21 + 1.0.22) bundled into a single 1.0.22 publish because 1.0.21
was rate-limited by the Homey App Store and never went live.

Closes / addresses the reports in issue #108 from aagehellem, silvinnio and
others — "Authorization Failed" appearing after sign-in actually succeeded,
container limit (CU-429) cascading into permanent failures, retry burning extra
device-code requests, and stale leftover container entries blocking new pairings.

What changed

1.0.22 — Distinct post-auth failure UX + retry short-circuit

  • setup_error event split from authentication_error: when sign-in
    succeeds but the post-auth container step fails (e.g. CU-105, CU-429), the
    driver now emits setup_error instead of authentication_error. The pair
    view shows a "Device Setup Failed" title with an explicit hint that sign-in
    is kept, so users no longer think their authorization failed when it didn't.
  • request_device_code short-circuits on stored valid tokens: the
    "Try Again" button no longer re-requests a device code when sign-in is
    still valid. The handler now calls getValidAccessToken() first and, on
    hit, runs only the container step + advances the view. Saves one OAuth
    round and one slot from BMW's 50/24h request quota per retry.
  • Frontend tolerates a { skipCode: true } sentinel so the loading spinner
    stays up while the backend drives the view transition.

1.0.21 — Progressive container cleanup (folded into 1.0.22)

  • Container cleanup is no longer all-or-nothing. The app removes one
    stale leftover container per pairing attempt rather than trying to clear
    every old entry at once, so a quota-hit on delete doesn't block the create.
  • Zero deletes when an existing entry is being reused.
  • CU-429 from the delete call propagates to the user so they understand
    the daily limit was hit, while other delete errors are swallowed so a
    single bad leftover doesn't block the create.

1.0.15 → 1.0.20 (already published, recapped for context)

  • PKCE device-code OAuth flow stabilization
  • Token persistence in pollForTokens before container step
  • HomeyTokenStore migration + cleanup
  • DeviceCodeAuthProvider error categorization (auth-pending vs. denied
    vs. expired)
  • Pair / repair flow alignment using shared device_code.html

Files of note

  • drivers/ConnectedDriver.tssetup_error emission +
    request_device_code short-circuit + progressive cleanup
  • shared/pair_repair/device_code.html — split title, hint paragraph,
    separate setup_error listener, skipCode handling
    (auto-copied into drivers/{bmw,mini}/{pair,repair}/ by prebuild)
  • locales/en.jsonpair.device_code.setup_error.{title,hint}
  • lib/api/ContainerManager.ts — progressive cleanup behaviour
  • tests/drivers/ConnectedDriver.comprehensive.test.ts — asserts new
    setup_error event + that authentication_error is NOT emitted on
    post-auth container failures

Validation

  • 614 tests passing across 31 suites
  • npm run validate clean (prettier + eslint + markdownlint, 0 errors)
  • homey app build clean (publish-level validation passes)
  • 1.0.22 published: Build 78 uploaded to the Homey App Store
  • Manual regression checks against the issue Error Message #108 user reports

Notes for reviewers

  • app.json diff is generated from .homeycompose/ by homey app build
    and is committed for at-a-glance review of capabilities and flows.
  • The four drivers/{bmw,mini}/{pair,repair}/device_code.html diffs are
    identical and are auto-generated from shared/pair_repair/device_code.html
    by the prebuild step. Edit the shared file, not the copies.

- Changed integration test path in package.json to reflect new directory structure.
- Updated TypeScript and Node.js configurations to use Node 22.
- Updated devDependencies for TypeScript, Node types, and ESLint configurations.
@lsiddiquee lsiddiquee force-pushed the bug/login-and-device-discovery branch from 634f1f4 to 02956a0 Compare May 19, 2026 05:20
@lsiddiquee lsiddiquee linked an issue May 19, 2026 that may be closed by this pull request
lsiddiquee added 11 commits May 19, 2026 10:36
feat: update version to 1.0.18 with improvements to pairing and error messaging

fix: prevent creation of duplicate containers during rapid pairing attempts (issue #108)

fix: ensure error messages provide clear context and correlation IDs for support (issue #108)

refactor: remove cross-fetch dependency as Node.js 18+ supports fetch natively

test: add comprehensive tests for ConnectedDriver and ContainerManager to cover recent changes

test: enhance Logger tests to ensure context preservation when no Error object is provided

test: mock global fetch in OpenStreetMap tests to align with native support in Node.js 18+
BMW maps the daily 50-request quota to HTTP 403 with body field
exveErrorCode "CU-429" (doubly-string-encoded by the gateway), not HTTP
429. The generic 403 branch then appended the misleading "verify
Streaming API option" hint, sending users to check a permission toggle
when the real cause was an exhausted rolling 24h window.

- extractErrorCode / extractErrorMessage now also read exveErrorCode /
  exveErrorMsg and strip wrapping quotes (e.g. '"CU-429"' -> CU-429).
- mapHttpError detects bmwCode === 'CU-429' (or HTTP 429) BEFORE the
  generic 403 branch and returns RateLimitError with a message that
  explains the 50/24h quota and that pairing retries + background polls
  both consume requests.
- BMW code stashed in RateLimitError.context.bmwCode (the public `code`
  field stays at the taxonomy value 'RATE_LIMIT_EXCEEDED').
- extractRetryAfter now also honors the Retry-After response header.
- Streaming API hint is suppressed when the 403 is actually CU-429.

Tests: 3 new failing-first cases under "BMW rate-limit detection
(issue #108)" describe block. Full suite 608/608 green.
)

When BMW returns CU-429 (daily 50/24h quota) from the list-containers
endpoint, ContainerManager's catch-all fallback would still call
createContainer(), burning a second request from the already-exhausted
quota and guaranteeing a second CU-429. Field logs show pairing
attempts burning 2 requests minimum and retries multiplying that.

- ContainerManager.resolveContainer now rethrows RateLimitError
  instead of falling back to create.
- Transient/non-rate-limit failures still fall back to create as
  before (the existing test `should_createContainer_when_listEndpointFails`
  still passes).

Combined with the HttpClient CU-429 fix in the previous commit, pairing
on a rate-limited account now fails fast with one request consumed
(vs. up to ~7 in 1.0.18) and surfaces the accurate "wait for the
rolling 24h window" message.

Tests: 1 new failing-first case
`should_propagateRateLimitError_when_listContainersHitsCu429`.
Full suite 609/609 green.
…ssaging (issue #108)

aagehellem reported "Maximum number of containers reached" on a clean 1.0.19
pair attempt. Pre-1.0.18 created a fresh container on every pair attempt with
the identical literal name "Homey BMW Integration" (no identifier suffix), so
users who loop-paired before 1.0.18 filled their BMW per-user container quota.

Changes (v1.0.20):

ContainerManager:
- During every resolveContainer() list call, identify and proactively delete
  any container that looks like ours (legacy unsuffixed name or stale suffixed
  name from a previous client ID) so leftover quota slots are reclaimed.
- Log full container inventory (containerId + name) during every list so
  user-submitted diagnostic logs include the BMW-side state without another
  API round-trip.
- Stop cleanup early on RateLimitError to preserve remaining quota for the
  active create/reuse path; tolerate individual delete failures.

HttpClient:
- Remove the "share the correlation ID with BMW support" wording from the
  generic 403 container hint and direct the user to open a GitHub issue with
  the log entry copied from the Homey app's log view (logs are local-only;
  BMW cannot help with a third-party Homey app).
- Soften wording: "if it's already enabled" instead of "verify it is enabled".

ConnectedDriver:
- Add a comment documenting that DeviceCodeAuthProvider.pollForTokens has
  already persisted tokens by the time we reach the container step, so a
  failed container step does not force the user back through device-code
  flow on retry (client_id_entered short-circuits via getValidAccessToken).

Docs:
- New .github/instructions/user-facing-messages.instructions.md codifying:
  never tell users to contact BMW support, never present correlation IDs as
  BMW support tokens, point users at GitHub issues + Homey log view.

Tests:
- 5 new ContainerManager tests covering legacy unsuffixed cleanup, stale
  suffixed cleanup, partial-failure tolerance, rate-limit short-circuit, and
  inventory logging.
- Updated HttpClient 403-hint test for the new wording (no "BMW support",
  must reference github/log).
- 614 tests passing.

Version bump: 1.0.19 -> 1.0.20.
…ax (issue #108)

1.0.20 cleaned up *all* orphans on every pair attempt. Each DELETE costs 1
of the user's 50-per-24h request quota, so a backlog of orphans risked
exhausting the daily limit on a single pair. Field guidance and the BMW
CarData swagger confirm DELETE is a single-container operation with no
batch endpoint, so each orphan is its own request.

Now:
- Reuse path (match found): skip cleanup entirely. Zero delete cost.
- Create path (no match): delete at most ONE orphan to free a single slot,
  then create. Subsequent pair attempts progressively reclaim the rest.
- If the single orphan delete returns CU-429, propagate immediately — the
  create would also fail with the same quota error.
- Other delete failures (e.g. 500) are swallowed; create still attempts.

Tests rewritten:
- should_notDeleteAnyOrphans_when_matchExistsForReuse (new conservation)
- should_deleteExactlyOneOrphan_when_creatingNewContainer
- should_proceedWithCreate_when_orphanDeleteFailsNonRateLimit
- should_propagateRateLimit_when_orphanDeleteHitsRateLimit
- should_logContainerInventory_when_listingDuringResolve (unchanged)

Bumps to 1.0.21; changelog explains the gradual cleanup to end users.

614 tests passing, validate clean, build clean.
… (issue #108)

Two related user-experience fixes that ride on top of the 1.0.21 progressive
container cleanup. Both surfaced in field reports from aagehellem and
silvinnio on issue #108.

Problem 1 — misleading "Authorization Failed" title
---------------------------------------------------
When sign-in succeeded but the post-auth container step failed (e.g. with
CU-105 or CU-429), the pair view showed "Authorization Failed" with the
container error message underneath. Users naturally re-attempted the device
code flow, burning another OAuth round and another container slot.

Now the driver emits a distinct `setup_error` event for post-auth failures,
the pair view shows a "Device Setup Failed" title with an explicit hint that
sign-in is kept, and `authentication_error` remains reserved for genuine
auth failures.

Problem 2 — Try Again re-asks for the device code
-------------------------------------------------
`pollForTokens` already persists tokens to the store before the container
step runs, so the next pair attempt's `client_id_entered` handler can
short-circuit. But the pair view's "Try Again" button re-emits
`request_device_code`, which always asked BMW for a fresh device code —
burning another request from the 50/24h quota and forcing the user back
through the browser even though their sign-in was still valid.

The `request_device_code` handler now checks `getValidAccessToken()` first
and, when valid, runs the container step + advances the view (or `done()`
for repair) without ever calling `requestDeviceCode()`. The frontend
recognises a `{ skipCode: true }` sentinel response and stays on the
loading spinner while the backend drives the view transition.

Changes
-------
- drivers/ConnectedDriver.ts:
  - Container failure in `pollForAuthorizationAsync` now emits `setup_error`.
  - `request_device_code` handler short-circuits on stored valid tokens.
- shared/pair_repair/device_code.html:
  - Adds `errorTitle` element with switchable `data-i18n` key.
  - Listens for `setup_error` distinctly; passes title/hint keys.
  - Tolerates `{ skipCode: true }` response from `request_device_code`.
- locales/en.json: adds `pair.device_code.setup_error.title` + `.hint`.
- tests/drivers/ConnectedDriver.comprehensive.test.ts:
  - Renamed `should_emitContainerErrorMessage_…` to
    `should_emitSetupErrorEvent_…`, now asserts `setup_error` event and
    that `authentication_error` is NOT emitted.

Versioning
----------
- Bumps to 1.0.22 (1.0.21 was rate-limited by the Homey App Store and
  never published; bundle both fixes into 1.0.22).
- .homeychangelog.json: end-user-friendly notes for 1.0.22.

614 tests passing. validate clean. build clean.
…orrectly (issue #108)

Centralizes post-authentication completion in a single helper so every code path that has a valid token (polling success, request_device_code short-circuit, client_id_entered short-circuit) follows the same invariants:

- Tokens are persisted by DeviceCodeAuthProvider on pollForTokens success and are never deleted by container, store, or navigation failures. Tokens survive across app restarts via the HomeyTokenStore.
- Container failures emit setup_error (never authentication_error). Auth has already succeeded; the user does not need to re-authorize.
- Device-store update failures in repair flow emit setup_error, not authentication_error.
- Session-closed errors (Not Found: PairSession) are silently swallowed when emitting events or advancing views.
- The request_device_code short-circuit never falls through to a new BMW device-code round-trip when a token already exists; container failures stay put with setup_error.

Adds 3 new tests covering the additional hazards. Full suite: 621/621 green.
Defer container creation from pair flow to first API poll (Vehicle.ensureContainer)
so MQTT streaming can start immediately and pairing is decoupled from
container rate limits / failures. Fixes #108 'Could not find that PairSession'.

On container 404, clear stored containerId so next poll re-provisions.
Reverted the user-facing containerId device setting (overengineered; repair
flow already exposes container override).

Tests: 615/615 pass. Validate: 0 errors.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant