Skip to content

Split remediation schedules, UI controls, auto-bootstrap, and tests#253

Merged
jorgecuesta merged 2 commits intostagingfrom
feat/remediation-schedule-split
Mar 26, 2026
Merged

Split remediation schedules, UI controls, auto-bootstrap, and tests#253
jorgecuesta merged 2 commits intostagingfrom
feat/remediation-schedule-split

Conversation

@jorgecuesta
Copy link
Copy Markdown
Contributor

Summary

  • Remediation schedule split: Separate SupplierRemediation (1001 ServiceMismatch, pausable by operator) from SupplierInitialStake (1003, always active). Both use the same underlying workflow with different args.
  • Provider UI controls: Auto Stake toggle (pause/unpause schedule), Request Remediation (two-step evaluate→confirm), renamed "Mark for Remediation" to "Clear Remediation"
  • Workflow fixes: Owner address backfill for legacy keys, early return when no keys exist, fix Temporal client namespace defaults
  • Auto-bootstrap for dev: Idempotent seed scripts for provider and middleman. Tilt conditionally injects init container from local JSON config. Fetches minimumStake/height from Pocket API and delegators/providers from governance CDN.
  • Tests: 14 new tests with realistic fixtures covering service comparison edge cases from the re-staking loop bug
  • Docs & UI consistency: Unified "Pocket API URL" label, updated DEVELOP.md with auto-bootstrap setup, updated key-management docs with new actions
  • Tilt improvements: Simplified ignore patterns to prevent local test/build runs from triggering rebuild cycles

Test plan

  • All domain tests pass locally (49 tests, 4 suites)
  • Tilt cluster boots cleanly with auto-bootstrap (both apps seed from JSON + governance CDN)
  • Provider and provider-workflows running with 0 restarts
  • Middleman and middleman-workflows running with 0 restarts
  • Bootstrap seed is idempotent (skips on re-deploy)
  • Workflows handle empty key set gracefully (no crash)
  • Auto Stake toggle pause/unpause works from UI
  • Verify Request Remediation evaluate→confirm flow (needs staked keys — will test with localnet)
  • Verify Clear Remediation button behavior (needs keys in error state)
  • CI passes

Remediation schedule split:
- Separate SupplierRemediation (1001 ServiceMismatch, pausable) from
  SupplierInitialStake (1003 OwnerInitialStake, always active)
- Add workflowType field to decouple schedule key from Temporal workflow
- Bootstrap auto-updates existing schedule args on deploy

Provider UI controls:
- Auto Stake toggle: pause/unpause remediation schedule from keys page
- Request Remediation: two-step evaluate→confirm flow for manual trigger
- Rename "Mark for Remediation" to "Clear Remediation" for clarity
- New server actions: Schedules.ts (get/toggle/trigger), Remediation.ts
  (evaluate needs, request remediation with Zod validation)

Workflow fixes:
- Owner sync: backfill empty ownerAddress from on-chain data for legacy keys
- Early return in SupplierStatus/SupplierRemediation when no keys exist
- Fix Temporal client defaults (middleman→provider namespace)

Auto-bootstrap for dev:
- Bootstrap seed scripts for provider and middleman (standalone, idempotent)
- Fetch minimumStake and height from Pocket API at seed time
- Fetch delegators/providers from governance CDN (single source of truth)
- Tilt conditionally injects init container + ConfigMap from local JSON
- Example configs with pocket-lego-testnet defaults

Tests:
- Fixture file with 7 realistic supplier/key scenarios
- 8 new tests for getExpectedServicesFromKey and getSupplierActiveServices
- 6 new comparison handler tests covering edge cases from re-staking bug

Docs and UI consistency:
- Unify "Node API URL"/"Shannon API URL" to "Pocket API URL" across both apps
- Update DEVELOP.md with auto-bootstrap setup instructions
- Update key-management docs with new button names and actions
- Map @igniter/db/provider/enums to correct schema/enums path in jest
- Revert fixture import to @igniter/db/provider/enums (valid for tsc)
- Simplify Tilt ignore across all 4 apps: **/dist, **/*.test.ts, **/__fixtures__
- Prevents local test/build from triggering Tilt rebuild cycles
@jorgecuesta jorgecuesta force-pushed the feat/remediation-schedule-split branch from 1e06b5b to c866b4d Compare March 26, 2026 04:08
@TheFeloniousMonk TheFeloniousMonk self-requested a review March 26, 2026 14:35
@jorgecuesta jorgecuesta merged commit 2375661 into staging Mar 26, 2026
6 checks passed
@jorgecuesta jorgecuesta deleted the feat/remediation-schedule-split branch March 26, 2026 14:37
Comment on lines +17 to +33
React.useEffect(() => {
let cancelled = false
GetRemediationScheduleStatus().then((result) => {
if (cancelled) return
if (result.success) {
setPaused(result.data.paused)
} else {
setError(result.error.message)
}
}).catch((e) => {
if (cancelled) return
setError(e instanceof Error ? e.message : 'Failed to fetch schedule status')
}).finally(() => {
if (!cancelled) setIsLoading(false)
})
return () => { cancelled = true }
}, [])
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use react-query useQuery for this

Comment on lines +92 to +113
for (const key of keys) {
const entry: RemediationHistoryEntry = {
message: 'Service mismatch detected — remediation requested by operator',
reason: RemediationHistoryEntryReason.ServiceMismatch,
timestamp: Date.now(),
}

const history = [...(key.remediationHistory ?? [])]
const existingIndex = history.findIndex((item) => item.reason === entry.reason)

if (existingIndex !== -1) {
history[existingIndex] = entry
} else {
history.push(entry)
history.sort((a, b) => b.timestamp - a.timestamp)
}

updates.push({
address: key.address,
remediationHistory: history,
})
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we check that the key is having a service mismatch here?

setError(result.error.message)
return
}
setPaused(!paused)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of using the hook directly to set the new value we should do setPaused((prevPaused) => !prevPaused)

Comment on lines +341 to +353
export async function batchUpdateRemediationHistory(
updates: Array<{ address: string; remediationHistory: RemediationHistoryEntry[] }>,
): Promise<void> {
const dbClient = getDbClient()
await Promise.all(
updates.map((update) =>
dbClient.db
.update(keysTable)
.set({ remediationHistory: update.remediationHistory })
.where(eq(keysTable.address, update.address)),
),
)
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should use a transaction here

jorgecuesta added a commit that referenced this pull request Mar 26, 2026
- AutoStakeToggle: replace manual useEffect with useQuery for schedule
  status polling (30s interval, consistent with codebase pattern)
- AutoStakeToggle: remove local paused state mutation, let useQuery
  refetch handle state updates after toggle
- Remediation.ts: verify actual service mismatch before marking keys
  in RequestRemediation (skip keys without real divergence)
- keys.ts: wrap batchUpdateRemediationHistory in a DB transaction
@jorgecuesta jorgecuesta mentioned this pull request Mar 26, 2026
5 tasks
jorgecuesta added a commit that referenced this pull request Mar 26, 2026
Review feedback (Alan):
- AutoStakeToggle: replace manual useEffect with useQuery (30s polling)
- AutoStakeToggle: remove local paused state mutation, let useQuery
  refetch handle state updates after toggle
- Remediation.ts: verify actual service mismatch before marking keys
  in RequestRemediation (skip keys without real divergence)
- keys.ts: wrap batchUpdateRemediationHistory in a DB transaction

Fix comparison divergence (Bug 1 from findings-supplier-restaking-loop.md):
- Filter 0% revShare entries in getExpectedServicesFromKey to match
  BuildSupplierServiceConfigHandler behavior (primary re-staking cause)
- Normalize rpcType to numeric via RPCTypeMap (prevents string vs
  number false mismatch)
- Use stakeOwner (from chain) as authoritative owner address, falling
  back to ownerAddress for backwards compatibility
- Update tests to verify new correct behavior
jorgecuesta added a commit that referenced this pull request Mar 26, 2026
Review feedback:
- AutoStakeToggle: replace manual useEffect with useQuery (30s polling)
- AutoStakeToggle: remove local paused state mutation, let useQuery
  refetch handle state updates after toggle
- Remediation.ts: verify actual service mismatch before marking keys
  in RequestRemediation (skip keys without real divergence)
- keys.ts: wrap batchUpdateRemediationHistory in a DB transaction

Fix service comparison divergence (root cause of supplier re-staking loop):
- Filter 0% revShare entries in getExpectedServicesFromKey to match
  BuildSupplierServiceConfigHandler behavior
- Normalize rpcType to numeric via RPCTypeMap (prevents string vs
  number false mismatch)
- Use stakeOwner (from chain) as authoritative owner address, falling
  back to ownerAddress for backwards compatibility
- Update tests to verify new correct behavior
jorgecuesta added a commit that referenced this pull request Mar 26, 2026
Review feedback:
- AutoStakeToggle: replace manual useEffect with useQuery (30s polling)
- AutoStakeToggle: remove local paused state mutation, let useQuery
  refetch handle state updates after toggle
- Remediation.ts: verify actual service mismatch before marking keys
  in RequestRemediation (skip keys without real divergence)
- keys.ts: wrap batchUpdateRemediationHistory in a DB transaction

Fix service comparison divergence (root cause of supplier re-staking loop):
- Filter 0% revShare entries in getExpectedServicesFromKey to match
  BuildSupplierServiceConfigHandler behavior
- Normalize rpcType to numeric via RPCTypeMap (prevents string vs
  number false mismatch)
- Use stakeOwner (from chain) as authoritative owner address, falling
  back to ownerAddress for backwards compatibility
- Update tests to verify new correct behavior
jorgecuesta added a commit that referenced this pull request Mar 26, 2026
Review feedback:
- AutoStakeToggle: replace manual useEffect with useQuery (30s polling)
- AutoStakeToggle: remove local paused state mutation, let useQuery
  refetch handle state updates after toggle
- Remediation.ts: verify actual service mismatch before marking keys
  in RequestRemediation (skip keys without real divergence)
- keys.ts: wrap batchUpdateRemediationHistory in a DB transaction

Fix service comparison divergence (root cause of supplier re-staking loop):
- Filter 0% revShare entries in getExpectedServicesFromKey to match
  BuildSupplierServiceConfigHandler behavior
- Normalize rpcType to numeric via RPCTypeMap (prevents string vs
  number false mismatch)
- Use stakeOwner (from chain) as authoritative owner address, falling
  back to ownerAddress for backwards compatibility
- Update tests to verify new correct behavior
jorgecuesta added a commit that referenced this pull request Mar 26, 2026
…#256)

Fix service comparison divergence (root cause of supplier re-staking loop):
- Filter 0% revShare entries in getExpectedServicesFromKey to match
  BuildSupplierServiceConfigHandler behavior
- Normalize rpcType to numeric via RPCTypeMap (prevents string vs
  number false mismatch)
- Use stakeOwner (from chain) as authoritative owner address, falling
  back to ownerAddress for backwards compatibility
- Update tests to verify new correct behavior
jorgecuesta added a commit that referenced this pull request Mar 26, 2026
…253)

* Split remediation schedules, add UI controls, auto-bootstrap, and tests

Remediation schedule split:
- Separate SupplierRemediation (1001 ServiceMismatch, pausable) from
  SupplierInitialStake (1003 OwnerInitialStake, always active)
- Add workflowType field to decouple schedule key from Temporal workflow
- Bootstrap auto-updates existing schedule args on deploy

Provider UI controls:
- Auto Stake toggle: pause/unpause remediation schedule from keys page
- Request Remediation: two-step evaluate→confirm flow for manual trigger
- Rename "Mark for Remediation" to "Clear Remediation" for clarity
- New server actions: Schedules.ts (get/toggle/trigger), Remediation.ts
  (evaluate needs, request remediation with Zod validation)

Workflow fixes:
- Owner sync: backfill empty ownerAddress from on-chain data for legacy keys
- Early return in SupplierStatus/SupplierRemediation when no keys exist
- Fix Temporal client defaults (middleman→provider namespace)

Auto-bootstrap for dev:
- Bootstrap seed scripts for provider and middleman (standalone, idempotent)
- Fetch minimumStake and height from Pocket API at seed time
- Fetch delegators/providers from governance CDN (single source of truth)
- Tilt conditionally injects init container + ConfigMap from local JSON
- Example configs with pocket-lego-testnet defaults

Tests:
- Fixture file with 7 realistic supplier/key scenarios
- 8 new tests for getExpectedServicesFromKey and getSupplierActiveServices
- 6 new comparison handler tests covering edge cases from re-staking bug

Docs and UI consistency:
- Unify "Node API URL"/"Shannon API URL" to "Pocket API URL" across both apps
- Update DEVELOP.md with auto-bootstrap setup instructions
- Update key-management docs with new button names and actions

* Fix jest moduleNameMapper for enums and simplify Tilt ignore patterns

- Map @igniter/db/provider/enums to correct schema/enums path in jest
- Revert fixture import to @igniter/db/provider/enums (valid for tsc)
- Simplify Tilt ignore across all 4 apps: **/dist, **/*.test.ts, **/__fixtures__
- Prevents local test/build from triggering Tilt rebuild cycles
jorgecuesta added a commit that referenced this pull request Mar 26, 2026
…#256)

Fix service comparison divergence (root cause of supplier re-staking loop):
- Filter 0% revShare entries in getExpectedServicesFromKey to match
  BuildSupplierServiceConfigHandler behavior
- Normalize rpcType to numeric via RPCTypeMap (prevents string vs
  number false mismatch)
- Use stakeOwner (from chain) as authoritative owner address, falling
  back to ownerAddress for backwards compatibility
- Update tests to verify new correct behavior
jorgecuesta added a commit that referenced this pull request Mar 26, 2026
* Add branch-gate check: only release branch can target main (#250)

* Improve bootstrap UX, workflow schedule management, and setup wizard (#251)

* Fix workflow bootstrap to evaluate and update existing schedules

- Bootstrap now compares args and interval of existing schedules against
  desired config. If different, logs warning and updates via Temporal API.
- Previously, existing schedules were always skipped regardless of config drift.
- Add ENV variable overrides for schedule intervals (SCHEDULE_<TYPE>_INTERVAL)
- Unify config: args, interval, and ENV var name in single record per workflow
- Fix dev overlay RPC URL: shannon-testnet-grove-rpc → sauron-rpc.beta

* Wait for app bootstrap before creating Temporal schedules

Workers now poll the database for isBootstrapped=true before proceeding
with schedule creation and starting the Temporal worker. This prevents
creating schedules for apps that haven't been set up through the web UI.

- Add isBootstrapped() method to both DALs (clean query, no noisy warnings)
- Poll interval configurable via BOOTSTRAP_POLL_INTERVAL_MS (default: 5000ms)

* Improve bootstrap UX, workflow schedule management, and setup wizard

Bootstrap workflows:
- Bootstrap now compares args and interval of existing schedules and updates if different
- Add ENV variable overrides for schedule intervals (SCHEDULE_<TYPE>_INTERVAL)
- Workers wait for app isBootstrapped=true before creating Temporal schedules
- Fix dev overlay RPC URL to use sauron-api.beta

Setup wizard (both apps):
- Fix RPC/Indexer URL validation: clear errors on change, no focus loss during validation
- Reset height when switching networks to prevent false height regression errors
- Add .prettierignore for proto/generated files
- Fix provider setup layout: remove sidebar, add scroll support
- Compact step navigation (prev ← current → next) for provider's 8 steps
- Add skeleton loading states instead of spinner
- Add SetupHelpBar component with links to docs for every step
- Add quick-fill buttons (Mainnet/Beta) for Node API + Indexer URLs
- Provider settings form: real-time validation with debounce, chain-id mismatch as error, height regression as warning
- Fix form label/input alignment in shared UI components

Tables and dialogs:
- Redesign table component: bordered container, header separation, row hover
- Fix DataTable: align cell padding with headers, always show action buttons
- All dialogs: max-h-[90vh] with scroll, consistent styling
- RelayMiner dialog: auto-generate identity from name, auto-select single region
- Region dialog: auto-generate urlValue from displayName
- Service dialog: load all services from chain, local search with dropdown
- Address group dialog: real buttons for Add Share/Add Address, Trash2Icon for remove
- Switch component: larger size, better unchecked contrast
- Back buttons use outline variant across all steps

Bootstrap complete step:
- Summary with settings details and entity counts
- Overview page: graceful handling of empty state with summary cards

* Split remediation schedules, UI controls, auto-bootstrap, and tests (#253)

* Split remediation schedules, add UI controls, auto-bootstrap, and tests

Remediation schedule split:
- Separate SupplierRemediation (1001 ServiceMismatch, pausable) from
  SupplierInitialStake (1003 OwnerInitialStake, always active)
- Add workflowType field to decouple schedule key from Temporal workflow
- Bootstrap auto-updates existing schedule args on deploy

Provider UI controls:
- Auto Stake toggle: pause/unpause remediation schedule from keys page
- Request Remediation: two-step evaluate→confirm flow for manual trigger
- Rename "Mark for Remediation" to "Clear Remediation" for clarity
- New server actions: Schedules.ts (get/toggle/trigger), Remediation.ts
  (evaluate needs, request remediation with Zod validation)

Workflow fixes:
- Owner sync: backfill empty ownerAddress from on-chain data for legacy keys
- Early return in SupplierStatus/SupplierRemediation when no keys exist
- Fix Temporal client defaults (middleman→provider namespace)

Auto-bootstrap for dev:
- Bootstrap seed scripts for provider and middleman (standalone, idempotent)
- Fetch minimumStake and height from Pocket API at seed time
- Fetch delegators/providers from governance CDN (single source of truth)
- Tilt conditionally injects init container + ConfigMap from local JSON
- Example configs with pocket-lego-testnet defaults

Tests:
- Fixture file with 7 realistic supplier/key scenarios
- 8 new tests for getExpectedServicesFromKey and getSupplierActiveServices
- 6 new comparison handler tests covering edge cases from re-staking bug

Docs and UI consistency:
- Unify "Node API URL"/"Shannon API URL" to "Pocket API URL" across both apps
- Update DEVELOP.md with auto-bootstrap setup instructions
- Update key-management docs with new button names and actions

* Fix jest moduleNameMapper for enums and simplify Tilt ignore patterns

- Map @igniter/db/provider/enums to correct schema/enums path in jest
- Revert fixture import to @igniter/db/provider/enums (valid for tsc)
- Simplify Tilt ignore across all 4 apps: **/dist, **/*.test.ts, **/__fixtures__
- Prevents local test/build from triggering Tilt rebuild cycles

* Fix false-positive success in remediateSupplier (#243)

* Update `remediateSupplier` to use `ApplicationFailure` exceptions for improved error handling and retry logic

* Update `upsertSupplierStatus` to return detailed result with state, remediation reasons, and supplier context

* Normalize domains to second-level structure in activity handler (#252)

* Fix text contrast in gradient-border status pills (#254)

White text on all gradient-border pills (green, orange, purple, slate)
for readable contrast. Affects node detail, transaction detail, stake,
unstake, and import success views in the middleman app.

* Run docker builds only after quality checks pass (#255)

* Address PR #253 review feedback and fix service comparison divergence (#256)

Fix service comparison divergence (root cause of supplier re-staking loop):
- Filter 0% revShare entries in getExpectedServicesFromKey to match
  BuildSupplierServiceConfigHandler behavior
- Normalize rpcType to numeric via RPCTypeMap (prevents string vs
  number false mismatch)
- Use stakeOwner (from chain) as authoritative owner address, falling
  back to ownerAddress for backwards compatibility
- Update tests to verify new correct behavior

* chore(deploy): update staging image to a0e0afb

---------

Co-authored-by: Jorge S. Cuesta <jorge.s.cuesta@gmail.com>
Co-authored-by: Alan Rojas <alan2rm7@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants