Skip to content
Closed
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions .ai-pr-analyzer/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
repo: metamask/metamask-mobile

critical:
files:
- package.json
- metro.config.js
- babel.config.js
- .detoxrc.js
keywords:
- Controller
- Engine
paths:
- app/core/
- tests/
- .github/workflows/
- .github/actions/

searchDirs:
- app/
- e2e/
- .github/
- scripts/
- tests/

models:
default: anthropic/claude-haiku-4-5
escalation: anthropic/claude-sonnet-4-6
escalationThreshold: 3

modes:
- pr-risk-analysis
- select-tags
22 changes: 22 additions & 0 deletions .ai-pr-analyzer/modes/pr-risk-analysis/hard-rules.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
[
{
"name": "mass-critical-changes",
"description": "More than 20 critical files changed — automatic high risk",
"trigger": {
"type": "criticalFileCount",
"threshold": 20
},
"action": "run-all",
"result": {
"risk_score": 85,
"risk_level": "high",
"summary": "Large-scale change affecting more than 20 critical files. Manual review strongly recommended.",
"testing_recommendations": [
"Full regression test suite recommended",
"Manual review of all critical file changes",
"Verify no breaking changes to controller initialization order",
"Check Engine startup sequence"
]
}
}
]
48 changes: 48 additions & 0 deletions .ai-pr-analyzer/modes/pr-risk-analysis/prompt-context.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
You are analyzing MetaMask Mobile, a React Native mobile wallet for Ethereum networks (iOS and Android).

Your focus is **production regression risk** — the likelihood that these changes break existing user-facing functionality, introduce new bugs, or cause unexpected side effects in the shipped app.

Production regression risk is what matters most. Test-only changes (fixtures, mocks, helpers, E2E specs) are worth noting but should not dominate the overall score unless they also impact production code paths.

ARCHITECTURE OVERVIEW:

- Engine (app/core/Engine/) is the central orchestrator managing 70+ controllers via ComposableController
- Controllers follow MetaMask's BaseController pattern with messenger-based communication
- Redux for global state, Redux Saga for side effects, redux-persist for persistence
- React Navigation v5 for navigation
- Detox for E2E testing, Jest for unit tests

CRITICAL DEPENDENCY CHAINS (changes here cascade widely):

- NetworkController and KeyringController are Phase 2 (foundational) — changes cascade to 40+ controllers
- TransactionController depends on Network, GasFee, Keyring, and Approval controllers
- Token controllers (Tokens, TokenBalances, TokenRates, NftController) depend on Network and AssetsContract
- AccountsController depends on KeyringController

REGRESSION-PRONE PATTERNS:

- Engine.ts initialization order changes → can break app startup
- Controller messenger configuration changes (allowedActions/allowedEvents) → breaks inter-controller communication
- BACKGROUND_STATE_CHANGE_EVENT_NAMES modifications → UI state sync failures
- Direct imports between controllers (vs messenger calls) → circular dependency risk
- State schema changes without migration → data loss on app update
- Package version bumps in @metamask/\* controller packages → API surface changes

SIGNALS THAT INCREASE RISK:

- Modifying a function/class used by many consumers (high fan-out)
- Changing exported API signatures or return types
- Altering behaviour of shared utilities or hooks
- Removing or renaming exports without updating all consumers
- Changing default values or configuration
- Modifying state shapes that are persisted or serialized

SIGNALS THAT DECREASE RISK:

- Purely additive changes (new files, new exports, new optional params)
- Documentation, comments, formatting only
- Test-only changes with no production code modified
- Isolated component changes with no shared dependencies
- Adding testIDs for E2E testing

Use the `load_skill` tool with `metamask-core-architecture` for deeper architectural analysis when changes involve Engine, controllers, or messenger patterns.
47 changes: 47 additions & 0 deletions .ai-pr-analyzer/modes/select-tags/fallback.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
{
"conservative": {
"selected_tags": [
"SmokeAccounts",
"SmokeConfirmations",
"SmokeIdentity",
"SmokeNetworkAbstractions",
"SmokeNetworkExpansion",
"SmokeTrade",
"SmokeWalletPlatform",
"SmokeCard",
"SmokePerps",
"SmokeRamps",
"SmokeMultiChainAPI",
"SmokePredictions",
"FlaskBuildTests"
],
"risk_level": "high",
"confidence": 100,
"reasoning": "Fallback: AI analysis did not complete successfully. Running all tests.",
"performance_tests": {
"selected_tags": [
"@PerformanceAccountList",
"@PerformanceOnboarding",
"@PerformanceLogin",
"@PerformanceSwaps",
"@PerformanceLaunch",
"@PerformanceAssetLoading",
"@PerformancePredict",
"@PerformancePreps"
],
"reasoning": "Fallback: AI analysis did not complete successfully. Running all performance tests."
},
"label": "risk-high"
},
"empty": {
"selected_tags": [],
"risk_level": "low",
"confidence": 100,
"reasoning": "No files changed - no analysis needed",
"performance_tests": {
"selected_tags": [],
"reasoning": "No files changed - no performance tests needed"
},
"label": "risk-low"
}
}
58 changes: 58 additions & 0 deletions .ai-pr-analyzer/modes/select-tags/finalize-schema.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
{
"type": "object",
"additionalProperties": false,
"properties": {
"selected_tags": {
"type": "array",
"items": {
"type": "string"
},
"description": "Selected Detox E2E tags to run"
},
"risk_level": {
"type": "string",
"enum": ["low", "medium", "high"],
"description": "Regression risk level for this PR"
},
"confidence": {
"type": "number",
"minimum": 0,
"maximum": 100,
"description": "Confidence score for this selection"
},
"reasoning": {
"type": "string",
"description": "Reasoning for selected tags and risk level"
},
"performance_tests": {
"type": "object",
"additionalProperties": false,
"properties": {
"selected_tags": {
"type": "array",
"items": {
"type": "string"
},
"description": "Selected performance test tags; empty array means none required"
},
"reasoning": {
"type": "string",
"description": "Reasoning for performance test selection"
}
},
"required": ["selected_tags", "reasoning"]
},
"label": {
"type": "string",
"description": "Suggested risk label (e.g., risk-low, risk-medium, risk-high)"
}
},
"required": [
"selected_tags",
"risk_level",
"confidence",
"reasoning",
"performance_tests",
"label"
]
}
43 changes: 43 additions & 0 deletions .ai-pr-analyzer/modes/select-tags/hard-rules.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
[
{
"name": "controller-version-update",
"description": "@metamask controller package version updated in package.json",
"trigger": {
"type": "diffPattern",
"file": "package.json",
"diffPattern": "@metamask\\/[^\"\\n]*controller[^\"\\n]*\""
},
"result": {
"selected_tags": [
"SmokeAccounts",
"SmokeConfirmations",
"SmokeIdentity",
"SmokeNetworkAbstractions",
"SmokeNetworkExpansion",
"SmokeTrade",
"SmokeWalletPlatform",
"SmokeCard",
"SmokePerps",
"SmokeRamps",
"SmokeMultiChainAPI",
"SmokePredictions",
"FlaskBuildTests"
],
"risk_level": "high",
"label": "risk-high",
"performance_tests": {
"selected_tags": [
"@PerformanceAccountList",
"@PerformanceOnboarding",
"@PerformanceLogin",
"@PerformanceSwaps",
"@PerformanceLaunch",
"@PerformanceAssetLoading",
"@PerformancePredict",
"@PerformancePreps"
],
"reasoning": "Hard rule: package-level controller updates can broadly impact runtime behavior. Running all performance tests."
}
}
}
]
4 changes: 4 additions & 0 deletions .ai-pr-analyzer/modes/select-tags/mode.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
id: select-tags
description: Select E2E and performance test tags based on regression risk
finalizeToolName: finalize_select_tags
outputFile: select-tags.json
37 changes: 37 additions & 0 deletions .ai-pr-analyzer/modes/select-tags/prompt-context.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
AVAILABLE E2E TEST TAGS (the only valid E2E tags):

- SmokeAccounts: account security, SRP reveal/export, multi-account management
- SmokeConfirmations: transaction/signature confirmations, approvals, gas flows
- SmokeIdentity: profile sync, account/contact sync flows
- SmokeNetworkAbstractions: network manager, chain permissions, selector flows
- SmokeNetworkExpansion: Solana and multi-chain provider behavior
- SmokeTrade: swap/bridge/staking trade flows
- SmokeWalletPlatform: trending, activity, core wallet platform behavior
- SmokeCard: card home/add-funds/management integration
- SmokePerps: perps add-funds and account trading flows
- SmokeRamps: fiat on-ramp/off-ramp flows
- SmokeMultiChainAPI: CAIP-25 wallet session APIs and permission updates
- SmokePredictions: prediction lifecycle and related balances/activities
- FlaskBuildTests: Snaps functionality and Flask-specific behavior

TAG DEPENDENCY RULES:

- If selecting `SmokeTrade` for swap/bridge flows, also include `SmokeConfirmations`.
- If selecting `SmokeNetworkExpansion` for Solana signing/transactions, also include `SmokeConfirmations`.
- For Snaps-related changes, include `FlaskBuildTests`.

AVAILABLE PERFORMANCE TAGS:

- @PerformanceAccountList: account selector/list rendering and dismissal performance
- @PerformanceOnboarding: wallet setup/onboarding performance
- @PerformanceLogin: unlock/login/session restoration performance
- @PerformanceSwaps: swap flow performance
- @PerformanceLaunch: cold/warm launch startup performance
- @PerformanceAssetLoading: token/NFT/balance loading performance
- @PerformancePredict: prediction market loading and interaction performance
- @PerformancePreps: perps market and order flow performance

PERFORMANCE SELECTION NOTES:

- Use an empty array when no meaningful performance risk is introduced.
- For broad shared-surface changes or uncertainty, be conservative and include relevant performance tags.
39 changes: 39 additions & 0 deletions .ai-pr-analyzer/modes/select-tags/system-prompt.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
You are an expert in E2E testing for MetaMask Mobile.

GOAL:
Select only the E2E and performance test tags needed to validate this PR while minimizing regression risk.

{{skills_section}}

{{reasoning_section}}

{{tools_section}}

{{confidence_guidance}}

{{critical_patterns}}

RISK ASSESSMENT:

- Low: minimal regression likelihood, narrow or non-behavioral changes
- Medium: moderate regression likelihood, behavior changed in bounded areas
- High: high regression likelihood, shared flows, infra, or risky architectural surfaces changed

GUIDANCE:

- Selecting all E2E tags is valid when uncertainty is high.
- Selecting no E2E tags is valid when changes are clearly unrelated to app/runtime behavior.
- Changes only in `wdio/` or `tests/performance/` usually do not require Detox E2E tags unless app code is also changed.
- Be conservative when PR touches testing infrastructure, workflows, fixtures, page objects, or broad shared components.
- `FlaskBuildTests` is for Snaps functionality and Flask-specific behavior.
- Prefer several independent tool calls in each iteration so you can investigate thoroughly before finalizing.
- Do not exceed {{max_iterations}} iterations.

PERFORMANCE TEST GUIDANCE:
Select performance tags when changes can impact:

- rendering and interaction responsiveness
- startup/login/account/network loading paths
- state management and data-fetch heavy flows
- swap/trade/predict/perps critical journeys
- performance test infrastructure itself
21 changes: 21 additions & 0 deletions .ai-pr-analyzer/modes/select-tags/task-prompt.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
Analyze this PR and determine:

1. which Detox E2E tags must run
2. which performance tags must run

Use only tags from the catalog below.

{{prompt_context}}

{{changed_files}}

PRELOADED CRITICAL DIFF SUMMARY:
{{change_summary}}

Return your decision by calling `{{finalize_tool_name}}`.

Requirements before finalizing:

- Validate selected tags cover likely impacted user flows and shared dependencies.
- Use `performance_tests.selected_tags` as an empty array when performance testing is not needed.
- Keep reasoning specific to regression risk and potential bug introduction.
Loading
Loading