MetaMask · cmd-ob · Mar 27, 2026 · Mar 27, 2026 · Mar 27, 2026 · Mar 27, 2026
diff --git a/.ai-pr-analyzer/config.yaml b/.ai-pr-analyzer/config.yaml
@@ -0,0 +1,32 @@
+repo: metamask/metamask-mobile
+
+critical:
+  files:
+    - package.json
+    - metro.config.js
+    - babel.config.js
+    - .detoxrc.js
+  keywords:
+    - Controller
+    - Engine
+  paths:
+    - app/core/
+    - tests/
+    - .github/workflows/
+    - .github/actions/
+
+searchDirs:
+  - app/
+  - e2e/
+  - .github/
+  - scripts/
+  - tests/
+
+models:
+  default: anthropic/claude-haiku-4-5
+  escalation: anthropic/claude-sonnet-4-6
+  escalationThreshold: 3
+
+modes:
+  - pr-risk-analysis
+  - select-tags
diff --git a/.ai-pr-analyzer/modes/pr-risk-analysis/hard-rules.json b/.ai-pr-analyzer/modes/pr-risk-analysis/hard-rules.json
@@ -0,0 +1,22 @@
+[
+  {
+    "name": "mass-critical-changes",
+    "description": "More than 20 critical files changed — automatic high risk",
+    "trigger": {
+      "type": "criticalFileCount",
+      "threshold": 20
+    },
+    "action": "run-all",
+    "result": {
+      "risk_score": 85,
+      "risk_level": "high",
+      "summary": "Large-scale change affecting more than 20 critical files. Manual review strongly recommended.",
+      "testing_recommendations": [
+        "Full regression test suite recommended",
+        "Manual review of all critical file changes",
+        "Verify no breaking changes to controller initialization order",
+        "Check Engine startup sequence"
+      ]
+    }
+  }
+]
diff --git a/.ai-pr-analyzer/modes/pr-risk-analysis/prompt-context.md b/.ai-pr-analyzer/modes/pr-risk-analysis/prompt-context.md
@@ -0,0 +1,48 @@
+You are analyzing MetaMask Mobile, a React Native mobile wallet for Ethereum networks (iOS and Android).
+
+Your focus is **production regression risk** — the likelihood that these changes break existing user-facing functionality, introduce new bugs, or cause unexpected side effects in the shipped app.
+
+Production regression risk is what matters most. Test-only changes (fixtures, mocks, helpers, E2E specs) are worth noting but should not dominate the overall score unless they also impact production code paths.
+
+ARCHITECTURE OVERVIEW:
+
+- Engine (app/core/Engine/) is the central orchestrator managing 70+ controllers via ComposableController
+- Controllers follow MetaMask's BaseController pattern with messenger-based communication
+- Redux for global state, Redux Saga for side effects, redux-persist for persistence
+- React Navigation v5 for navigation
+- Detox for E2E testing, Jest for unit tests
+
+CRITICAL DEPENDENCY CHAINS (changes here cascade widely):
+
+- NetworkController and KeyringController are Phase 2 (foundational) — changes cascade to 40+ controllers
+- TransactionController depends on Network, GasFee, Keyring, and Approval controllers
+- Token controllers (Tokens, TokenBalances, TokenRates, NftController) depend on Network and AssetsContract
+- AccountsController depends on KeyringController
+
+REGRESSION-PRONE PATTERNS:
+
+- Engine.ts initialization order changes → can break app startup
+- Controller messenger configuration changes (allowedActions/allowedEvents) → breaks inter-controller communication
+- BACKGROUND_STATE_CHANGE_EVENT_NAMES modifications → UI state sync failures
+- Direct imports between controllers (vs messenger calls) → circular dependency risk
+- State schema changes without migration → data loss on app update
+- Package version bumps in @metamask/\* controller packages → API surface changes
+
+SIGNALS THAT INCREASE RISK:
+
+- Modifying a function/class used by many consumers (high fan-out)
+- Changing exported API signatures or return types
+- Altering behaviour of shared utilities or hooks
+- Removing or renaming exports without updating all consumers
+- Changing default values or configuration
+- Modifying state shapes that are persisted or serialized
+
+SIGNALS THAT DECREASE RISK:
+
+- Purely additive changes (new files, new exports, new optional params)
+- Documentation, comments, formatting only
+- Test-only changes with no production code modified
+- Isolated component changes with no shared dependencies
+- Adding testIDs for E2E testing
+
+Use the `load_skill` tool with `metamask-core-architecture` for deeper architectural analysis when changes involve Engine, controllers, or messenger patterns.
diff --git a/.ai-pr-analyzer/modes/select-tags/fallback.json b/.ai-pr-analyzer/modes/select-tags/fallback.json
@@ -0,0 +1,47 @@
+{
+  "conservative": {
+    "selected_tags": [
+      "SmokeAccounts",
+      "SmokeConfirmations",
+      "SmokeIdentity",
+      "SmokeNetworkAbstractions",
+      "SmokeNetworkExpansion",
+      "SmokeTrade",
+      "SmokeWalletPlatform",
+      "SmokeCard",
+      "SmokePerps",
+      "SmokeRamps",
+      "SmokeMultiChainAPI",
+      "SmokePredictions",
+      "FlaskBuildTests"
+    ],
+    "risk_level": "high",
+    "confidence": 100,
+    "reasoning": "Fallback: AI analysis did not complete successfully. Running all tests.",
+    "performance_tests": {
+      "selected_tags": [
+        "@PerformanceAccountList",
+        "@PerformanceOnboarding",
+        "@PerformanceLogin",
+        "@PerformanceSwaps",
+        "@PerformanceLaunch",
+        "@PerformanceAssetLoading",
+        "@PerformancePredict",
+        "@PerformancePreps"
+      ],
+      "reasoning": "Fallback: AI analysis did not complete successfully. Running all performance tests."
+    },
+    "label": "risk-high"
+  },
+  "empty": {
+    "selected_tags": [],
+    "risk_level": "low",
+    "confidence": 100,
+    "reasoning": "No files changed - no analysis needed",
+    "performance_tests": {
+      "selected_tags": [],
+      "reasoning": "No files changed - no performance tests needed"
+    },
+    "label": "risk-low"
+  }
+}
diff --git a/.ai-pr-analyzer/modes/select-tags/finalize-schema.json b/.ai-pr-analyzer/modes/select-tags/finalize-schema.json
@@ -0,0 +1,58 @@
+{
+  "type": "object",
+  "additionalProperties": false,
+  "properties": {
+    "selected_tags": {
+      "type": "array",
+      "items": {
+        "type": "string"
+      },
+      "description": "Selected Detox E2E tags to run"
+    },
+    "risk_level": {
+      "type": "string",
+      "enum": ["low", "medium", "high"],
+      "description": "Regression risk level for this PR"
+    },
+    "confidence": {
+      "type": "number",
+      "minimum": 0,
+      "maximum": 100,
+      "description": "Confidence score for this selection"
+    },
+    "reasoning": {
+      "type": "string",
+      "description": "Reasoning for selected tags and risk level"
+    },
+    "performance_tests": {
+      "type": "object",
+      "additionalProperties": false,
+      "properties": {
+        "selected_tags": {
+          "type": "array",
+          "items": {
+            "type": "string"
+          },
+          "description": "Selected performance test tags; empty array means none required"
+        },
+        "reasoning": {
+          "type": "string",
+          "description": "Reasoning for performance test selection"
+        }
+      },
+      "required": ["selected_tags", "reasoning"]
+    },
+    "label": {
+      "type": "string",
+      "description": "Suggested risk label (e.g., risk-low, risk-medium, risk-high)"
+    }
+  },
+  "required": [
+    "selected_tags",
+    "risk_level",
+    "confidence",
+    "reasoning",
+    "performance_tests",
+    "label"
+  ]
+}
diff --git a/.ai-pr-analyzer/modes/select-tags/hard-rules.json b/.ai-pr-analyzer/modes/select-tags/hard-rules.json
@@ -0,0 +1,43 @@
+[
+  {
+    "name": "controller-version-update",
+    "description": "@metamask controller package version updated in package.json",
+    "trigger": {
+      "type": "diffPattern",
+      "file": "package.json",
+      "diffPattern": "@metamask\\/[^\"\\n]*controller[^\"\\n]*\""
+    },
+    "result": {
+      "selected_tags": [
+        "SmokeAccounts",
+        "SmokeConfirmations",
+        "SmokeIdentity",
+        "SmokeNetworkAbstractions",
+        "SmokeNetworkExpansion",
+        "SmokeTrade",
+        "SmokeWalletPlatform",
+        "SmokeCard",
+        "SmokePerps",
+        "SmokeRamps",
+        "SmokeMultiChainAPI",
+        "SmokePredictions",
+        "FlaskBuildTests"
+      ],
+      "risk_level": "high",
+      "label": "risk-high",
+      "performance_tests": {
+        "selected_tags": [
+          "@PerformanceAccountList",
+          "@PerformanceOnboarding",
+          "@PerformanceLogin",
+          "@PerformanceSwaps",
+          "@PerformanceLaunch",
+          "@PerformanceAssetLoading",
+          "@PerformancePredict",
+          "@PerformancePreps"
+        ],
+        "reasoning": "Hard rule: package-level controller updates can broadly impact runtime behavior. Running all performance tests."
+      }
+    }
+  }
+]
diff --git a/.ai-pr-analyzer/modes/select-tags/mode.yaml b/.ai-pr-analyzer/modes/select-tags/mode.yaml
@@ -0,0 +1,4 @@
+id: select-tags
+description: Select E2E and performance test tags based on regression risk
+finalizeToolName: finalize_select_tags
+outputFile: select-tags.json
diff --git a/.ai-pr-analyzer/modes/select-tags/prompt-context.md b/.ai-pr-analyzer/modes/select-tags/prompt-context.md
@@ -0,0 +1,37 @@
+AVAILABLE E2E TEST TAGS (the only valid E2E tags):
+
+- SmokeAccounts: account security, SRP reveal/export, multi-account management
+- SmokeConfirmations: transaction/signature confirmations, approvals, gas flows
+- SmokeIdentity: profile sync, account/contact sync flows
+- SmokeNetworkAbstractions: network manager, chain permissions, selector flows
+- SmokeNetworkExpansion: Solana and multi-chain provider behavior
+- SmokeTrade: swap/bridge/staking trade flows
+- SmokeWalletPlatform: trending, activity, core wallet platform behavior
+- SmokeCard: card home/add-funds/management integration
+- SmokePerps: perps add-funds and account trading flows
+- SmokeRamps: fiat on-ramp/off-ramp flows
+- SmokeMultiChainAPI: CAIP-25 wallet session APIs and permission updates
+- SmokePredictions: prediction lifecycle and related balances/activities
+- FlaskBuildTests: Snaps functionality and Flask-specific behavior
+
+TAG DEPENDENCY RULES:
+
+- If selecting `SmokeTrade` for swap/bridge flows, also include `SmokeConfirmations`.
+- If selecting `SmokeNetworkExpansion` for Solana signing/transactions, also include `SmokeConfirmations`.
+- For Snaps-related changes, include `FlaskBuildTests`.
+
+AVAILABLE PERFORMANCE TAGS:
+
+- @PerformanceAccountList: account selector/list rendering and dismissal performance
+- @PerformanceOnboarding: wallet setup/onboarding performance
+- @PerformanceLogin: unlock/login/session restoration performance
+- @PerformanceSwaps: swap flow performance
+- @PerformanceLaunch: cold/warm launch startup performance
+- @PerformanceAssetLoading: token/NFT/balance loading performance
+- @PerformancePredict: prediction market loading and interaction performance
+- @PerformancePreps: perps market and order flow performance
+
+PERFORMANCE SELECTION NOTES:
+
+- Use an empty array when no meaningful performance risk is introduced.
+- For broad shared-surface changes or uncertainty, be conservative and include relevant performance tags.
diff --git a/.ai-pr-analyzer/modes/select-tags/system-prompt.md b/.ai-pr-analyzer/modes/select-tags/system-prompt.md
@@ -0,0 +1,39 @@
+You are an expert in E2E testing for MetaMask Mobile.
+
+GOAL:
+Select only the E2E and performance test tags needed to validate this PR while minimizing regression risk.
+
+{{skills_section}}
+
+{{reasoning_section}}
+
+{{tools_section}}
+
+{{confidence_guidance}}
+
+{{critical_patterns}}
+
+RISK ASSESSMENT:
+
+- Low: minimal regression likelihood, narrow or non-behavioral changes
+- Medium: moderate regression likelihood, behavior changed in bounded areas
+- High: high regression likelihood, shared flows, infra, or risky architectural surfaces changed
+
+GUIDANCE:
+
+- Selecting all E2E tags is valid when uncertainty is high.
+- Selecting no E2E tags is valid when changes are clearly unrelated to app/runtime behavior.
+- Changes only in `wdio/` or `tests/performance/` usually do not require Detox E2E tags unless app code is also changed.
+- Be conservative when PR touches testing infrastructure, workflows, fixtures, page objects, or broad shared components.
+- `FlaskBuildTests` is for Snaps functionality and Flask-specific behavior.
+- Prefer several independent tool calls in each iteration so you can investigate thoroughly before finalizing.
+- Do not exceed {{max_iterations}} iterations.
+
+PERFORMANCE TEST GUIDANCE:
+Select performance tags when changes can impact:
+
+- rendering and interaction responsiveness
+- startup/login/account/network loading paths
+- state management and data-fetch heavy flows
+- swap/trade/predict/perps critical journeys
+- performance test infrastructure itself
diff --git a/.ai-pr-analyzer/modes/select-tags/task-prompt.md b/.ai-pr-analyzer/modes/select-tags/task-prompt.md
@@ -0,0 +1,21 @@
+Analyze this PR and determine:
+
+1. which Detox E2E tags must run
+2. which performance tags must run
+
+Use only tags from the catalog below.
+
+{{prompt_context}}
+
+{{changed_files}}
+
+PRELOADED CRITICAL DIFF SUMMARY:
+{{change_summary}}
+
+Return your decision by calling `{{finalize_tool_name}}`.
+
+Requirements before finalizing:
+
+- Validate selected tags cover likely impacted user flows and shared dependencies.
+- Use `performance_tests.selected_tags` as an empty array when performance testing is not needed.
+- Keep reasoning specific to regression risk and potential bug introduction.