-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
feat: Add AI PR analysis configuration and workflow #28024
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from 8 commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
f63b199
feat: Add AI PR analysis configuration and workflow
cmd-ob 48127d8
fix: Update AI PR analysis workflow to use new analyzer action
cmd-ob b5b5db9
chore: Update AI PR analysis workflow to use local analyzer action
cmd-ob b08ea7e
chore: Update AI PR analysis workflow to use remote analyzer action
cmd-ob aae1b27
chore: Update AI PR analysis workflow to use local analyzer action
cmd-ob ff576c7
chore: Update AI PR analysis workflow to use specific branch for anal…
cmd-ob b4f9333
chore: Refactor AI PR analysis workflow to streamline input handling
cmd-ob c22e7b0
feat: Add select-tags mode for AI PR analysis
cmd-ob 014d185
chore: Update AI PR analysis workflow to use specific branch for depe…
cmd-ob File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,32 @@ | ||
| repo: metamask/metamask-mobile | ||
|
|
||
| critical: | ||
| files: | ||
| - package.json | ||
| - metro.config.js | ||
| - babel.config.js | ||
| - .detoxrc.js | ||
| keywords: | ||
| - Controller | ||
| - Engine | ||
| paths: | ||
| - app/core/ | ||
| - tests/ | ||
| - .github/workflows/ | ||
| - .github/actions/ | ||
|
|
||
| searchDirs: | ||
| - app/ | ||
| - e2e/ | ||
| - .github/ | ||
| - scripts/ | ||
| - tests/ | ||
|
|
||
| models: | ||
| default: anthropic/claude-haiku-4-5 | ||
| escalation: anthropic/claude-sonnet-4-6 | ||
| escalationThreshold: 3 | ||
|
|
||
| modes: | ||
| - pr-risk-analysis | ||
| - select-tags | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,22 @@ | ||
| [ | ||
| { | ||
| "name": "mass-critical-changes", | ||
| "description": "More than 20 critical files changed — automatic high risk", | ||
| "trigger": { | ||
| "type": "criticalFileCount", | ||
| "threshold": 20 | ||
| }, | ||
| "action": "run-all", | ||
| "result": { | ||
| "risk_score": 85, | ||
| "risk_level": "high", | ||
| "summary": "Large-scale change affecting more than 20 critical files. Manual review strongly recommended.", | ||
| "testing_recommendations": [ | ||
| "Full regression test suite recommended", | ||
| "Manual review of all critical file changes", | ||
| "Verify no breaking changes to controller initialization order", | ||
| "Check Engine startup sequence" | ||
| ] | ||
| } | ||
| } | ||
| ] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,48 @@ | ||
| You are analyzing MetaMask Mobile, a React Native mobile wallet for Ethereum networks (iOS and Android). | ||
|
|
||
| Your focus is **production regression risk** — the likelihood that these changes break existing user-facing functionality, introduce new bugs, or cause unexpected side effects in the shipped app. | ||
|
|
||
| Production regression risk is what matters most. Test-only changes (fixtures, mocks, helpers, E2E specs) are worth noting but should not dominate the overall score unless they also impact production code paths. | ||
|
|
||
| ARCHITECTURE OVERVIEW: | ||
|
|
||
| - Engine (app/core/Engine/) is the central orchestrator managing 70+ controllers via ComposableController | ||
| - Controllers follow MetaMask's BaseController pattern with messenger-based communication | ||
| - Redux for global state, Redux Saga for side effects, redux-persist for persistence | ||
| - React Navigation v5 for navigation | ||
| - Detox for E2E testing, Jest for unit tests | ||
|
|
||
| CRITICAL DEPENDENCY CHAINS (changes here cascade widely): | ||
|
|
||
| - NetworkController and KeyringController are Phase 2 (foundational) — changes cascade to 40+ controllers | ||
| - TransactionController depends on Network, GasFee, Keyring, and Approval controllers | ||
| - Token controllers (Tokens, TokenBalances, TokenRates, NftController) depend on Network and AssetsContract | ||
| - AccountsController depends on KeyringController | ||
|
|
||
| REGRESSION-PRONE PATTERNS: | ||
|
|
||
| - Engine.ts initialization order changes → can break app startup | ||
| - Controller messenger configuration changes (allowedActions/allowedEvents) → breaks inter-controller communication | ||
| - BACKGROUND_STATE_CHANGE_EVENT_NAMES modifications → UI state sync failures | ||
| - Direct imports between controllers (vs messenger calls) → circular dependency risk | ||
| - State schema changes without migration → data loss on app update | ||
| - Package version bumps in @metamask/\* controller packages → API surface changes | ||
|
|
||
| SIGNALS THAT INCREASE RISK: | ||
|
|
||
| - Modifying a function/class used by many consumers (high fan-out) | ||
| - Changing exported API signatures or return types | ||
| - Altering behaviour of shared utilities or hooks | ||
| - Removing or renaming exports without updating all consumers | ||
| - Changing default values or configuration | ||
| - Modifying state shapes that are persisted or serialized | ||
|
|
||
| SIGNALS THAT DECREASE RISK: | ||
|
|
||
| - Purely additive changes (new files, new exports, new optional params) | ||
| - Documentation, comments, formatting only | ||
| - Test-only changes with no production code modified | ||
| - Isolated component changes with no shared dependencies | ||
| - Adding testIDs for E2E testing | ||
|
|
||
| Use the `load_skill` tool with `metamask-core-architecture` for deeper architectural analysis when changes involve Engine, controllers, or messenger patterns. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,47 @@ | ||
| { | ||
| "conservative": { | ||
| "selected_tags": [ | ||
| "SmokeAccounts", | ||
| "SmokeConfirmations", | ||
| "SmokeIdentity", | ||
| "SmokeNetworkAbstractions", | ||
| "SmokeNetworkExpansion", | ||
| "SmokeTrade", | ||
| "SmokeWalletPlatform", | ||
| "SmokeCard", | ||
| "SmokePerps", | ||
| "SmokeRamps", | ||
| "SmokeMultiChainAPI", | ||
| "SmokePredictions", | ||
| "FlaskBuildTests" | ||
| ], | ||
| "risk_level": "high", | ||
| "confidence": 100, | ||
| "reasoning": "Fallback: AI analysis did not complete successfully. Running all tests.", | ||
| "performance_tests": { | ||
| "selected_tags": [ | ||
| "@PerformanceAccountList", | ||
| "@PerformanceOnboarding", | ||
| "@PerformanceLogin", | ||
| "@PerformanceSwaps", | ||
| "@PerformanceLaunch", | ||
| "@PerformanceAssetLoading", | ||
| "@PerformancePredict", | ||
| "@PerformancePreps" | ||
| ], | ||
| "reasoning": "Fallback: AI analysis did not complete successfully. Running all performance tests." | ||
| }, | ||
| "label": "risk-high" | ||
| }, | ||
| "empty": { | ||
| "selected_tags": [], | ||
| "risk_level": "low", | ||
| "confidence": 100, | ||
| "reasoning": "No files changed - no analysis needed", | ||
| "performance_tests": { | ||
| "selected_tags": [], | ||
| "reasoning": "No files changed - no performance tests needed" | ||
| }, | ||
| "label": "risk-low" | ||
| } | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,58 @@ | ||
| { | ||
| "type": "object", | ||
| "additionalProperties": false, | ||
| "properties": { | ||
| "selected_tags": { | ||
| "type": "array", | ||
| "items": { | ||
| "type": "string" | ||
| }, | ||
| "description": "Selected Detox E2E tags to run" | ||
| }, | ||
| "risk_level": { | ||
| "type": "string", | ||
| "enum": ["low", "medium", "high"], | ||
| "description": "Regression risk level for this PR" | ||
| }, | ||
| "confidence": { | ||
| "type": "number", | ||
| "minimum": 0, | ||
| "maximum": 100, | ||
| "description": "Confidence score for this selection" | ||
| }, | ||
| "reasoning": { | ||
| "type": "string", | ||
| "description": "Reasoning for selected tags and risk level" | ||
| }, | ||
| "performance_tests": { | ||
| "type": "object", | ||
| "additionalProperties": false, | ||
| "properties": { | ||
| "selected_tags": { | ||
| "type": "array", | ||
| "items": { | ||
| "type": "string" | ||
| }, | ||
| "description": "Selected performance test tags; empty array means none required" | ||
| }, | ||
| "reasoning": { | ||
| "type": "string", | ||
| "description": "Reasoning for performance test selection" | ||
| } | ||
| }, | ||
| "required": ["selected_tags", "reasoning"] | ||
| }, | ||
| "label": { | ||
| "type": "string", | ||
| "description": "Suggested risk label (e.g., risk-low, risk-medium, risk-high)" | ||
| } | ||
| }, | ||
| "required": [ | ||
| "selected_tags", | ||
| "risk_level", | ||
| "confidence", | ||
| "reasoning", | ||
| "performance_tests", | ||
| "label" | ||
| ] | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,43 @@ | ||
| [ | ||
| { | ||
| "name": "controller-version-update", | ||
| "description": "@metamask controller package version updated in package.json", | ||
| "trigger": { | ||
| "type": "diffPattern", | ||
| "file": "package.json", | ||
| "diffPattern": "@metamask\\/[^\"\\n]*controller[^\"\\n]*\"" | ||
| }, | ||
| "result": { | ||
| "selected_tags": [ | ||
| "SmokeAccounts", | ||
| "SmokeConfirmations", | ||
| "SmokeIdentity", | ||
| "SmokeNetworkAbstractions", | ||
| "SmokeNetworkExpansion", | ||
| "SmokeTrade", | ||
| "SmokeWalletPlatform", | ||
| "SmokeCard", | ||
| "SmokePerps", | ||
| "SmokeRamps", | ||
| "SmokeMultiChainAPI", | ||
| "SmokePredictions", | ||
| "FlaskBuildTests" | ||
| ], | ||
| "risk_level": "high", | ||
| "label": "risk-high", | ||
| "performance_tests": { | ||
| "selected_tags": [ | ||
| "@PerformanceAccountList", | ||
| "@PerformanceOnboarding", | ||
| "@PerformanceLogin", | ||
| "@PerformanceSwaps", | ||
| "@PerformanceLaunch", | ||
| "@PerformanceAssetLoading", | ||
| "@PerformancePredict", | ||
| "@PerformancePreps" | ||
| ], | ||
| "reasoning": "Hard rule: package-level controller updates can broadly impact runtime behavior. Running all performance tests." | ||
| } | ||
| } | ||
| } | ||
| ] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,4 @@ | ||
| id: select-tags | ||
| description: Select E2E and performance test tags based on regression risk | ||
| finalizeToolName: finalize_select_tags | ||
| outputFile: select-tags.json |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,37 @@ | ||
| AVAILABLE E2E TEST TAGS (the only valid E2E tags): | ||
|
|
||
| - SmokeAccounts: account security, SRP reveal/export, multi-account management | ||
| - SmokeConfirmations: transaction/signature confirmations, approvals, gas flows | ||
| - SmokeIdentity: profile sync, account/contact sync flows | ||
| - SmokeNetworkAbstractions: network manager, chain permissions, selector flows | ||
| - SmokeNetworkExpansion: Solana and multi-chain provider behavior | ||
| - SmokeTrade: swap/bridge/staking trade flows | ||
| - SmokeWalletPlatform: trending, activity, core wallet platform behavior | ||
| - SmokeCard: card home/add-funds/management integration | ||
| - SmokePerps: perps add-funds and account trading flows | ||
| - SmokeRamps: fiat on-ramp/off-ramp flows | ||
| - SmokeMultiChainAPI: CAIP-25 wallet session APIs and permission updates | ||
| - SmokePredictions: prediction lifecycle and related balances/activities | ||
| - FlaskBuildTests: Snaps functionality and Flask-specific behavior | ||
|
|
||
| TAG DEPENDENCY RULES: | ||
|
|
||
| - If selecting `SmokeTrade` for swap/bridge flows, also include `SmokeConfirmations`. | ||
| - If selecting `SmokeNetworkExpansion` for Solana signing/transactions, also include `SmokeConfirmations`. | ||
| - For Snaps-related changes, include `FlaskBuildTests`. | ||
|
|
||
| AVAILABLE PERFORMANCE TAGS: | ||
|
|
||
| - @PerformanceAccountList: account selector/list rendering and dismissal performance | ||
| - @PerformanceOnboarding: wallet setup/onboarding performance | ||
| - @PerformanceLogin: unlock/login/session restoration performance | ||
| - @PerformanceSwaps: swap flow performance | ||
| - @PerformanceLaunch: cold/warm launch startup performance | ||
| - @PerformanceAssetLoading: token/NFT/balance loading performance | ||
| - @PerformancePredict: prediction market loading and interaction performance | ||
| - @PerformancePreps: perps market and order flow performance | ||
|
|
||
| PERFORMANCE SELECTION NOTES: | ||
|
|
||
| - Use an empty array when no meaningful performance risk is introduced. | ||
| - For broad shared-surface changes or uncertainty, be conservative and include relevant performance tags. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,39 @@ | ||
| You are an expert in E2E testing for MetaMask Mobile. | ||
|
|
||
| GOAL: | ||
| Select only the E2E and performance test tags needed to validate this PR while minimizing regression risk. | ||
|
|
||
| {{skills_section}} | ||
|
|
||
| {{reasoning_section}} | ||
|
|
||
| {{tools_section}} | ||
|
|
||
| {{confidence_guidance}} | ||
|
|
||
| {{critical_patterns}} | ||
|
|
||
| RISK ASSESSMENT: | ||
|
|
||
| - Low: minimal regression likelihood, narrow or non-behavioral changes | ||
| - Medium: moderate regression likelihood, behavior changed in bounded areas | ||
| - High: high regression likelihood, shared flows, infra, or risky architectural surfaces changed | ||
|
|
||
| GUIDANCE: | ||
|
|
||
| - Selecting all E2E tags is valid when uncertainty is high. | ||
| - Selecting no E2E tags is valid when changes are clearly unrelated to app/runtime behavior. | ||
| - Changes only in `wdio/` or `tests/performance/` usually do not require Detox E2E tags unless app code is also changed. | ||
| - Be conservative when PR touches testing infrastructure, workflows, fixtures, page objects, or broad shared components. | ||
| - `FlaskBuildTests` is for Snaps functionality and Flask-specific behavior. | ||
| - Prefer several independent tool calls in each iteration so you can investigate thoroughly before finalizing. | ||
| - Do not exceed {{max_iterations}} iterations. | ||
|
|
||
| PERFORMANCE TEST GUIDANCE: | ||
| Select performance tags when changes can impact: | ||
|
|
||
| - rendering and interaction responsiveness | ||
| - startup/login/account/network loading paths | ||
| - state management and data-fetch heavy flows | ||
| - swap/trade/predict/perps critical journeys | ||
| - performance test infrastructure itself |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| Analyze this PR and determine: | ||
|
|
||
| 1. which Detox E2E tags must run | ||
| 2. which performance tags must run | ||
|
|
||
| Use only tags from the catalog below. | ||
|
|
||
| {{prompt_context}} | ||
|
|
||
| {{changed_files}} | ||
|
|
||
| PRELOADED CRITICAL DIFF SUMMARY: | ||
| {{change_summary}} | ||
|
|
||
| Return your decision by calling `{{finalize_tool_name}}`. | ||
|
|
||
| Requirements before finalizing: | ||
|
|
||
| - Validate selected tags cover likely impacted user flows and shared dependencies. | ||
| - Use `performance_tests.selected_tags` as an empty array when performance testing is not needed. | ||
| - Keep reasoning specific to regression risk and potential bug introduction. |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.