Integration test Semantic Tool filtering policy#14113
Conversation
…through the gateway to a WireMock-backed AI service. Verify embeddings requests, semantic tool filtering, tool-pruning behavior, and cache reuse across repeated invocations. Also add the supporting test resources and mock provider/API setup required for the guardrail test flow.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdded guardrail test fixtures, a comprehensive guardrail deployment TOML, and a new end-to-end integration test that validates Semantic Tool Filtering against a WireMock-backed Mistral mock AI; plus a trivial CI workflow newline change. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Gateway as APIM Gateway
participant Embed as Embeddings API (WireMock)
participant Chat as Chat Completions (WireMock)
Client->>Gateway: POST /v1/chat/completions (with all tools)
Gateway->>Chat: Forward chat request (all tools)
Chat-->>Gateway: Mock chat response
Gateway-->>Client: Return response
Note over Gateway: SemanticToolFiltering policy attached
Client->>Gateway: POST /v1/chat/completions (with all tools)
Gateway->>Embed: Request embeddings for tool texts
Embed-->>Gateway: Return embeddings
Gateway->>Gateway: Compute similarities → select top N tools
Gateway->>Chat: Forward chat request (filtered tools)
Chat-->>Gateway: Mock chat response
Gateway-->>Client: Return response
Note over Gateway: Subsequent Request (Cache Hit)
Client->>Gateway: POST /v1/chat/completions (with all tools)
Gateway->>Gateway: Use cached embeddings (no external call)
Gateway->>Chat: Forward chat request (filtered tools)
Chat-->>Gateway: Mock chat response
Gateway-->>Client: Return response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (7)
all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/resources/testng.xml (1)
274-274: Consider isolatingApplicationBlockSubscriptionTestCasefrom the shared store suite for stability.
ApplicationBlockSubscriptionTestCaseinitializes fixed resources (test-user,test-app,BlockAPITest) in@BeforeClass(all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/application/ApplicationBlockSubscriptionTestCase.java), so placing it inside the broaderapim-store-testsblock increases state-collision risk. A dedicated<test>block or unique resource naming would be safer.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/resources/testng.xml` at line 274, The ApplicationBlockSubscriptionTestCase is sharing stateful fixed resources (test-user, test-app, BlockAPITest) and should be isolated; update testng.xml to move the <class name="org.wso2.am.integration.tests.application.ApplicationBlockSubscriptionTestCase"/> out of the shared apim-store-tests suite into its own <test> block, or modify the test class (org.wso2.am.integration.tests.application.ApplicationBlockSubscriptionTestCase) to use unique, randomized resource names in its `@BeforeClass` setup (or teardown) to avoid collisions—locate the `@BeforeClass` setup in ApplicationBlockSubscriptionTestCase and either change resource names to include a unique suffix or create a separate testng <test> entry for that class.all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/resources/guardrail/mockrequestbody.json (1)
1-82: Avoid maintaining two identical guardrail payload fixtures.This payload is duplicated in
all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/resources/guardrail/mistral-payload.json. Keeping a single source of truth will reduce drift in future assertions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/resources/guardrail/mockrequestbody.json` around lines 1 - 82, The two JSON fixtures (mockrequestbody.json and mistral-payload.json) are duplicates; remove the redundant file and make tests reference a single canonical fixture instead. Pick one fixture to keep (e.g., mistral-payload.json), delete the other (mockrequestbody.json), and update any test/resource consumers that import or load mockrequestbody.json to load the retained file instead; ensure any build/test fixtures or CI references are adjusted to the single file to avoid drift.all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/GuardrailTestCase.java (5)
469-472: Remove commented-out debug code.Commented-out sleep statements (lines 470-471) should be removed to keep the codebase clean.
Suggested fix
String gatewayUrl = getAPIInvocationURLHttp(API_CONTEXT, API_VERSION) + CHAT_RESOURCE; HttpResponse response = invokeGatewayWithRetryOnNotFound(gatewayUrl, headers, requestPayload, 12, 5000L); - // Add a wait - // Thread.sleep(5000); assertEquals(response.getResponseCode(), HttpStatus.SC_OK,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/GuardrailTestCase.java` around lines 469 - 472, Remove the commented-out debug sleep lines in the GuardrailTestCase test method: delete the two commented lines "// Add a wait" and "// Thread.sleep(5000);" that appear immediately after the call to invokeGatewayWithRetryOnNotFound(gatewayUrl, headers, requestPayload, 12, 5000L) in the org.wso2.am.integration.tests.guardrail.GuardrailTestCase class so the test contains only the real invocation and the subsequent assertion assertEquals(response.getResponseCode(), HttpStatus.SC_OK, ...).
529-533: Remove commented-out debug code.Similar to the previous test, remove the commented-out sleep statements.
Suggested fix
String gatewayUrl = getAPIInvocationURLHttp(API_CONTEXT, API_VERSION) + CHAT_RESOURCE; HttpResponse response = invokeGatewayWithRetryOnNotFound(gatewayUrl, headers, requestPayload, 12, 5000L); - // Add a wait - // Thread.sleep(5000); assertEquals(response.getResponseCode(), HttpStatus.SC_OK,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/GuardrailTestCase.java` around lines 529 - 533, Remove the leftover commented-out debug sleep: delete the "// Add a wait" comment and the commented "Thread.sleep(5000);" line near the invocation of invokeGatewayWithRetryOnNotFound where gatewayUrl is built (usage of getAPIInvocationURLHttp(API_CONTEXT, API_VERSION) + CHAT_RESOURCE and invokeGatewayWithRetryOnNotFound). Ensure no other commented sleep lines remain in GuardrailTestCase.java around that assertion block so the test contains only active code and assertions.
478-484: UseassertFalseinstead of double-negativeassertTrue(!...).The assertion at line 482 uses a double-negative pattern which reduces readability.
Suggested fix
- assertTrue(!embeddingRequests.isEmpty(), + assertFalse(embeddingRequests.isEmpty(), "Expected at least one embeddings request when Semantic Tool Filtering policy is applied");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/GuardrailTestCase.java` around lines 478 - 484, The assertion uses a double-negative; replace the assertTrue(!embeddingRequests.isEmpty(), ...) with assertFalse(embeddingRequests.isEmpty(), "Expected at least one embeddings request when Semantic Tool Filtering policy is applied") to improve readability. Locate the test where embeddingRequests is obtained via wireMockServer.findAll(postRequestedFor(urlEqualTo(BACKEND_PATH + EMBEDDINGS_RESOURCE))) and where logAllMockServerRequests() is called, and update the assertion there (method/variable names: embeddingRequests, logAllMockServerRequests, BACKEND_PATH, EMBEDDINGS_RESOURCE).
437-450: Remove hardcoded sleep and debug logging remnants.
The
Thread.sleep(10000)is a test smell - it makes tests slow and potentially flaky. Consider using a polling/retry mechanism withwaitForAPIDeploymentSync(already called at line 409) or similar utility.The log statements with
###===###prefix appear to be debug artifacts that should be removed or converted to uselog.debug()level.Line 399 has commented-out sleep code that should be removed.
Suggested cleanup
- // Thread.sleep(5000); assertTrue(operationPolicyAttached, "Unable to find POST " + CHAT_RESOURCE + " operation to attach Semantic Tool Filtering policy");- Thread.sleep(10000); - // Retrieve and log the complete API definition to verify policy parameters - HttpResponse finalAPIResponse = restAPIPublisher.getAPI(aiApiId); - assertEquals(finalAPIResponse.getResponseCode(), HttpStatus.SC_OK, - "Failed to retrieve final AI API definition for logging"); - - APIDTO finalApiDto = new Gson().fromJson(finalAPIResponse.getData(), APIDTO.class); - log.info("###===### ===== COMPLETE API DEFINITION AFTER POLICY ATTACHMENT ===== ###"); - log.info("###===### API ID: " + finalApiDto.getId()); - log.info("###===### API Name: " + finalApiDto.getName()); - log.info("###===### API Version: " + finalApiDto.getVersion()); + // If additional sync time is needed, use a polling utility instead of fixed sleep🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/GuardrailTestCase.java` around lines 437 - 450, Remove the hardcoded Thread.sleep(10000) and the debug log.info lines prefixed with "###===###" and the commented-out sleep; instead use the existing deployment wait utility (e.g., waitForAPIDeploymentSync or a polling/retry loop) to wait for the API to be ready before calling restAPIPublisher.getAPI(aiApiId), and either delete the APIDTO debug logging or demote those log.info calls to log.debug so only meaningful assertions (like the finalAPIResponse status/assertEquals) remain. Ensure no leftover commented sleep remains.
880-896: Remove unused methodgetEmbeddingResponseForText().This method is never called. The WireMock stubs in
startMockBackend()(lines 603-634) handle embedding response matching and selection directly using request matchers, making this method redundant.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/GuardrailTestCase.java` around lines 880 - 896, Remove the unused helper method getEmbeddingResponseForText() from the GuardrailTestCase class: it is never invoked and duplicate logic is already handled by the WireMock stubs in startMockBackend() which use request matchers and mockEmbeddingEntries; delete the private method getEmbeddingResponseForText(String inputText) and any related unused references to buildEmbeddingOnlyResponse() or buildMistralEmptyEmbeddingResponse() only if they become unused as a result.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/GuardrailTestCase.java`:
- Around line 544-546: Update the assertion message in GuardrailTestCase's
assertTrue call so the typo "chache" is corrected to "cache": locate the
assertTrue(...) that checks embeddingRequests.size() and change the message
string to read "Expected no embeddings requests when cache applied. Only user
query should trigger embedding request. Embedding requests: " +
embeddingRequests.size() so the log uses the correct word "cache".
- Around line 782-801: Update the mismatched Javadoc for the method
invokeGatewayWithRetryOnNotFound to describe its actual behavior (invokes the
gateway URL with headers and payload, retries up to maxAttempts when receiving
HTTP 404 NOT_FOUND with retryIntervalMillis between attempts, returns
HttpResponse, and propagates exceptions), and correct the `@param` and `@throws`
tags to match parameters gatewayUrl, headers, payload, maxAttempts,
retryIntervalMillis and the thrown Exception; replace the current "Writes
content to a temporary file" comment with this accurate description so the
Javadoc aligns with the invokeGatewayWithRetryOnNotFound implementation.
- Around line 867-878: The method logAllMockServerRequests() computes
formattedTime and increments index but never logs anything; update it to emit a
clear log per ServeEvent (e.g., using the test class logger or System.out)
including index, request.getMethod().getName(), request.getUrl(), formattedTime,
request.getBodyAsString() and any headers you care about; ensure you use
wireMockServer.getAllServeEvents() and the ServeEvent -> LoggedRequest objects,
remove the unused index increment if you prefer zero-based counting, and keep
null-safe handling for request.getLoggedDate() (already done) so each iteration
produces a readable log entry instead of being a no-op.
- Around line 803-865: The private helper method
assertSemanticToolFilteringMediatorDeployedToGateway() is unused dead code;
either call it from the test that attaches the policy
(testAttachSemanticToolFilteringPolicyToAiApi()) after the policy attachment and
deployment steps to verify the mediator on the gateway, or delete the method
entirely; if integrating, invoke
assertSemanticToolFilteringMediatorDeployedToGateway() at the end of
testAttachSemanticToolFilteringPolicyToAiApi() and ensure any required fields
(gatewayContextMgt, keyManagerContext, superTenantKeyManagerContext,
SEMANTIC_TOOL_FILTERING_POLICY_CLASS, API_CONTEXT, API_VERSION) are initialized
in that test’s setup so the helper can run correctly.
In
`@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/testng.txt`:
- Around line 25-27: Update the testng suite parameter so the
apim-integration-tests-api-common suite uses the same group as the Guardrail
tests; change the <parameter name="group" value="group1"/> in the test entry for
test name "apim-integration-tests-api-common" to value="group4" so it aligns
with org.wso2.am.integration.tests.guardrail.GuardrailTestCase and avoids
divergent execution paths.
---
Nitpick comments:
In
`@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/GuardrailTestCase.java`:
- Around line 469-472: Remove the commented-out debug sleep lines in the
GuardrailTestCase test method: delete the two commented lines "// Add a wait"
and "// Thread.sleep(5000);" that appear immediately after the call to
invokeGatewayWithRetryOnNotFound(gatewayUrl, headers, requestPayload, 12, 5000L)
in the org.wso2.am.integration.tests.guardrail.GuardrailTestCase class so the
test contains only the real invocation and the subsequent assertion
assertEquals(response.getResponseCode(), HttpStatus.SC_OK, ...).
- Around line 529-533: Remove the leftover commented-out debug sleep: delete the
"// Add a wait" comment and the commented "Thread.sleep(5000);" line near the
invocation of invokeGatewayWithRetryOnNotFound where gatewayUrl is built (usage
of getAPIInvocationURLHttp(API_CONTEXT, API_VERSION) + CHAT_RESOURCE and
invokeGatewayWithRetryOnNotFound). Ensure no other commented sleep lines remain
in GuardrailTestCase.java around that assertion block so the test contains only
active code and assertions.
- Around line 478-484: The assertion uses a double-negative; replace the
assertTrue(!embeddingRequests.isEmpty(), ...) with
assertFalse(embeddingRequests.isEmpty(), "Expected at least one embeddings
request when Semantic Tool Filtering policy is applied") to improve readability.
Locate the test where embeddingRequests is obtained via
wireMockServer.findAll(postRequestedFor(urlEqualTo(BACKEND_PATH +
EMBEDDINGS_RESOURCE))) and where logAllMockServerRequests() is called, and
update the assertion there (method/variable names: embeddingRequests,
logAllMockServerRequests, BACKEND_PATH, EMBEDDINGS_RESOURCE).
- Around line 437-450: Remove the hardcoded Thread.sleep(10000) and the debug
log.info lines prefixed with "###===###" and the commented-out sleep; instead
use the existing deployment wait utility (e.g., waitForAPIDeploymentSync or a
polling/retry loop) to wait for the API to be ready before calling
restAPIPublisher.getAPI(aiApiId), and either delete the APIDTO debug logging or
demote those log.info calls to log.debug so only meaningful assertions (like the
finalAPIResponse status/assertEquals) remain. Ensure no leftover commented sleep
remains.
- Around line 880-896: Remove the unused helper method
getEmbeddingResponseForText() from the GuardrailTestCase class: it is never
invoked and duplicate logic is already handled by the WireMock stubs in
startMockBackend() which use request matchers and mockEmbeddingEntries; delete
the private method getEmbeddingResponseForText(String inputText) and any related
unused references to buildEmbeddingOnlyResponse() or
buildMistralEmptyEmbeddingResponse() only if they become unused as a result.
In
`@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/resources/guardrail/mockrequestbody.json`:
- Around line 1-82: The two JSON fixtures (mockrequestbody.json and
mistral-payload.json) are duplicates; remove the redundant file and make tests
reference a single canonical fixture instead. Pick one fixture to keep (e.g.,
mistral-payload.json), delete the other (mockrequestbody.json), and update any
test/resource consumers that import or load mockrequestbody.json to load the
retained file instead; ensure any build/test fixtures or CI references are
adjusted to the single file to avoid drift.
In
`@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/resources/testng.xml`:
- Line 274: The ApplicationBlockSubscriptionTestCase is sharing stateful fixed
resources (test-user, test-app, BlockAPITest) and should be isolated; update
testng.xml to move the <class
name="org.wso2.am.integration.tests.application.ApplicationBlockSubscriptionTestCase"/>
out of the shared apim-store-tests suite into its own <test> block, or modify
the test class
(org.wso2.am.integration.tests.application.ApplicationBlockSubscriptionTestCase)
to use unique, randomized resource names in its `@BeforeClass` setup (or teardown)
to avoid collisions—locate the `@BeforeClass` setup in
ApplicationBlockSubscriptionTestCase and either change resource names to include
a unique suffix or create a separate testng <test> entry for that class.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7f3da95f-dd21-48eb-9297-79759b0ccfb1
📒 Files selected for processing (11)
.github/workflows/maven.ymlall-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/GuardrailTestCase.javaall-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/testng.txtall-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/resources/guardrail/ai-service-provider-config-no-auth.jsonall-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/resources/guardrail/deployment.tomlall-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/resources/guardrail/mistral-def.jsonall-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/resources/guardrail/mistral-payload.jsonall-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/resources/guardrail/mistral-response.jsonall-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/resources/guardrail/mockembeddings.jsonall-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/resources/guardrail/mockrequestbody.jsonall-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/resources/testng.xml
… testng.xml adding new group for guardrail tests
6bb6d37 to
ddb6edf
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/SemanticToolFilteringTestCase.java`:
- Line 151: The test registers PROVIDER_CONFIG_FILE
("ai-service-provider-config-no-auth.json") but the embeddings stubs require an
Authorization: Bearer header, causing successful unauthenticated calls to miss
the 200 stub and hit the 400 fallback; update the embeddings HTTP stubs used in
SemanticToolFilteringTestCase (and the other stub blocks around the other
occurrences) to allow requests without an Authorization header (or explicitly
match an empty/absent Authorization) so that unauthenticated requests from the
no-auth provider configuration return the expected 200 responses, or
alternatively switch the test to use an auth-enabled provider config and token
if you prefer; ensure adjustments are made to the stub matchers referenced in
the test (the embeddings stubs) so they align with PROVIDER_CONFIG_FILE's
no-auth behavior.
- Line 121: Replace the hard-coded MOCK_BACKEND_PORT constant with a dynamic
WireMock port: remove or stop using private static final int MOCK_BACKEND_PORT =
18080 and instead start your WireMockServer with a dynamic port (e.g., new
WireMockServer(WireMockConfiguration.options().dynamicPort())), call
wireMockServer.start(), then assign mockPort = wireMockServer.port() (or
retrieve the dynamic port where mockPort is used) so all downstream URLs
continue to be built from mockPort; update the same change where
MOCK_BACKEND_PORT is referenced elsewhere (around the other occurrences
~588-592) in SemanticToolFilteringTestCase.
- Around line 593-601: The WireMock stub for wireMockServer.stubFor that mocks
POST to BACKEND_PATH + CHAT_RESOURCE currently requires REQUESTED_TOOL_TEXT in
the request body, which breaks tests when ranking prunes that tool; remove the
strict withRequestBody(containing(REQUESTED_TOOL_TEXT)) constraint (or replace
it with a more permissive/optional matcher) so the stub only asserts the
required QUERY snippet (REQUESTED_QUERY_TEXT_SNIPPET) and other stable structure
checks, leaving tool presence optional while still returning mockResponse.
- Around line 539-546: The current assertion only checks POST count and can
false-pass for batched embeddings; modify the test in
SemanticToolFilteringTestCase to inspect the actual embedding request payloads
(use the existing embeddingRequests =
wireMockServer.findAll(postRequestedFor(...)) and iterate over each
LoggedRequest.getBodyAsString()) and assert that none of the requests contain
tool embedding inputs (or tool identifiers) while exactly one request contains
the user query; i.e., replace the size-based assert with payload-based checks
that verify tool texts/IDs are absent from embedding request bodies and only the
user query triggered an embeddings call, and keep logging via
logAllMockServerRequests() for diagnostics.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 1caec53b-f133-4b61-b40a-f9bdd25a51ec
📒 Files selected for processing (2)
all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/SemanticToolFilteringTestCase.javaall-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/resources/testng.xml
…and remove obsolete verification method
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/SemanticToolFilteringTestCase.java (2)
811-821:⚠️ Potential issue | 🟡 Minor
logAllMockServerRequests()is still a no-op.Lines 811-821 format timestamps and increment
index, but nothing is logged or persisted. The calls at Lines 322, 481, and 543 therefore add no diagnostics when this test fails.Based on learnings: In
SemanticToolFilteringTestCase.java, the methodlogAllMockServerRequests()is intended to store/record WireMock serve events rather than act as a no-op.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/SemanticToolFilteringTestCase.java` around lines 811 - 821, The method logAllMockServerRequests() currently formats timestamps and increments index but does not record anything; update it to iterate wireMockServer.getAllServeEvents() and persist each ServeEvent by writing a clear, timestamped entry (using the existing formattedTime and index) to the test diagnostics (e.g., test logger or a test artifacts file) including request.getMethod().getName(), request.getUrl(), request.getBodyAsString(), request.getHeaders() and serveEvent.getResponse().getStatus() so the data from serveEvents/request/formattedTime/index is actually logged or saved for later inspection; ensure you use the test logger instance (or a dedicated file writer) and handle nulls for request.getLoggedDate() and empty bodies.
594-601:⚠️ Potential issue | 🟠 MajorDon’t make the shared chat stub depend on one specific tool surviving filtering.
Lines 594-598 back the baseline and filtered invocations with the same stub. Requiring
REQUESTED_TOOL_TEXTmeans a legitimate ranking change that prunes that tool turns the filtered-path request into a WireMock miss/404 instead of letting the test validate the filtered payload explicitly.🧪 Proposed fix
wireMockServer.stubFor(WireMock.post(urlEqualTo(BACKEND_PATH + CHAT_RESOURCE)) .withRequestBody(matchingJsonPath("$.contents[0].parts[0].text")) .withRequestBody(containing(REQUESTED_QUERY_TEXT_SNIPPET)) - .withRequestBody(containing(REQUESTED_TOOL_TEXT)) .willReturn(aResponse() .withStatus(200) .withHeader("Content-Type", "application/json") .withBody(mockResponse)));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/SemanticToolFilteringTestCase.java` around lines 594 - 601, The shared WireMock stub for chat requests is too strict: it requires REQUESTED_TOOL_TEXT so if filtering removes that tool the filtered-path request becomes a miss; update the stub(s) used by SemanticToolFilteringTestCase so the baseline and filtered invocations both match reliably — either remove the .withRequestBody(containing(REQUESTED_TOOL_TEXT)) constraint from the existing wireMockServer.stubFor(...) that matches BACKEND_PATH + CHAT_RESOURCE (leaving matchingJsonPath("$.contents[0].parts[0].text") and containing(REQUESTED_QUERY_TEXT_SNIPPET)), or register two separate stubs for the baseline and filtered flows (one that includes REQUESTED_TOOL_TEXT for the baseline and one that does not for the filtered case), ensuring both return mockResponse; locate references to wireMockServer.stubFor, BACKEND_PATH, CHAT_RESOURCE, REQUESTED_QUERY_TEXT_SNIPPET, REQUESTED_TOOL_TEXT, and mockResponse to apply the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/SemanticToolFilteringTestCase.java`:
- Around line 297-301: The direct single gateway call using
HTTPSClientUtils.doPost(gatewayUrl, headers, requestPayload) can fail
transiently after publish; replace it with the existing retry helper
invokeGatewayWithRetryOnNotFound(...) to retry on 404s—construct the same
gatewayUrl via getAPIInvocationURLHttp(API_CONTEXT, API_VERSION) + CHAT_RESOURCE
and pass the same headers and requestPayload into
invokeGatewayWithRetryOnNotFound so the initial invocation uses the retry logic
instead of a one-off doPost.
---
Duplicate comments:
In
`@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/SemanticToolFilteringTestCase.java`:
- Around line 811-821: The method logAllMockServerRequests() currently formats
timestamps and increments index but does not record anything; update it to
iterate wireMockServer.getAllServeEvents() and persist each ServeEvent by
writing a clear, timestamped entry (using the existing formattedTime and index)
to the test diagnostics (e.g., test logger or a test artifacts file) including
request.getMethod().getName(), request.getUrl(), request.getBodyAsString(),
request.getHeaders() and serveEvent.getResponse().getStatus() so the data from
serveEvents/request/formattedTime/index is actually logged or saved for later
inspection; ensure you use the test logger instance (or a dedicated file writer)
and handle nulls for request.getLoggedDate() and empty bodies.
- Around line 594-601: The shared WireMock stub for chat requests is too strict:
it requires REQUESTED_TOOL_TEXT so if filtering removes that tool the
filtered-path request becomes a miss; update the stub(s) used by
SemanticToolFilteringTestCase so the baseline and filtered invocations both
match reliably — either remove the
.withRequestBody(containing(REQUESTED_TOOL_TEXT)) constraint from the existing
wireMockServer.stubFor(...) that matches BACKEND_PATH + CHAT_RESOURCE (leaving
matchingJsonPath("$.contents[0].parts[0].text") and
containing(REQUESTED_QUERY_TEXT_SNIPPET)), or register two separate stubs for
the baseline and filtered flows (one that includes REQUESTED_TOOL_TEXT for the
baseline and one that does not for the filtered case), ensuring both return
mockResponse; locate references to wireMockServer.stubFor, BACKEND_PATH,
CHAT_RESOURCE, REQUESTED_QUERY_TEXT_SNIPPET, REQUESTED_TOOL_TEXT, and
mockResponse to apply the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 489901b0-3798-47a6-bb35-deed88cd00ba
📒 Files selected for processing (1)
all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/SemanticToolFilteringTestCase.java
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #14113 +/- ##
============================================
+ Coverage 11.37% 19.05% +7.68%
- Complexity 898 1419 +521
============================================
Files 361 361
Lines 17718 17718
Branches 1896 1896
============================================
+ Hits 2015 3377 +1362
+ Misses 15671 14301 -1370
- Partials 32 40 +8
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…oper policy attachment verification
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/SemanticToolFilteringTestCase.java (2)
297-301:⚠️ Potential issue | 🟠 MajorUse the existing retry helper for the first gateway invocation too.
Line 301 still does a one-shot POST immediately after publish. The same class already needed
invokeGatewayWithRetryOnNotFound(...)for deployment lag later, so the baseline step can fail in the same transient 404 window.Suggested change
- HttpResponse response = HTTPSClientUtils.doPost(gatewayUrl, headers, requestPayload); + HttpResponse response = invokeGatewayWithRetryOnNotFound( + gatewayUrl, headers, requestPayload, 12, 5000L);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/SemanticToolFilteringTestCase.java` around lines 297 - 301, The initial gateway invocation currently does a single POST via HTTPSClientUtils.doPost(gatewayUrl, headers, requestPayload) which can fail due to deployment lag; replace that one-shot call with the existing retry helper invokeGatewayWithRetryOnNotFound(...) used later in this class so the baseline invocation uses the same retry-on-404 behavior (keep using gatewayUrl built from getAPIInvocationURLHttp(API_CONTEXT, API_VERSION) + CHAT_RESOURCE and the same headers/requestPayload when calling invokeGatewayWithRetryOnNotFound).
811-822:⚠️ Potential issue | 🟡 Minor
logAllMockServerRequests()is still a no-op.The helper computes timestamps and discards them, so these call sites record nothing when diagnostics are needed most. Either persist the serve events as intended or remove the helper to avoid a misleading hook. Based on learnings:
logAllMockServerRequests()is intended to store/record WireMock serve events rather than emit debug logs.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/SemanticToolFilteringTestCase.java` around lines 811 - 822, logAllMockServerRequests() currently iterates ServeEvent objects from wireMockServer.getAllServeEvents() but does nothing with them; implement it to persist or emit the events instead of being a no-op. Collect each ServeEvent/LoggedRequest (use ServeEvent and request.getLoggedDate()/request.getUrl()/request.getMethod()/request.getBodyAsString() as needed), format a readable entry with the timestamp and index, and either append to a class-level List<ServeEvent> (e.g., recordedServeEvents) for later assertions or write them to the test logger (e.g., logger.info(...)) so diagnostics are available; if you prefer not to record at all, remove the logAllMockServerRequests() method and its call sites. Ensure the method name logAllMockServerRequests(), the call to wireMockServer.getAllServeEvents(), and the ServeEvent/LoggedRequest usages are updated accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/SemanticToolFilteringTestCase.java`:
- Around line 437-447: Replace the fixed Thread.sleep(20000) in
SemanticToolFilteringTestCase with a bounded polling readiness check: repeatedly
call restAPIPublisher.getAPI(aiApiId) (or another API/state-check method) at a
short interval (e.g., 500-1000ms) until the response shows the policy has been
applied (inspect finalAPIResponse.getResponseCode() == HttpStatus.SC_OK and/or
parse APIDTO from finalAPIResponse.getData() to validate the expected policy
fields) or a configurable timeout (e.g., 30s) is reached; after the loop assert
success and log the final APIDTO as before (use aiApiId,
restAPIPublisher.getAPI, finalAPIResponse, APIDTO) so slow CI agents won’t cause
flaky tests.
---
Duplicate comments:
In
`@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/SemanticToolFilteringTestCase.java`:
- Around line 297-301: The initial gateway invocation currently does a single
POST via HTTPSClientUtils.doPost(gatewayUrl, headers, requestPayload) which can
fail due to deployment lag; replace that one-shot call with the existing retry
helper invokeGatewayWithRetryOnNotFound(...) used later in this class so the
baseline invocation uses the same retry-on-404 behavior (keep using gatewayUrl
built from getAPIInvocationURLHttp(API_CONTEXT, API_VERSION) + CHAT_RESOURCE and
the same headers/requestPayload when calling invokeGatewayWithRetryOnNotFound).
- Around line 811-822: logAllMockServerRequests() currently iterates ServeEvent
objects from wireMockServer.getAllServeEvents() but does nothing with them;
implement it to persist or emit the events instead of being a no-op. Collect
each ServeEvent/LoggedRequest (use ServeEvent and
request.getLoggedDate()/request.getUrl()/request.getMethod()/request.getBodyAsString()
as needed), format a readable entry with the timestamp and index, and either
append to a class-level List<ServeEvent> (e.g., recordedServeEvents) for later
assertions or write them to the test logger (e.g., logger.info(...)) so
diagnostics are available; if you prefer not to record at all, remove the
logAllMockServerRequests() method and its call sites. Ensure the method name
logAllMockServerRequests(), the call to wireMockServer.getAllServeEvents(), and
the ServeEvent/LoggedRequest usages are updated accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5f0629c8-64a5-45ac-8927-579a6398e8b8
📒 Files selected for processing (1)
all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/SemanticToolFilteringTestCase.java
…handling and introduce observation tracking for semantic filtering application
There was a problem hiding this comment.
♻️ Duplicate comments (3)
all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/SemanticToolFilteringTestCase.java (3)
314-314:⚠️ Potential issue | 🟠 MajorUse the retry helper for the first gateway invocation.
Line 314 still does a one-shot
doPost(...)immediately after publish. If the gateway is still syncing the deployment, this baseline call can fail before the rest of the retry-aware checks even run.🔁 Proposed fix
- HttpResponse response = HTTPSClientUtils.doPost(gatewayUrl, headers, requestPayload); + HttpResponse response = invokeGatewayWithRetryOnNotFound( + gatewayUrl, headers, requestPayload, 12, 5000L);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/SemanticToolFilteringTestCase.java` at line 314, Replace the one-shot call to HTTPSClientUtils.doPost(gatewayUrl, headers, requestPayload) with the test-suite's retry helper so the first gateway invocation is retried until the deployment is up; specifically, locate the direct invocation (HttpResponse response = HTTPSClientUtils.doPost(gatewayUrl, headers, requestPayload);) and wrap it with the existing retry helper used elsewhere in this test class to repeatedly call HTTPSClientUtils.doPost(gatewayUrl, headers, requestPayload) until a successful response or timeout, then assign the resulting HttpResponse to response.
914-924:⚠️ Potential issue | 🟡 Minor
logAllMockServerRequests()currently records nothing.Lines 917-923 compute request metadata, but nothing is persisted, returned, or attached to test output. That makes this helper a no-op, so the diagnostics disappear exactly when these tests fail.
Based on learnings,
logAllMockServerRequests()is intended to store/record WireMock serve events rather than emit debug logs.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/SemanticToolFilteringTestCase.java` around lines 914 - 924, The helper logAllMockServerRequests() is a no-op because it builds request metadata but never records or outputs it; update the method to iterate serveEvents (from wireMockServer.getAllServeEvents()), build a formatted line for each LoggedRequest using formattedTime, request.getUrl(), request.getMethod().getName(), request.getBodyAsString() (and any headers you need), append each line to a StringBuilder or List<String>, and then persist/output the results (e.g., call logger.info(...) or Reporter.log(...) or write to a test-output file) so the serve events are actually recorded when tests fail; keep the existing variable names (serveEvents, request, formattedTime) so the change is localized to logAllMockServerRequests().
610-614:⚠️ Potential issue | 🟠 MajorDon't make the shared chat stub depend on one specific tool surviving filtering.
This is the only
/chat/completionsstub, but it currently requiresREQUESTED_TOOL_TEXTto be present. Once Semantic Tool Filtering is enabled, a correct ranking can legitimately prune that tool and cause the stub to miss even though the gateway behavior is correct. The baseline assertion at Lines 325-328 already validates the original payload.🧩 Proposed fix
wireMockServer.stubFor(WireMock.post(urlEqualTo(BACKEND_PATH + CHAT_RESOURCE)) .withRequestBody(matchingJsonPath("$.contents[0].parts[0].text")) .withRequestBody(containing(REQUESTED_QUERY_TEXT_SNIPPET)) - .withRequestBody(containing(REQUESTED_TOOL_TEXT)) .willReturn(aResponse() .withStatus(200) .withHeader("Content-Type", "application/json") .withBody(mockResponse)));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/SemanticToolFilteringTestCase.java` around lines 610 - 614, The shared WireMock stub for POST to BACKEND_PATH + CHAT_RESOURCE (created via wireMockServer.stubFor) should not require REQUESTED_TOOL_TEXT to be present because Semantic Tool Filtering may remove that tool; remove the .withRequestBody(containing(REQUESTED_TOOL_TEXT)) matcher from the stub setup so the stub only asserts the query text (REQUESTED_QUERY_TEXT_SNIPPET) and the parts text matcher, relying on the existing baseline assertion to validate the original payload instead.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In
`@all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/SemanticToolFilteringTestCase.java`:
- Line 314: Replace the one-shot call to HTTPSClientUtils.doPost(gatewayUrl,
headers, requestPayload) with the test-suite's retry helper so the first gateway
invocation is retried until the deployment is up; specifically, locate the
direct invocation (HttpResponse response = HTTPSClientUtils.doPost(gatewayUrl,
headers, requestPayload);) and wrap it with the existing retry helper used
elsewhere in this test class to repeatedly call
HTTPSClientUtils.doPost(gatewayUrl, headers, requestPayload) until a successful
response or timeout, then assign the resulting HttpResponse to response.
- Around line 914-924: The helper logAllMockServerRequests() is a no-op because
it builds request metadata but never records or outputs it; update the method to
iterate serveEvents (from wireMockServer.getAllServeEvents()), build a formatted
line for each LoggedRequest using formattedTime, request.getUrl(),
request.getMethod().getName(), request.getBodyAsString() (and any headers you
need), append each line to a StringBuilder or List<String>, and then
persist/output the results (e.g., call logger.info(...) or Reporter.log(...) or
write to a test-output file) so the serve events are actually recorded when
tests fail; keep the existing variable names (serveEvents, request,
formattedTime) so the change is localized to logAllMockServerRequests().
- Around line 610-614: The shared WireMock stub for POST to BACKEND_PATH +
CHAT_RESOURCE (created via wireMockServer.stubFor) should not require
REQUESTED_TOOL_TEXT to be present because Semantic Tool Filtering may remove
that tool; remove the .withRequestBody(containing(REQUESTED_TOOL_TEXT)) matcher
from the stub setup so the stub only asserts the query text
(REQUESTED_QUERY_TEXT_SNIPPET) and the parts text matcher, relying on the
existing baseline assertion to validate the original payload instead.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3565455d-9c5b-4157-928d-fc313bb147a9
📒 Files selected for processing (1)
all-in-one-apim/modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/am/integration/tests/guardrail/SemanticToolFilteringTestCase.java
Add Guardrail integration tests for AI request forwarding and Semantic Tool Filtering
Summary
What changed
How to test locally
Test case flow
Summary by CodeRabbit
New Features
Tests
Chores