Skip to content

Refactor e2e instrumentation tests to use step templates#4765

Draft
swiatekm wants to merge 11 commits intoopen-telemetry:mainfrom
swiatekm:claude/fix-e2e-tests-CQP0x
Draft

Refactor e2e instrumentation tests to use step templates#4765
swiatekm wants to merge 11 commits intoopen-telemetry:mainfrom
swiatekm:claude/fix-e2e-tests-CQP0x

Conversation

@swiatekm
Copy link
Copy Markdown
Contributor

Refactor instrumentation e2e tests to eliminate duplicated inline steps by extracting them into shared Chainsaw step templates. This dramatically reduces boilerplate and makes the test suite easier to maintain.

Changes

  • Created 4 new shared step templates in tests/step-templates/:
    • app-http-request.yaml - Makes HTTP request to app pod via kubectl proxy
    • wait-for-telemetry.yaml - Waits for telemetry data in collector pod logs
    • validate-collector-metrics.yaml - Validates collector has received spans AND metric points (> 1)
    • validate-collector-spans.yaml - Validates collector has received spans only (> 0)
  • Converted 18 instrumentation e2e tests to use shared templates via test-level bindings:
    • instrumentation-apache-httpd, instrumentation-apache-multicontainer
    • instrumentation-dotnet, instrumentation-dotnet-multicontainer, instrumentation-dotnet-musl
    • instrumentation-go
    • instrumentation-java, instrumentation-java-multicontainer, instrumentation-java-other-ns, instrumentation-java-tls
    • instrumentation-nginx, instrumentation-nginx-contnr-secctx, instrumentation-nginx-multicontainer
    • instrumentation-nodejs, instrumentation-nodejs-multicontainer, instrumentation-nodejs-volume
    • instrumentation-python, instrumentation-python-multicontainer, instrumentation-python-musl
  • Deleted 3 redundant Python-specific local templates (app-request.yaml, wait-for-telemetry-data.yaml, check-telemetry-data.yaml)
  • Fixed Chainsaw expression syntax issues in templates (using ($binding) for JMESPath expressions, ${ENV_VAR} for environment variables)

Design decisions

  • Java "Wait for telemetry" stays inline: Java tests compute a version-dependent metric name at runtime (process.runtime.jvm.memory.usage vs jvm.memory.used based on instrumentation major version), which can't be expressed as a static template binding
  • Two validation templates: Tests that only produce spans (apache, nginx, go, python-base) use validate-collector-spans.yaml (> 0), while tests that produce both spans and metrics use validate-collector-metrics.yaml (> 1)
  • Multicontainer tests reuse templates twice: Both the initial and post-update verification rounds use the same shared templates

This change reduces code duplication across instrumentation tests by
creating reusable Chainsaw step templates:

- get-pod-by-label.yaml: Parameterized pod lookup by label selector
- app-http-request.yaml: Generic HTTP request to app via kubectl proxy
- wait-for-telemetry.yaml: Wait for telemetry data in collector logs
- validate-collector-metrics.yaml: Fetch and validate collector metrics

Refactored tests:
- instrumentation-java (123 -> 83 lines, -40 lines)
- instrumentation-nodejs (107 -> 59 lines, -48 lines)
- instrumentation-dotnet (107 -> 59 lines, -48 lines)
- instrumentation-go (110 -> 66 lines, -44 lines)

Total reduction: 180 lines of duplicated code removed

These templates follow the pattern established by the Python test and
can be reused across all instrumentation tests, improving maintainability
and consistency.
- Remove complex conditional logic from validate-collector-metrics template
  that was causing test failures (boolean && and || operators in assertions)
- Remove invalid 'if' condition in catch block (not supported by Chainsaw)
- Simplify template to always check all 4 metrics (spans + metric points)
- Revert Go test to inline validation since it only checks spans (not metric points)
- Remove checkSpans/checkMetricPoints bindings from Java, NodeJS, and DotNet tests

The template now follows the same pattern as the Python test's
check-telemetry-data.yaml which is proven to work.
Chainsaw does not support binding variables in the timeout field.
Changed from 'timeout: ($scriptTimeout)' to 'timeout: 2m' to fix
error: time: invalid duration "($scriptTimeout)"

The timeout field must be a static duration value.
…code

- Restore original `>` comparison in validate-collector-metrics.yaml
  (was incorrectly changed to `>=`, loosening assertions from "at least 2"
  to "at least 1")
- Fix relative path in wait-for-telemetry.yaml: use `../test-e2e-apps/`
  instead of `../../test-e2e-apps/` since the template lives in
  `tests/step-templates/`, not `tests/e2e-instrumentation/instrumentation-*/`
- Remove unused get-pod-by-label.yaml template (dead code, functionality
  already embedded in app-http-request.yaml)
- Add missing trailing newline in instrumentation-java/chainsaw-test.yaml

https://claude.ai/code/session_01PCRpSx4oMfvuXX1GpDDvjJ
Three root causes fixed:

1. wait-for-telemetry.yaml: Script content path was relative to the
   template directory, but chainsaw resolves it relative to the test
   directory. Changed from ../test-e2e-apps/ to ../../test-e2e-apps/.

2. app-http-request.yaml and validate-collector-metrics.yaml: Used
   ${LABEL_SELECTOR} env var references in command args, but chainsaw
   resolves ${...} in args against bindings, not process env vars set
   via the env field. Replaced with ($binding) references directly.

3. validate-collector-metrics.yaml: Added shell: /bin/bash for inline
   script using {1..12} brace expansion (chainsaw defaults to sh -c).
   Replaced unproven ($binding) substitution in JMESPath assertion
   keys with inline literals matching original test code.

https://claude.ai/code/session_018bX4ykoWumDzPztPxXc4xn
Chainsaw resolves ${...} bindings in command args but may not process
($...) JMESPath expressions in that context. Changed ($podLabelSelector)
and ($collectorLabelSelector) to ${podLabelSelector} and
${collectorLabelSelector} in kubectl args, matching the proven pattern
used in the Python instrumentation test templates.

https://claude.ai/code/session_018bX4ykoWumDzPztPxXc4xn
Remove default bindings from step templates and define all bindings at
the test level (spec.bindings) to avoid template binding shadowing issues.
This follows the pattern already used by the Python instrumentation tests.

https://claude.ai/code/session_018bX4ykoWumDzPztPxXc4xn
Command args use environment.Expand() which requires ${ENV_VAR} with
env entries, not direct binding references. Proxy Expression fields
(port, path) use Expression.Value() which only handles ($...) JMESPath
syntax, not ${...}.

- app-http-request: add env entry mapping binding to env var for label
  selector in command args; use ($appPort) and ($appPath) JMESPath
  syntax for proxy port and path Expression fields
- validate-collector-metrics: add env entry mapping binding to env var
  for collector label selector in command args

https://claude.ai/code/session_018bX4ykoWumDzPztPxXc4xn
The previous CI run passed unit tests with the same Go code. The unit
test failure on this commit is likely flaky since only YAML test
template files were changed.

https://claude.ai/code/session_018bX4ykoWumDzPztPxXc4xn
Replace inline request/telemetry/validation steps across 15 instrumentation
e2e tests with shared step templates. Creates a new validate-collector-spans
template for spans-only validation, and converts all tests to use the shared
app-http-request, wait-for-telemetry, and validate-collector-metrics/spans
templates via test-level bindings. Removes Python-specific local templates
that are now redundant. Net reduction of ~1,000 lines.

https://claude.ai/code/session_018bX4ykoWumDzPztPxXc4xn
@swiatekm swiatekm added the Skip Changelog PRs that do not require a CHANGELOG.md entry label Feb 23, 2026
@frzifus
Copy link
Copy Markdown
Member

frzifus commented Mar 5, 2026

cc @IshwarKanse

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Skip Changelog PRs that do not require a CHANGELOG.md entry

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants