Skip to content

GenAI Utils | Adding Embedding metrics #4377

Open
shuningc wants to merge 6 commits intoopen-telemetry:mainfrom
shuningc:addingEmbeddingMetric
Open

GenAI Utils | Adding Embedding metrics #4377
shuningc wants to merge 6 commits intoopen-telemetry:mainfrom
shuningc:addingEmbeddingMetric

Conversation

@shuningc
Copy link
Copy Markdown
Contributor

Description

This PR adds metrics and events telemetry for [EmbeddingInvocation], aligning with OpenTelemetry semantic conventions. Previously, embedding invocations only emitted spans - now they also emit duration metrics and operation details events.

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Added 3 new tests for embedding metrics:
[test_stop_embedding_records_duration_only] - verifies duration is recorded but token metrics are NOT
[test_stop_embedding_records_duration_with_additional_attributes] - verifies server address, port, custom attributes, and response model are included
[test_fail_embedding_records_error_and_duration] - verifies error path records
error.type and duration

Does This PR Require a Core Repo Change?

  • No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added

@shuningc shuningc requested a review from a team as a code owner March 30, 2026 12:52
@shuningc shuningc marked this pull request as draft March 30, 2026 12:52
@shuningc shuningc marked this pull request as ready for review March 30, 2026 12:53
self._record_embedding_metrics(
invocation, span, error_type=error_type
self._record_metrics(invocation, span, error_type=error_type)
_maybe_emit_embedding_event(
Copy link
Copy Markdown
Member

@lmolkova lmolkova Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we have embedding event documented in semantic conventions. The one we have (gen_ai.client.inference.operation.details https://github.com/open-telemetry/semantic-conventions/blob/main/docs/gen-ai/gen-ai-events.md#event-gen_aiclientinferenceoperationdetails) is for inference.

I believe the only reason we wanted event for it is to capture large content, but we never added any of the content attributes to embedding span - https://github.com/open-telemetry/semantic-conventions/blob/main/docs/gen-ai/gen-ai-spans.md#embeddings

So I think we should not report any events for embeddings

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed events emissions for Embedding type

@xrmx xrmx added the gen-ai Related to generative AI label Apr 1, 2026
@xrmx xrmx moved this to Reviewed PRs that need fixes in Python PR digest Apr 1, 2026
@shuningc shuningc changed the title GenAI Utils | Adding Embedding metrics and events GenAI Utils | Adding Embedding metrics Apr 2, 2026
attributes: Dict[str, AttributeValue] = {}

# Set attributes using getattr for fields that may not exist on base class
operation_name = getattr(invocation, "operation_name", None)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a case when operation name is not available in the base model? it shouid not be

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed None check

if request_model:
attributes[GenAI.GEN_AI_REQUEST_MODEL] = request_model

provider = getattr(invocation, "provider", None)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a case when provider is not available? it should always be available

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed None check

if server_port is not None:
attributes[server_attributes.SERVER_PORT] = server_port

metric_attributes = getattr(invocation, "metric_attributes", None)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we make metric_attributes part of base operation definition?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to base operation definition, together with operation name and provider.

context=span_context,
)
# Only record token metrics for LLMInvocation
if isinstance(invocation, LLMInvocation):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why only LLM? we should still report input token usage for embeddings

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added input token metrics for embeddings

@shuningc shuningc force-pushed the addingEmbeddingMetric branch from e6216e4 to 68d328f Compare April 8, 2026 00:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gen-ai Related to generative AI

Projects

Status: Reviewed PRs that need fixes

Development

Successfully merging this pull request may close these issues.

5 participants