-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Feature - add missing jvm thread metrics #17135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
JackN5y
wants to merge
3
commits into
open-telemetry:main
Choose a base branch
from
JackN5y:feature/add-missing-jvm-thread-metrics
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,138 @@ | ||
| ## 7. JVM Runtime Metrics | ||
|
|
||
| The OpenTelemetry Java Agent automatically collects JVM runtime metrics via JMX. These metrics are enabled by default and provide insight into JVM health — memory, threads, CPU, GC, and class loading. | ||
|
|
||
| ### Stable Metrics (Enabled by Default) | ||
|
|
||
| These metrics follow the [OpenTelemetry JVM semantic conventions](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/runtime/jvm-metrics.md) and are always collected: | ||
|
|
||
| | OTel Metric Name | Prometheus Name | Description | | ||
| |---|---|---| | ||
| | `jvm.memory.used` | `jvm_memory_used_bytes` | Measure of memory used | | ||
| | `jvm.memory.committed` | `jvm_memory_committed_bytes` | Measure of memory committed | | ||
| | `jvm.memory.limit` | `jvm_memory_limit_bytes` | Measure of max memory available | | ||
| | `jvm.memory.used_after_last_gc` | `jvm_memory_used_after_last_gc_bytes` | Memory used after last GC | | ||
| | `jvm.thread.count` | `jvm_thread_count` | Number of executing platform threads | | ||
| | `jvm.class.loaded` | `jvm_class_loaded_total` | Total number of classes loaded | | ||
| | `jvm.class.unloaded` | `jvm_class_unloaded_total` | Total number of classes unloaded | | ||
| | `jvm.class.count` | `jvm_class_count` | Current number of loaded classes | | ||
| | `jvm.cpu.time` | `jvm_cpu_time_seconds_total` | CPU time used by the JVM process | | ||
| | `jvm.cpu.count` | `jvm_cpu_count` | Number of available processors | | ||
| | `jvm.gc.duration` | `jvm_gc_duration_seconds` | Duration of GC pauses | | ||
|
|
||
| > **Note:** The Prometheus exporter automatically converts OTel metric names from dots (`.`) to underscores (`_`) and appends unit suffixes like `_bytes`, `_total`, `_seconds`. | ||
|
|
||
| ### Experimental Metrics | ||
|
|
||
| Additional JVM metrics are available but must be explicitly enabled. These are considered experimental and may change in future releases. | ||
|
|
||
| #### How to Enable | ||
|
|
||
| ```bash | ||
| # Command line | ||
| java -javaagent:opentelemetry-javaagent.jar \ | ||
| -Dotel.instrumentation.runtime-telemetry.emit-experimental-telemetry=true \ | ||
| -jar my-application.jar | ||
|
|
||
| # Environment variable | ||
| export OTEL_INSTRUMENTATION_RUNTIME_TELEMETRY_EMIT_EXPERIMENTAL_TELEMETRY=true | ||
| ``` | ||
|
|
||
| #### Experimental Metrics List | ||
|
|
||
| | OTel Metric Name | Prometheus Name | Description | | ||
| |---|---|---| | ||
| | `jvm.memory.init` | `jvm_memory_init_bytes` | Initial memory pool size | | ||
| | `jvm.buffer.memory.used` | `jvm_buffer_memory_used_bytes` | Memory used by buffers | | ||
| | `jvm.buffer.memory.limit` | `jvm_buffer_memory_limit_bytes` | Total memory capacity of buffers | | ||
| | `jvm.buffer.count` | `jvm_buffer_count` | Number of buffers in the pool | | ||
| | `jvm.system.cpu.load_1m` | `jvm_system_cpu_load_1m` | System CPU load average (1 min) | | ||
| | `jvm.system.cpu.utilization` | `jvm_system_cpu_utilization_ratio` | System CPU utilization | | ||
| | `jvm.file_descriptor.count` | `jvm_file_descriptor_count` | Number of open file descriptors | | ||
| | `jvm.thread.deadlock.count` | `jvm_thread_deadlock_count` | Threads in deadlock (monitors + ownable synchronizers) | | ||
| | `jvm.thread.monitor_deadlock.count` | `jvm_thread_monitor_deadlock_count` | Threads in deadlock (monitors only) | | ||
|
|
||
| ### Deadlock Detection Metrics | ||
|
|
||
| The two deadlock metrics (`jvm.thread.deadlock.count` and `jvm.thread.monitor_deadlock.count`) use `ThreadMXBean.findDeadlockedThreads()` and `ThreadMXBean.findMonitorDeadlockedThreads()` — JMX **operations** that cannot be expressed via standard JMX YAML rules (which only support reading MBean attributes). | ||
|
|
||
| | Metric | What It Detects | | ||
| |---|---| | ||
| | `jvm.thread.deadlock.count` | Deadlocks involving **both** `synchronized` blocks and `java.util.concurrent` locks (e.g., `ReentrantLock`) | | ||
| | `jvm.thread.monitor_deadlock.count` | Deadlocks involving **only** `synchronized` blocks (object monitor locks) | | ||
|
|
||
| These are equivalent to Prometheus client_java's `jvm_threads_deadlocked` and `jvm_threads_deadlocked_monitor` gauges. | ||
|
|
||
| #### Verifying Deadlock Metrics | ||
|
|
||
| If you have the Prometheus exporter enabled, you can verify the metrics: | ||
|
|
||
| ```bash | ||
| # Check for deadlock metrics | ||
| curl -s http://localhost:9464/metrics | grep jvm_thread_deadlock | ||
|
|
||
| # Expected output (0 means no deadlocks — which is healthy): | ||
| # HELP jvm_thread_deadlock_count Number of platform threads that are in deadlock... | ||
| # TYPE jvm_thread_deadlock_count gauge | ||
| # jvm_thread_deadlock_count{...} 0.0 | ||
| # HELP jvm_thread_monitor_deadlock_count Number of platform threads that are in deadlock... | ||
| # TYPE jvm_thread_monitor_deadlock_count gauge | ||
| # jvm_thread_monitor_deadlock_count{...} 0.0 | ||
| ``` | ||
|
|
||
| #### Alerting on Deadlocks | ||
|
|
||
| These metrics are particularly useful for alerting. A value greater than 0 indicates a deadlock in the JVM: | ||
|
|
||
| ```yaml | ||
| # Example Prometheus alert rule | ||
| groups: | ||
| - name: jvm_deadlock_alerts | ||
| rules: | ||
| - alert: JvmThreadDeadlock | ||
| expr: jvm_thread_deadlock_count > 0 | ||
| for: 1m | ||
| labels: | ||
| severity: critical | ||
| annotations: | ||
| summary: "JVM thread deadlock detected on {{ $labels.instance }}" | ||
| description: "{{ $value }} threads are in deadlock. Immediate investigation required." | ||
| ``` | ||
|
|
||
| ### Complete Example with All JVM Metrics Enabled | ||
|
|
||
| ```bash | ||
| java -javaagent:/path/to/opentelemetry-javaagent.jar \ | ||
| -Dotel.service.name=my-java-service \ | ||
| -Dotel.exporter.otlp.endpoint=http://localhost:4317 \ | ||
| -Dotel.exporter.otlp.protocol=grpc \ | ||
| -Dotel.metrics.exporter=otlp \ | ||
| -Dotel.logs.exporter=otlp \ | ||
| -Dotel.traces.exporter=otlp \ | ||
| -Dotel.instrumentation.runtime-telemetry.emit-experimental-telemetry=true \ | ||
| -jar my-application.jar | ||
| ``` | ||
|
|
||
| For JBoss/WildFly, add to `standalone.conf`: | ||
|
|
||
| ```bash | ||
| # ========================================= | ||
| # OpenTelemetry Java Agent Configuration | ||
| # with experimental JVM metrics enabled | ||
| # ========================================= | ||
|
|
||
| OTEL_AGENT_PATH="/opt/jboss/opentelemetry-javaagent.jar" | ||
|
|
||
| export OTEL_SERVICE_NAME="jboss-application" | ||
| export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317" | ||
| export OTEL_EXPORTER_OTLP_PROTOCOL="grpc" | ||
| export OTEL_METRICS_EXPORTER="otlp" | ||
| export OTEL_LOGS_EXPORTER="otlp" | ||
| export OTEL_TRACES_EXPORTER="otlp" | ||
|
|
||
| # Enable experimental JVM metrics (buffer pools, CPU utilization, | ||
| # file descriptors, deadlock detection) | ||
| export OTEL_INSTRUMENTATION_RUNTIME_TELEMETRY_EMIT_EXPERIMENTAL_TELEMETRY=true | ||
|
|
||
| JAVA_OPTS="$JAVA_OPTS -javaagent:$OTEL_AGENT_PATH" | ||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
68 changes: 68 additions & 0 deletions
68
...a/io/opentelemetry/instrumentation/runtimemetrics/java8/internal/ExperimentalThreads.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,68 @@ | ||
| /* | ||
| * Copyright The OpenTelemetry Authors | ||
| * SPDX-License-Identifier: Apache-2.0 | ||
| */ | ||
|
|
||
| package io.opentelemetry.instrumentation.runtimemetrics.java8.internal; | ||
|
|
||
| import io.opentelemetry.api.OpenTelemetry; | ||
| import io.opentelemetry.api.metrics.Meter; | ||
| import java.lang.management.ManagementFactory; | ||
| import java.lang.management.ThreadMXBean; | ||
| import java.util.ArrayList; | ||
| import java.util.List; | ||
|
|
||
| /** | ||
| * Registers measurements that generate experimental metrics about JVM threads. | ||
| * | ||
| * <p>This class is internal and is hence not for public use. Its APIs are unstable and can change | ||
| * at any time. | ||
| * | ||
| * @deprecated Use {@link io.opentelemetry.instrumentation.runtimemetrics.java8.RuntimeMetrics} | ||
| * instead, and configure metric views to select specific metrics. | ||
| */ | ||
| @Deprecated | ||
| public final class ExperimentalThreads { | ||
|
|
||
| /** Register observers for java runtime experimental thread metrics. */ | ||
| public static List<AutoCloseable> registerObservers(OpenTelemetry openTelemetry) { | ||
| return registerObservers(openTelemetry, ManagementFactory.getThreadMXBean()); | ||
| } | ||
|
|
||
| // Visible for testing | ||
| static List<AutoCloseable> registerObservers( | ||
| OpenTelemetry openTelemetry, ThreadMXBean threadBean) { | ||
| Meter meter = JmxRuntimeMetricsUtil.getMeter(openTelemetry); | ||
| List<AutoCloseable> observables = new ArrayList<>(); | ||
|
|
||
| observables.add( | ||
| meter | ||
| .upDownCounterBuilder("jvm.thread.deadlock.count") | ||
| .setDescription( | ||
| "Number of platform threads that are in deadlock waiting to acquire object monitors or ownable synchronizers.") | ||
| .setUnit("{thread}") | ||
| .buildWithCallback( | ||
| measurement -> | ||
| measurement.record( | ||
| nullSafeArrayLength(threadBean.findDeadlockedThreads())))); | ||
|
|
||
| observables.add( | ||
| meter | ||
| .upDownCounterBuilder("jvm.thread.monitor_deadlock.count") | ||
| .setDescription( | ||
| "Number of platform threads that are in deadlock waiting to acquire object monitors.") | ||
| .setUnit("{thread}") | ||
| .buildWithCallback( | ||
| measurement -> | ||
| measurement.record( | ||
| nullSafeArrayLength(threadBean.findMonitorDeadlockedThreads())))); | ||
|
|
||
| return observables; | ||
| } | ||
|
|
||
| private static long nullSafeArrayLength(long[] array) { | ||
| return array == null ? 0 : array.length; | ||
| } | ||
|
|
||
| private ExperimentalThreads() {} | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
99 changes: 99 additions & 0 deletions
99
.../opentelemetry/instrumentation/runtimemetrics/java8/internal/ExperimentalThreadsTest.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,99 @@ | ||
| /* | ||
| * Copyright The OpenTelemetry Authors | ||
| * SPDX-License-Identifier: Apache-2.0 | ||
| */ | ||
|
|
||
| package io.opentelemetry.instrumentation.runtimemetrics.java8.internal; | ||
|
|
||
| import static io.opentelemetry.instrumentation.runtimemetrics.java8.ScopeUtil.EXPECTED_SCOPE; | ||
| import static io.opentelemetry.sdk.testing.assertj.OpenTelemetryAssertions.assertThat; | ||
| import static org.mockito.Mockito.when; | ||
|
|
||
| import io.opentelemetry.instrumentation.testing.junit.InstrumentationExtension; | ||
| import io.opentelemetry.instrumentation.testing.junit.LibraryInstrumentationExtension; | ||
| import java.lang.management.ThreadMXBean; | ||
| import org.junit.jupiter.api.Test; | ||
| import org.junit.jupiter.api.extension.ExtendWith; | ||
| import org.junit.jupiter.api.extension.RegisterExtension; | ||
| import org.mockito.Mock; | ||
| import org.mockito.junit.jupiter.MockitoExtension; | ||
|
|
||
| @SuppressWarnings("deprecation") // until ExperimentalThreads is renamed | ||
| @ExtendWith(MockitoExtension.class) | ||
| class ExperimentalThreadsTest { | ||
|
|
||
| @RegisterExtension | ||
| static final InstrumentationExtension testing = LibraryInstrumentationExtension.create(); | ||
|
|
||
| @Mock private ThreadMXBean threadBean; | ||
|
|
||
| @Test | ||
| void registerObservers_DeadlockedThreads() { | ||
| when(threadBean.findDeadlockedThreads()).thenReturn(new long[] {1, 2, 3}); | ||
| when(threadBean.findMonitorDeadlockedThreads()).thenReturn(new long[] {4, 5}); | ||
|
|
||
| ExperimentalThreads.registerObservers(testing.getOpenTelemetry(), threadBean); | ||
|
|
||
| testing.waitAndAssertMetrics( | ||
| "io.opentelemetry.runtime-telemetry-java8", | ||
| "jvm.thread.deadlock.count", | ||
| metrics -> | ||
| metrics.anySatisfy( | ||
| metricData -> | ||
| assertThat(metricData) | ||
| .hasInstrumentationScope(EXPECTED_SCOPE) | ||
| .hasDescription( | ||
| "Number of platform threads that are in deadlock waiting to acquire object monitors or ownable synchronizers.") | ||
| .hasUnit("{thread}") | ||
| .hasLongSumSatisfying( | ||
| sum -> | ||
| sum.isNotMonotonic() | ||
| .hasPointsSatisfying(point -> point.hasValue(3))))); | ||
|
|
||
| testing.waitAndAssertMetrics( | ||
| "io.opentelemetry.runtime-telemetry-java8", | ||
| "jvm.thread.monitor_deadlock.count", | ||
| metrics -> | ||
| metrics.anySatisfy( | ||
| metricData -> | ||
| assertThat(metricData) | ||
| .hasInstrumentationScope(EXPECTED_SCOPE) | ||
| .hasDescription( | ||
| "Number of platform threads that are in deadlock waiting to acquire object monitors.") | ||
| .hasUnit("{thread}") | ||
| .hasLongSumSatisfying( | ||
| sum -> | ||
| sum.isNotMonotonic() | ||
| .hasPointsSatisfying(point -> point.hasValue(2))))); | ||
| } | ||
|
|
||
| @Test | ||
| void registerObservers_NoDeadlockedThreads() { | ||
| when(threadBean.findDeadlockedThreads()).thenReturn(null); | ||
| when(threadBean.findMonitorDeadlockedThreads()).thenReturn(null); | ||
|
|
||
| ExperimentalThreads.registerObservers(testing.getOpenTelemetry(), threadBean); | ||
|
|
||
| testing.waitAndAssertMetrics( | ||
| "io.opentelemetry.runtime-telemetry-java8", | ||
| "jvm.thread.deadlock.count", | ||
| metrics -> | ||
| metrics.anySatisfy( | ||
| metricData -> | ||
| assertThat(metricData) | ||
| .hasLongSumSatisfying( | ||
| sum -> | ||
| sum.hasPointsSatisfying(point -> point.hasValue(0))))); | ||
|
|
||
| testing.waitAndAssertMetrics( | ||
| "io.opentelemetry.runtime-telemetry-java8", | ||
| "jvm.thread.monitor_deadlock.count", | ||
| metrics -> | ||
| metrics.anySatisfy( | ||
| metricData -> | ||
| assertThat(metricData) | ||
| .hasLongSumSatisfying( | ||
| sum -> | ||
| sum.hasPointsSatisfying(point -> point.hasValue(0))))); | ||
| } | ||
| } |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is already covered with
jvm.thread.countmetric with the[jvm.thread.daemon](https://opentelemetry.io/docs/specs/semconv/registry/attributes/jvm/)boolean attribute.Unfortulately, the
daemonstatus can't be accessed through the JMX interface, however it is provided when the metrics are captured withruntime-telemetrymodule.