Add sentinel_peer_info metric for Sentinel peer discovery#1097
Add sentinel_peer_info metric for Sentinel peer discovery#1097oliver006 merged 9 commits intooliver006:masterfrom
Conversation
|
Thanks for the PR - I'll try to review this in the coming days. One thing I noticed - I don't see a single new test for this. Can you please make sure that new functionality has complete code coverage? |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1097 +/- ##
==========================================
+ Coverage 81.88% 82.11% +0.22%
==========================================
Files 20 20
Lines 2700 2728 +28
==========================================
+ Hits 2211 2240 +29
Misses 370 370
+ Partials 119 118 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Coverage Report for CI Build 24058978132Warning Build has drifted: This PR's base is out of sync with its target branch, so coverage data may include unrelated changes. Coverage increased (+0.2%) to 85.849%Details
Uncovered Changes
Coverage RegressionsNo coverage regressions found. Coverage Stats
💛 - Coveralls |
|
quick bump - are you still planning on finishing this? |
Sorry, I’ve been quite busy lately and had actually forgotten about this. But I’ve added the missing part now. That said, regarding the high-cardinality metrics, I think it would be good for us to discuss this a bit further. The reason I included run_id and the IP/port information is to help capture certain My intention was to make the cluster state more transparent and easier to observe. That said, I completely agree that high cardinality is a real concern. If the metrics are stored in a system like Prometheus, which does not handle high-cardinality data very well, it could lead to significant write amplification. Do you think it’s necessary for us to expose this information? If not, I could also add an enable flag so that it is disabled by default. |
|
Thanks for getting back to this PR!
I don't think it's needed, want to just go without it for now? |
|
fwiw, tests failed |
I added the InclSentinelPeerInfo flag, defaulting to false. The sentinel_peer_info metric is only registered and emitted when it is set to true. I also fixed the unit test issue from this round, so it should work properly now. One thing to call out: I removed log.SetLevel(log.DebugLevel) from exporter/info_test.go, since after TestKeyspaceStringParser runs, it leaves the log level set to debug for all following tests, which causes a large amount of log output. By the way, I can’t trigger CI from my side, so could you please help run it? Thanks! |
| if v, ok := sentinelDetailMap["runid"]; ok { | ||
| runid = v | ||
| } | ||
| flags := "" |
There was a problem hiding this comment.
let's use this down in line 175 and onwards instead of extracting it again down there
exporter/sentinels.go
Outdated
| masterOkSentinels := 1 | ||
| masterName := "" | ||
| masterAddr := "" | ||
| if len(labels) >= 2 { |
There was a problem hiding this comment.
Do we still want to emit the metric if labels is empty?
(plus we outright use labels down in 197 for the other metric)
would be good to be consistent here and maybe just return if labels is empty?
exporter/exporter.go
Outdated
| e.metricDescriptions[k] = newMetricDescr(opts.Namespace, k, desc.txt, desc.lbls) | ||
| } | ||
|
|
||
| if !opts.InclSentinelPeerInfo { |
There was a problem hiding this comment.
I don't think we need to delete the description (and we don't do it for any other flags), just not emitting the metric should be enough?
Add
sentinel_peer_infometric for Sentinel peer discoverySummary
When the exporter is pointed at a Redis Sentinel, it now exposes a new metric
sentinel_peer_infothat lists other Sentinel peers discovered viaSENTINEL SENTINELS. This allows discovering all Sentinel addresses by scraping a single Sentinel instance.Motivation
SENTINEL SENTINELS <master_name>returns the other Sentinels monitoring that master. Each peer is emitted as one time series, so there is no overlap or overwriting when multiple Sentinels are monitored.Changes
New metric:
sentinel_peer_infomaster_name,master_address,name,ip,port,runid,flags.down_after_millisecondsorvoted_leader).Example output
Testing
sentinel_master_ok_sentinels.