Declare run_only_one as a scheme argument for geoipupdate_input#84
Conversation
Background ---------- The geoip database updater is a modular input. In a search head cluster we want it to run on every member so each member downloads its own copy of the MaxMind databases: the geoip search command runs with local=true and needs the .mmdb files present on the search head that executes it. To get that, the input ships run_only_one = false in inputs.conf, and 1.1.1 additionally added run_only_one to the inputs.conf.spec. Despite that, the input was reported as not running on all search heads on Splunk Cloud Victoria -- it behaved as if run_only_one defaulted to true (run on only a single member). Why setting it in inputs.conf was not enough -------------------------------------------- run_only_one is a server-side (splunkd) setting. It is not referenced anywhere in the Splunk Python SDK (splunklib.modularinput) or in the UCC add-on generator (addonfactory-ucc-generator), so nothing on the client side reads or acts on it. Splunk's configuration system stores arbitrary keys, so run_only_one = false is persisted and shows up in btool, but being present in inputs.conf is not the same as the input subsystem acting on it. For a modular input, the parameters splunkd recognizes come from the input's scheme. The SDK serializes each declared Argument into the scheme's <endpoint><args> list (splunklib/modularinput/scheme.py), and UCC treats only name, interval, index, and sourcetype as built-in fields that do not need declaring (commands/build.py field_allow_list); every other field, including run_only_one, is emitted as a scheme argument (templates/input.template). So Splunk's own tooling expresses a non-standard input setting by declaring it as a scheme argument. Splunk's own add-ons follow the same pattern. The Splunk Add-on for CrowdStrike FDR declares run_only_one as a scheme argument in every one of its modular inputs, and its inputs.conf.spec documents the Victoria semantics: run_only_one = false runs one input instance on each search head, while run_only_one = true runs a single instance for the whole cluster. Change ------ Declare run_only_one as a scheme argument in the modular input's runtime get_scheme (GeoIPUpdateScript), which is the scheme splunkd actually consumes. The module also keeps a separate, dependency-free dict scheme on GeoIPUpdateInput so the module can be imported and unit tested without the Splunk runtime; that dict and its unit test are updated to match so the two representations stay in sync. run_only_one = false remains set in inputs.conf. Caveat ------ The actual enforcement of run_only_one lives inside splunkd, which is closed source, and the platform inputs.conf spec still marks the setting "currently not supported / under development" (it is only implemented on Splunk Cloud Victoria). This change matches Splunk's own tooling and add-ons and is the only structural difference between this input and one where run_only_one is honored, but it should be confirmed empirically on a Victoria stack, e.g. "splunk btool inputs list geoipupdate_input://default --debug" on each member and by verifying the input process runs on every member. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Plus Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (8)
📝 WalkthroughWalkthroughThe PR adds ChangesGeoIP update input release
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request bumps the application version to 1.1.2 and declares run_only_one as a scheme argument in the geoipupdate_input modular input's Python scheme. This ensures that Splunk honors the setting, particularly on Splunk Cloud Victoria. The reviewer suggests explicitly setting the data_type of the run_only_one argument to Argument.data_type_boolean to ensure Splunk correctly validates the parameter type.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| # the inputs.conf run_only_one value to be honored; with | ||
| # run_only_one = false, each search head cluster member runs the | ||
| # input and downloads its own databases. | ||
| scheme.add_argument(Argument("run_only_one", required_on_create=False)) |
There was a problem hiding this comment.
Since run_only_one is a boolean parameter (as declared in inputs.conf.spec), it is recommended to explicitly set its data_type to Argument.data_type_boolean when adding it to the scheme. This ensures Splunk correctly recognizes and validates the parameter type rather than defaulting to a string.
| scheme.add_argument(Argument("run_only_one", required_on_create=False)) | |
| scheme.add_argument(Argument("run_only_one", data_type=Argument.data_type_boolean, required_on_create=False)) |
Summary by CodeRabbit
Bug Fixes
New Features
Chores
Documentation