Add scripted-input variant of the database updater by horgh · Pull Request #88 · maxmind/geoip-splunk

horgh · 2026-06-30T15:48:10Z

On Splunk Cloud Victoria, the geoipupdate_input modular input only runs on one search head cluster member, so other members never download the .mmdb files and the geoip command fails on them. Prior attempts to make the modular input run on every member (run_only_one in inputs.conf.spec and as a scheme argument) did not fix it.

In the Splunk community Slack, it was suggested trying a scripted input, which may run on every SHC member. As an easily-revertible experiment, disable the modular ://default instance and add a [script://] stanza pointing at a thin wrapper (geoipupdate_script.py) that reads a session key from stdin (provided via passAuth) and calls the shared run_database_update(). The core update logic in geoipupdate_input.py is unchanged.

This is unverified: Splunk's docs confirm the mechanics (passAuth -> stdin session key, python.required honored for [script://]) but do not state that scripted inputs run on every SHC member, so it is pending a customer test.

Summary by CodeRabbit

New Features
- Added an alternate hourly update mechanism for GeoIP database updates in Splunk Cloud using a scripted-input approach.
- Updated configuration to support cluster-friendly execution while keeping updates functional.
Bug Fixes
- Disabled the default hourly update input by default during validation.
- Improved behavior and logging when the required session input is missing or blank.
Documentation
- Updated “Automatic Updates” guidance and added instructions for switching/reverting update modes.
Tests
- Added coverage for scripted-input session handling and update invocation behavior.

coderabbitai · 2026-06-30T15:48:49Z

📝 Walkthrough

Walkthrough

Adds geoipupdate_script.py, a scripted-input wrapper that reads a session key from stdin and calls run_database_update(). The modular input default instance is disabled in inputs.conf and replaced with a script:// stanza. Tests and docs were updated for the new path.

Changes

Scripted Input Variant for GeoIP Updater

Layer / File(s)	Summary
Scripted entry point and inputs wiring `geoip/package/bin/geoipupdate_script.py`, `geoip/package/default/inputs.conf`	The new script reads a session key from stdin, logs and exits on empty input, calls `run_database_update(session_key)` on valid input, and is wired in `inputs.conf` alongside the disabled default modular input.
stdin handling tests `tests/geoipupdate_script_test.py`	The test module sets up imports for `geoipupdate_script` and verifies the valid, empty, and whitespace stdin cases with mocked logging and update calls.
Docs and changelog updates `CLAUDE.md`, `CHANGELOG.md`	`CLAUDE.md` updates the automatic update description and adds the scripted-input variant section; `CHANGELOG.md` adds the unreleased `1.1.3` entry.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested reviewers

oschwald

Poem

🐇 I hop through stdin, key in tow,
A script now starts the update flow.
One stanza sleeps, one stanza sings,
And every search head does its things.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: adding a scripted-input variant of the database updater.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch wstorey/scripted-input

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

gemini-code-assist

Code Review

This pull request introduces a scripted-input variant (geoipupdate_script.py) as an alternative to the modular input to test whether it runs on all search head cluster members in Splunk Cloud. The modular input is disabled by default, and the scripted input is enabled in inputs.conf. Feedback suggests improving the portability of the scripted input configuration by using a relative path (./bin/geoipupdate_script.py) instead of a hardcoded absolute path, and updating the documentation in CLAUDE.md to reflect this change.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@CLAUDE.md`:
- Around line 65-85: The new scripted-input notes conflict with the existing
Search Head Cluster guidance in CLAUDE.md, so update the “Automatic Updates”
section to match the current behavior and remove the claim that run_only_one =
false guarantees every SHC node downloads its own databases. Make the documented
SHC update behavior consistent with the scripted-input experiment description,
and keep the guidance in sync using the relevant CLAUDE.md sections about
automatic updates and the scripted input variant.

In `@tests/geoipupdate_script_test.py`:
- Around line 15-20: The import-time path/module patching in
geoipupdate_script_test.py is leaking the fake solnlib into other tests because
sys.path and sys.modules are mutated during collection. Move the setup into a
scoped fixture that uses monkeypatch.syspath_prepend() and monkeypatch.setitem()
for the mocked modules, then import or reload geoipupdate_script inside that
fixture so each test gets an isolated environment.
- Around line 51-103: The main stdin-handling tests for geoipupdate_script.main
currently verify update and logging behavior but do not assert the wrapper stays
silent on stdout. Update the tests around geoipupdate_script.main to use capsys
and assert no stdout output on both the successful session-key path and the
empty/whitespace stdin paths, while still checking run_database_update and the
logger calls remain unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: abdf0623-8b4c-4131-b44a-059f2a86ead5

📥 Commits

Reviewing files that changed from the base of the PR and between 2a42f85 and f70a104.

📒 Files selected for processing (5)

CHANGELOG.md
CLAUDE.md
geoip/package/bin/geoipupdate_script.py
geoip/package/default/inputs.conf
tests/geoipupdate_script_test.py

coderabbitai · 2026-06-30T15:53:09Z

+# Add the lib and bin directories to the path for testing
+repo_root = Path(__file__).parent.parent
+lib_dir = repo_root / "geoip" / "package" / "lib"
+bin_dir = repo_root / "geoip" / "package" / "bin"
+sys.path.insert(0, str(lib_dir))
+sys.path.insert(0, str(bin_dir))


🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Keep the import-time mocks scoped to each test.

This module mutates sys.path and sys.modules during collection, so the fake solnlib can leak into unrelated tests after this file is imported. Move the path/module patching into a fixture with monkeypatch.syspath_prepend() / monkeypatch.setitem(...) and import or reload geoipupdate_script inside that fixture instead.

Suggested direction

+import importlib +import pytest - -# Add the lib and bin directories to the path for testing -repo_root = Path(__file__).parent.parent -lib_dir = repo_root / "geoip" / "package" / "lib" -bin_dir = repo_root / "geoip" / "package" / "bin" -sys.path.insert(0, str(lib_dir)) -sys.path.insert(0, str(bin_dir)) +repo_root = Path(__file__).parent.parent +lib_dir = repo_root / "geoip" / "package" / "lib" +bin_dir = repo_root / "geoip" / "package" / "bin" @@ -mock_solnlib = MagicMock() -mock_solnlib.conf_manager = mock_conf_manager -mock_solnlib.soln_exceptions = mock_soln_exceptions -sys.modules["solnlib"] = mock_solnlib -sys.modules["solnlib.conf_manager"] = mock_conf_manager -sys.modules["solnlib.soln_exceptions"] = mock_soln_exceptions - -# Now we can import geoipupdate_script since solnlib is mocked -import geoipupdate_script # noqa: E402 # type: ignore[import-not-found] +@pytest.fixture +def geoipupdate_script_module(monkeypatch: MonkeyPatch): + mock_solnlib = MagicMock() + mock_solnlib.conf_manager = mock_conf_manager + mock_solnlib.soln_exceptions = mock_soln_exceptions + monkeypatch.syspath_prepend(str(lib_dir)) + monkeypatch.syspath_prepend(str(bin_dir)) + monkeypatch.setitem(sys.modules, "solnlib", mock_solnlib) + monkeypatch.setitem(sys.modules, "solnlib.conf_manager", mock_conf_manager) + monkeypatch.setitem(sys.modules, "solnlib.soln_exceptions", mock_soln_exceptions) + import geoipupdate_script + return importlib.reload(geoipupdate_script)

Also applies to: 40-48

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/geoipupdate_script_test.py` around lines 15 - 20, The import-time path/module patching in geoipupdate_script_test.py is leaking the fake solnlib into other tests because sys.path and sys.modules are mutated during collection. Move the setup into a scoped fixture that uses monkeypatch.syspath_prepend() and monkeypatch.setitem() for the mocked modules, then import or reload geoipupdate_script inside that fixture so each test gets an isolated environment.

On Splunk Cloud Victoria, the geoipupdate_input modular input only runs on one search head cluster member, so other members never download the .mmdb files and the geoip command fails on them. Prior attempts to make the modular input run on every member (run_only_one in inputs.conf.spec and as a scheme argument) did not fix it. In the Splunk community Slack, it was suggested trying a scripted input, which may run on every SHC member. As an easily-revertible experiment, disable the modular ://default instance and add a [script://] stanza pointing at a thin wrapper (geoipupdate_script.py) that reads a session key from stdin (provided via passAuth) and calls the shared run_database_update(). The core update logic in geoipupdate_input.py is unchanged. This is unverified: Splunk's docs confirm the mechanics (passAuth -> stdin session key, python.required honored for [script://]) but do not state that scripted inputs run on every SHC member, so it is pending a customer test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

oschwald

This looks fine to me for a release to test out whether it helps. Claude had a few comments, but they seem intentional or inconsequential.

oschwald · 2026-06-30T16:25:36Z


 The `geoipupdate_input` modular input downloads and updates databases automatically:
- A default input (`geoipupdate_input://default`) is enabled out of the box (no UI, runs in background)
+- A default input (`geoipupdate_input://default`) provides the update logic (no UI, runs in background)


Claude Code, on behalf of Greg:

This one-line softening ("provides the update logic") isn't enough — the rest of this Automatic Updates section still reads as present-tense fact while the modular input is now disabled = 1:

L55: "The geoipupdate_input modular input downloads and updates databases automatically" — it no longer does; the scripted wrapper does.

L60: "Default update interval is 3600 seconds" — the active path now runs at interval = 60.

L61: "run_only_one = false ... ensures each node in a SHC downloads its own databases" — this directly contradicts the new section's premise (that it only runs on one member, which is why the experiment exists).

Suggest adding a forward-pointer at the top of this section and qualifying these bullets as the modular input's intended behavior when enabled.

Yeah, we can clean this up more if we keep this. I don't think we'll keep it though.

oschwald · 2026-06-30T16:25:36Z

+# The leading "./" resolves relative to this app's directory, so the path keeps
+# working regardless of the installed app folder name (Splunk's documented form).
+[script://./bin/geoipupdate_script.py]
+interval = 60


Claude Code, on behalf of Greg:

interval = 60 is 60x the documented/modular value (3600s). pygeoipupdate does hash/up-to-date checks so it rarely re-downloads, but this still polls MaxMind every minute per SHC member. If 60 was chosen for fast feedback during the experiment, a one-line comment saying so would keep it from reading as a leftover from local testing — otherwise consider aligning to 3600.

Oh, this one I actually meant to leave at 3600.

oschwald · 2026-06-30T16:25:36Z

+# For Splunk Free, we apparently must use "admin", otherwise we get no session
+# key when running the input. For Splunk Enterprise (and I think Splunk Cloud),
+# we can use splunk-system-user.
+passAuth = splunk-system-user


Claude Code, on behalf of Greg:

This comment describes passAuth = admin but the stanza uses splunk-system-user, and it's doubly hedged ("apparently", "I think"). It documents an out-of-scope, unverified path (the experiment targets Splunk Cloud Victoria) — the kind of comment that rots. Suggest removing it, or rewording to a verified, de-hedged statement if the Free-vs-Enterprise distinction is worth keeping.

Yeah, I am not sure about exact behaviour unfortunately. The comment describes how I think it works based on slack searches. I think what we have is the best thing to try.

I didn't intend to commit this as 60. I had it 60 for testing.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@geoip/package/bin/geoipupdate_script.py`:
- Around line 24-25: The stdin parsing in geoipupdate_script.py is treating the
full passAuth payload as the session key, so update the input handling to
extract the token from the sessionKey= prefix before calling
run_database_update(). Locate the logic around the raw stdin read and
session_key assignment, and make sure only the token value is kept after
stripping whitespace and removing the sessionKey= marker.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 18dad701-3a70-46a5-bf8b-8632a4c2b2e0

📥 Commits

Reviewing files that changed from the base of the PR and between f70a104 and 2a6228e.

📒 Files selected for processing (5)

CHANGELOG.md
CLAUDE.md
geoip/package/bin/geoipupdate_script.py
geoip/package/default/inputs.conf
tests/geoipupdate_script_test.py

coderabbitai · 2026-06-30T19:04:38Z

+    raw = sys.stdin.read()
+    session_key = raw.strip()


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

Splunk scripted input passAuth stdin format sessionKey= prefix or raw token

💡 Result:

When you enable passAuth in your Splunk scripted input configuration, splunkd provides a session key to your script via stdin [1][2]. The exact format delivered is sessionKey=YOUR_TOKEN_HERE [3]. To use this token in your script, you must read the line from stdin and parse it to remove the sessionKey= prefix [3][1]. Common implementation patterns include: 1. Reading the stream: Since the token is delivered as the first (and often only) line, you can capture it using standard input reading methods [1]. 2. Parsing the prefix: Because the input arrives exactly as sessionKey=, your script should programmatically strip the sessionKey= string to extract the raw token [3]. A simple approach in Python is to use.replace('sessionKey=', '') or string slicing if you are certain of the format [1]. 3. Utilization: Once extracted, this raw session key can be used to authenticate requests to the Splunk REST API [1][2]. Note that this behavior applies to traditional scripted inputs [2]. If you are developing Modular Inputs, the configuration and session key are passed to your script via stdin in XML format, which is parsed differently [4][5][6]. In those cases, the session key is contained within a <session_key> XML tag, which should be parsed using an XML library rather than manual string manipulation [4].

Citations:

1: https://community.splunk.com/t5/Splunk-Dev/Scripted-Input-Python-SDK-passAuth-Not-Working/m-p/147796

2: https://community.splunk.com/t5/Alerting/Use-an-alert-to-trigger-scripted-input/m-p/98136

3: https://community.splunk.com/t5/Security/How-to-get-session-key-auth-token-for-mod-input-script/m-p/561575

4: https://help.splunk.com/en/splunk-cloud-platform/developing-views-and-apps-for-splunk-web/10.1.2507/modular-inputs/create-modular-inputs

5: https://help.splunk.com/en/splunk-enterprise/developing-views-and-apps-for-splunk-web/9.0/modular-inputs/create-modular-inputs

6: https://help.splunk.com/en/splunk-cloud-platform/developing-views-and-apps-for-splunk-web/10.3.2512/modular-inputs/modular-inputs-examples

🏁 Script executed:

#!/bin/bash set -euo pipefail # Inspect the target file and locate related parsing logic. git ls-files geoip/package/bin/geoipupdate_script.py wc -l geoip/package/bin/geoipupdate_script.py cat -n geoip/package/bin/geoipupdate_script.py | sed -n '1,220p' # Search for any handling of sessionKey/passAuth in the repo. rg -n "sessionKey|passAuth|stdin" geoip -S

Repository: maxmind/geoip-splunk

Length of output: 3098

Parse the sessionKey= prefix from stdin
passAuth sends sessionKey=<token> on stdin for scripted inputs, so raw.strip() passes the literal prefix through to run_database_update() and turns a valid token into an auth failure. Strip the sessionKey= prefix before using the value.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@geoip/package/bin/geoipupdate_script.py` around lines 24 - 25, The stdin parsing in geoipupdate_script.py is treating the full passAuth payload as the session key, so update the input handling to extract the token from the sessionKey= prefix before calling run_database_update(). Locate the logic around the raw stdin read and session_key assignment, and make sure only the token value is kept after stripping whitespace and removing the sessionKey= marker.

gemini-code-assist Bot reviewed Jun 30, 2026

View reviewed changes

Comment thread geoip/package/default/inputs.conf Outdated

Comment thread CLAUDE.md Outdated

coderabbitai Bot reviewed Jun 30, 2026

View reviewed changes

horgh force-pushed the wstorey/scripted-input branch from f70a104 to 8b26cd0 Compare June 30, 2026 15:58

oschwald previously approved these changes Jun 30, 2026

View reviewed changes

Increase interval to 3600

2a6228e

I didn't intend to commit this as 60. I had it 60 for testing.

horgh dismissed oschwald’s stale review via 2a6228e June 30, 2026 19:01

coderabbitai Bot reviewed Jun 30, 2026

View reviewed changes

oschwald approved these changes Jun 30, 2026

View reviewed changes

oschwald merged commit d69c898 into main Jun 30, 2026
10 checks passed

oschwald deleted the wstorey/scripted-input branch June 30, 2026 20:44

Uh oh!

Conversation

horgh commented Jun 30, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Poem

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

oschwald left a comment

Choose a reason for hiding this comment

Uh oh!

oschwald Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

horgh Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

oschwald Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

horgh Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

oschwald Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

horgh Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

horgh commented Jun 30, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 30, 2026 •

edited

Loading

oschwald Jun 30, 2026 •

edited

Loading

oschwald Jun 30, 2026 •

edited

Loading

oschwald Jun 30, 2026 •

edited

Loading