Skip to content

fix(errorreport): redact installer secrets from crash reports#48

Merged
jigar-arc10 merged 3 commits into
mainfrom
jigar/akhn-243-scrub-error-report-tokens
Jun 22, 2026
Merged

fix(errorreport): redact installer secrets from crash reports#48
jigar-arc10 merged 3 commits into
mainfrom
jigar/akhn-243-scrub-error-report-tokens

Conversation

@jigar-arc10

Copy link
Copy Markdown
Collaborator

The installer journal is captured verbatim into crash reports via recent_syslog(re.compile(".")), so the installation key / install_id / auth tokens could leak into a shared report. Added redact_secrets() (value-based redaction of the on-disk key + install_id, plus targeted Authorization: Bearer / DPoP / JWT patterns) and wrapped the journal capture in it.

Refs AKHN-243

@shimpa1 shimpa1 assigned shimpa1 and jigar-arc10 and unassigned shimpa1 Jun 18, 2026
@shimpa1

shimpa1 commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

hi @jigar-arc10

⏺ Yes — PR #48 narrows the leak but does not fully close it. I traced the whole crash-report assembly. Here's what's actually in a report and what's covered.

What the report attaches (and what #48 redacts)

_bg_add_info + _bg_attach_hook (errorreport.py:162, :452) build a report from many fields. Only InstallerJournal is passed through redact_secrets (errorreport.py:171). Everything registered via
note_file_for_apport is attached verbatim by attach_file_if_exists (errorreport.py:456) with no redaction:

  • InstallerServerLog → /var/log/installer/subiquity-server-debug.log
  • InstallerServerLogInfo → …-info.log (server.py:257-258)
  • BootstrapLog, CurtinLog, curtin configs, NetplanConfig, etc.

What's actually secret in those un-redacted files

The homenode-token controller logs the key/install_id truncated, not in full:

  • homenode_token.py:122 → "check_token_GET called with token: %s", token[:10] + "..."
  • :137 → "Installation key validation successful: %s", token[:10] + "..."
  • :144 → install_id[:10] + "..."

Those records land in both the journal and the server log file. So:

  • First 10 chars of the installation key + install_id DO leak via InstallerServerLog/InstallerServerLogInfo, which redact_secrets never touches. For a 5-word key, 10 chars ≈ the first word-and-a-half — partial,
    not the whole secret, but it meaningfully shrinks brute-force space and fingerprints the install.
  • Worse, redact_secrets does value-based replacement of the full on-disk value (text.replace(value, …)). A 10-char prefix is a different string, so even in the now-"redacted" InstallerJournal the truncated
    prefix survives. The redaction only catches the full key if the full key ever appears verbatim in the journal.

What does NOT leak (good news)

  • Full installation key: the only place it sits in cleartext is /var/log/installer/autoinstall-user-data (install.py:178, written via make_autoinstall()). That file is 0o400 root and is not registered with
    note_file_for_apport, so it's never attached to the report. The full autoinstall yaml is also write_file'd to cloud_autoinstall_path but only log.debug("autoinstall found in cloud-config") is logged — the body
    isn't (server.py:880-894). So the full key stays out of both logs and the report. ✓
  • Kernel cmdline: only carries akash.auto-config/akash.from-windows flags; the key comes from install-config.json/TUI, never /proc/cmdline. So ProcKernelCmdLine is clean. ✓
  • Auth Bearer/DPoP/JWT tokens: these are minted by helper.sh during first-boot (post-reboot, installed system) — subiquity is gone by then, so they never appear in a subiquity crash report. The three regexes in
    fix(errorreport): redact installer secrets from crash reports #48 are guarding against secrets that don't actually occur in this report. Harmless, but they're not where the residual risk is.

Verdict

The real residual exposure after #48 is the truncated (10-char) key/install_id prefix in the attached InstallerServerLog* files, plus the structural fact that redaction is applied to one field instead of the
whole report.

Two concrete fixes, in order of value:

  1. Stop logging even the truncated prefix. Change homenode_token.py:122/137/144 to log a length or a salted hash, not token[:10]. This removes the secret at the source so it can't reach any attachment.
  2. Apply redact_secrets to the whole report, not just the journal — e.g. run it over each string field (or over the registered log file contents) before self.pr.write() / upload, rather than wrapping only
    recent_syslog(...). That also future-proofs against a new log line leaking the value verbatim.

#48 is still a net improvement and safe to merge — but if the ticket's goal is "secrets can't escape in a crash report," it isn't fully met yet. I'd raise fix #1 as a blocking comment since it's a one-line
change that closes the actual remaining vector.

check_token_GET logged token[:10]/install_id[:10]. The server debug/info
logs are attached to crash reports verbatim (InstallerServerLog*), and
redact_secrets runs only on the journal and matches only the full on-disk
value — so a logged prefix escaped unredacted. Log lengths instead; the
secret never reaches any attachment.

Refs AKHN-243
PR #47 (AKHN-240, 0600 perms) added test_homenode_token.py with
TestTokenFilePerms; this branch added it with TestCheckTokenLogging.
Combined both classes into one file with unified imports. homenode_token.py
auto-merged (0600 _write_private + length-only logging coexist).
@jigar-arc10 jigar-arc10 merged commit 20832e4 into main Jun 22, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants