Skip to content

ScrubbingOptions extra_patterns: scrub-message leaks the matched credential substring #1909

@msrdic

Description

@msrdic

Bug Report

Summary

When using logfire.ScrubbingOptions(extra_patterns=[...]), Logfire replaces matched attribute values with [Scrubbed due to '<reason>']. The <reason> field literally contains the matched substring, exposing the credential the scrubbing was meant to hide.

Reproduction

import re
import logfire

logfire.configure(
    scrubbing=logfire.ScrubbingOptions(
        extra_patterns=[re.compile(r"://[^:@/]+:[^@/]+@")]  # match embedded URL credentials
    )
)

with logfire.span("connect", config_url="postgresql://admin:s3cr3t_pass@db.internal:5432/mydb"):
    pass

Inspect the exported span. The config_url attribute will contain:

[Scrubbed due to '://admin:s3cr3t_pass@']

The credential admin:s3cr3t_pass is visible in the scrub marker itself.

Expected Behaviour

The scrub marker must not echo the matched text. Acceptable alternatives:

Format Notes
[Scrubbed] Simplest
[Scrubbed: matched sensitive pattern] Adds context without leaking value
[Scrubbed: pattern matched at offset N, length M] Structural info only

Why This Matters

The scrub message is often written to the same telemetry backend the operator reads to investigate incidents. An operator with read access to traces sees [Scrubbed due to '://user:password@db/…'] — the credential is right there, defeating the purpose of scrubbing. This is worse than no scrubbing: it draws the eye to a well-formatted credential.

This was discovered during a production logging audit. Attributes like db.connection_string (auto-instrumented by SQLAlchemy via OpenTelemetry) and config URL fields passed explicitly to spans were leaking URL-embedded credentials through this mechanism.

Suggested Fix

In the scrub-message construction, remove the matched substring from the replacement string. Something like:

# Before (leaks the matched text)
replacement = f"[Scrubbed due to '{matched_text}']"

# After (safe)
replacement = "[Scrubbed: matched sensitive pattern]"

If the reason string is meant to identify which pattern matched, use the pattern index or a user-supplied label — never the matched value itself.

Workaround

Avoid extra_patterns for URL-credential fields captured in span arguments. Instead, use request_attributes_mapper (FastAPI/ASGI) or a scrub callback that returns None/[REDACTED] rather than the default marker. For OTel auto-instrumented fields (db.connection_string, http.url) the scrub message leak still applies until this is fixed upstream, so treat [Scrubbed due to …] markers as unreliable evidence of redaction.

Environment

  • Affects all versions of logfire that support extra_patterns in ScrubbingOptions
  • Python 3.11+

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions