Commit f44528b
ANE-1036: Glob file matching for exclusion filters in .fossa.yml (#1703)
* ANE-1036: Support glob patterns in .fossa.yml path filters
`paths.only` and `paths.exclude` entries that contain `*`, `?`, or `[`
are now parsed as System.FilePattern globs via the existing Data.Glob
wrapper. Entries without glob metacharacters keep their prior
"directory and all children" semantics, so this change is
backward-compatible.
Adds a PathFilter sum type at the config layer, threads a parallel
list of glob patterns through FilterCombination, and extends
pathAllowed / applyComb to include-or-exclude directories whose
relative path matches a glob. Matching normalizes the trailing slash
that Path.toString appends to Dir paths so patterns like
`node_modules/*` match as users expect.
Docs and fossa-yml.v3.schema.json updated.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* Fix Windows glob path matching and link changelog entry
Normalize backslashes to forward slashes before glob matching so
user-supplied patterns like `node_modules/*` match the backslash-
separated paths produced by `Path Rel Dir` on Windows.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* Add glob filter test coverage for ?, character classes, root-level globs
Cover '?' wildcards, '[...]' character classes, root-anchored single-segment
globs, an explicit trailing-slash normalization regression guard, and a
four-way mix of include/exclude globs and concrete paths.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* Document System.FilePattern '?'/'[...]' literal-match limitation in tests
System.FilePattern only implements `*` and `**`; `?` and `[...]` are
matched literally rather than as wildcards/character classes. The two
new tests asserted wildcard semantics and were red on CI. Flip the
expectations so they document the actual behavior and serve as a
regression guard if the engine ever gains those features.
* Normalize backslashes in glob patterns; correct ?/[] doc semantics
Extend the Windows portability fix from db3921b (which normalized the
path side) to also normalize the user-supplied pattern side. A Windows
user typing `node_modules\*` in `.fossa.yml` now gets the same glob as
`node_modules/*`. The shared normalization is lifted into a top-level
`normalizeSlashes` helper used by both `FromJSON PathFilter` and
`globMatchesDir`.
Also correct the glob-pattern documentation: it previously claimed
`?` matches a single character, but `System.FilePattern` only
implements `*` and `**` (test/Discovery/FiltersSpec.hs already asserts
this). The doc now says `?` and `[...]` are matched as literals and
notes that backslashes in patterns are normalized.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Limit glob trigger to `*`; drop unreachable `?`/`[]` literal tests
System.FilePattern only implements `*` and `**` — `?` and `[...]` are
matched as literal characters, not single-character wildcards or
character classes (per commit 51f4f66). Routing strings containing `?`
or `[` to the glob branch was therefore a no-op for matching, and on
Windows it silently swallowed `parseRelDir` errors for `?` (a reserved
NTFS character) by producing a glob that could never match a real path.
Shrink the trigger to a simple `*`-in-string check inline. Strings with
`?` or `[` now go through `parseRelDir` like any other concrete path.
The two `FiltersSpec` tests that documented FilePattern's literal
handling of `?`/`[]` go away — they tested an internal-engine quirk that
is no longer reachable through the user-facing config. Doc + JSON
schema updated to match (and the `*` description is fixed: "any
sequence of characters within a single path segment", not the
inaccurate "a single path segment").
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Doc: show concrete example directories for each glob pattern
Add a short bulleted list under the YAML example illustrating what each
pattern actually matches in a real tree (deep Go vendoring, a scoped
npm package's transitive plugin tree, and a generated-proto subtree).
The list also calls out that `build/generated/*` is anchored at the
root and that the walker prunes the matched directory's whole subtree.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Echo active path filters at analyze startup
Walker-level prunes from `paths.only`/`paths.exclude` short-circuit
discovery before any strategy sees the excluded directory, so the user
gets no log line and no "Skipping ..." trail telling them why a
project they expected didn't appear in the analyze summary. The
existing post-summary note even points at `fossa list-targets` as a
workaround, but that command deliberately ignores all filters.
Add a small `logActivePathFilters` helper invoked once at the start of
`analyze` that prints the configured include/exclude paths and globs
(skipping empty kinds). Output for a `.fossa.yml` containing
`paths.exclude: ["**/zip/**"]` is now:
[INFO] Active exclude glob filters: **/zip/**
Per-prune logging would be more direct but requires propagating
`Has Logger sig m` through `walkWithFilters'`, `simpleDiscover`, and
every strategy's `findProjects`/`discover` (~35 files). Saving that
for a follow-up; this gives the user the single piece of information
needed to map a missing project back to a configured filter.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Surface walker-pruned subtrees and add walker filter test coverage
Walker-level path-filter prunes were previously silent: pathFilterIntercept
short-circuits before any strategy reaches the directory, and there was no
log trail explaining why a project the user expected didn't appear in the
analyze summary. The post-summary note even pointed at `fossa list-targets`
as a workaround, which deliberately ignores all filters.
Wire `Has Logger sig m` through `walkWithFilters'` and `pathFilterIntercept`
so the walker can speak. Per-prune log lines fire at debug level (one per
strategy, ~28 strategies = noisy at info). Add `enumeratePrunedSubtrees`,
a one-shot pre-discovery walk that returns the list of subtrees the filter
will reject; analyze invokes it once before strategies run and logs each
pruned path at info level. Result for a `.fossa.yml` with
`paths.exclude: ["**/zip/**"]`:
Active exclude glob filters: **/zip/**
Skipping path "zip/" (excluded by paths filter)
The Has Logger ripple touches every strategy that uses walkWithFilters'
(~32 single-line constraint sites, ~7 multi-line). Each carrier already
provides Logger via DiscoverTaskEffs, so the change is purely a constraint
propagation — no new effects, no runtime cost.
Add three Walker spec tests (test/Discovery/WalkSpec.hs):
- include-path filter: mirror of the existing exclude test, asserts the
walker accepts ancestors + included subtree and prunes siblings.
- WalkSkipSome merge: strategy returns WalkSkipSome ["a"], filter
excludes "b", both should be pruned. Catches the
`pathFilterIntercept`/`skipDisallowed` merge logic.
- YAML-to-walker end-to-end: parses a YAML config string with a glob
exclude, runs it through `collectConfigFileFilters`, executes the
walker, asserts pruning. Catches "globs parse but never reach the
walker" wiring regressions — the exact class of bug we hit earlier.
Cannot exercise the new tests locally because the test binary's startup
reads test/Container/testdata/emptypath.tar (a git-LFS pointer not
materialized in this environment); CI's Linux/macOS/Windows jobs will
validate. Library and test-binary builds pass with no warnings.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Revert Has Logger walker propagation; gate prune-enumeration walk
Per-prune `logDebug` events from `pathFilterIntercept` were the only thing
the 39-file `Has Logger sig m` ripple bought us. The user-facing value —
"each pruned subtree shows up once at info" — comes from
`enumeratePrunedSubtrees` + `logPrunedSubtrees` in `App.Fossa.Analyze`,
and that path doesn't need the constraint propagated: `walk'` works
without `Logger`, and the logging happens in `analyze` where Logger was
always in scope.
Revert the propagation in `walkWithFilters'` and `pathFilterIntercept`
back to the prior `Applicative m, Monoid o` shape, and revert all 39
strategy files (and their `Effect.Logger` import additions) to master.
Keep `enumeratePrunedSubtrees`, `logPrunedSubtrees`, and the
"Active filters" startup line — they're the user-visible win.
Add a no-filters short-circuit to `logPrunedSubtrees` so we don't pay
for an extra walk when the user has no `paths.only`/`paths.exclude`
entries configured.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Fix hlint hits in FiltersSpec backslash-normalization test
CI's hlint job flagged two restricted patterns I introduced:
- Text.pack should be toText (project-wide convention).
- bare error is a restricted function in this codebase.
Replace Text.pack with toText, and change the failure branch from
error to Nothing so parse returns Maybe PathFilter. The shouldBe
assertion still works: it now compares two Maybe PathFilter values,
both of which are Just _ for valid input.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Fix duplicate viaShow import in Analyze.hs
CI's -Werror=unused-imports flagged the existing
`Prettyprinter.viaShow` as unused because the same name was also
imported (and used) from Effect.Logger when I added it for
logPrunedSubtrees. Drop my Effect.Logger.viaShow addition; the
Prettyprinter one was already in scope and is what's actually used.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Condense changelog entry for 3.17.6 to one line
Drop the long-form description of glob semantics and the analyze
visibility entry; "now accept glob patterns" plus the PR link gives
readers everything they need, and full docs live in
docs/references/files/fossa-yml.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Address CodeRabbit review: include-glob reachability + JSON shape + filter scope
Three issues from CodeRabbit:
1. Startup logs used `filters`, but discovery uses `discoveryFilters`
(which is `mempty` under `--no-discovery-exclusion`). Compute
`discoveryFilters` once near the top of `analyze` and use it for
`logActivePathFilters` / `logPrunedSubtrees` so the log line agrees
with what discovery actually applies.
2. `FilterCombination`'s ToJSON used `genericToEncoding defaultOptions`,
which emits the new `_combinedPathGlobs` field as `[]` even when no
globs are configured — so every serialized payload changes shape vs
pre-glob-support runs. Replace with a hand-written `toEncoding` that
omits the field when the list is empty.
3. Include globs only matched the directory itself; ancestors-on-the-way
to a match and descendants-after-a-match weren't allowed. So
`paths.only: ["apps/*"]` rejected `apps/`, the walker never
descended, and every project under `apps/` was silently dropped.
Add `isParentOfIncludedGlob` (path's segments are a prefix of the
glob's literal directory prefix — segments before the first `*`)
and `isChildOfIncludedGlob` (any proper ancestor of `path` matches a
glob) to `pathAllowed`. Cover both with new tests in
`FiltersSpec.hs`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Allow any path as ancestor when include-glob's literal prefix is empty
The new "treats a leading '**' include glob as accepting any ancestor"
test was failing on CI. With include `**/service/**`, the walker has
no idea which subtree contains a `service/` directory and must descend
everywhere on the way down. My initial pathSegmentsPrefixOf check
returned False whenever the path was non-empty and the glob's literal
directory prefix was empty, so the walker refused to descend at all.
Fix: when the literal prefix is empty (because the glob's first
segment is wildcarded — `**`, `*`, `*foo`, etc.), accept any path as
a candidate ancestor. The actual match still has to fire via
isIncludedByGlob when the walker reaches a real match. Costs extra
walking when the user writes a leading-wildcard include, but that's
the only correct behavior — there's no way to know up front where a
match lives.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Address CodeRabbit nitpicks: stdlib isPrefixOf + drop list comprehension
Two style fixes from CodeRabbit's second review:
1. `pathSegmentsPrefixOf`'s local `isPrefixOfList` was a hand-rolled
reimplementation of `Data.List.isPrefixOf`. Drop the where-clause
and import the standard one (Data.List was already imported).
2. `properAncestors` used a list comprehension to build prefix lists,
which the project's coding guidelines disallow ("avoid partial
functions, list comprehensions, and match guards"). Replace with
`map (`take` segs) [1 .. length segs - 1]`.
Both behavior-equivalent.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Drop logActivePathFilters per review feedback
Reviewer flagged the startup "Active <kind> filters: ..." log lines
as redundant with the user's own .fossa.yml — and at info level, more
noise than signal. The actually-useful info is "what got pruned in
this run," which logPrunedSubtrees already provides.
Remove logActivePathFilters and its call site, plus the imports it
was the only consumer of (Data.Glob.unGlob, Data.Text, Text.intercalate).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Use unsnoc instead of double-reverse in trimTrailingSlash
Per review feedback. The original `case reverse s of '/' : rest -> reverse rest`
walks the list twice and allocates an intermediate reversed copy.
Replacing with an unsnoc-based version traverses once.
Project supports `base >= 4.15` and `Data.List.unsnoc` is base-4.19+
(GHC 9.8), so I can't import it directly. Inline a tiny helper that
matches the stdlib implementation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Rename glob ancestor-reachability test + clarify the comment
Match the existing concrete-path "should include all parents" naming
convention, and rewrite the comment so it's clear pathAllowed is the
walker's traversal predicate (not a "this path matches the filter"
check). The previous wording made it look like the assertions were
about glob matches.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>1 parent 56cc319 commit f44528b
10 files changed
Lines changed: 558 additions & 28 deletions
File tree
- docs/references/files
- src
- App/Fossa
- Config
- Discovery
- test/Discovery
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| 5 | + | |
5 | 6 | | |
6 | 7 | | |
| 8 | + | |
7 | 9 | | |
8 | 10 | | |
9 | 11 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
334 | 334 | | |
335 | 335 | | |
336 | 336 | | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
337 | 359 | | |
338 | 360 | | |
339 | 361 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
528 | 528 | | |
529 | 529 | | |
530 | 530 | | |
531 | | - | |
| 531 | + | |
532 | 532 | | |
533 | 533 | | |
534 | 534 | | |
535 | | - | |
| 535 | + | |
536 | 536 | | |
537 | 537 | | |
538 | 538 | | |
539 | 539 | | |
540 | 540 | | |
541 | 541 | | |
542 | | - | |
| 542 | + | |
543 | 543 | | |
544 | 544 | | |
545 | 545 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
121 | 121 | | |
122 | 122 | | |
123 | 123 | | |
124 | | - | |
| 124 | + | |
125 | 125 | | |
126 | 126 | | |
127 | 127 | | |
| |||
132 | 132 | | |
133 | 133 | | |
134 | 134 | | |
135 | | - | |
| 135 | + | |
136 | 136 | | |
| 137 | + | |
137 | 138 | | |
138 | 139 | | |
139 | 140 | | |
| |||
297 | 298 | | |
298 | 299 | | |
299 | 300 | | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
300 | 327 | | |
301 | 328 | | |
302 | 329 | | |
| |||
331 | 358 | | |
332 | 359 | | |
333 | 360 | | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
334 | 367 | | |
335 | 368 | | |
336 | 369 | | |
| |||
434 | 467 | | |
435 | 468 | | |
436 | 469 | | |
437 | | - | |
438 | 470 | | |
439 | 471 | | |
440 | 472 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
120 | 120 | | |
121 | 121 | | |
122 | 122 | | |
123 | | - | |
| 123 | + | |
124 | 124 | | |
125 | 125 | | |
126 | 126 | | |
| |||
624 | 624 | | |
625 | 625 | | |
626 | 626 | | |
627 | | - | |
| 627 | + | |
628 | 628 | | |
629 | | - | |
| 629 | + | |
630 | 630 | | |
631 | | - | |
| 631 | + | |
| 632 | + | |
| 633 | + | |
632 | 634 | | |
633 | 635 | | |
634 | 636 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
| 14 | + | |
14 | 15 | | |
15 | 16 | | |
16 | 17 | | |
| |||
57 | 58 | | |
58 | 59 | | |
59 | 60 | | |
| 61 | + | |
60 | 62 | | |
61 | 63 | | |
62 | 64 | | |
| |||
257 | 259 | | |
258 | 260 | | |
259 | 261 | | |
260 | | - | |
261 | | - | |
| 262 | + | |
| 263 | + | |
262 | 264 | | |
263 | 265 | | |
264 | 266 | | |
| |||
0 commit comments