You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add user-facing documentation around fingerprinting snippet-scans (#1576)
* Add user-facing documentation around fingerprinting snippet-scans, add a proper `analyze --help` message around `x-snippet-scan`:
```
--x-snippet-scan
Experimental flag to enable snippet scanning to identify open source code snippets using fingerprinting.
```
* Provied ficus sensible default endpoint
* Add default endpoint to pass to ficus
Copy file name to clipboardExpand all lines: docs/references/subcommands/analyze.md
+74Lines changed: 74 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -142,6 +142,80 @@ In addition to the [standard flags](#specifying-fossa-project-details), the anal
142
142
|`--strict`| Enforces strict analysis to ensure the most accurate results by rejecting fallbacks. When run with `--static-only-analysis`, the most optimal static strategy will be applied without fallbacks. |
143
143
144
144
145
+
### Snippet Scanning
146
+
147
+
Snippet scanning identifies potential open source code snippets within your first-party source code by comparing file fingerprints against FOSSA's knowledge base. This feature helps detect code that may have been copied from open source projects.
|`--x-snippet-scan`| Enable snippet scanning during analysis. This experimental feature fingerprints your source files and checks them against FOSSA's snippet database via ScanOSS integration. |
154
+
155
+
#### How Snippet Scanning Works
156
+
157
+
When `--x-snippet-scan` is enabled, the CLI:
158
+
159
+
1.**Hashes Files First**: Creates CRC64 hashes of all source files to identify which files need fingerprinting
160
+
2.**Checks Necessity of Fingerprinting**: Checks with FOSSA servers to determine which file hashes are already known
161
+
3.**Fingerprints New or Changed Files**: Uses the Ficus fingerprinting engine (written in Rust) to create cryptographic fingerprints only for files not previously seen
162
+
4.**Filters Content**: By default, skips directories like `.git/`, and hidden directories. This includes, from `.fossa.yml`, `vendoredDependencies.licenseScanPathFilters.exclude`, documented further below.
163
+
5.**Uploads Fingerprints**: Sends only the fingerprints to FOSSA's servers via ScanOSS integration
164
+
6.**Receives Matches**: Gets back information about any matching open source components
165
+
7.**Uploads Match Contents**: For files that have matches, uploads source code content temporarily to FOSSA servers.
166
+
167
+
#### Data Sent to FOSSA
168
+
169
+
**For Performance Optimization:**
170
+
- CRC64 hashes of all files, to avoid re-fingerprinting unchanged files.
171
+
172
+
**For Fingerprinting:**
173
+
- ScanOSS-compatible fingerprints of source code to identify matches.
174
+
175
+
**For Matched Files Only:**
176
+
- The actual source code content of files that contain snippet matches.
177
+
178
+
#### Data Retention
179
+
180
+
-**File Fingerprints**: Stored permanently for caching and performance optimization
181
+
-**Source Code Content**: Stored temporarily for 30 days and then automatically deleted
182
+
-**CRC64 Hashes**: The likelihood of a collision with CRC64 (2^64 possible values) is extremely low.
183
+
184
+
#### Directory Filtering
185
+
186
+
By default, snippet scanning excludes common non-production directories and follows `.gitignore` patterns:
187
+
188
+
- Hidden directories.
189
+
- Globs as directed by `.gitignore` files.
190
+
191
+
#### Custom Exclude Filtering
192
+
193
+
You can customize which files and directories are excluded from snippet scanning by configuring exclude filters in your `.fossa.yml` file. Note that snippet scanning currently only supports exclude patterns, not `only` patterns.
194
+
195
+
For example:
196
+
```yaml
197
+
version: 3
198
+
vendoredDependencies:
199
+
licenseScanPathFilters:
200
+
exclude:
201
+
- "**/test/**"
202
+
- "**/tests/**"
203
+
- "**/spec/**"
204
+
- "**/node_modules/**"
205
+
- "**/dist/**"
206
+
- "**/build/**"
207
+
- "**/*.test.js"
208
+
- "**/*.spec.ts"
209
+
```
210
+
211
+
**Important Notes:**
212
+
213
+
- Snippet scanning only uses the `exclude` filters from `licenseScanPathFilters` - `only` filters are ignored for this use-case.
214
+
- Path filters use standard glob patterns (e.g., `**/*` for recursive matching, `*` for single-directory matching).
215
+
- The configuration goes in the `vendoredDependencies.licenseScanPathFilters.exclude` section.
216
+
- These exclude patterns are passed directly to the Ficus fingerprinting engine as `--exclude` arguments.
217
+
- Default exclusions (hidden files, `.gitignore` patterns) are applied in addition to custom excludes.
218
+
145
219
### Experimental Options
146
220
147
221
_Important: For support and other general information, refer to the [experimental options overview](../experimental/README.md) before using experimental options._
<*> flagOpt StrictMode (applyFossaStyle <> long "strict"<> stringToHelpDoc "Enforces strict analysis to ensure the most accurate results by rejecting fallbacks.")
348
-
<*> switch (applyFossaStyle <> long "x-snippet-scan"<>help "Enable ficus snippet scanning")
347
+
<*> switch (applyFossaStyle <> long "x-snippet-scan"<>stringToHelpDoc "Experimental flag to enable snippet scanning to identify open source code snippets using fingerprinting.")
0 commit comments