fix(secrets): report all multiline regex matches per file, not just first occurrence#7483
Conversation
a41c75b to
29952e0
Compare
| for mm in multiline_matches: | ||
| mm = self._extract_real_regex_match(mm) | ||
| for match_obj in multiline_regex.finditer(file_content): | ||
| mm = self._extract_real_regex_match(cast(Tuple[str], match_obj.groups()) or match_obj.group(0)) |
There was a problem hiding this comment.
Can you change this to a more readable var name ?
| # which is the most meaningful trigger line (e.g. "BEGIN PRIVATE KEY"). | ||
| if '\n' not in mm: | ||
| inner_offset = match_obj.group(0).find(mm) | ||
| mm_offset = match_obj.start() + (inner_offset if inner_offset >= 0 else 0) |
There was a problem hiding this comment.
in case of mm_offset equals zero shouldn't we just skip this iteration?
There was a problem hiding this comment.
inner_offset == 0 is ok it means the secret was found at the first character of the match.
I added a safety continue for inner_offset < 0 (not found at all),
| lines = sorted(c.file_line_range[0] for c in interesting_failed_checks) | ||
| # The committed fix reports the prerun match line (BEGIN_SECRET) for multiline captured values. | ||
| # First secret: BEGIN_SECRET is on line 1, second: BEGIN_SECRET is on line 8. | ||
| # On main (before fix), both fall back to line 1 (the first prerun match line). |
There was a problem hiding this comment.
can we remove "On main" and just leave "before fix"?
| runner = Runner() | ||
| report = runner.run(root_folder=valid_dir_path, | ||
| report = runner.run(root_folder=None, | ||
| files=[valid_dir_path + "/Dockerfile.mine"], |
There was a problem hiding this comment.
what's /Dockerfile.mine?
There was a problem hiding this comment.
This was the original file name of the test. I explicitly specified it here because I added another file, and before the change it scanned the entire folder.
| where the captured group spans multiple real lines, both must be detected with | ||
| correct and distinct line numbers. | ||
|
|
||
| On main (before fix), find_line_number() cannot find a multiline substring in any |
There was a problem hiding this comment.
also here remove "On main"
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
Description
Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
Fixes # (issue)
New/Edited policies (Delete if not relevant)
Description
Include a description of what makes it a violation and any relevant external links.
Fix
How does someone fix the issue in code and/or in runtime?
Checklist: