fix: hash skill binaries by mmilanta · Pull Request #255 · snyk/agent-scan

mmilanta · 2026-04-02T10:26:31Z

No description provided.

qodo-merge-etso · 2026-04-02T10:26:45Z

Review Summary by Qodo

Hash binary skill files with SHA256

✨ Enhancement

Walkthroughs

Description

• Add SHA256 hashing for binary skill files
• Replace generic message with hash-based identification
• Enable binary file content verification and comparison

Diagram

flowchart LR
  A["Binary File Read Error"] -->|"Generate SHA256 hash"| B["Hash-based Content"]
  B -->|"Replace generic message"| C["Binary file. Hash: {hexdigest}"]

File Changes

1. src/agent_scan/skill_client.py ✨ Enhancement +4/-1

Add SHA256 hashing for binary skill files

• Import hashlib module for SHA256 hashing
• Catch UnicodeDecodeError and compute SHA256 hash of binary files
• Replace generic "Binary file. No content available." message with hash-based identifier
• Enable binary file content verification through hash comparison

src/agent_scan/skill_client.py

qodo-merge-etso · 2026-04-02T10:26:46Z

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (2) 📎 Requirement gaps (0) 🎨 UX Issues (0)

1. traverse_skill_tree() too long 📘 Rule violation ⚙ Maintainability

Description

traverse_skill_tree() is ~60 lines long, exceeding the ≤40 SLOC limit, making it harder to
maintain and safely extend. This PR modifies the function further without extracting helpers.

Code

src/agent_scan/skill_client.py[R123-131]

                    content = f.read()
            except UnicodeDecodeError:
                logger.exception(f"Error reading file: {file}. The file is not a bianry")
-                content = "Binary file. No content available."
+                with open(os.path.expanduser(full_path), "rb") as f:
+                    content_hash = hashlib.sha256(f.read()).hexdigest()
+                content = f"Binary file. Hash: {content_hash}"
            resources.append(
                Resource(
                    name=file,

Evidence
Compliance requires functions to be ≤40 lines; traverse_skill_tree() spans roughly lines 78-137
(~60 lines) and was modified in this PR in the binary-file handling block.
Rule 45: Limit function length to ≤ 40 lines (SLOC)
src/agent_scan/skill_client.py[78-137]
src/agent_scan/skill_client.py[121-135]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`traverse_skill_tree()` exceeds the 40-line function length limit and was further expanded/modified in this PR.

## Issue Context
Large functions are harder to test and maintain; this function currently handles traversal, file-type branching, text reading, binary handling, and object construction.

## Fix Focus Areas
- src/agent_scan/skill_client.py[78-137]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

2. No regression test for hash 📘 Rule violation ☼ Reliability

Description

The PR changes binary resource descriptions to include a SHA-256 hash, but no test asserts this new
behavior. Without an assertion, the change can silently regress.

Code

src/agent_scan/skill_client.py[R124-128]

            except UnicodeDecodeError:
                logger.exception(f"Error reading file: {file}. The file is not a bianry")
-                content = "Binary file. No content available."
+                with open(os.path.expanduser(full_path), "rb") as f:
+                    content_hash = hashlib.sha256(f.read()).hexdigest()
+                content = f"Binary file. Hash: {content_hash}"

Evidence
Compliance requires tests for changed code paths; the PR changes the binary-file path to compute and
embed a hash, while the existing test only calls inspect_skill(...) without assertions about the
binary description output.
Rule 4: Every change must include automated tests; bug fixes add a regression test
src/agent_scan/skill_client.py[124-128]
tests/v4compatibility/test_inspect.py[119-121]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Add a regression/unit test that asserts binary files are represented with a deterministic SHA-256 hash in `Resource.description`.

## Issue Context
The code now sets `content = f"Binary file. Hash: {content_hash}"` for binary files; existing tests exercise skill inspection but do not assert this new output.

## Fix Focus Areas
- src/agent_scan/skill_client.py[121-128]
- tests/v4compatibility/test_inspect.py[119-121]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

3. Binary hash errors unhandled 🐞 Bug ☼ Reliability

Description

In traverse_skill_tree(), the fallback path for binary files opens and reads the file in binary mode
without any error handling, so PermissionError/ENOENT/etc. will bubble up and fail the entire skill
scan. Previously the UnicodeDecodeError fallback always produced a placeholder string and continued.

Code

src/agent_scan/skill_client.py[R126-128]

+                with open(os.path.expanduser(full_path), "rb") as f:
+                    content_hash = hashlib.sha256(f.read()).hexdigest()
+                content = f"Binary file. Hash: {content_hash}"

Evidence

The try/except only guards the UTF-8 open/read; the subsequent binary open+hash is outside any try,
so any I/O failure during hashing will abort traversal. This traversal runs as part of
inspect_skill() when scanning a skill.

src/agent_scan/skill_client.py[120-135]
src/agent_scan/skill_client.py[30-74]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`traverse_skill_tree()` catches `UnicodeDecodeError` from the UTF-8 read, but then performs a second `open(..., "rb")` + read/hash without a protective `try/except`. Any `OSError` (e.g., `PermissionError`, `FileNotFoundError` due to concurrent changes) will propagate and fail the scan.

### Issue Context
This code runs during skill scanning (`inspect_skill()` calls `traverse_skill_tree()`), so a single unreadable binary should not abort the whole scan.

### Fix Focus Areas
- src/agent_scan/skill_client.py[120-135]

### Expected fix
- Wrap the binary open/hash in a `try/except (OSError, Exception)`.
- On failure, log (prefer `logger.warning`/`logger.exception` with clear message) and set `content` to a safe placeholder like `"Binary file. Hash unavailable."` so traversal continues.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

4. Unbounded binary file reads 🐞 Bug ➹ Performance

Description

Binary hashing uses hashlib.sha256(f.read()), which reads the entire file into memory at once and
can significantly slow scans or cause high memory usage when skills include large binaries. This
happens for every non-.md and non-script asset in the skill directory.

Code

src/agent_scan/skill_client.py[R126-127]

+                with open(os.path.expanduser(full_path), "rb") as f:
+                    content_hash = hashlib.sha256(f.read()).hexdigest()

Evidence

The traversal iterates all files and, for non-text assets, computes a SHA-256 by calling f.read()
with no size bound, which reads the entire binary into RAM in one shot.

src/agent_scan/skill_client.py[85-135]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The binary hashing implementation does `hashlib.sha256(f.read()).hexdigest()`, which loads the whole file into memory. Large files can cause memory spikes and slow scans.

### Issue Context
This code is in `traverse_skill_tree()` and may process many assets during a scan.

### Fix Focus Areas
- src/agent_scan/skill_client.py[124-128]

### Expected fix
- Replace `hashlib.sha256(f.read())` with a streaming hash update loop, e.g.:
 - `h = hashlib.sha256()`
 - `for chunk in iter(lambda: f.read(1024 * 1024), b""):`
 - `h.update(chunk)`
 - `content_hash = h.hexdigest()`
- (Optional) Consider an upper bound / early-exit policy if hashing very large binaries is not required for the scan.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

ⓘ The new review experience is currently in Beta. Learn more

qodo-merge-etso · 2026-04-02T10:28:48Z

src/agent_scan/skill_client.py

                    content = f.read()
            except UnicodeDecodeError:
                logger.exception(f"Error reading file: {file}. The file is not a bianry")
-                content = "Binary file. No content available."
+                with open(os.path.expanduser(full_path), "rb") as f:
+                    content_hash = hashlib.sha256(f.read()).hexdigest()
+                content = f"Binary file. Hash: {content_hash}"
            resources.append(
                Resource(
                    name=file,


1. traverse_skill_tree() too long 📘 Rule violation ⚙ Maintainability

traverse_skill_tree() is ~60 lines long, exceeding the ≤40 SLOC limit, making it harder to maintain and safely extend. This PR modifies the function further without extracting helpers.

Agent Prompt

## Issue description `traverse_skill_tree()` exceeds the 40-line function length limit and was further expanded/modified in this PR. ## Issue Context Large functions are harder to test and maintain; this function currently handles traversal, file-type branching, text reading, binary handling, and object construction. ## Fix Focus Areas - src/agent_scan/skill_client.py[78-137]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

qodo-merge-etso · 2026-04-02T10:28:48Z

src/agent_scan/skill_client.py

            except UnicodeDecodeError:
                logger.exception(f"Error reading file: {file}. The file is not a bianry")
-                content = "Binary file. No content available."
+                with open(os.path.expanduser(full_path), "rb") as f:
+                    content_hash = hashlib.sha256(f.read()).hexdigest()
+                content = f"Binary file. Hash: {content_hash}"


2. No regression test for hash 📘 Rule violation ☼ Reliability

The PR changes binary resource descriptions to include a SHA-256 hash, but no test asserts this new behavior. Without an assertion, the change can silently regress.

Agent Prompt

## Issue description Add a regression/unit test that asserts binary files are represented with a deterministic SHA-256 hash in `Resource.description`. ## Issue Context The code now sets `content = f"Binary file. Hash: {content_hash}"` for binary files; existing tests exercise skill inspection but do not assert this new output. ## Fix Focus Areas - src/agent_scan/skill_client.py[121-128] - tests/v4compatibility/test_inspect.py[119-121]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

qodo-merge-etso · 2026-04-02T10:28:48Z

src/agent_scan/skill_client.py

+                with open(os.path.expanduser(full_path), "rb") as f:
+                    content_hash = hashlib.sha256(f.read()).hexdigest()
+                content = f"Binary file. Hash: {content_hash}"


3. Binary hash errors unhandled 🐞 Bug ☼ Reliability

In traverse_skill_tree(), the fallback path for binary files opens and reads the file in binary mode without any error handling, so PermissionError/ENOENT/etc. will bubble up and fail the entire skill scan. Previously the UnicodeDecodeError fallback always produced a placeholder string and continued.

Agent Prompt

### Issue description `traverse_skill_tree()` catches `UnicodeDecodeError` from the UTF-8 read, but then performs a second `open(..., "rb")` + read/hash without a protective `try/except`. Any `OSError` (e.g., `PermissionError`, `FileNotFoundError` due to concurrent changes) will propagate and fail the scan. ### Issue Context This code runs during skill scanning (`inspect_skill()` calls `traverse_skill_tree()`), so a single unreadable binary should not abort the whole scan. ### Fix Focus Areas - src/agent_scan/skill_client.py[120-135] ### Expected fix - Wrap the binary open/hash in a `try/except (OSError, Exception)`. - On failure, log (prefer `logger.warning`/`logger.exception` with clear message) and set `content` to a safe placeholder like `"Binary file. Hash unavailable."` so traversal continues.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

fix: hash skill binaries

cc2a1f9

mmilanta requested a review from a team as a code owner April 2, 2026 10:26

qodo-merge-etso bot reviewed Apr 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: hash skill binaries#255

fix: hash skill binaries#255
mmilanta wants to merge 1 commit intomainfrom
fix/hash-binaries-in-skills

mmilanta commented Apr 2, 2026

Uh oh!

qodo-merge-etso bot commented Apr 2, 2026

Uh oh!

qodo-merge-etso bot commented Apr 2, 2026 •

edited

Loading

Uh oh!

qodo-merge-etso bot Apr 2, 2026

Uh oh!

qodo-merge-etso bot Apr 2, 2026

Uh oh!

qodo-merge-etso bot Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mmilanta commented Apr 2, 2026

Uh oh!

qodo-merge-etso bot commented Apr 2, 2026

Review Summary by Qodo

Walkthroughs

File Changes

Uh oh!

qodo-merge-etso bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review by Qodo

Uh oh!

qodo-merge-etso bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

qodo-merge-etso bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

qodo-merge-etso bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

qodo-merge-etso bot commented Apr 2, 2026 •

edited

Loading