Skip to content

Fix: inconsistency between FuzzIntrospector API and YAML files #1183#1220

Open
dev-kvt wants to merge 1 commit intogoogle:mainfrom
dev-kvt:fix/inconsistency-between-fuzzintrospector-api-and-yaml-files-1183
Open

Fix: inconsistency between FuzzIntrospector API and YAML files #1183#1220
dev-kvt wants to merge 1 commit intogoogle:mainfrom
dev-kvt:fix/inconsistency-between-fuzzintrospector-api-and-yaml-files-1183

Conversation

@dev-kvt
Copy link
Copy Markdown

@dev-kvt dev-kvt commented Jan 31, 2026

To maintain a strictly professional tone suitable for a high-level engineering contribution to a Google-managed repository, here is the refined Pull Request description.

Fix: Dynamic resolution for mismatched API/YAML function signatures (#1183)
Summary
This PR introduces a dynamic signature resolution strategy within ContextRetriever. It addresses critical discrepancies between static benchmark definitions (YAML) and the FuzzIntrospector (FI) API, ensuring accurate context retrieval for LLM prompt generation.

Root Cause Analysis
The oss-fuzz-gen pipeline utilizes function signatures defined in benchmark-sets/*.yaml to query the FI API. However, significant schema drift exists between these static YAML files and the internal representation in FuzzIntrospector.

YAML: GObex * g_obex_new(GIOChannel *, GObexTransportType, gssize, gssize)

FI API: GObex *g_obex_new(GIOChannel *io, GObexTransportType transport_type, ...)

Since the FI API enforces strict string matching for signature lookups, these semantic equivalents fail validation. This results in 404 errors for source code and cross-reference queries, which prevents the LLM from receiving necessary context during the prompt construction phase.

Proposed Solution
I have implemented a priority-based lookup mechanism that treats the FuzzIntrospector API as the canonical source of truth for signatures, while retaining the YAML configuration as a secondary fallback.

Workflow:

Query: The system attempts to resolve the canonical signature via the FI API using the function name, which remains stable across versions.

Canonicalization: If a signature is returned, it is cached as the _real_function_signature and used for all subsequent API operations (cross-references, headers, and implementation retrieval).

Graceful Degradation: If the API returns no result or is unreachable, the system falls back to the original signature provided in the benchmark YAML.

Implementation Details
File: data_prep/project_context/context_introspector.py

Centralized Resolution: Introduced the _get_real_function_signature method to abstract the lookup and caching logic.

Integration: Refactored _get_function_implementation, _get_xrefs_to_function, and _get_embeddable_declaration to utilize this resolved signature instead of accessing the raw benchmark data directly.
`def _get_real_function_signature(self) -> str:
"""
Resolves the authoritative function signature.
Prioritizes the FuzzIntrospector API representation; falls back to
the benchmark configuration if the API is unreachable or returns None.
"""
if self._real_function_signature:
return self._real_function_signature

# Attempt to resolve canonical signature via API
project = self._benchmark.project
func_name = self._benchmark.function_name
canonical_sig = introspector.query_introspector_function_signature(project, func_name)

# Set source of truth
if canonical_sig:
    self._real_function_signature = canonical_sig
else:
    self._real_function_signature = self._benchmark.function_signature
    
return self._real_function_signature`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant