Historical Data Management OSS-Fuzz SDK Implementation#1150
Open
zewei-wang wants to merge 8 commits intogoogle:mainfrom
Open
Historical Data Management OSS-Fuzz SDK Implementation#1150zewei-wang wants to merge 8 commits intogoogle:mainfrom
zewei-wang wants to merge 8 commits intogoogle:mainfrom
Conversation
- Add comprehensive data models for build, crash, corpus, and coverage history - Implement HistoricalSummary model for aggregated statistics - Add specialized error classes for SDK configuration and validation - Include proper type hints and Pydantic validation
- Extend storage adapters with history-specific functionality - Add support for time-series data storage and retrieval - Implement environment variable utilities for configuration - Improve error handling and logging in storage operations
- Add abstract HistoryManager base class with common functionality - Implement BuildHistoryManager for build statistics and trends - Add CoverageHistoryManager for coverage data analysis - Include data validation and storage abstraction - Add comprehensive logging and error handling
- Implement CorpusHistoryManager for corpus growth analysis - Add CrashHistoryManager for crash tracking and statistics - Include duplicate detection and data validation - Complete the historical data management infrastructure
- Add OSSFuzzSDK class as main entry point for historical data - Implement project report generation and analysis features - Add fuzzing efficiency analysis and health scoring - Include environment configuration and error handling - Provide unified interface for all history managers
- Export OSSFuzzSDK and history managers in package __init__ - Add data models and error classes to public API - Maintain backward compatibility with existing exports - Complete integration of historical data functionality
- Add test suite for OSSFuzzSDK main functionality - Include tests for all history managers (build, crash, corpus, coverage) - Test configuration, error handling, and edge cases - Ensure proper integration with storage and data validation - Add mocking for external dependencies
Collaborator
Author
|
/gcbrun exp -n zewei -m vertex_ai_gemini-2-5-flash-chat -ag -b quick-test |
There was a problem hiding this comment.
Pull Request Overview
This PR implements a comprehensive Historical Data SDK for OSS-Fuzz, providing unified access to historical fuzzing data with specialized managers for builds, crashes, corpus, and coverage analysis. The implementation includes storage infrastructure, data models, and extensive testing capabilities.
Key Changes:
- Introduces the main
OSSFuzzSDKfacade class for unified historical data access - Adds specialized history managers for builds, crashes, corpus, and coverage data
- Extends storage infrastructure with history-specific operations and multiple backend support
Reviewed Changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| ossfuzz_py/utils/env_vars.py | Adds environment variables for historical data storage configuration |
| ossfuzz_py/unittests/test_local_builder_pipeline.py | Updates path resolution for benchmark YAML file |
| ossfuzz_py/unittests/test_historical_data_sdk.py | Comprehensive test suite for new SDK functionality |
| ossfuzz_py/unittests/test_cloud_builder_pipeline.py | Updates path resolution for benchmark YAML file |
| ossfuzz_py/history/*.py | New history manager classes and base functionality |
| ossfuzz_py/errors/*.py | New error types for historical data operations |
| ossfuzz_py/data/storage_*.py | Extended storage infrastructure with history operations |
| ossfuzz_py/core/ossfuzz_sdk.py | Main SDK facade implementation |
| ossfuzz_py/core/data_models.py | New data models for historical data structures |
| ossfuzz_py/init.py | Updates to public API exports |
…atures - Update cloud builder pipeline tests for new SDK integration - Modify local builder pipeline tests to work with enhanced functionality - Ensure backward compatibility and proper error handling - Fix any test conflicts with new historical data features
0a02fd8 to
29484f1
Compare
Collaborator
Author
|
/gcbrun exp -n zewei -m vertex_ai_gemini-2-5-flash-chat -ag -b quick-test |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR introduces a comprehensive Historical Data SDK for OSS-Fuzz, providing a unified interface for accessing, storing, and analyzing historical fuzzing data. The SDK enables researchers and developers to track fuzzing progress over time, analyze trends, and generate detailed reports across builds, crashes, corpus, and coverage data.
Features
OSSFuzzSDKclass providing unified access to all historical data functionalityBuildHistoryManager- Build history, success rates, and artifactsCrashHistoryManager- Crash data, deduplication, and analysisCorpusHistoryManager- Corpus growth, statistics, and effectivenessCoverageHistoryManager- Coverage data, trends, and reportingStorageManager- Unified storage backend managementStorageAdapter- Abstract interface with file and GCS implementationsTesting
test_historical_data_sdk.pywith comprehensive coverage of:test_cloud_builder_pipeline.py,test_local_builder_pipeline.py) to use proper path resolutionNone - This is a purely additive feature that extends the existing SDK without modifying existing APIs or functionality.