fix: avoid settings validation during ingestion service import#413
Open
DivyamTalwar wants to merge 1 commit into
Open
fix: avoid settings validation during ingestion service import#413DivyamTalwar wants to merge 1 commit into
DivyamTalwar wants to merge 1 commit into
Conversation
Importing IngestionService during pytest collection previously resolved settings immediately and could require POSTGRES_URI before tests ran. Move settings validation to service construction, pass the already-resolved startup/worker settings into production instances, keep module-level settings attribute access lazy, preserve direct proxy overrides through a validated settings reconstruction, and add regression coverage for the import and constructor contracts. Constraint: Default local config selects Postgres and validates POSTGRES_URI when get_settings() runs. Rejected: Test-only os.environ defaulting | mutates process-wide configuration during import. Rejected: Requiring test callers to replace the whole module settings object | breaks existing direct-attribute monkeypatch seams. Rejected: Returning the mutable settings proxy from construction | can bypass full settings validation when partial overrides exist. Rejected: model_copy(update=...) for override merges | Pydantic does not validate update values. Confidence: high Scope-risk: moderate Directive: Keep broader import-time settings cleanup out of this PR unless covered by targeted tests. Tested: uv run pre-commit run isort --files core/services/ingestion_service.py core/services_init.py core/workers/ingestion_worker.py core/tests/unit/test_ingestion_colpali_rendering.py core/tests/unit/test_ingestion_service_settings_import.py; uv run pre-commit run black --files core/services/ingestion_service.py core/services_init.py core/workers/ingestion_worker.py core/tests/unit/test_ingestion_colpali_rendering.py core/tests/unit/test_ingestion_service_settings_import.py; uv run pre-commit run ruff --files core/services/ingestion_service.py core/services_init.py core/workers/ingestion_worker.py core/tests/unit/test_ingestion_colpali_rendering.py core/tests/unit/test_ingestion_service_settings_import.py; env -u POSTGRES_URI uv run pytest -q core/tests/unit/test_ingestion_service_settings_import.py; env -u POSTGRES_URI uv run pytest --collect-only -q core/tests/unit/test_ingestion_service_settings_import.py core/tests/unit/test_ingestion_colpali_rendering.py; env -u POSTGRES_URI uv run pytest -q core/tests/unit/test_ingestion_service_settings_import.py core/tests/unit/test_ingestion_colpali_rendering.py; env -u POSTGRES_URI uv run pytest -q core/tests/unit/test_ingestion_service_settings_import.py core/tests/unit/test_ingestion_colpali_rendering.py core/tests/unit/test_ingestion_service_metadata_update.py; env -u POSTGRES_URI uv run pytest --collect-only -q Not-tested: Full test execution; includes integration/live-service tests that require external services.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
core.services.ingestion_serviceresolved project settings at module import time. Because the default local config uses the Postgres provider, importingIngestionServiceduring pytest collection could requirePOSTGRES_URIbefore any ColPali rendering test ran.This PR removes that import-time settings validation from the service module while preserving early settings validation for real service instances.
IngestionServicenow stores a resolved settings reference during construction, app startup and ingestion workers pass their already-resolved settings object, and the module-levelsettingssymbol remains available for lazy attribute access.This is intentionally scoped to the ingestion service import path. Test collection still has an existing LiteLLM model-pricing HTTP fetch from other imports; this PR does not attempt to make the whole collection path offline.
Changes
settings = get_settings()binding incore/services/ingestion_service.pywith a lazy compatibility proxy.IngestionServiceinstance during construction.core/services_init.pyandcore/workers/ingestion_worker.py.COLPALI_PDF_DPIonce per PDF/office conversion path so a single document uses a stable DPI value.IngestionService.Why this matters
Full pytest collection is a basic contributor workflow. These rendering unit tests should be collectible under the default local config without requiring
POSTGRES_URI.Constructor resolution keeps configuration validation before ingestion/update side effects in normal app and worker paths. The lazy module-level proxy keeps attribute access through the old
ingestion_service.settingsseam available without forcing configuration validation during import.Production paths now rely on injected or resolved settings during
IngestionServiceconstruction. The module-level proxy remains a compatibility and test override seam, not the preferred runtime configuration source.Tests
Commands run:
uv run pre-commit run isort --files core/services/ingestion_service.py core/services_init.py core/workers/ingestion_worker.py core/tests/unit/test_ingestion_colpali_rendering.py core/tests/unit/test_ingestion_service_settings_import.py- passeduv run pre-commit run black --files core/services/ingestion_service.py core/services_init.py core/workers/ingestion_worker.py core/tests/unit/test_ingestion_colpali_rendering.py core/tests/unit/test_ingestion_service_settings_import.py- passeduv run pre-commit run ruff --files core/services/ingestion_service.py core/services_init.py core/workers/ingestion_worker.py core/tests/unit/test_ingestion_colpali_rendering.py core/tests/unit/test_ingestion_service_settings_import.py- passedenv -u POSTGRES_URI uv run pytest -q core/tests/unit/test_ingestion_service_settings_import.py- passed, 8 tests passedenv -u POSTGRES_URI uv run pytest --collect-only -q core/tests/unit/test_ingestion_service_settings_import.py core/tests/unit/test_ingestion_colpali_rendering.py- passed, 11 tests collectedenv -u POSTGRES_URI uv run pytest -q core/tests/unit/test_ingestion_service_settings_import.py core/tests/unit/test_ingestion_colpali_rendering.py- passed, 11 tests passedenv -u POSTGRES_URI uv run pytest -q core/tests/unit/test_ingestion_service_settings_import.py core/tests/unit/test_ingestion_colpali_rendering.py core/tests/unit/test_ingestion_service_metadata_update.py- passed, 17 tests passedenv -u POSTGRES_URI uv run pytest --collect-only -q- passed, 253 tests collectedNot run:
Risk
Risk level: medium
This changes where
IngestionServicestores settings: instance methods now readself.settingsinstead of a module-global settings object. Startup and worker construction pass the same resolved settings object they already use, and direct construction still resolves settings during__init__unless tests explicitly inject a settings object. Reverting the service constructor change and the two construction call sites restores the previous behavior.Issue
No existing issue. This was discovered during repository analysis.
Notes for maintainers
This PR intentionally addresses only the
ingestion_serviceimport-time settings dependency needed by the ColPali rendering tests. Other import-time settings patterns, such ascore/services/document_service.pyand route/module-level settings singletons, are left out of scope and are not tracked by this PR.Collection still logs a LiteLLM model-pricing HTTP fetch from existing imports; this PR does not attempt to make the whole collection path offline.