Codebase review: prioritized recommendations across architecture, testing, CI, deps, CLI, docs & performance

# Microsoft Olive — Read-Only Codebase Review

> Review date: 2026-06-12
> Review scope: `/home/justinchu/dev/Olive` (`olive-ai 0.11.0.dev0`)
> Review method: Eight-dimension parallel read-only static analysis (Architecture / Code Quality / Testing / Documentation / CLI / Dependencies / CI-CD / Performance). This report de-duplicates and synthesizes findings from all eight analyses into a unified priority ranking.
> Hard constraint: Strictly read-only. No files in the repository were modified during this review.

This is a **read-only static-analysis review**. No files in the repository were modified during this review.

---

## Executive Summary

Olive is a well-organized, clearly abstracted multi-backend model optimization toolkit. The core abstractions — Pass / Engine / System / Evaluator / Search — have well-defined inheritance contracts, a consistent Pydantic v2 configuration system, and a CLI entry point strongly synchronized with the code through argparse and auto-generated docs. However, the eight-dimension review uncovered several systemic issues concentrated in three areas: **weak quality safeguards**, **layer-coupling leakage**, and **unreliable user-visible behavior**.

**Overall health**: Above-average design paired with below-average quality guardrails. Without adding CI test gates, fixing the `or True` permanent skip, and tightening dependency upper bounds, a single upgrade of numpy / torch / transformers could cause production-grade regressions that go undetected.

**Top 5 High-Priority Issues (ranked by risk)**:

1. **Test suite barely wired into CI**: Of 184 test files, only 1 (`test/cli/test_cli_test_model_smoke.py`) runs in GitHub Actions. All core modules — passes / engine / search / evaluator / model / systems — have zero CI coverage. Any regression will only be discovered through user reports. (findings-tests item 1, findings-ci §2.1)
2. **Critical dependencies have no upper bounds + ruff target contradicts python_requires**: `requirements.txt` lists `numpy / torch / transformers / onnx / onnxscript` without upper bounds, yet test files already document known breaking changes at `onnx<1.21.0`, `torch<2.11.0`, `transformers<5.4.0`, `onnxscript<0.6.1`; meanwhile `pyproject.toml` ruff `target-version="py39"` contradicts `setup.py` `python_requires=">=3.10"`, silently invalidating 3.10+-specific lint rules. (findings-deps H1/H2/M1/M2, L3)
3. **Permanently skipped tests coexist with production code marked "incorrect"**: `test_slicegpt.py:16` uses `or True` to force-skip and hide failures; `graph_surgeries.py:170,184` `RenameInputs` / `RenameOutputs` carry a "This is incorrect" TODO yet remain registered in the surgeon table and are user-accessible; the `auto-opt` deprecation warning is **never shown** because the default `--log_level=3` (ERROR) is higher than WARNING. All three share the same pattern: users and developers cannot see "known-bad" state. (findings-tests item 3, findings-quality item 8, findings-cli §10.1)
4. **Critical failures in `olive/cache.py` are silently swallowed**: `cache_model` / `load_model` / `cache_run` and other critical paths use `except Exception: logger.exception(...)` and then **continue executing** rather than re-raising, creating a "cache miss" illusion and hard-to-diagnose downstream errors; `cache.py` also creates a reverse dependency by importing `olive.model.ModelConfig` and `olive.passes.onnx.common.resave_model`, breaking the layering. (findings-quality item 2, findings-arch item 2)
5. **`systems → workflows` and `passes → evaluator` reverse dependencies**: `DockerSystem` directly imports `olive.workflows.run.config.RunConfig`; `workflow_runner.py` calls `from olive.workflows import run`; two ONNX passes (`inc_quantization.py`, `session_params_tuning.py`) construct `OliveEvaluatorConfig` directly inside `_run_for_config` to invoke the evaluator. Both reverse-dependency patterns prevent systems / passes from being tested or evolved independently. (findings-arch items 1 and 3)

All high / medium / low priority recommendations are listed below, grouped by dimension. Related issues (e.g., multiple missing `auto-opt` deprecation notices, conflicting `onnxscript` bounds, CLI snake/kebab mixing) are merged into single items retaining all `file:line` references.

---

## High-Priority Recommendations

> Inclusion criteria: correctness issues / security / user-blocking / data loss / CI gaps / severe developer-experience barriers.

### Architecture

#### [Quick Fix] Disable or fix Surgeons marked "incorrect" in `graph_surgeries.py`

- **Location**:
  - `olive/passes/onnx/graph_surgeries.py:170-195` (`RenameInputs` / `RenameOutputs` class definitions, carrying `# TODO(anyone): This is incorrect, remove or fix`)
  - `olive/passes/onnx/graph_surgeries.py:3273-3278` (registered in the surgeon table, user-accessible via configuration)
- **Current State**: Both Surgeon subclasses have inline comments admitting they are "incorrect" yet remain exposed to users as valid surgeons.
- **Impact**: If a user specifies `surgeon: "RenameInputs"`, a known-broken code path executes without error, silently producing an incorrect ONNX model.
- **Recommended Action**: Immediately raise `NotImplementedError("RenameInputs is known to be incorrect; see TODO at line 170")` inside `_run_for_config`, or remove the entries from the surgeon registry; restore once fixed.

#### [Refactor] Eliminate the reverse dependency from `olive.systems` to `olive.workflows`

- **Location**: `olive/systems/docker/docker_system.py:22` (`from olive.workflows.run.config import RunConfig`), `olive/systems/docker/workflow_runner.py:8` (`from olive.workflows import run as olive_run`)
- **Current State**: The systems layer directly imports the workflows layer, forming a systems ↔ workflows circular dependency.
- **Impact**: `systems` cannot be tested or released independently of `workflows`; any workflows refactor breaks systems.
- **Recommended Action**: Define a `DockerRunSpec` (containing only runtime-agnostic fields like host/target/accelerator/resources) inside `systems/`; let `workflows` convert `RunConfig` to `DockerRunSpec` before passing it in; move `workflow_runner.py` from `systems/docker/` to `workflows/` or `cli/`.

#### [Refactor] Eliminate the reverse dependency from `olive/passes` to `olive/evaluator`

- **Location**:
  - `olive/passes/onnx/inc_quantization.py:19-21`
  - `olive/passes/onnx/session_params_tuning.py:17-19`
  - Both passes construct `OliveEvaluatorConfig` directly inside `_run_for_config`
- **Current State**: The pass layer (which should only perform model transformations) directly invokes evaluation logic, violating the "passes are pure transformations" contract and is inconsistent with other pass interfaces.
- **Impact**: Both passes cannot be reused or unit-tested without evaluator configuration; there is no unified mechanism for adding evaluation-feedback-driven passes.
- **Recommended Action**: Add an `evaluation_callback: Optional[Callable]` parameter to the `Pass` base class, injected by the `Engine` at dispatch time; passes obtain evaluation results via the callback and are prohibited from directly `import olive.evaluator`.

#### [Refactor] Eliminate `olive/cache.py` reverse dependencies on domain layers

- **Location**: `olive/cache.py:22` (`from olive.model.config.model_config import ModelConfig`), `olive/cache.py:483` (lazy import `from olive.passes.onnx.common import resave_model`), `olive/model/handler/mixin/mlflow.py:10` (`from olive.cache import OliveCache`)
- **Current State**: `cache.py` should be infrastructure, yet it depends on `ModelConfig` and a concrete pass; inversely, `MLFlowTransformersMixin` depends on `OliveCache.from_cache_env()` to read the cache directory from environment variables.
- **Impact**: Cache cannot be tested in isolation; MLFlow handler has an implicit runtime environment variable dependency, producing obscure initialization failure messages.
- **Recommended Action**: Replace the `resave_model` call with a protocol/hook injected by the caller (`engine`); have `cache_model` accept a serialized dict instead of the `ModelConfig` type; move MLFlow model-merge logic into `Engine.initialize()` or a pre-processing step in `LocalSystem.run_pass`, or switch to an explicit `cache_dir: Path` parameter.

### Code Quality

#### [Quick Fix] Multiple critical failures silently swallowed in `olive/cache.py`

- **Location**: `olive/cache.py:188, 202, 229, 242, 266, 279, 679, 684, 701, 774`
- **Current State**: `cache_model` / `load_model` / `cache_run` / `load_run_from_model_id` and other critical paths use `except Exception: logger.exception(...)` and then **continue executing** without re-raising.
- **Impact**: Cache write failures are invisible to users; downstream code continues running on the assumption the cache succeeded, producing hard-to-diagnose secondary errors.
- **Recommended Action**: Change cache **write** operations to `logger.warning` and expose write status in return values / callers; true exceptions in **read** operations (not `FileNotFoundError`) must propagate upward; document in an `EXCEPTIONS_TO_RAISE` whitelist.

#### [Quick Fix] Bare exception catch in `_run_search` loses critical context

- **Location**: `olive/engine/engine.py:437-446`
- **Current State**: In search mode, any exception from `_run_passes` is caught by a global `except Exception` and only logged via `logger.warning(...)` **without `exc_info=True`**.
- **Impact**: Bugs during search (memory errors, user script exceptions) are silenced; users only see "No output model produced" instead of the real error.
- **Recommended Action**: Use `logger.warning(..., exc_info=True)`; explicitly document the `EXCEPTIONS_TO_RAISE` list (ensure `OlivePassError` and similar always propagate through).

#### [Refactor] ~90% of the 4144-line `hadamard_utils.py` is hand-written matrix literals

- **Location**: `olive/passes/pytorch/hadamard_utils.py` (4144 lines; `get_had172()` 1207 lines, `get_had156()` 939 lines, `get_had140()` 843 lines, `get_had108()` 435 lines — all `+1, -1, ...` literal data)
- **Current State**: The largest file in the repository; its content is static data incorrectly embedded as code.
- **Impact**: Slows IDE/lint time; misleads reviewers; makes matrix content impossible to diff; adds cold-start overhead.
- **Recommended Action**: Serialize the matrices as `.npy` or `.safetensors`, ship them as package data, and load via `np.load(importlib.resources.files(...))`. Over 90% of the source file content can be removed.

### Testing

#### [Quick Fix] `or True` in `test_slicegpt.py` permanently skips tests and hides real failures

- **Location**: `test/passes/pytorch/test_slicegpt.py:16-17`
  ```python
  @pytest.mark.skipif(
      (sys.version_info < (3, 10) and not torch.cuda.is_available()) or True,
      reason="requires python3.10 or higher",
  )
  ```
- **Current State**: `or True` makes the condition always true; the comment `# TODO(team): Failed in pipeline (linux gpu). Need to investigate.` shows this is a workaround to hide failures.
- **Impact**: Zero effective test coverage for the `SliceGPT` pass; the pattern may be copied into other files.
- **Recommended Action**: Immediately investigate and fix the root cause; if GPU is required, change to `@pytest.mark.skipif(not torch.cuda.is_available(), ...)`; if network is required, replace with a local small model; remove `or True` after fixing.

#### [Refactor] Add a `test-unit.yml` workflow covering core modules

- **Location**: `.github/workflows/test-model-fast.yml:39` only runs `pytest test/cli/test_cli_test_model_smoke.py`; the remaining 183 test files are not executed in any CI.
- **Current State**: passes / engine / search / model / evaluator / systems / hardware — all core directories have zero CI protection.
- **Impact**: Any regression can only be discovered through manual testing or production exposure.
- **Recommended Action**: Add `.github/workflows/test-unit.yml` covering at least the hardware/network-dependency-free tests in `test/engine/`, `test/search/`, `test/model/`, `test/evaluator/`, `test/hardware/`, `test/passes/onnx/`; use `workflow_dispatch` or a separate matrix for GPU/QNN/OpenVINO suites.

#### [Quick Fix] Three locations perform direct HF downloads with no skip guard

- **Location**:
  - `test/passes/onnx/test_extract_adapters.py:37` (`AutoModelForCausalLM.from_pretrained("hf-internal-testing/tiny-random-LlamaForCausalLM")`, module scope)
  - `test/passes/onnx/test_nvmo_graph_surgery.py:34` (`AutoConfig.from_pretrained("Qwen/Qwen2.5-0.5B")`, ~700 MB)
  - `test/cli/test_cli.py:652` (HF download with no skip guard)
- **Current State**: When the network is unavailable, tests emit a connection error rather than skipping; large model downloads significantly slow CI.
- **Recommended Action**: Register a `@pytest.mark.network` marker (see CLI/CI items) and uniformly guard with `skipif(not _network_available())`; prefer local synthetic models (see `test_cli_test_model_smoke.py::_save_local_tiny_llama`).

### Documentation

#### [Quick Fix] `docs/source/conf.py` version is inconsistent with the actual package version

- **Location**: `docs/source/conf.py:22` (`version = "0.12.1"`) vs `olive/version.py:1` (`__version__ = "0.11.0.dev0"`)
- **Current State**: Hard-coded version number is one minor version ahead of the actual package version.
- **Impact**: The version switcher shows the wrong version; documentation does not match release artifacts; users and contributors reference incorrect APIs.
- **Recommended Action**: Switch to dynamic reading: `from olive.version import __version__ as version`, and set `release = version` accordingly.

#### [Quick Fix] `docs/architecture.md` Pass base class API has changed but documentation has not been updated

- **Location**: `docs/architecture.md:47-48` (example uses `@abstractmethod def default_config`) vs `olive/passes/olive_pass.py:318` (actual: `_default_config`, private method with different signature)
- **Impact**: New contributors following the documentation to implement a custom Pass will get an "abstract method not implemented" error.
- **Recommended Action**: Update the example to `_default_config(cls, accelerator_spec)`; also update `docs/architecture.md:79-87` — the outdated list of only 9 passes should be replaced with a link to `reference/pass.rst`; fix the Unicode em-dash at `docs/architecture.md:363` (`python –m olive.workflows.run` → `olive run --config`).

#### [Quick Fix] Mark `auto-opt` as deprecated at all entry points and fix the default log level

- **Location**:
  - `olive/cli/auto_opt.py:184` emits deprecation warning via `logger.warning(...)`
  - `olive/cli/base.py:394` default `--log_level=3` (ERROR) is above WARNING (2), so **the warning is never shown by default**
  - `docs/source/how-to/cli/cli-auto-opt.md:1` does not note the deprecation
  - `docs/source/reference/cli.rst:91-92` does not note the deprecation
  - `notebooks/olive_quickstart.ipynb:42` still installs `olive-ai[auto-opt]`
- **Current State**: The command is marked deprecated, but users see no indication of this at any of three entry points (CLI runtime, documentation, notebook).
- **Impact**: Users continue learning and integrating a deprecated command; widespread breakage when it is eventually removed.
- **Recommended Action**: ① Change `logger.warning` to `logger.error` or print `print("DEPRECATED: use 'olive optimize'", file=sys.stderr)` at subcommand registration; ② add `> ⚠️ Deprecated: use \`olive optimize\`` at the top of `cli-auto-opt.md` and `cli.rst`; ③ insert a deprecation notice at the top of `--help` output; ④ update the quickstart notebook to use `olive optimize`.

### CLI / UX

#### [Quick Fix] `--num-splits` and other parameters globally mix kebab-case and snake_case

- **Location**:
  - `olive/cli/auto_opt.py:143-170` (kebab: `--num-splits`, `--cost-model`, etc.)
  - `olive/cli/optimize.py:100-184` (snake: `--num_split`, `--dim_param`, etc.; synonym parameters named differently from `auto_opt.py`)
  - `olive/cli/run.py:25-41` (mixed in same file: `--run-config`, `--package-config` kebab; `--list_required_packages` snake)
  - `olive/cli/base.py:649-667` (`add_shared_cache_options` uses `--account_name`/`--container_name`) vs `olive/cli/shared_cache.py:37-44` (`--account`/`--container`)
- **Current State**: The same concept uses different separators across different commands and even within the same file; Azure credential parameter names differ between two implementations — an observable bug.
- **Impact**: Users switching between commands must consult help every time; scripts are hard to reuse; Azure credential confusion is a real, observable bug.
- **Recommended Action**: Standardize on snake_case (consistent with the majority of existing CLI); use argparse `add_argument(..., dest=...)` to retain kebab-case aliases for backward compatibility, printing a `DeprecationWarning` during the transition period. Standardize Azure credentials as `--account_name` / `--container_name`.

#### [Quick Fix] `type=bool` in `capture-onnx-graph` does not work as expected

- **Location**: `olive/cli/capture_onnx.py:152-163` (`--exclude_embeds`, `--exclude_lm_head`, `--enable_cuda_graph`, all `type=bool, default=False`)
- **Current State**: argparse's `type=bool` evaluates any non-empty string as `True` (`bool("false") == True`), so `--exclude_embeds false` still yields `True`.
- **Impact**: Users believe they are disabling a feature when in fact it is being enabled — silently incorrect behavior.
- **Recommended Action**: Change to `action="store_true"`, or define an explicit `def str2bool(v): ...` that accepts `true/false/0/1`.

#### [Quick Fix] CLI `ValueError` thrown directly causes users to see tracebacks

- **Location**: `olive/cli/base.py:217-218, 375`, `olive/cli/auto_opt.py:248-249, 300`, `olive/cli/optimize.py:260-290`, `olive/cli/capture_onnx.py:237-241`, `olive/cli/run_pass.py:134`
- **Current State**: Argument validation errors raise `ValueError`, triggering a full Python traceback and exit(1); this is mixed with argparse's `parser.error()` (exit 2).
- **Impact**: Poor user experience; scripts cannot distinguish argument errors from runtime errors via exit code.
- **Recommended Action**: Standardize argument validation to use `parser.error("...")`; retain `ValueError` only for genuine runtime errors.

#### [Quick Fix] Mixed `print()` and `logger.*` in CLI; multiple debug prints pollute stdout

- **Location**: 40+ `print` calls throughout `olive/cli/`, most notably:
  - `olive/cli/quantize.py:145` `print(f"pass list: {pass_list}")`
  - `olive/cli/quantize.py:184` `print(f"selected pass configs: {passes_dict}")`
  - `olive/cli/quantize.py:129-132` uses `print` for warnings
  - `olive/cli/run_pass.py:198-200` uses `print` for errors then calls exit(0)
  - `olive/cli/base.py:107-125` mixes `print` and `logger` in the same function
- **Current State**: Debug-level output cannot be controlled via `--log_level`; error output goes to stdout instead of stderr; hard to parse in pipelines.
- **Recommended Action**: Standardize: use `print` for user-facing success/progress/result paths; use `logger.*` (going to stderr) for diagnostics/warnings/errors; remove obvious debug prints; the error path in `run_pass.py:198-200` should call `sys.exit(1)`.

### Dependencies / Packaging

#### [Quick Fix] Add upper bounds to critical dependencies in `requirements.txt`

- **Location**: `requirements.txt:1-14` and `test/requirements-test.txt:18,24,40`, `test/requirements-test-gpu.txt:4`, `test/requirements-test-openvino.txt:6,7`
- **Current State**: Production `requirements.txt` lists `numpy`, `torch`, `transformers`, `onnx`, `onnxscript` without upper bounds; tests already document known breaking changes at `onnx<1.21.0` (symlink issue), `torch<2.11.0`, `transformers<5.4.0`, `onnxscript<0.6.1`.
- **Impact**: Any upstream major release may cause a fresh `pip install olive-ai` to fail immediately; poor reproducibility in user environments.
- **Recommended Action**: Sync the test-documented upper bounds into production `requirements.txt` (e.g., `numpy<2`, `onnx<1.21.0`, `torch<2.11.0`, `transformers<5.4.0`, `onnxscript>=0.5.3,<0.6.1`); explicitly raise bounds via dependabot/manual PR in each subsequent release.

#### [Quick Fix] Fix the inconsistency between `ruff target-version` and `python_requires`

- **Location**: `pyproject.toml:84` (`target-version = "py39"`) vs `setup.py:77` (`python_requires=">=3.10"`); and comments at `pyproject.toml:135, 176`
- **Current State**: ruff still checks against py39, missing lint rules specific to 3.10+ syntax (e.g., stricter `B905` patterns, PEP 604 union types).
- **Impact**: Lint false negatives; certain anti-patterns may slip through.
- **Recommended Action**: Change to `target-version = "py310"`; remove legacy comments advising to "avoid" 3.10 constructs and re-audit the `ignore` list.

#### [Quick Fix] Remove or update the uninstallable `tf` extra (`tensorflow==1.15.0`)

- **Location**: `extra_dependencies.tf` in `olive/olive_config.json`, `olive/model/handler/tensorflow.py:31` (`load_model` directly raises `NotImplementedError`), `olive/cli/configure_qualcomm_sdk.py:21`, `olive/platform_sdk/qualcomm/configure/configure.py:84`
- **Current State**: TF 1.15.0 does not support Python 3.10+; it cannot be installed under the current `python_requires`; `TensorFlowModelHandler` itself is not implemented.
- **Impact**: `pip install olive-ai[tf]` will always fail, signaling to users that "this extra is dead."
- **Recommended Action**: Remove the `tf` extra, or update it to modern TF (only if actually needed); synchronously delete `TensorFlowModelHandler` or mark it deprecated.

### CI/CD

#### [Quick Fix] All GitHub Actions references use unpinned SHAs (supply-chain risk)

- **Location**: 10+ mutable tags across 4 workflows: `lint.yml:19, 21, 32, 73`, `codeql.yml:51, 66, 80, 85`, `docs.yml:30`, `test-model-fast.yml:21`
- **Current State**: All actions use mutable tags (`@v3` / `@v4` / `@v5`), including two third-party actions (`reviewdog/action-misspell@v1`, `advanced-security/dismiss-alerts@v1`).
- **Impact**: If a tag is hijacked or an account is compromised, arbitrary code can be executed on the runner — a standard GitHub Actions supply-chain attack vector.
- **Recommended Action**: Pin all actions to SHA (e.g., `uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2`); add `.github/dependabot.yml` to enable automatic updates for github-actions and pip.

#### [Refactor] Enable pip caching and dependency security scanning

- **Location**: `lint.yml:49-53`, `test-model-fast.yml:23-27`, `docs.yml:32-36` (all missing `cache: pip`); no `pip-audit`/`safety` in `.github/workflows/`
- **Current State**: Every CI run re-downloads ML dependencies (5–15 minutes); no dependency vulnerability scanning; CodeQL only scans source code.
- **Recommended Action**: Add `cache: "pip"` and `cache-dependency-path: ...` to all `setup-python@v5` steps; add `pip-audit -r requirements.txt` to the scheduled trigger in `codeql.yml`.

### Performance

#### [Refactor] Passes / search points are strictly serial — no multi-core utilization in search mode

- **Location**: `olive/engine/engine.py:620-631` (`_run_passes`), `olive/engine/engine.py:409-451` (`_run_search`), `olive/search/search_strategy.py:210-257` (synchronous generator)
- **Current State**: All passes within a single workflow are executed serially; in search mode each search point is evaluated completely serially, waiting for `record_feedback_signal` before generating the next.
- **Impact**: Workflows with 5+ passes × hundreds of search points have end-to-end time that is purely additive; multi-core CPUs sit idle.
- **Recommended Action**: In search mode, introduce `concurrent.futures.ProcessPoolExecutor` for batch evaluation of independent search points at the same level; introduce an async feedback mechanism so the sampler can submit batches simultaneously.

#### [Refactor] `PythonEnvironmentSystem` spawns a new subprocess for every pass / evaluation

- **Location**: `olive/systems/python_environment/python_environment_system.py:61-93`
- **Current State**: Every `run_pass` / `evaluate_model` creates a temporary directory, writes JSON, calls `subprocess.run` to start a new Python interpreter, reads JSON, and deletes the directory.
- **Impact**: In search mode with hundreds of iterations, each incurring 200ms–1s process startup overhead; cumulative JSON I/O is significant.
- **Recommended Action**: Implement a persistent worker process (e.g., using `multiprocessing.Process` + Queue or ZeroMQ) that communicates via stdin/stdout/IPC and reuses the interpreter across iterations.

---

## Medium-Priority Recommendations

### Architecture / Code Quality

- **[Quick Fix] `cache.py:216` `pass_name: int` type annotation is incorrect**: The call site (`engine.py:739`) actually passes a `str`; change to `pass_name: str`.
- **[Quick Fix] `Engine.__init__` negation-style parameter naming is inconsistent**: `engine.py:59,82` has `no_artifacts: bool=False` immediately assigned to internal `skip_saving_artifacts`; change to positive naming `save_artifacts: bool=True` or align the parameter name with the attribute name.
- **[Refactor] `olive/evaluator/olive_evaluator.py` is a 2059-line god module**: Contains 10 classes (`OliveEvaluator`, per-backend evaluators, `OliveEvaluatorConfig`); split by backend into `base.py`, `onnx.py`, `pytorch.py`, `openvino.py`, `qnn.py`, `lm.py`, `mteb.py`, with `__init__.py` maintaining compatible exports.
- **[Refactor] `olive/engine/engine.py` is 844 lines with 4 TODOs**: Mixes scheduling, caching, search expansion, evaluation, and packaging responsibilities; extract `Scheduler` / `EvaluationRunner` following the existing `FootprintNode`/`WorkflowOutput` approach.
- **[Refactor] `olive/common/utils.py` is a 721-line / 38-function grab-bag**: Mixes hashing, subprocess, torch tensor, file I/O, and HF integration; split by responsibility into `hashing.py`/`io_utils.py`/`torch_utils.py`/`dict_utils.py`; move `hf_repo_exists` to `olive/common/hf/`.
- **[Refactor] `olive/cli/base.py` is 768 lines mixing ABC with utility functions**: Extract `add_input_model_options`, `add_hf_test_model_config`, `get_input_model_config`, and similar into `olive/cli/utils.py`/`olive/cli/model_options.py`; keep only `BaseOliveCLICommand` in `base.py`.
- **[Quick Fix] `olive/passes/onnx/common.py` is depended on across backends**: `olive/passes/openvino/quantization.py:17` and `olive/passes/quark_quantizer/quark_quantization.py:21` import `model_proto_to_file` / `model_proto_to_olive_model` / `get_external_data_config`; move backend-agnostic `get_external_data_config` to `passes/common/`; keep ONNX-specific utilities in `passes/onnx/common.py`.
- **[Quick Fix] PyTorch backend pass names lack a consistent prefix**: `olive/passes/pytorch/` contains `Gptq`, `Rtn`, `LoRA`, `AutoAWQQuantizer`, `GptqQuantizer` with inconsistent naming; uniformly add a `Pytorch` prefix (matching the ONNX backend convention) or add a lint check in `Pass.__init_subclass__`.
- **[Refactor] `olive.search` depends on `MetricResult`**: `olive/search/search_results.py:12`, `olive/search/search_strategy.py:23`, `olive/search/samplers/search_sampler.py:14` (though `TYPE_CHECKING`); define `ScoreMap = dict[str, float]` or a Protocol; have `Engine` convert `MetricResult` before passing it in.
- **[Quick Fix] `olive.platform_sdk` has a reverse dependency on `olive.cli`**: `olive/platform_sdk/qualcomm/configure/__main__.py:9` calls `from olive.cli.launcher import legacy_call`; inline the needed logic or define an independent `run_configure()` in `platform_sdk`.
- **[Quick Fix] Adapter handling code is duplicated in 3 places**: `olive/passes/pytorch/autoawq.py:131-138`, `autogptq.py:119-131`, `gptqmodel.py:119-128` are nearly identical; extract `detach_adapter(model, backend_name) -> tuple[ModelHandler, Optional[str]]` into `olive/passes/pytorch/common.py`.
- **[Refactor] `HfModelHandler` has a 5-level MRO plus dead `composite_parent` code**: `olive/model/handler/hf.py:29`, `olive/model/handler/pytorch.py:25` already carry `# pylint: disable=too-many-ancestors`; `olive/model/handler/base.py:45` `composite_parent` has no consumers anywhere in the codebase. Change `DummyInputsMixin` and similar to composition (`handler.dummy_input_provider: DummyInputProvider`); delete `composite_parent`.

### Testing

- **[Refactor] `test_python_environment_system.py` creates a real venv and runs `pip install` inside a fixture**: `test/systems/python_environment/test_python_environment_system.py:34-52`; mock `run_subprocess` or move to `test/systems/integration/` and add `@pytest.mark.integration`.
- **[Refactor] `test_extract_adapters.py` module-scope fixture runs an end-to-end pipeline**: `test/passes/onnx/test_extract_adapters.py:27-65` includes HF download + PEFT + ONNX export; replace with a local synthetic model, or mark as `@pytest.mark.integration`.
- **[Quick Fix] `pyproject.toml` registers only 1 marker and does not enable strict mode**: Currently only `openvino`; add `gpu`, `cuda`, `network`, `qnn`, `qairt`, `integration`, and enable `--strict-markers`.
- **[Refactor] `olive/telemetry/` and `olive/platform_sdk/qualcomm/` have zero test coverage**: Telemetry covers a user-privacy-critical path; add `test/telemetry/test_telemetry.py` with at minimum disable/enable toggle coverage and "no send when disabled"; remove the empty `test/snpe/` shell or add a README, and add mock-SDK unit tests in `test/passes/qairt/` or a new `test/platform_sdk/`.
- **[Refactor] `test/conftest.py:15-30` autouse session fixture slows startup**: Imports `transformers` and `datasets`, modifies `transformers.utils.hub.TRANSFORMERS_CACHE` globally; change to non-autouse explicit dependency, or configure via the `TRANSFORMERS_CACHE` environment variable in CI.
- **[Refactor] `requirements-test.txt` mixes in platform-specific heavy dependencies**: `cupy-cuda12x`, `nncf>=2.19.0`, `openvino>=2025.4.1`, `neural-compressor<2.4` in the base file cause install failures on CPU/AMD/Apple Silicon; split into a minimal base set plus `requirements-test-{gpu,openvino,...}.txt`.

### Documentation

- **[Quick Fix] `cli-auto-opt.md` and `reference/cli.rst:91-92` do not note the deprecation**: (Already merged with the CLI high-priority item.)
- **[Quick Fix] `how-to/index.md:8` mentions `onnx-graph-capture` but no corresponding how-to page exists**: Add `cli-capture-onnx-graph.md` or remove the mention from the index.
- **[Quick Fix] `NEWS.md` was last updated in Feb 2025**: README contains Aug/Oct 2025 entries; backfill v0.8.0, v0.9.0, v0.9.2 release entries for Mar–Oct 2025.
- **[Refactor] Improve docstring coverage across core modules**: `engine.py:49` (`Engine.__init__`), `engine.py:241` (`run_accelerator`), `pass_config.py:98` (`BasePassConfig`), `onnx.py:45` (`OnnxModelHandler.__init__`) all lack docstrings; prioritize public abstract methods.
- **[Quick Fix] CONTRIBUTING.md / AGENTS.md lack docs build instructions**: Add `cd docs && pip install -r requirements.txt && make html`.
- **[Quick Fix] `docs.yml:41-42` inlines `onnxruntime`/`torch` without pinned versions**: Move to `docs/requirements.txt` with pinned versions.

### CLI / UX

- **[Quick Fix] `--log_level` accepts integer 0–4 with no `choices=`**: `olive/cli/base.py:390-398`; change to string-based `["debug","info","warning","error","critical"]` or at minimum add `choices=range(5)`; also introduce `-v`/`-q` Boolean shortcut flags (consistent with git/pip/uv).
- **[Quick Fix] `run` command lacks `--dry_run`**: `olive/cli/run.py` does not call `add_save_config_file_options`; inconsistent experience compared to `optimize` and other commands.
- **[Quick Fix] `quantize --algorithm` help text is semantically incorrect**: `olive/cli/quantize.py:65-66` says "List of", but the parameter accepts only a single value; change to "The quantization algorithm to use."
- **[Quick Fix] `tune-session-params` help references non-existent parameters**: `olive/cli/session_params_tuning.py:30-36` mentions `--model` / `--hf_model_name`, but the actual parameter is `--model_name_or_path`.
- **[Quick Fix] `--disable_force_evaluate_other_eps` is a double negative**: `olive/cli/session_params_tuning.py:78-84`; change to `--evaluate_all_eps` (positive) with inverted default.
- **[Refactor] `olive --help` has no grouping**: `olive/cli/launcher.py:37-59` lists 19 subcommands flat, with `auto-opt` registered before `optimize`; group by "core workflows / model conversion / adapter management / utilities", with deprecated commands listed last.
- **[Quick Fix] `@action` decorator placement is inconsistent**: `olive/cli/run_pass.py:22-23` decorates the class, while other commands decorate the `run()` method (e.g., `finetune.py:82`); standardize to `run()`.
- **[Quick Fix] `diffusion_lora.py` bypasses `add_input_model_options`**: `olive/cli/diffusion_lora.py:34-47` defines custom `-m/--model_name_or_path`, missing `--trust_remote_code` and `--test`; use the shared helper.
- **[Quick Fix] `--memory` help does not mention supported string formats**: `olive/cli/base.py:704-709` says "in bytes", but `type=AcceleratorSpec.str_to_int_memory` accepts `"4GB"`; add an example.

### Dependencies / Packaging

- **[Quick Fix] `setup.py:84` `package_data` references non-existent package `olive.auto_optimizer`**: Dead entry; remove or confirm the package exists.
- **[Quick Fix] `onnxscript` upper and lower bounds conflict**: Production `requirements.txt:4` `>=0.5.3` with no upper bound; tests `test/requirements-test.txt:24` `<0.6.1`; unify by tightening.
- **[Quick Fix] `openvino` extras and test versions are misaligned**: `olive_config.json` specifies `openvino>=2025.4.1`, `nncf>=2.19.0`; `test/requirements-test-openvino.txt:6,7` specifies `openvino>=2025.3.0`, `nncf>=2.18.0`; align them.
- **[Quick Fix] Missing aggregate extras (`all` / `dev`)**: Makes one-command CI and developer installs inconvenient.
- **[Refactor] `finetune`/`bnb`/`lora` extras overlap**: `finetune ≈ bnb + lora + optimum` is not expressed using `olive-ai[bnb,lora]` syntax; high maintenance cost to keep in sync.

### CI/CD

- **[Quick Fix] `docs.yml` is missing a `permissions:` declaration**: Add explicit `permissions: contents: read`; also add `permissions: {}` at the top level of each workflow as a default, elevating at the job level as needed.
- **[Quick Fix] `optional-lint` job has excess `security-events: write`**: `.github/workflows/lint.yml:16-17`; that job does not upload SARIF; remove under the principle of least privilege.
- **[Quick Fix] `lint.yml`/`test-model-fast.yml`/`docs.yml` do not trigger on `rel-*`**: Release candidate branches are not running tests or building docs; explicitly add `rel-*`.
- **[Refactor] Introduce a platform/Python version matrix**: Currently all workflows run on `ubuntu-latest` + Python 3.12; add `windows-latest` and `python-version: ["3.10","3.12"]` (at minimum in lint and smoke tests).
- **[Refactor] Introduce `pip-audit` and mypy/pyright static type checking**: Add `pip-audit -r requirements.txt` to the scheduled trigger in `codeql.yml`; incrementally enable `mypy --ignore-missing-imports olive/` in lint.
- **[Quick Fix] pre-commit and CI lint check sets are inconsistent**: `PYLINT`, `TOML-SORT`, `EDITORCONFIG-CHECKER`, `NOQA` only in CI; `check-yaml`, `absolufy-imports` only in pre-commit; unify by having pre-commit invoke `lintrunner`.

### Performance

- **[Refactor] `engine.py:306, 398` unnecessary `model_dump()` + `validate_config()` round-trip in search hot path**: Use `pass_config.model_copy()` or pass by reference directly.
- **[Refactor] `footprint.py:91-102, 230-254` every `record()` call triggers a full `_resolve_metrics()` — O(n²)**: Change to lazy evaluation (call only before `create_pareto_frontier`) or incremental computation.
- **[Refactor] `footprint.py:270-303` Pareto frontier uses a double loop — O(n²)**: Noticeable at larger search spaces; replace with a divide-and-conquer Pareto algorithm or sorting-based approach.
- **[Refactor] `DockerSystem.run_workflow` destroys the container on every run**: `olive/systems/docker/docker_system.py:108-142`; support container reuse.
- **[Quick Fix] `local.py:46` `ModelConfig.from_json(output_model.to_json())` is an unnecessary round-trip**: Return `output_model.to_model_config()` directly.
- **[Refactor] `onnx.load(self.model_path)` loads large models fully into memory**: `olive/model/handler/onnx.py:110-111`; in large LLM scenarios every pass does a full load; prefer `load_external_data=False` in lightweight inspection paths.
- **[Refactor] Multiple `onnx.load` calls on the same file inside quantization passes**: `olive/passes/onnx/quantization.py:265, 477, 510, 540`; pass `ModelProto` within the pass instead of repeatedly re-reading from disk.
- **[Quick Fix] `cache.py:185, 227, 263` JSON `indent=4` increases I/O**: Use `indent=None` in production mode.
- **[Quick Fix] `cache.py:290` cache_id is only 8 hex characters**: 32-bit space has collision risk in shared caches; extend to 16 characters or add a pass prefix.

---

## Low-Priority Recommendations

### Code Quality / Style

- **`pyproject.toml` globally disables all complexity rules in pylint**: `too-many-arguments / branches / locals / nested-blocks / return-statements / statements`, `broad-exception-caught`, `unused-argument` are all globally disabled; exempt per module at most, not globally, to catch newly introduced violations. The same applies to ruff `ignore`: `D100-D107` fully off, `RUF013` off due to a "Python 3.10" comment (contradicting target=py39), `PLW0603`/`B028` marked "temp" but now permanent.
- **`olive/passes/onnx/conversion.py:41` file-level `# pylint: disable=W0212`**: Scope is too broad; change to inline suppression.
- **`olive/common/utils.py:76` internal variable `md5_hash` in `hash_string` but the algorithm is SHA-256**: Misleading; rename to `sha256_hash` or `hasher` (same issue in `hash_io_stream`/`hash_file`/`hash_dict`).
- **100+ untracked TODO/FIXME items**: `olive/engine/engine.py:560`, `olive/passes/pytorch/lora.py:360` (functional bug `num_items_in_batch`), `olive/passes/onnx/graph_surgeries.py:170,184` ("incorrect"), three similar adapter issues in `olive/passes/pytorch/*`; enable ruff `TD003` to require each TODO to link to an issue.
- **`olive/engine/engine.py:462-485` `_dump_run_history` temporarily mutates global verbosity**: Anti-pattern (thread-unsafe); control via module logger level or a parameter instead.
- **Return type annotation coverage across passes/ is ~33%**: Annotate public and abstract methods with `->`.
- **Multiple `_run_for_config` implementations exceed 200 lines**: `diffusers/lora.py:231 _train_sd` (356), `_train_flux` (316), `onnx/mnb_to_qdq.py:89` (271), `openvino/optimum_intel.py:349` (246), `onnx/split.py:42` (211), `onnx/model_builder.py:210` (208), `pytorch/selective_mixed_precision.py:394 compute_module_stats` (207); extract `_prepare_*`/`_post_process_*` private methods.
- **`olive/data/config.py:24, 44` `DataConfig` has 4 fixed component fields**: Adding a new data component type requires changing the class; follow `pass_config.py` and change to `components: dict[str, DataComponentConfig]`.

### Testing

- **Engine / Workflow test names do not follow AGENTS.md convention**: `test/engine/test_engine.py:46,76,93,112`, `test/workflows/test_workflow_run.py` use `test_register`, `test_run`, etc., lacking "expected behavior"; rename to `test_<func>_<expected>[_when_<cond>]`.
- **Key subdirectories lack conftest files**: `test/engine/`, `test/search/`, `test/model/`, `test/systems/`; convert factory functions in `test/utils.py` to conftest fixtures.
- **`test_aimet_quantization.py` uses `skipif(CUDA_AVAILABLE)` to exclude GPU environments**: `test/passes/onnx/test_aimet_quantization.py:144,197,223,269`; use a `TEST_DEVICE` environment variable or a dedicated marker instead.
- **`test/assets/` contains only `user_script.py`**: No shared ONNX files or golden outputs; commit common dummy ONNX files as static assets to reduce runtime generation.
- **`olive/exception.py` / `olive/logging.py` have no tests**: Add `test/test_exception.py` / `test/test_logging.py`.

### Documentation

- **Minor README / getting-started issues**: `README.md:98` "the the"; `README.md:71-79` instructions mention Llama-3.2-1B-Instruct but the actual command uses Qwen2.5-0.5B-Instruct; `docs/source/getting-started/getting-started.md:55` "60secs" is a significant underestimate.
- **`notebooks/README.md:3,6` typos**: "ocmmand" → "command", "hot wo" → "how to".
- **`docs/architecture.md` is not in the Sphinx toctree**: Users cannot find it from the documentation site; move into `docs/source/` or add to the toctree.
- **`docs/source/conf.py:65` `suppress_warnings = ["myst.xref_missing"]`**: Suppresses missing internal cross-reference warnings, potentially masking broken links; at least keep a version with this enabled in CI.
- **`docs/source/index.md:30` `features/index` links to `.rst` while others use `.md`**: Style inconsistency.
- **No Python API example notebooks**: All existing notebooks are CLI-based.

### CLI / Exit Codes

- **Exit codes are not standardized**: Argument errors use `raise ValueError` (exit 1) and `parser.error()` (exit 2) inconsistently; scripts cannot distinguish them.
- **`run_pass.py:183-200` `_list_passes` has no `--json` format**: Inconvenient for tool integration.
- **Internal single-letter pass keys leak into `--save_config_file` output**: `capture_onnx.py:340-347` (`"c"/"m"/"b"/"f"`), `generate_adapter.py:89` (`"e"`), `finetune.py:159` (`"f"`); change to readable key names.
- **`--modality choices=["text"]` has only one option**: `olive/cli/optimize.py:155-163`; hide or mark as internal.
- **`session_params_tuning.py:149` prompts to "set log_level to 1" but 1 is INFO; debugging requires DEBUG (0)**.

### Dependencies / Packaging

- **`pyproject.toml` lacks `[build-system]` and `[project]`**: Not PEP 517/518 compliant; inconvenient for dependabot/`pip inspect`.
- **`triton` is a hard dependency in `bnb`/`finetune` extras**: Linux + CUDA only; will fail in non-GPU environments.
- **`hf-xet` is a core dependency**: It is a performance acceleration package, not a functional dependency; consider moving to an optional extra.
- **`MANIFEST.in` does not explicitly include olive data files**: sdist may miss data files (wheel is fine via `package_data`).
- **`setup.py:47-48` is missing macOS classifier**: Olive is pure Python and can run on macOS.
- **`setup.py:40` "3 - Alpha"**: Consider advancing to Beta.
- **`test/requirements-test.txt:14` `neural-compressor<2.4` has no explanatory comment**.

### CI/CD / Performance / Other

- **`.github/workflows/test-model-fast.yml` does not upload failure artifacts**: Logs are destroyed when the runner exits; add `actions/upload-artifact@v4` with `if: failure()`.
- **`lint.yml:72` SARIF upload has `continue-on-error: true`**: Silent failure may silently drop security scan results.
- **`docs.yml:43` `make linkcheck` may block PRs due to unavailable external links**: Consider running only on schedule trigger or adding `--ignore` patterns.
- **`quantization.py:306-337` `deepcopy` on a module-level config dict**: Minor overhead; can be changed to a shallow copy.
- **`cache.py:795-806` evaluation cache is keyed only on model_id, without an evaluator configuration hash**: If a user changes the evaluation dataset but the pass configuration is unchanged, the cache does not invalidate; introduce an evaluator hash.
- **`search/search_results.py:36` uses `None`-padded sparse list**: Change to `dict[int, tuple]`.
- **`optuna_sampler.py:73-78` `while True` loop skips duplicate points**: May produce many wasted trials when the search space is nearly exhausted.
- **`OnnxQuantization._initialize()` depends on destructor to clean up temporary directory**: `quantization.py:289`; exception paths may leak; change to a contextmanager or try/finally.
- **`olive/passes/openvino/optimum_intel.py:275` silently `except ImportError: pass`**: Users see an `AttributeError` instead of a "missing dependency" message.
- **`telemetry_extensions.py:57, 126` `@action` uses `inspect.stack()`**: Currently only decorates `Engine.run()` so risk is limited; future expansion will require session-level caching.

---

## Appendix: Per-Dimension Investigation Summaries

### Architecture (findings-arch.md)
Olive has clear inheritance contracts and a Pydantic v2 configuration foundation across the Pass / Engine / System / Evaluator / Search / Model layers — a reasonable overall structure for a mid-to-large project. The main architectural debt concentrates in reverse dependencies (`systems → workflows`, `cache → model/passes`, `passes → evaluator`, `platform_sdk → cli`, `model/handler/mixin → cache`) and god modules (`olive_evaluator.py` at 2059 lines, `engine.py` at 844 lines, `cli/base.py` at 768 lines, `common/utils.py` at 721 lines). The `HfModelHandler` 5-level MRO and dead `composite_parent` code need cleanup. Full 14 findings in `findings-arch.md`.

### Code Quality (findings-quality.md)
Key metrics: largest single function is 1207 lines (`get_had172`), 68 bare `except Exception` catches, 100+ TODO/FIXME items, pylint globally disabling 37 complexity rules. The most critical issues are critical failures silently swallowed in `cache.py`, missing `exc_info` in `_run_search` exception handling, and the matrix data embedded as code in `hadamard_utils.py`. The CLI layer mixes `print`/`logger`, and the "incorrect" surgeons in `graph_surgeries.py` remain production-accessible. Full 17 findings in `findings-quality.md`.

### Testing (findings-tests.md)
Of 184 test files, **only 1 runs in CI**; `test_slicegpt.py` has an `or True` permanent skip; `test_python_environment_system.py` and `test_extract_adapters.py` are end-to-end integration tests disguised as unit tests; 3 locations perform HF downloads with no skip guard; `telemetry` / `platform_sdk` have zero tests; `pyproject.toml` registers only the `openvino` marker. Engine/Workflow test method names do not follow AGENTS.md convention. Full 15 findings in `findings-tests.md`.

### Documentation (findings-docs.md)
README + Sphinx main structure is good (CLI docs strongly synchronized with code via `sphinxarg.ext`), but there are 3 P0 issues: `conf.py` version 0.12.1 vs actual 0.11.0.dev0; `architecture.md` Pass base class API has changed (`default_config` → `_default_config`) without being updated; `auto-opt` deprecated status is not communicated at any of three entry points (runtime warning filtered by default log_level, `cli-auto-opt.md`, `reference/cli.rst`). Docstring coverage is 22–47%. Full list in `findings-docs.md`.

### CLI (findings-cli.md)
The biggest issue is inconsistent parameter naming style (kebab-case and snake_case mixed across different commands and even within the same file, with the Azure credential parameter names `--account_name` vs `--account` being a real observable bug); `type=bool` in `capture-onnx-graph` does not work as expected; multiple `raise ValueError` calls expose raw tracebacks; debug `print` calls in `quantize.py` pollute stdout; `--log_level` uses integers without a `--verbose/--quiet` shortcut; `olive --help` lists 19 subcommands flat with no grouping. Full 10 sections in `findings-cli.md`.

### Dependencies / Packaging (findings-deps.md)
H1: ruff `target-version="py39"` contradicts `python_requires=">=3.10"`; H2: `numpy/torch/transformers/onnx` have no production upper bounds but tests document known breaking changes; H3: `tf` extra `tensorflow==1.15.0` cannot be installed on Python 3.10+. Medium-priority issues: `onnxscript` bounds conflict, `openvino` extras/test version mismatch, dead `olive.auto_optimizer` entry in `setup.py` `package_data`, missing aggregate `all` extra. Full 11 sections in `findings-deps.md`.

### CI/CD (findings-ci.md)
4 workflows (lint/codeql/docs/test-model-fast). Core issues: unit tests are almost entirely absent from CI (183/184 not running); all GitHub Actions references use unpinned mutable tags (supply-chain risk); no Dependabot; no platform/Python version matrix (all ubuntu+3.12); no pip caching; no static type checking or dependency security scanning; pre-commit and CI lint check sets are inconsistent. Full 12 sections in `findings-ci.md`.

### Performance (findings-perf.md)
13 performance anti-patterns, with 3 high-severity: search/pass execution is fully serial with no parallel evaluation, `PythonEnvironmentSystem` spawns a new subprocess per iteration, ONNX models are loaded fully into memory with no streaming. Medium-severity issues include Pydantic hot-path round-trips, O(n²) Pareto frontier, and repeated `onnx.load` calls inside quantization passes. Adding new Passes / Systems / Samplers is low-cost; the extensibility design is sound. Telemetry concurrency design is correct. Full list in `findings-perf.md`.

---

*Report generated: 2026-06-12 · Strictly read-only review. No modifications were made to the Olive repository. All artifacts are stored in the session files/ directory.*


Uh oh!

Codebase review: prioritized recommendations across architecture, testing, CI, deps, CLI, docs & performance #2516

Description

Microsoft Olive — Read-Only Codebase Review

Executive Summary

High-Priority Recommendations

Architecture

[Quick Fix] Disable or fix Surgeons marked "incorrect" in graph_surgeries.py

[Refactor] Eliminate the reverse dependency from olive.systems to olive.workflows

[Refactor] Eliminate the reverse dependency from olive/passes to olive/evaluator

[Refactor] Eliminate olive/cache.py reverse dependencies on domain layers

Code Quality

[Quick Fix] Multiple critical failures silently swallowed in olive/cache.py

[Quick Fix] Bare exception catch in _run_search loses critical context

[Refactor] ~90% of the 4144-line hadamard_utils.py is hand-written matrix literals

Testing

[Quick Fix] or True in test_slicegpt.py permanently skips tests and hides real failures

[Refactor] Add a test-unit.yml workflow covering core modules

[Quick Fix] Three locations perform direct HF downloads with no skip guard

Documentation

[Quick Fix] docs/source/conf.py version is inconsistent with the actual package version

[Quick Fix] docs/architecture.md Pass base class API has changed but documentation has not been updated

[Quick Fix] Mark auto-opt as deprecated at all entry points and fix the default log level

CLI / UX

[Quick Fix] --num-splits and other parameters globally mix kebab-case and snake_case

[Quick Fix] type=bool in capture-onnx-graph does not work as expected

[Quick Fix] CLI ValueError thrown directly causes users to see tracebacks

[Quick Fix] Mixed print() and logger.* in CLI; multiple debug prints pollute stdout

Dependencies / Packaging

[Quick Fix] Add upper bounds to critical dependencies in requirements.txt

[Quick Fix] Fix the inconsistency between ruff target-version and python_requires

[Quick Fix] Remove or update the uninstallable tf extra (tensorflow==1.15.0)

CI/CD

[Quick Fix] All GitHub Actions references use unpinned SHAs (supply-chain risk)

[Refactor] Enable pip caching and dependency security scanning

Performance

[Refactor] Passes / search points are strictly serial — no multi-core utilization in search mode

[Refactor] PythonEnvironmentSystem spawns a new subprocess for every pass / evaluation

Medium-Priority Recommendations

Architecture / Code Quality

Testing

Documentation

CLI / UX

Dependencies / Packaging

CI/CD

Performance

Low-Priority Recommendations

Code Quality / Style

Testing

Documentation

CLI / Exit Codes

Dependencies / Packaging

CI/CD / Performance / Other

Appendix: Per-Dimension Investigation Summaries

Architecture (findings-arch.md)

Code Quality (findings-quality.md)

Testing (findings-tests.md)

Documentation (findings-docs.md)

CLI (findings-cli.md)

Dependencies / Packaging (findings-deps.md)

CI/CD (findings-ci.md)

Performance (findings-perf.md)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

[Quick Fix] Disable or fix Surgeons marked "incorrect" in `graph_surgeries.py`

[Refactor] Eliminate the reverse dependency from `olive.systems` to `olive.workflows`

[Refactor] Eliminate the reverse dependency from `olive/passes` to `olive/evaluator`

[Refactor] Eliminate `olive/cache.py` reverse dependencies on domain layers

[Quick Fix] Multiple critical failures silently swallowed in `olive/cache.py`

[Quick Fix] Bare exception catch in `_run_search` loses critical context

[Refactor] ~90% of the 4144-line `hadamard_utils.py` is hand-written matrix literals

[Quick Fix] `or True` in `test_slicegpt.py` permanently skips tests and hides real failures

[Refactor] Add a `test-unit.yml` workflow covering core modules

[Quick Fix] `docs/source/conf.py` version is inconsistent with the actual package version

[Quick Fix] `docs/architecture.md` Pass base class API has changed but documentation has not been updated

[Quick Fix] Mark `auto-opt` as deprecated at all entry points and fix the default log level

[Quick Fix] `--num-splits` and other parameters globally mix kebab-case and snake_case

[Quick Fix] `type=bool` in `capture-onnx-graph` does not work as expected

[Quick Fix] CLI `ValueError` thrown directly causes users to see tracebacks

[Quick Fix] Mixed `print()` and `logger.*` in CLI; multiple debug prints pollute stdout

[Quick Fix] Add upper bounds to critical dependencies in `requirements.txt`

[Quick Fix] Fix the inconsistency between `ruff target-version` and `python_requires`

[Quick Fix] Remove or update the uninstallable `tf` extra (`tensorflow==1.15.0`)

[Refactor] `PythonEnvironmentSystem` spawns a new subprocess for every pass / evaluation