Skip to content

Add CPU quality regression tests (ResNet50, Qwen3-0.6B, Whisper-tiny.en)#177

Open
roberto-laudani wants to merge 4 commits into
iree-org:mainfrom
roberto-laudani:add-resnet50-regression
Open

Add CPU quality regression tests (ResNet50, Qwen3-0.6B, Whisper-tiny.en)#177
roberto-laudani wants to merge 4 commits into
iree-org:mainfrom
roberto-laudani:add-resnet50-regression

Conversation

@roberto-laudani

@roberto-laudani roberto-laudani commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Adds an end-to-end CPU quality regression for microsoft/resnet-50, Qwen/Qwen3-0.6B, and openai/whisper-tiny.en. IREE compiles the committed model.mlir, fetches the externalized parameters from the Hugging Face Hub (pinned by revision), runs on llvm-cpu, and compares the output logits against the committed reference.

@roberto-laudani roberto-laudani marked this pull request as draft June 18, 2026 10:01
@AGindinson AGindinson marked this pull request as ready for review June 18, 2026 11:09
@AGindinson AGindinson requested review from Manewing and chrsmcgrr June 18, 2026 11:09
Signed-off-by: Roberto Laudani <laudani@roofline.ai>
@roberto-laudani roberto-laudani force-pushed the add-resnet50-regression branch from 4f4a653 to 928276b Compare June 18, 2026 11:11
@roberto-laudani

Copy link
Copy Markdown
Contributor Author

The red "Test Torch Models" check is the pre-existing scheduled_unet_compstat_cpu binary-size failure, tracked in #178 and is unrelated to this PR. The resnet50 test added here passes.

@roberto-laudani roberto-laudani changed the title Add ResNet50 CPU quality regression test Add CPU quality regression tests (ResNet50, Qwen3-0.6B) Jun 23, 2026
Signed-off-by: Roberto Laudani <laudani@roofline.ai>
@roberto-laudani roberto-laudani changed the title Add CPU quality regression tests (ResNet50, Qwen3-0.6B) Add CPU quality regression tests (ResNet50, Qwen3-0.6B, Whisper-tiny.en) Jun 23, 2026
@roberto-laudani roberto-laudani force-pushed the add-resnet50-regression branch from d9e707d to ffe38ee Compare June 23, 2026 08:25
Signed-off-by: Roberto Laudani <laudani@roofline.ai>
@roberto-laudani roberto-laudani force-pushed the add-resnet50-regression branch from ffe38ee to 9fb9b06 Compare June 23, 2026 08:29
@roberto-laudani roberto-laudani deleted the add-resnet50-regression branch June 23, 2026 09:02
@roberto-laudani roberto-laudani restored the add-resnet50-regression branch June 23, 2026 09:06
@chrsmcgrr chrsmcgrr requested a review from AGindinson June 23, 2026 09:17

@chrsmcgrr chrsmcgrr left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Would it also be possible to have the binary files on HF? We would want to avoid checking in binary files because they can't be diffed. Changing it later will cause a new copy in the history.

args = parser.parse_args()
args.out_dir.mkdir(parents=True, exist_ok=True)

print(f"Loading model {MODEL_ID}@{MODEL_REVISION} (f32)")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
print(f"Loading model {MODEL_ID}@{MODEL_REVISION} (f32)")
print(f"Loading model {MODEL_ID}@{MODEL_REVISION} (f32)", file=sys.stderr)

For scripts and binaries if its not an output and just logging info use stderr. Stdout is reserved for tool output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants