Skip to content

feat(studio): add vLLM 4-bit export via Auto-Round (AWQ/GPTQ)#4837

Draft
OnePunchMonk wants to merge 4 commits intounslothai:mainfrom
OnePunchMonk:feat/vllm-4bit-export
Draft

feat(studio): add vLLM 4-bit export via Auto-Round (AWQ/GPTQ)#4837
OnePunchMonk wants to merge 4 commits intounslothai:mainfrom
OnePunchMonk:feat/vllm-4bit-export

Conversation

@OnePunchMonk
Copy link
Copy Markdown
Contributor

@OnePunchMonk OnePunchMonk commented Apr 3, 2026

Closes #4761

  • Add save_to_vllm_4bit() in unsloth/save.py using Intel Auto-Round
  • Add ExportVllm4bitRequest Pydantic schema with format/bits/group_size
  • Add export_vllm_4bit() to ExportBackend, Orchestrator, and Worker
  • Add POST /api/export/export/vllm4bit route
  • Add exportVllm4bit() frontend API client
  • Add vllm4bit export method + VLLM_QUANT_OPTIONS constants
  • Add 4-bit format picker UI in export-page.tsx
  • Fix missing ExportVllm4bitRequest import in routes/export.py
  • Fix missing VLLM_QUANT_OPTIONS import in export-page.tsx

- Add save_to_vllm_4bit() in unsloth/save.py using Intel Auto-Round
- Add ExportVllm4bitRequest Pydantic schema with format/bits/group_size
- Add export_vllm_4bit() to ExportBackend, Orchestrator, and Worker
- Add POST /api/export/export/vllm4bit route
- Add exportVllm4bit() frontend API client
- Add vllm4bit export method + VLLM_QUANT_OPTIONS constants
- Add 4-bit format picker UI in export-page.tsx
- Fix missing ExportVllm4bitRequest import in routes/export.py
- Fix missing VLLM_QUANT_OPTIONS import in export-page.tsx
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@OnePunchMonk OnePunchMonk marked this pull request as ready for review April 4, 2026 11:18
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dcf58747b6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

import type { TrainingMethod } from "@/types/training";

export type ExportMethod = "merged" | "lora" | "gguf";
export type ExportMethod = "merged" | "lora" | "gguf" | "vllm4bit";
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: Isn't this a subset of "merged" case?

@rolandtannous rolandtannous marked this pull request as draft April 6, 2026 08:26
@rolandtannous
Copy link
Copy Markdown
Collaborator

@OnePunchMonk please rename methods, classes, etc to be more autoround specific, so for example ExportAutoRound4bit instead of ExportVllm4bit and so on

@OnePunchMonk OnePunchMonk force-pushed the feat/vllm-4bit-export branch from dcf5874 to c6997bc Compare April 7, 2026 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]Request: 4-bit Quantized Export (AWQ/GPTQ) for vLLM in Unsloth Studio

3 participants