docs: add scheduling framework developer guide with example plugins#2782
docs: add scheduling framework developer guide with example plugins#2782gyliu513 wants to merge 1 commit intokubernetes-sigs:mainfrom
Conversation
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: gyliu513 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @gyliu513. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Tip We noticed you've done this a few times! Consider joining the org to skip this step and gain Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
9b9f919 to
f699572
Compare
f699572 to
acef1c2
Compare
What type of PR is this?
/kind documentation
What this PR does / why we need it:
Summary
Adds a developer guide and working examples that demonstrate how to extend the
scheduling framework with custom out-of-tree plugins and how to build a custom EPP
binary.
What's included
examples/scheduler/— A standalone demo (make run-example EXAMPLE=scheduler)that wires together four custom plugin implementations covering every extension
point in the scheduling pipeline:
Filter—ModelAffinityFilterkeeps only endpoints serving the requested model.Scorer—LeastLoadedScorerprefers endpoints with more spare model capacity.Picker—TopKRandomPickerselects randomly from the top-K highest-scored candidates.ProfileHandler—LoggingProfileHandleradds tracing output to the scheduling lifecycle.plugins/register.go— CentralisedRegisterAllPlugins()following the sameexplicit-registration pattern as llm-d-inference-scheduler.
examples/custom-epp/— Shows how to build a production EPP binary thatembeds out-of-tree plugins (
make build-example EXAMPLE=custom-epp).examples/README.md— Comprehensive developer guide covering:EndpointPickerConfigexample referencing custom plugins.InferencePoolselection diagram.Makefile— Two new targets:make run-example EXAMPLE=<name>— run an example locally.make build-example EXAMPLE=<name>— build an example binary tobin/.Which issue(s) this PR fixes:
Fixes #856
Does this PR introduce a user-facing change?:
/cc @ahg-g