Skip to content

docs: add scheduling framework developer guide with example plugins#2782

Open
gyliu513 wants to merge 1 commit intokubernetes-sigs:mainfrom
gyliu513:scheduler-guide
Open

docs: add scheduling framework developer guide with example plugins#2782
gyliu513 wants to merge 1 commit intokubernetes-sigs:mainfrom
gyliu513:scheduler-guide

Conversation

@gyliu513
Copy link
Copy Markdown
Contributor

@gyliu513 gyliu513 commented Apr 4, 2026

What type of PR is this?

/kind documentation

What this PR does / why we need it:

Summary

Adds a developer guide and working examples that demonstrate how to extend the
scheduling framework with custom out-of-tree plugins and how to build a custom EPP
binary.

What's included

  • examples/scheduler/ — A standalone demo (make run-example EXAMPLE=scheduler)
    that wires together four custom plugin implementations covering every extension
    point in the scheduling pipeline:
    • FilterModelAffinityFilter keeps only endpoints serving the requested model.
    • ScorerLeastLoadedScorer prefers endpoints with more spare model capacity.
    • PickerTopKRandomPicker selects randomly from the top-K highest-scored candidates.
    • ProfileHandlerLoggingProfileHandler adds tracing output to the scheduling lifecycle.
    • plugins/register.go — Centralised RegisterAllPlugins() following the same
      explicit-registration pattern as llm-d-inference-scheduler.
  • examples/custom-epp/ — Shows how to build a production EPP binary that
    embeds out-of-tree plugins (make build-example EXAMPLE=custom-epp).
  • examples/README.md — Comprehensive developer guide covering:
    • Scheduling architecture (Filter → Score → Pick pipeline).
    • Step-by-step instructions for writing each plugin type.
    • How to build and deploy a custom EPP (in-tree and out-of-tree patterns).
    • YAML EndpointPickerConfig example referencing custom plugins.
    • Multi-EPP deployment with InferencePool selection diagram.
  • Makefile — Two new targets:
    • make run-example EXAMPLE=<name> — run an example locally.
    • make build-example EXAMPLE=<name> — build an example binary to bin/.

Which issue(s) this PR fixes:

Fixes #856

Does this PR introduce a user-facing change?:

NONE

/cc @ahg-g

@k8s-ci-robot k8s-ci-robot requested a review from ahg-g April 4, 2026 15:22
@k8s-ci-robot k8s-ci-robot added the kind/documentation Categorizes issue or PR as related to documentation. label Apr 4, 2026
@netlify
Copy link
Copy Markdown

netlify bot commented Apr 4, 2026

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit acef1c2
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/69d137787897b1000841249f
😎 Deploy Preview https://deploy-preview-2782--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: gyliu513
Once this PR has been reviewed and has the lgtm label, please assign ahg-g for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Apr 4, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @gyliu513. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Tip

We noticed you've done this a few times! Consider joining the org to skip this step and gain /lgtm and other bot rights. We recommend asking approvers on your previous PRs to sponsor you.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Apr 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/documentation Categorizes issue or PR as related to documentation. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Developer guide for the scheduling framework

2 participants