feat(provider): add read-only Kubernetes provider by bussyjd · Pull Request #102 · 0xff-ai/omnifs

bussyjd · 2026-06-09T16:35:20Z

Summary

Adds omnifs-provider-kubernetes: a read-only WASM provider that projects a
Kubernetes cluster as a browsable filesystem. Resource types — including CRDs —
are discovered live from the API server, so the tree reflects whatever the
cluster actually serves, at the versions it serves.

/namespaces/<ns>/<type>/<name>/{manifest.yaml,manifest.json,status.yaml,events.txt}
/namespaces/<ns>/pods/<name>/logs/<container>.log
/cluster/<type>/<name>/{manifest.yaml,manifest.json,status.yaml}

Standard tooling works on every leaf: cat, grep -r, find, diff, tar,
tail.

Transport & key design decisions

Transport-agnostic over the host callout. Every request is
cx.http().get(HttpEndpoint::build_url(...)). The recommended endpoint is a
local kubectl proxy --unix-socket — the same unix: callout transport the
Docker provider uses. kubectl terminates TLS and injects the active context's
credentials, so the provider issues plain HTTP, never handles a token, and
works against any cluster kubectl can reach (mTLS / EKS-GKE exec plugins /
OIDC / custom CA all handled upstream). An https:// API server with
system-trust TLS + bearer token also works. No host changes — the host
grants the socket automatically from config.endpoint
(materialize_runtime_capabilities).
One mount = one pinned cluster/context. The FS does not change when you
kubectl config use-context; to browse another cluster, add another mount.
This respects the per-mount credential binding and the read-only read model
(a writable "switch context" control file was rejected as it violates both).
Resource-as-directory, so status, events, and pod logs have a home
and the manifest leaf keeps an honest stat/wc -c.
Optional hide_empty_types: list only types that currently have instances
(batched limit=1 probes via the SDK's join_all); empty types stay
navigable via lookup.

How correctness was ensured (methodology)

This was built grounded-first and verified adversarially, not from memory:

Research + codebase analysis (multi-agent workflow). Fetched current
Kubernetes API/kubeconfig docs and analyzed the omnifs SDK router, host
callout/capability path, caching model, and auth — producing a grounded
design with the genuine open decisions (context handling, transport, resource
representation) surfaced as options.
Independently verified the feasibility crux by reading the host code: the
shared HTTPS client has no custom-CA/mTLS and the capability checker denies
private IPs — which is why the kubectl proxy unix-socket transport was
chosen (it sidesteps both with zero host changes), rather than assuming a
native HTTPS client would work.
Implemented against the real SDK contract — every router/projection/
capture/HTTP API was matched to the actual source (db/docker/github
providers as templates), not guessed.
Adversarial correctness review (4 dimensions — routing/listing,
discovery/HTTP, manifest/host-load, SDK-contract — each finding independently
verified): 0 confirmed bugs.

kubectl / client-go parity validation. Cloned kubernetes/client-go and
kubernetes/kubectl (current) and ran an 18-agent match-matrix with
per-finding verification against the upstream source and its tests. This
confirmed three real divergences, which are fixed here, and caught one
false positive (an agent claimed kubectl get doesn't strip
managedFields; verified that NewGetPrintFlags composes cli-runtime's
default --show-managed-fields=false path — so the strip is correct and was
kept).

Fixes derived directly from upstream contracts:

Behavior	Upstream reference	Result
Event field selector	`event_expansion.go` `GetFieldSelector` → name+namespace+kind+uid	was name-only → now full selector (no cross-kind / prior-incarnation leakage)
Object cleaning	`kubectl get` strips only `managedFields`, keeps last-applied	was stripping both → now managedFields only
Discovery versions	client-go `ServerPreferredResources` (all versions, preferred wins)	was preferred-only → now all versions, preferred-first (CRDs served only in a non-preferred version now surface)

Validated

cargo test — 15 unit tests: a route-seal test (proves the literal-pods
logs routes coexist with the {rtype} capture without ambiguity — otherwise
only caught at runtime), capture parsing/traversal guards, discovery
scope/subresource/collision/multi-version dedup, the event field selector, and
event rendering.
cargo clippy --target wasm32-wasip2 -- -D warnings — clean.
Release component build (wasm32-wasip2).

Out of scope / follow-ups (documented in the README)

No live watch — reads are point-in-time; re-cat re-fetches. Polling-based
invalidation (periodic LIST keyed by resourceVersion → cache invalidation,
via the runtime refresh-interval/timer-tick) is the natural follow-up.
Pod logs are a current snapshot per container (= kubectl logs <pod> -c);
follow/--previous/--timestamps need the ranged/volatile file path.
describe.txt omitted (a faithful per-kind describe renderer is large;
manifest/status/events cover the same data).
Listings issue a single unpaginated LIST (returns the full collection —
no silent truncation); chunked listing (limit/continue) is a follow-up for
very large namespaces.
Live-cluster integration testing across versions (omnifs dev +
kubectl proxy against kind clusters at several minor versions) is not
expressible as a unit test and remains the integration follow-up.

A WASM provider that projects a Kubernetes cluster as a browsable, read-only filesystem. Resource types (including CRDs) are discovered live from the API server, so the tree reflects whatever the cluster serves. Layout (resource-as-directory): /namespaces/<ns>/<type>/<name>/{manifest.yaml,manifest.json,status.yaml,events.txt} /namespaces/<ns>/pods/<name>/logs/<container>.log /cluster/<type>/<name>/{manifest.yaml,manifest.json,status.yaml} Transport: the provider is transport-agnostic over the host callout. The recommended endpoint is a local `kubectl proxy --unix-socket` (the same `unix:` callout transport the Docker provider uses): kubectl terminates TLS and injects the active-context credentials, so the provider issues plain HTTP, never handles a token, and works against any cluster kubectl can reach (mTLS, EKS/GKE exec plugins, OIDC, custom CA all handled upstream). An `https://` API server with system-trust TLS + bearer token also works. The host grants the socket automatically from `config.endpoint`; no host changes. Design decisions: one mount = one pinned cluster/context (the FS does not change on `kubectl config use-context`; add a mount per cluster), matching the per-mount credential model and read-only read model. Correctness was validated against the upstream kubectl/client-go source and tests; the contract-derived behaviors: - Discovery walks /api/v1 + every group, querying each group's versions preferred-first (matches client-go ServerPreferredResources): a multi-version resource resolves to its preferred version while a resource present only in a non-preferred version still surfaces. - events.txt filters by involvedObject.{name,namespace,kind,uid} (matches event_expansion.go GetFieldSelector), so a same-named object of another kind, or a prior incarnation, doesn't leak events. - manifest.{yaml,json} strip only metadata.managedFields (as `kubectl get` does by default since v1.21) and preserve the last-applied-configuration annotation, matching `kubectl get` output. - Plural collisions across groups disambiguate to <plural>.<group>. Optional `hide_empty_types` config: list only resource types with at least one instance (batched limit=1 probes); empty types stay navigable via lookup. Validated: cargo nextest/test (15 unit tests incl. a route-seal test and kubectl-parity selector/discovery tests), wasm32-wasip2 clippy -D warnings, release component build. Live-cluster validation across versions (omnifs dev + kubectl proxy against kind) remains the integration follow-up. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

bussyjd · 2026-06-09T16:51:02Z

Tests & validation

Unit tests (15 — inline `#[cfg(test)]`)

The PR does include tests; they're inline #[cfg(test)] mod tests (12 in src/api.rs, 3 in src/lib.rs), run by cargo test, not a separate tests/ dir — which is why they may have looked absent. Inline is the only convention that fits: providers compile to wasm32-wasip2 and can't execute on the host test harness (per AGENTS.md, WASM tests build but only --no-run), so host-runnable tests live inline. Every provider in the repo that has tests does it this way; for context this is currently the most-tested provider here (others range 0–5).

They guard behavior that would otherwise fail only at runtime or silently diverge from kubectl — not plumbing:

routes_seal_without_ambiguity — the generated provider calls router.seal() on first use, so route ambiguity is otherwise a runtime failure. This proves the literal pods logs routes coexist with the {rtype} capture without overlap.
Discovery — scope classification, subresource (name containing /) filtering, group-qualified plural collisions, and multi-version preferred-first dedup (a resource in several versions resolves to the preferred one; a resource present only in a non-preferred version still surfaces).
event_field_selector_matches_kubectl_fields — the involvedObject.{name,namespace,kind,uid} selector matching client-go's GetFieldSelector.
Capture parsing — path-traversal / separator rejection; <container>.log stem extraction.
Manifest cleaning — managedFields stripped, last-applied-configuration kept (matching kubectl get); panic-safety on malformed shapes.
Rendering — event table formatting, .status extraction, container enumeration order.

How correctness was established

Wire behaviors were validated against the upstream kubernetes/client-go + kubernetes/kubectl source and their tests (not from memory). That produced the three fixes in this PR: the full event field selector, managedFields-only cleaning (kubectl get keeps last-applied), and all-versions/preferred-first discovery.

Live cluster run

Validated against a live k3s v1.35.1 cluster (k3d) by pointing the provider's transport — a read-only kubectl proxy --unix-socket — at it and issuing the provider's exact requests. (curl --unix-socket mirrors how the host turns the provider's unix://<hex>/path URLs into socket calls.)

Check	Result
Transport to a loopback + admin client-cert API server	✅ reached via the proxy — the case native HTTPS can't do
Discovery `/api/v1` + `/apis` + per-group	✅ parses (39 core resources, 28 groups)
Real multi-version groups (`autoscaling`, `gateway.networking.k8s.io`, `monitoring.coreos.com`)	✅ confirms the preferred-first fix is load-bearing, not theoretical
Namespace + collection listings	✅ resolve to names
Object GET carries `managedFields` and `last-applied`	✅ validates both cleaning decisions on a real object
StatefulSet manifest + `status`	✅
Pod logs `…/pods/<p>/log?container=…&tailLines=…`	✅
`hide_empty_types` `limit=1` probe (empty → 0, non-empty → 1)	✅
Event selector accepted (cluster had 0 live events; returns cleanly)	✅

The cluster was only ever read; the proxy was torn down afterward.

Not yet covered: the host↔WASM FUSE mount end-to-end (omnifs dev), which needs Kubernetes added to the dev built-ins + the proxy socket bind-mounted into the dev container — tracked as the integration follow-up.

How to actually use it

Run a read-only proxy against your chosen context. kubectl handles TLS / mTLS / exec plugins / OIDC / context selection, so the provider never sees a token:
```
kubectl proxy --unix-socket=/run/omnifs/k8s.sock \
  --reject-methods='POST,PUT,PATCH,DELETE'
```

Mount config — the host grants the socket automatically from config.endpoint (no separate capabilities.unix_sockets entry needed):

{
  "provider": "omnifs_provider_kubernetes.wasm",
  "mount": "k8s",
  "config": { "endpoint": "unix:///run/omnifs/k8s.sock", "hide_empty_types": false }
}

Browse with ordinary tools:

cat  /omnifs/k8s/namespaces/<ns>/deployments/<name>/manifest.yaml
cat  /omnifs/k8s/namespaces/<ns>/deployments/<name>/status.yaml
tail /omnifs/k8s/namespaces/<ns>/pods/<pod>/logs/<container>.log
cat  /omnifs/k8s/cluster/nodes/<node>/manifest.yaml
grep -rh 'image:' /omnifs/k8s/namespaces/<ns>/deployments

/namespaces/<ns> lists every namespaced type (CRDs included via live discovery); /cluster the cluster-scoped ones. Set hide_empty_types: true to list only types that currently have instances.
One mount = one pinned cluster/context; the FS does not change on kubectl config use-context — add a mount per cluster.
An https:// endpoint also works for clusters reachable with system-trust TLS + a bearer token in mount auth; the unix:// + kubectl proxy path is recommended and the only one that reaches local/mTLS/exec-plugin clusters today.

Full details, FS layout, and limitations are in providers/kubernetes/README.md.

raulk · 2026-06-09T17:26:57Z

Oh, hey! Thanks for the contribution ❤️ Taking a look

Findings from an adversarial review plus a full FUSE end-to-end run against a live k3s cluster (kubectl proxy over a unix socket inside the dev container): - Pod logs 406'd through kubectl proxy: the apiserver's content negotiation rejects `Accept: text/plain` on the log subresource even though it streams text. Send `Accept: */*` (what curl/kubectl effectively send). Found only by the live FUSE run; fixed and re-verified (app/init/sidecar logs, tail, wc). - Live listings are now `open` (non-exhaustive) instead of `exhaustive`, so every readdir re-lists from the API instead of freezing the first enumeration in the host's no-TTL dirent cache. Verified live: a pod created after the first `ls` appears on the next one. This also keeps the hide_empty_types contract honest (hidden types stay resolvable). - Root discovery failures (/api/v1, /apis) now propagate instead of being swallowed: a transient error there must not be cached for the session as a half-empty catalog (which could also invert bare-plural collision naming). Per-group-version failures are still skipped, and all group-version fetches run in one batched callout round. - Path segments reject URL metacharacters (`%`, `?`, `#`, control chars): `cat 'pods/x?watch=true/...'` used to smuggle a query through the raw URL path (holding a watch open); now ENOENT. `%` is forbidden by Kubernetes' own ValidatePathSegmentName, so nothing legal is lost; RBAC names with `:` keep working (verified live). - Removed the five DirIntent::Lookup existence-check branches: the router resolves capture-dir lookups statically before handlers run, so they were dead code (and the only unvalidated URL interpolation). - Object reads project their siblings: manifest.yaml/manifest.json/ status.yaml render from one GET and preload the other two (<=64 KiB inline cap); events.txt preloads all three from the object it already fetches for the uid. - Event field-selector values escape `\` `,` `=` like kubectl's fields.EscapeValue; recurring events.k8s.io events read series{count,lastObservedTime} like kubectl's printer. - Dropped the dead Rc<RefCell<...>> around the discovery cache in favor of cx.state_mut. - README/manifest: corrected the https-transport claim (no bearer-token injection exists in v1), documented that the proxy socket must be reachable inside the runtime container, the always-visible pods scaffolding entry under hide_empty_types, the YAML 1.1 quoting divergence from kubectl, and unbounded pod-log reads. Live verification: manifest.json is byte-identical (jq -S) to kubectl get -o json against k3s v1.34.1; CRDs surface automatically; grep -r/find/du/wc/stat/md5sum/cp/diff/tar all behave through FUSE; nonexistent objects/types and metachar names return ENOENT.

bussyjd · 2026-06-09T22:46:25Z

Final review pass: live FUSE end-to-end + adversarial review → `68918af`

This PR got a last full review: three independent review passes over the diff (SDK-contract, Kubernetes-semantics, repo-conventions) plus — for the first time — a complete FUSE end-to-end run against a live cluster: a disposable k3s v1.34.1 container, kubectl proxy --unix-socket running inside the omnifs dev container on the exact socket the mount expects, and the whole surface exercised with real shell tools through /omnifs/k8s.

The bug only the live run could catch

cat pods/<pod>/logs/<container>.log failed with EINVAL. The apiserver's content negotiation rejects Accept: text/plain on the log subresource with 406 Not Acceptable — even though the endpoint streams text — and 4xx maps to invalid-input → EINVAL. Earlier validation probed the HTTP contract with curl (whose default is Accept: */*), so the provider's own request shape was never exercised. Fixed to send Accept: */*; re-verified live (app/init/sidecar logs, tail, wc -l).

Review findings fixed in `68918af`

Listings no longer freeze. All live listings were exhaustive, which the host caches with no TTL — after one ls, new pods were invisible for the session. Now open: every readdir re-lists. Verified live: pod created after first ls appears on the next; a Terminating pod stays visible exactly as long as kubectl get pods shows it.
URL metachar injection closed. cat 'pods/x?watch=true/manifest.yaml' used to smuggle a query through the raw URL path (e.g. holding a watch open). Segments now reject % ? # and control chars — fail-closed ENOENT, verified live. Nothing legal is lost: % is forbidden by k8s' own ValidatePathSegmentName, and system:* RBAC names keep working (verified live).
Root discovery failures propagate. A transient error on /api/v1 or /apis was swallowed and the half-empty catalog cached for the session (which could also invert bare-plural collision naming, worst case handing a pods CRD the logs/ subtree). Roots now fail loudly and retry on next browse; per-group-version failures are still skipped; all group-version fetches now run in one batched callout round.
Dead lookup branches removed. The router resolves capture-dir lookups statically before handlers run, so the five DirIntent::Lookup existence checks never executed — and carried the only unvalidated URL interpolation. Deleted.
Siblings project from one fetch (repo rule: "project all data you have already fetched"): reading any of manifest.yaml/manifest.json/status.yaml preloads the other two from the same GET (≤64 KiB inline cap); events.txt preloads all three from the object it already fetches for the uid.
kubectl parity tightened: event field-selector values escape \ , = like fields.EscapeValue; recurring events.k8s.io events read series{count,lastObservedTime} like kubectl's printer (was COUNT 1 at first-occurrence time).
Docs corrected: the https+bearer-token claim was wrong (v1 injects no Authorization header — unix:// is the supported transport); the proxy socket must be reachable inside the runtime container; pods is always listed under hide_empty_types (it anchors the logs/ scaffolding); YAML 1.1 quoting divergence from kubectl (yes/no unquoted) documented.

Live verification summary (k3s v1.34.1, through FUSE)

Check	Result
`manifest.json` vs `kubectl get -o json` (`jq -S` both)	byte-identical
managedFields stripped, last-applied kept	✓
CRD created live (`widgets.example.io`)	appears automatically, CR readable
Pod logs: app / terminated init / sidecar, `tail`, `wc`	✓ (after 406 fix)
`events.txt`	real scheduler events, kubectl-style table
Listing freshness (create/delete between `ls`)	mirrors API exactly
`grep -r`, `find`, `du`, `wc`, `stat`, `md5sum` (stable), `cp`+`diff`, `tar`	✓ (tar warns "file changed" from the host's learned-size promotion — pre-existing host behavior for `Size::Unknown` files, all providers)
`system:node` ClusterRole (colon name), `/cluster/nodes` status	✓
Nonexistent object / type / `x?watch=true`	ENOENT, ENOENT, ENOENT
No-socket mount in `omnifs dev`	container healthy, clean `Network` error on browse

Full check suite is green: cargo fmt, host tests (17 in this provider), wasm check/clippy -D warnings/test --no-run across all omnifs-provider-*/omnifs-tool-*, host workspace clippy + tests, and omnifs dev -y + smoke harness.

One note for maintainers: CI never ran on this PR (fork PR, workflow concluded action_required) — it needs a maintainer to approve the workflow run.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(provider): add read-only Kubernetes provider#102

feat(provider): add read-only Kubernetes provider#102
bussyjd wants to merge 2 commits into
0xff-ai:mainfrom
bussyjd:feat/kubernetes-provider

bussyjd commented Jun 9, 2026

Uh oh!

bussyjd commented Jun 9, 2026

Uh oh!

raulk commented Jun 9, 2026 •

edited

Loading

Uh oh!

bussyjd commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bussyjd commented Jun 9, 2026

Summary

Transport & key design decisions

How correctness was ensured (methodology)

Validated

Out of scope / follow-ups (documented in the README)

Uh oh!

bussyjd commented Jun 9, 2026

Tests & validation

Unit tests (15 — inline #[cfg(test)])

How correctness was established

Live cluster run

How to actually use it

Uh oh!

raulk commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bussyjd commented Jun 9, 2026

Final review pass: live FUSE end-to-end + adversarial review → 68918af

The bug only the live run could catch

Review findings fixed in 68918af

Live verification summary (k3s v1.34.1, through FUSE)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Unit tests (15 — inline `#[cfg(test)]`)

raulk commented Jun 9, 2026 •

edited

Loading

Final review pass: live FUSE end-to-end + adversarial review → `68918af`

Review findings fixed in `68918af`