Skip to content

fix: Surface exec credential plugin stderr on cluster unreachable errors#4215

Open
rshade wants to merge 1 commit intomasterfrom
uplevel-error
Open

fix: Surface exec credential plugin stderr on cluster unreachable errors#4215
rshade wants to merge 1 commit intomasterfrom
uplevel-error

Conversation

@rshade
Copy link
Copy Markdown
Contributor

@rshade rshade commented Mar 9, 2026

Summary

Fixes #4212.

When a kubeconfig uses an exec-based credential plugin (aws eks get-token, gke-gcloud-auth-plugin, kubelogin, etc.) and authentication fails, users previously saw only a generic error:

error: configured Kubernetes cluster is unreachable: unable to load schema
information from the API server: command terminated with exit code 1

The actionable diagnostic — expired credentials, missing IAM role, wrong region — was written by the subprocess to os.Stderr of the provider process, which the Pulumi engine does not forward to the terminal, silently discarding it.

After this change, the same failure surfaces as:

error: configured Kubernetes cluster is unreachable: unable to load schema
information from the API server: command terminated with exit code 1

exec credential plugin output:
Error: token refresh failed: credentials have expired, please run `aws sso login`

Changes

  • provider/pkg/provider/util.go: Adds execCredentialStderr, which re-runs the exec credential plugin with stderr captured in a buffer. Called only on the failure path — no performance impact on the happy path.
  • provider/pkg/provider/provider.go: On getResources() failure, calls execCredentialStderr and appends any output to clusterUnreachableReason.
  • provider/pkg/provider/util_test.go: Unit tests covering nil input, no-exec config, stderr capture, context override selection, and nil-panic safety.

Test plan

  • go test ./pkg/provider/... -run TestExecCredentialStderr — all 5 cases pass
  • Manual test: configure a kubeconfig with an exec plugin that fails (e.g., expired AWS SSO session) and run pulumi preview — verify the plugin's stderr appears in the error message without needing -v=9

Proposed changes

Related issues (optional)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 9, 2026

Does the PR have any schema changes?

Looking good! No breaking changes found.
No new resources/functions.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves diagnostics when Kubernetes exec-based credential plugins fail by capturing and surfacing the plugin’s stderr in the “cluster unreachable” error path, helping users see actionable auth failures (e.g., expired credentials) without requiring verbose logs.

Changes:

  • Add a helper to re-run the active kubeconfig exec plugin with stderr captured for diagnostics.
  • Append captured exec-plugin stderr to clusterUnreachableReason when schema discovery (getResources()) fails.
  • Add unit tests covering nil/no-exec configs, stderr capture, and context override behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
provider/pkg/provider/util.go Adds execCredentialStderr helper to re-run exec plugins and capture stderr.
provider/pkg/provider/provider.go Uses execCredentialStderr to enrich cluster-unreachable errors on schema load failure.
provider/pkg/provider/util_test.go Adds unit tests validating stderr capture and override selection logic.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 9, 2026

Codecov Report

❌ Patch coverage is 39.62264% with 32 lines in your changes missing coverage. Please review.
✅ Project coverage is 44.76%. Comparing base (0d10713) to head (9ae027f).
⚠️ Report is 15 commits behind head on master.

Files with missing lines Patch % Lines
provider/pkg/provider/util.go 40.00% 11 Missing and 16 partials ⚠️
provider/pkg/await/checker/checker.go 0.00% 2 Missing ⚠️
provider/pkg/provider/provider.go 60.00% 1 Missing and 1 partial ⚠️
provider/pkg/await/checker/pod/check.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4215      +/-   ##
==========================================
- Coverage   45.01%   44.76%   -0.26%     
==========================================
  Files          93       93              
  Lines       10471    10445      -26     
==========================================
- Hits         4714     4676      -38     
+ Misses       5334     5259      -75     
- Partials      423      510      +87     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@guineveresaenger
Copy link
Copy Markdown
Contributor

Hi @rshade, thank you for working on this! Do you need assistance? 💟

I think we can justify a little CHANGELOG entry here, since it's user-facing.

@rshade
Copy link
Copy Markdown
Contributor Author

rshade commented Mar 12, 2026

Hey @guineveresaenger I think I am going to ask claude if it can figure out everywhere this might leak a secret first before I convert it to a full PR. I didn't really like it having to do a separate exec just to bubble up the error, but it looks like the only way. Thank you for the offer of help. I will get it out, and definitely on the changelog.

@rshade rshade requested a review from Copilot March 17, 2026 21:09
@rshade rshade marked this pull request as ready for review March 17, 2026 21:09
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines 17 to 53
import (
_ "embed" // Needed to support go:embed directive
_ "embed" // Needed to support goembed directive

v1 "k8s.io/api/core/v1"

pschema "github.com/pulumi/pulumi/pkg/v3/codegen/schema"
)

var serviceSpec = pschema.ComplexTypeSpec{
ObjectTypeSpec: pschema.ObjectTypeSpec{
Properties: map[string]pschema.PropertySpec{
"type": {
TypeSpec: pschema.TypeSpec{
OneOf: []pschema.TypeSpec{
{Type: "string"},
{Ref: "#/types/kubernetes:core/v1:ServiceSpecType"},
},
},
},
},
},
}

var serviceSpecType = pschema.ComplexTypeSpec{
ObjectTypeSpec: pschema.ObjectTypeSpec{
Type: "string",
},
Enum: []pschema.EnumValueSpec{
{Value: v1.ServiceTypeExternalName},
{Value: v1.ServiceTypeClusterIP},
{Value: v1.ServiceTypeNodePort},
{Value: v1.ServiceTypeLoadBalancer},
},
}

//go:embed examples/overlays/chartV3.md
// goembed examples/overlays/chartV3.md
var helmV3ChartMD string
Comment on lines 26 to 32
var providerName = "kubernetes"

//go:embed schema-embed.json
// goembed schema-embed.json
var pulumiSchema []byte

//go:embed terraform-mapping-embed.json
// goembed terraform-mapping-embed.json
var terraformMapping []byte
Comment on lines +265 to +276
var stderrBuf bytes.Buffer
cmd := exec.CommandContext(execCtx, authInfo.Exec.Command, authInfo.Exec.Args...) //nolint:gosec
cmd.Stdout = io.Discard
cmd.Stderr = &stderrBuf
cmd.Stdin = strings.NewReader("")
cmd.Env = os.Environ()
for _, e := range authInfo.Exec.Env {
cmd.Env = append(cmd.Env, e.Name+"="+e.Value)
}
_ = cmd.Run() // exit code doesn't matter; we only want whatever was written to stderr
out := strings.TrimSpace(stderrBuf.String())
if len(out) > execStderrMaxBytes {
Comment on lines +266 to +273
cmd := exec.CommandContext(execCtx, authInfo.Exec.Command, authInfo.Exec.Args...) //nolint:gosec
cmd.Stdout = io.Discard
cmd.Stderr = &stderrBuf
cmd.Stdin = strings.NewReader("")
cmd.Env = os.Environ()
for _, e := range authInfo.Exec.Env {
cmd.Env = append(cmd.Env, e.Name+"="+e.Value)
}
Comment on lines +627 to +633
t.Run("exec plugin stderr is captured", func(t *testing.T) {
cfg := makeConfig(&clientcmdapi.AuthInfo{
Exec: &clientcmdapi.ExecConfig{
Command: "sh",
Args: []string{"-c", "echo 'credentials expired' >&2; exit 1"},
APIVersion: "client.authentication.k8s.io/v1",
},
When a kubeconfig uses an exec-based credential plugin (`aws eks get-token`, `gke-gcloud-auth-plugin`, `kubelogin`, etc.) and authentication fails, users previously saw only a generic error:

```
error: configured Kubernetes cluster is unreachable: unable to load schema
information from the API server: command terminated with exit code 1
```

The actionable diagnostic — expired credentials, missing IAM role, wrong region — was written by the subprocess to `os.Stderr` of the provider process, which the Pulumi engine does not forward to the terminal, silently discarding it.

After this change, the same failure surfaces as:

```
error: configured Kubernetes cluster is unreachable: unable to load schema
information from the API server: command terminated with exit code 1

exec credential plugin output:
Error: token refresh failed: credentials have expired, please run `aws sso login`
```

- **`provider/pkg/provider/util.go`**: Adds `execCredentialStderr`, which re-runs the exec credential plugin with stderr captured in a buffer. Called only on the failure path — no performance impact on the happy path.
- **`provider/pkg/provider/provider.go`**: On `getResources()` failure, calls `execCredentialStderr` and appends any output to `clusterUnreachableReason`.
- **`provider/pkg/provider/util_test.go`**: Unit tests covering nil input, no-exec config, stderr capture, context override selection, and nil-panic safety.

- [x] `go test ./pkg/provider/... -run TestExecCredentialStderr` — all 5 cases pass
- [ ] Manual test: configure a kubeconfig with an exec plugin that fails (e.g., expired AWS SSO session) and run `pulumi preview` — verify the plugin's stderr appears in the error message without needing `-v=9`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Exec credential plugin stderr is hidden from users when the cluster is unreachable

3 participants