Problem
k8gb's IP discovery assumes the CoreDNS Service / Ingress LB status reflects publicly routable IPs. On bare-metal clusters behind 1:1 static NAT — a common colo / on-prem topology — this assumption breaks:
- The cluster's CoreDNS Service (or fallback Ingress with
k8gb.io/ip-source=true) only ever sees the private node IPs assigned by the cluster's L4 LB (e.g. klipper-lb on k3s, MetalLB in L2 mode, kube-vip).
- The colo perimeter does 1:1 SNAT/DNAT between private node IPs and a set of public IPs. The cluster itself has no awareness of those public IPs.
- k8gb publishes the private IPs as the NS glue records (
gslb-ns-<geoTag>-<loadBalancedZone> A records in EdgeDNS) and as localtargets-<host> records inside each Gslb's DNSEndpoint.
- External resolvers follow the NS delegation, hit the private IPs in the glue, and time out.
Reproduction
- Stand up any bare-metal-style k8s with a node-IP-based LoadBalancer controller (k3d, kind, k3s with klipper-lb).
- Install k8gb with
coredns.serviceType=LoadBalancer and any EdgeDNS provider (Cloudflare / Route53).
- Configure 1:1 NAT between node IPs and public IPs at the network perimeter (or simulate by treating node IPs as "private").
- Observe
gslb-ns-<geoTag>-<loadBalancedZone> A records in EdgeDNS contain node IPs.
Why existing knobs don't solve it
| Approach |
Why it doesn't work on bare-metal NAT |
coredns.serviceType: LoadBalancer reading Status.LoadBalancer.Ingress[].IP |
Bare-metal LBs assign private node IPs to that field |
Status.LoadBalancer.Ingress[].Hostname (FQDN lookup path in extractIPFromLB) |
klipper-lb doesn't populate Hostname; no annotation to inject one |
Ingress fallback with k8gb.io/ip-source=true |
Same root cause — bare-metal Ingress status only has private IPs |
Per-Gslb k8gb.io/exposed-ip-addresses annotation |
Only overrides the per-Gslb localtargets / A records; does NOT fix the NS glue records published to EdgeDNS |
The last point matters: even if every Gslb annotates itself with public IPs, the NS delegation in EdgeDNS still points to private IPs, so the whole chain is broken before resolvers ever reach the cluster's CoreDNS.
Proposed enhancement
Add a cluster-level override:
- New chart value
k8gb.edgeDNSPublicIPs ([]string, default empty)
- New env var
EDGE_DNS_PUBLIC_IPS (comma-separated)
- When set, k8gb uses these IPs in place of the discovered Service / Ingress IPs at the zone-delegation boundary so both the EdgeDNS NS glue records AND per-Gslb
localtargets-* / final A records use the override.
Suggested insertion point
controllers/zones/wrapper.go ZoneDelegationWrapper.GetDetail(). This covers both ExternalDNS and Infoblox providers, since both consume ExtendedZoneDelegation.LocalCoreDNSExposedIPs. Doing it here keeps the IPResolver unchanged and avoids special-casing provider code.
Alternatives considered
- Per-Gslb annotation only — partial fix; doesn't reach EdgeDNS glue records.
- External tooling to rewrite records post-publish — fights k8gb's TXT-registry ownership and creates reconcile churn.
- Custom CoreDNS Service hostname — requires the bare-metal LB to support hostname annotations; klipper-lb doesn't, and patching this in-cluster is brittle.
Use case
Bare-metal / colo BYOC clusters where the cluster sees only RFC1918 IPs but is exposed via static 1:1 NAT on the perimeter. Common in datacenter deployments where MetalLB / Cilium LB-IPAM with a public CIDR isn't an option (e.g. Cilium migration in progress, or perimeter team owns the NAT).
I have a working patch and can open a PR if the approach is acceptable.
Problem
k8gb's IP discovery assumes the CoreDNS Service / Ingress LB status reflects publicly routable IPs. On bare-metal clusters behind 1:1 static NAT — a common colo / on-prem topology — this assumption breaks:
k8gb.io/ip-source=true) only ever sees the private node IPs assigned by the cluster's L4 LB (e.g. klipper-lb on k3s, MetalLB in L2 mode, kube-vip).gslb-ns-<geoTag>-<loadBalancedZone>A records in EdgeDNS) and aslocaltargets-<host>records inside each Gslb's DNSEndpoint.Reproduction
coredns.serviceType=LoadBalancerand any EdgeDNS provider (Cloudflare / Route53).gslb-ns-<geoTag>-<loadBalancedZone>A records in EdgeDNS contain node IPs.Why existing knobs don't solve it
coredns.serviceType: LoadBalancerreadingStatus.LoadBalancer.Ingress[].IPStatus.LoadBalancer.Ingress[].Hostname(FQDN lookup path inextractIPFromLB)k8gb.io/ip-source=truek8gb.io/exposed-ip-addressesannotationThe last point matters: even if every Gslb annotates itself with public IPs, the NS delegation in EdgeDNS still points to private IPs, so the whole chain is broken before resolvers ever reach the cluster's CoreDNS.
Proposed enhancement
Add a cluster-level override:
k8gb.edgeDNSPublicIPs([]string, default empty)EDGE_DNS_PUBLIC_IPS(comma-separated)localtargets-*/ final A records use the override.Suggested insertion point
controllers/zones/wrapper.goZoneDelegationWrapper.GetDetail(). This covers both ExternalDNS and Infoblox providers, since both consumeExtendedZoneDelegation.LocalCoreDNSExposedIPs. Doing it here keeps the IPResolver unchanged and avoids special-casing provider code.Alternatives considered
Use case
Bare-metal / colo BYOC clusters where the cluster sees only RFC1918 IPs but is exposed via static 1:1 NAT on the perimeter. Common in datacenter deployments where MetalLB / Cilium LB-IPAM with a public CIDR isn't an option (e.g. Cilium migration in progress, or perimeter team owns the NAT).
I have a working patch and can open a PR if the approach is acceptable.