-
Notifications
You must be signed in to change notification settings - Fork 737
Reverse DNS look-ups are inconsistent #2160
Description
I have deployed etcd-operator with helm and have the following cluster spec:
apiVersion: etcd.database.coreos.com/v1beta2
kind: EtcdCluster
metadata:
name: coredns-etcd-cluster
spec:
size: 3
From my understanding based on the documentation here, etcd with TLS enabled will do a reverse lookup based on the ip address of the etcd pod to check if the incoming request is valid. However, when I run nslookup <PEER_IP_ADDR> from an etcd pod, I get inconsistent results:
/ # nslookup 10.11.3.99
nslookup: can't resolve '(null)': Name does not resolve
Name: 10.11.3.99
Address 1: 10.11.3.99 10-11-3-99.coredns-etcd-cluster-client.dns.svc.cluster.local
/ # nslookup 10.11.3.99
nslookup: can't resolve '(null)': Name does not resolve
Name: 10.11.3.99
Address 1: 10.11.3.99 coredns-etcd-cluster-t9rjxhtc96.coredns-etcd-cluster.dns.svc.cluster.local
Half the time, the reverse lookup will give the incorrect client service DNS name of the form pod-ip.coredns-etc-cluster-client.*. This will cause the peer TLS communication to fail since this is not of the form *.coredns-etcd-cluster.*.
I first discovered this on a newly created k8s cluster (v1.17.2) when trying to deploy Cilium with the managed etcd. Cilium internally uses the etcd-operator to create their etcd cluster and I saw the etcd pod logs flooded with these messages:
2020-02-14 03:29:44.693313 I | embed: rejected connection from "10.11.4.148:53696" (error "tls: \"10.11.4.148\" does not match any of DNSNames [\"*.cilium-etcd.kube-system.svc\" \"*.cilium-etcd.kube-system.svc.cluster.local\"]", ServerName "cilium-etcd-8svdg9rhbc.cilium-etcd.kube-system.svc", IPAddresses [], DNSNames ["*.cilium-etcd.kube-system.svc" "*.cilium-etcd.kube-system.svc.cluster.local"])
So I created my own etcd operator deployment and validated that from one etcd pod, a reverse lookup for the IP address of a peer etcd pod will return different values.
The only time that the reverse DNS lookup is consistent is when the pod is looking up its own DNS name since it is written into /etc/hosts.
Can somebody please help investigate to see if they can replicate this and if this issue lies with how etcd-operator is creating the etcd pods?