Skip to content
This repository was archived by the owner on Mar 28, 2020. It is now read-only.
This repository was archived by the owner on Mar 28, 2020. It is now read-only.

Reverse DNS look-ups are inconsistent #2160

@benfuu

Description

@benfuu

I have deployed etcd-operator with helm and have the following cluster spec:

apiVersion: etcd.database.coreos.com/v1beta2
kind: EtcdCluster
metadata:
  name: coredns-etcd-cluster
spec:
  size: 3

From my understanding based on the documentation here, etcd with TLS enabled will do a reverse lookup based on the ip address of the etcd pod to check if the incoming request is valid. However, when I run nslookup <PEER_IP_ADDR> from an etcd pod, I get inconsistent results:

/ # nslookup 10.11.3.99
nslookup: can't resolve '(null)': Name does not resolve

Name:      10.11.3.99
Address 1: 10.11.3.99 10-11-3-99.coredns-etcd-cluster-client.dns.svc.cluster.local
/ # nslookup 10.11.3.99
nslookup: can't resolve '(null)': Name does not resolve

Name:      10.11.3.99
Address 1: 10.11.3.99 coredns-etcd-cluster-t9rjxhtc96.coredns-etcd-cluster.dns.svc.cluster.local

Half the time, the reverse lookup will give the incorrect client service DNS name of the form pod-ip.coredns-etc-cluster-client.*. This will cause the peer TLS communication to fail since this is not of the form *.coredns-etcd-cluster.*.

I first discovered this on a newly created k8s cluster (v1.17.2) when trying to deploy Cilium with the managed etcd. Cilium internally uses the etcd-operator to create their etcd cluster and I saw the etcd pod logs flooded with these messages:

2020-02-14 03:29:44.693313 I | embed: rejected connection from "10.11.4.148:53696" (error "tls: \"10.11.4.148\" does not match any of DNSNames [\"*.cilium-etcd.kube-system.svc\" \"*.cilium-etcd.kube-system.svc.cluster.local\"]", ServerName "cilium-etcd-8svdg9rhbc.cilium-etcd.kube-system.svc", IPAddresses [], DNSNames ["*.cilium-etcd.kube-system.svc" "*.cilium-etcd.kube-system.svc.cluster.local"])

So I created my own etcd operator deployment and validated that from one etcd pod, a reverse lookup for the IP address of a peer etcd pod will return different values.

The only time that the reverse DNS lookup is consistent is when the pod is looking up its own DNS name since it is written into /etc/hosts.

Can somebody please help investigate to see if they can replicate this and if this issue lies with how etcd-operator is creating the etcd pods?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions