Skip to content
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 19 additions & 14 deletions logger/readers/udp_reader.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@

import logging
import socket
<<<<<<< HEAD
=======
import sys
import struct
>>>>>>> cd2d034 (Move to using socket.IPADDR_ANY as the fallback interface for multicast read, and remove redundant IP_MULTICAST_IF sockopt)

from logger.readers.reader import Reader # noqa: E402

Expand Down Expand Up @@ -93,14 +98,6 @@ def __init__(self, interface=None, port=None, mc_group=None,
if mc_group:
# resolve once in constructor
mc_group = socket.gethostbyname(mc_group)
if not interface:
# multicast needs to specify interface to use, so let's pick a
# sane default
#
# NOTE: This means hostname cannot be an alias to localhost, or
# you won't be able to send IGMP packets correctly.
#
self.interface = socket.gethostbyname(socket.gethostname())

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this was written, we always set the multicast interface using self.interface, but we try to guess a sane default if it wasn't specified. I was trying to avoid using INADDR_ANY by default for this because if your system has multiple ethernet interfaces, the OS will "pick one", the behavior of which is undefined as far as I know, almost definitely varies by OS, and probably not what you want at least half the time.

If what you'd like to see happen is let the OS pick if we haven't specified (which is admittedly probably more likely to be expected by developers), instead of making this resolv-your-hostname guess, then I'd say we should simply set self.interface = socket.INADDR_ANY right here and leave the existing IP_ADD_MEMBERSHIP setsockopt code below alone.

But then, if we're changing the class's behavior, you have to update the docstring at the top of init as well.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting and I'd like to understand better how our environments differ, because at least in my environment, I would never expect the current implementation to provide a sane default, but would usually expect the proposed implementation to.

From my side, we are deploying on an Ubuntu server. As you would expect for a 'normal' Ubuntu / Debian server, there is an entry in /etc/hosts like 127.0.1.1 my-example-hostname. As our /etc/nsswitch.conf configuration is also left to defaults, /etc/hosts will be preferred over DNS as a source for host resolution. So, if I call socket.gethostbyname(socket.gethostname()) on my system, I am going to be returned 127.0.1.1, which is not a usable interface for multicast because it's in the loopback range.

In contrast, notwithstanding that the OS might not pick the right or best interface if I specify socket.INADDR_ANY, it should at least pick a non-loopback interface, and (I won't pretend to have a great understanding of what's happening under the hood, but) an interface that it expects to be able to receive on from the specified multicast group.

Curious how this reads from your side, is there anything obviously different in our environments that would account for our different experiences?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting and I'd like to understand better how our environments differ, because at least in my environment, I would never expect the current implementation to provide a sane default, but would usually expect the proposed implementation to.

From my side, we are deploying on an Ubuntu server. As you would expect for a 'normal' Ubuntu / Debian server, there is an entry in /etc/hosts like 127.0.1.1 my-example-hostname. As our /etc/nsswitch.conf configuration is also left to defaults, /etc/hosts will be preferred over DNS as a source for host resolution. So, if I call socket.gethostbyname(socket.gethostname()) on my system, I am going to be returned 127.0.1.1, which is not a usable interface for multicast because it's in the loopback range.

This is practically a religious debate among old-school Linux users. ;-) Traditionally, RHEL does NOT set the hostname to an alias in /ec/hosts because servers should be able to resolve their IPs (and it's actually considered INCORECT for your hostname to resolve to localhost). In contrast, Fedora sets a localhost-ish alias to the hostname in /etc/hosts just like Ubuntu now does, because these distros cater to home/mobile users. Ubuntu Server does the same thing as regular Ubuntu (pretty sure), but then they're practically the same product compared to the RHEL/Fedora split.

So, in a RHEL environment (at least as of a few years ago), it's perfectly safe to assume your hostname resolves to your actual IP address. I primarily use Ubuntu (now, because new job was all Debian/Ubuntu), but can't stand the localhost alias and remove it on my servers (I've run down multiple troubleshooting rabbit holes over the years that went off the rails because of bad assumptions being made because the hostname is resolving to localhost). I even go so far as to patch the OpenRVDAS install script to NOT make that alias (pretty sure it still does that).

That being said, years removed from writing this code, I agree you're right: the use case for my resolve-hostname-to-guess is overcomplicated and is going to definitely be wrong on a non-trivial percentage of distros. Just using INADDR_ANY will at least be correct all the time on systems with only a single NIC. Defaulting to INADDR_ANY seems much more "general case".

In contrast, notwithstanding that the OS might not pick the right or best interface if I specify socket.INADDR_ANY, it should at least pick a non-loopback interface, and (I won't pretend to have a great understanding of what's happening under the hood, but) an interface that it expects to be able to receive on from the specified multicast group.

The problem is that in an environment with multiple network interfaces, the kernel CANNOT make a smart choice. Multicast traffic doesn't follow normal routing rules, so it's going to do something silly but hopefully at least deterministic (e.g., always pick the interface with the lowest MAC address). So if you have 2 interfaces, say eth0 on 10.1.10.0/24 and eth1 on 192.168.1.0/24, and INADDR_ANY happens to cause eth1 to be used, your IGMP messages go to 192.168.1.255. But if the actual multicast server is sitting on 10.1.10.0/24, it won't get your IGMP message at all... and you'll get no multicast traffic even though you "joined the group".

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation @veggiemike , I've never worked with RHEL so that's news to me but useful to know! And point well taken RE: kernel not being able to make a smart choice between multiple NICs.

Notwithstanding, I'm taking away that you're happy to adopt INADDR_ANY as the default case all things considered (correct me if I'm wrong)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely. 😎


self.mc_group = mc_group

Expand Down Expand Up @@ -154,8 +151,6 @@ def _open_socket(self):
"specify the interface to use by passing its IP address as the "
"`interface` parameter. (You can ignore this message if you're "
"actually just doing loopback testing.)")
sock.setsockopt(socket.IPPROTO_IP, socket.IP_MULTICAST_IF,
socket.inet_aton(self.interface))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really do need set the multicast interface. Otherwise people running this on multi-homed machines won't be able to specify which interface to receive multicast traffic on.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, now that I've said that, I think you're right... you only have to do the IP_MULTICAST_IF call if you're sending multicast traffic. The 2nd param to the IP_ADD_MEMBERSHIP call below is how we specify the interface to receive on.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my understanding as well -- IP_MULTICAST_IF for sending, IP_ADD_MEMBERSHIP for receiving 👍


# join the group via IGMP
#
Expand All @@ -164,11 +159,21 @@ def _open_socket(self):
# could use struct.pack("4s4s", ...) to create a struct to
# pass into setsockopt()
#
sock.setsockopt(socket.IPPROTO_IP, socket.IP_ADD_MEMBERSHIP,
socket.inet_aton(self.mc_group) + socket.inet_aton(self.interface))
if self.interface:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As suggested above, just blindly use self.interface as originally written w/out any conditional code. We've either been passed in a value or already assigned it to INADDR_ANY.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm totally happy in principle to avoid having conditional code here, however it's a little fiddlier than just setting self.interface = socket.INADDR_ANY in the constructor, because the existing code here calls inet_aton on the value, which we can't pass socket.INADDR_ANY to. We could set self.interface = socket.inet_aton(interface) or socket.INADDR_ANY in the constructor to avoid conditional logic here.

An advantage of the conditional code is that it leaves your battle-tested implementation in place for the one path, and uses our Investigator-battle-tested implementation for the other path.

No strong feelings about this one though, happy to do what the crowd is happy with :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm totally happy in principle to avoid having conditional code here, however it's a little fiddlier than just setting self.interface = socket.INADDR_ANY in the constructor, because the existing code here calls inet_aton on the value, which we can't pass socket.INADDR_ANY to. We could set self.interface = socket.inet_aton(interface) or socket.INADDR_ANY in the constructor to avoid conditional logic here.

Yep, I missed that. I still like setting it while checking params at the top of the function and then just using it, just make sure it's the same data type. So something like this:

        if interface:
            # resolve once in constructor                                                                                               
            self.interface = socket.inet_aton(socket.gethostbyname(interface))
        else:
            self.interface = socket.INADDR_ANY

Then further down we can just use self.interface consistently w/out calling inet_aton.

While we're at it, we should also prep self.mc_group in the constructor if specified (i.e., make sure if inet_aton is gonna fail, it happens during construction). So stick something like this right after we set self.interface:

        if mc_group:
            # resolve once in constructor
            self.mc_group = socket.inet_aton(socket.gethostbyname(mc_group))
        else:
            self.mc_group = None

That way both self.interface and self.mc_group are ready to get passed into getsockopt calls after construction.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with what you've suggested here, thanks -- slightly under the pump with other work but will aim to push a commit soon!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, I know the feeling.

sock.setsockopt(socket.IPPROTO_IP, socket.IP_ADD_MEMBERSHIP,
socket.inet_aton(self.mc_group) + socket.inet_aton(self.interface))
else:
mreq = struct.pack("4sl", socket.inet_aton(self.mc_group), socket.INADDR_ANY)
sock.setsockopt(
socket.IPPROTO_IP,
socket.IP_ADD_MEMBERSHIP,
mreq
)


# bind to mc_group:port

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent here, even though you don't have to bind to the mc_group IP, was to hopefully have netstat -pnave --udp output display the mc_group IP instead of INADDR_ANY. That being said, I don't recall if it actually had that effect. It does work, though, and is simply more explicit than binding to INADDR_ANY.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, happy to roll this back 👍

sock.bind((self.mc_group, self.port))
# bind
# don't need to specify mc_group address because we have already joined above
sock.bind(('', self.port))

else:
# broadcast or unicast, bind to specificed interface
Expand Down