-
Notifications
You must be signed in to change notification settings - Fork 30
Update default UDP reader interface setup (INADDR_ANY), clarify bind and sockopts logic #552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2,6 +2,7 @@ | |
|
|
||
| import logging | ||
| import socket | ||
| import struct | ||
|
|
||
| from logger.readers.reader import Reader # noqa: E402 | ||
|
|
||
|
|
@@ -93,14 +94,6 @@ def __init__(self, interface=None, port=None, mc_group=None, | |
| if mc_group: | ||
| # resolve once in constructor | ||
| mc_group = socket.gethostbyname(mc_group) | ||
| if not interface: | ||
| # multicast needs to specify interface to use, so let's pick a | ||
| # sane default | ||
| # | ||
| # NOTE: This means hostname cannot be an alias to localhost, or | ||
| # you won't be able to send IGMP packets correctly. | ||
| # | ||
| self.interface = socket.gethostbyname(socket.gethostname()) | ||
|
|
||
| self.mc_group = mc_group | ||
|
|
||
|
|
@@ -154,8 +147,6 @@ def _open_socket(self): | |
| "specify the interface to use by passing its IP address as the " | ||
| "`interface` parameter. (You can ignore this message if you're " | ||
| "actually just doing loopback testing.)") | ||
| sock.setsockopt(socket.IPPROTO_IP, socket.IP_MULTICAST_IF, | ||
| socket.inet_aton(self.interface)) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We really do need set the multicast interface. Otherwise people running this on multi-homed machines won't be able to specify which interface to receive multicast traffic on.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually, now that I've said that, I think you're right... you only have to do the IP_MULTICAST_IF call if you're sending multicast traffic. The 2nd param to the IP_ADD_MEMBERSHIP call below is how we specify the interface to receive on.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's my understanding as well -- IP_MULTICAST_IF for sending, IP_ADD_MEMBERSHIP for receiving 👍 |
||
|
|
||
| # join the group via IGMP | ||
| # | ||
|
|
@@ -164,11 +155,21 @@ def _open_socket(self): | |
| # could use struct.pack("4s4s", ...) to create a struct to | ||
| # pass into setsockopt() | ||
| # | ||
| sock.setsockopt(socket.IPPROTO_IP, socket.IP_ADD_MEMBERSHIP, | ||
| socket.inet_aton(self.mc_group) + socket.inet_aton(self.interface)) | ||
| if self.interface: | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As suggested above, just blindly use
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm totally happy in principle to avoid having conditional code here, however it's a little fiddlier than just setting An advantage of the conditional code is that it leaves your battle-tested implementation in place for the one path, and uses our Investigator-battle-tested implementation for the other path. No strong feelings about this one though, happy to do what the crowd is happy with :)
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yep, I missed that. I still like setting it while checking params at the top of the function and then just using it, just make sure it's the same data type. So something like this: if interface:
# resolve once in constructor
self.interface = socket.inet_aton(socket.gethostbyname(interface))
else:
self.interface = socket.INADDR_ANYThen further down we can just use While we're at it, we should also prep if mc_group:
# resolve once in constructor
self.mc_group = socket.inet_aton(socket.gethostbyname(mc_group))
else:
self.mc_group = NoneThat way both
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agree with what you've suggested here, thanks -- slightly under the pump with other work but will aim to push a commit soon!
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No problem, I know the feeling. |
||
| sock.setsockopt(socket.IPPROTO_IP, socket.IP_ADD_MEMBERSHIP, | ||
| socket.inet_aton(self.mc_group) + socket.inet_aton(self.interface)) | ||
| else: | ||
| mreq = struct.pack("4sl", socket.inet_aton(self.mc_group), socket.INADDR_ANY) | ||
| sock.setsockopt( | ||
| socket.IPPROTO_IP, | ||
| socket.IP_ADD_MEMBERSHIP, | ||
| mreq | ||
| ) | ||
|
|
||
|
|
||
| # bind to mc_group:port | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The intent here, even though you don't have to bind to the mc_group IP, was to hopefully have
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fair enough, happy to roll this back 👍 |
||
| sock.bind((self.mc_group, self.port)) | ||
| # bind | ||
| # don't need to specify mc_group address because we have already joined above | ||
| sock.bind(('', self.port)) | ||
|
|
||
| else: | ||
| # broadcast or unicast, bind to specificed interface | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way this was written, we always set the multicast interface using self.interface, but we try to guess a sane default if it wasn't specified. I was trying to avoid using INADDR_ANY by default for this because if your system has multiple ethernet interfaces, the OS will "pick one", the behavior of which is undefined as far as I know, almost definitely varies by OS, and probably not what you want at least half the time.
If what you'd like to see happen is let the OS pick if we haven't specified (which is admittedly probably more likely to be expected by developers), instead of making this resolv-your-hostname guess, then I'd say we should simply set
self.interface = socket.INADDR_ANYright here and leave the existing IP_ADD_MEMBERSHIP setsockopt code below alone.But then, if we're changing the class's behavior, you have to update the docstring at the top of init as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is interesting and I'd like to understand better how our environments differ, because at least in my environment, I would never expect the current implementation to provide a sane default, but would usually expect the proposed implementation to.
From my side, we are deploying on an Ubuntu server. As you would expect for a 'normal' Ubuntu / Debian server, there is an entry in
/etc/hostslike127.0.1.1 my-example-hostname. As our/etc/nsswitch.confconfiguration is also left to defaults,/etc/hostswill be preferred over DNS as a source for host resolution. So, if I callsocket.gethostbyname(socket.gethostname())on my system, I am going to be returned127.0.1.1, which is not a usable interface for multicast because it's in the loopback range.In contrast, notwithstanding that the OS might not pick the right or best interface if I specify socket.INADDR_ANY, it should at least pick a non-loopback interface, and (I won't pretend to have a great understanding of what's happening under the hood, but) an interface that it expects to be able to receive on from the specified multicast group.
Curious how this reads from your side, is there anything obviously different in our environments that would account for our different experiences?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is practically a religious debate among old-school Linux users. ;-) Traditionally, RHEL does NOT set the hostname to an alias in
/ec/hostsbecause servers should be able to resolve their IPs (and it's actually considered INCORECT for your hostname to resolve to localhost). In contrast, Fedora sets a localhost-ish alias to the hostname in/etc/hostsjust like Ubuntu now does, because these distros cater to home/mobile users. Ubuntu Server does the same thing as regular Ubuntu (pretty sure), but then they're practically the same product compared to the RHEL/Fedora split.So, in a RHEL environment (at least as of a few years ago), it's perfectly safe to assume your hostname resolves to your actual IP address. I primarily use Ubuntu (now, because new job was all Debian/Ubuntu), but can't stand the localhost alias and remove it on my servers (I've run down multiple troubleshooting rabbit holes over the years that went off the rails because of bad assumptions being made because the hostname is resolving to localhost). I even go so far as to patch the OpenRVDAS install script to NOT make that alias (pretty sure it still does that).
That being said, years removed from writing this code, I agree you're right: the use case for my resolve-hostname-to-guess is overcomplicated and is going to definitely be wrong on a non-trivial percentage of distros. Just using INADDR_ANY will at least be correct all the time on systems with only a single NIC. Defaulting to INADDR_ANY seems much more "general case".
The problem is that in an environment with multiple network interfaces, the kernel CANNOT make a smart choice. Multicast traffic doesn't follow normal routing rules, so it's going to do something silly but hopefully at least deterministic (e.g., always pick the interface with the lowest MAC address). So if you have 2 interfaces, say eth0 on 10.1.10.0/24 and eth1 on 192.168.1.0/24, and INADDR_ANY happens to cause eth1 to be used, your IGMP messages go to 192.168.1.255. But if the actual multicast server is sitting on 10.1.10.0/24, it won't get your IGMP message at all... and you'll get no multicast traffic even though you "joined the group".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation @veggiemike , I've never worked with RHEL so that's news to me but useful to know! And point well taken RE: kernel not being able to make a smart choice between multiple NICs.
Notwithstanding, I'm taking away that you're happy to adopt
INADDR_ANYas the default case all things considered (correct me if I'm wrong)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely. 😎