Skip to content

Update default UDP reader interface setup (INADDR_ANY), clarify bind and sockopts logic#552

Open
ce-rvi wants to merge 3 commits into
OceanDataTools:devfrom
ce-rvi:update-udp-reader-default-setup
Open

Update default UDP reader interface setup (INADDR_ANY), clarify bind and sockopts logic#552
ce-rvi wants to merge 3 commits into
OceanDataTools:devfrom
ce-rvi:update-udp-reader-default-setup

Conversation

@ce-rvi

@ce-rvi ce-rvi commented May 4, 2026

Copy link
Copy Markdown

Hi folks,

Opening this PR with some suggested changes to the UDP reader, see details below. Feedback welcomed, cheers!

  • This PR changes how the UDP reader selects a default interface for reading multicast traffic where one is not specified to the constructor.
    • Currently self.interface = socket.gethostbyname(socket.gethostname()) is used to derive a default value for the interface. This will resolve to 127.0.1.1 on many Linux machines which is not a useable interface for reading UDP multicast traffic.
    • The proposed alternative is to use socket.INADDR_ANY, which will let the OS pick a sensible default. This is the approach we currently use on RV Investigator (but would like to realign our reader implementation with what's upstream).

Additionally, this PR makes some minor changes in the interest of clarity:

  • Changes the socket bind for multicast from sock.bind((self.mc_group, self.port)) to sock.bind(('', self.port)). To my understanding there is no need to specify the group in the bind once the IGMP group has been joined with the preceding logic.
  • Removes the logic setting the socket.IP_MULTICAST_IF sockopt for multicast. This option only controls which interface is used for multicast send, and this class does not need to send.

ce-rvi added 2 commits May 4, 2026 17:36
…st read, and remove redundant IP_MULTICAST_IF sockopt

Signed-off-by: Charlie Evans <charles.evans@csiro.au>
Signed-off-by: Charlie Evans <charles.evans@csiro.au>
@davidpablocohn

Copy link
Copy Markdown
Collaborator

Thanks for this! I'd love to have @veggiemike sanity check this, as he's much closer to where the rubber meets the road with respect to UDP.

Comment thread logger/readers/udp_reader.py Outdated
@ce-rvi

ce-rvi commented May 12, 2026

Copy link
Copy Markdown
Author

Thanks for this! I'd love to have @veggiemike sanity check this, as he's much closer to where the rubber meets the road with respect to UDP.

Thanks David. I'd definitely appreciate @veggiemike 's perspective as there was obviously a reasoning behind defaulting the interface in this way for multicast which would be great to understand.

@veggiemike

Copy link
Copy Markdown
Contributor

FYI, I'm planning to review this sometime this week... just struggling to find the time. ;-)

@veggiemike

veggiemike commented May 23, 2026

Copy link
Copy Markdown
Contributor

Thanks for this! I'd love to have @veggiemike sanity check this, as he's much closer to where the rubber meets the road with respect to UDP.

Thanks David. I'd definitely appreciate @veggiemike 's perspective as there was obviously a reasoning behind defaulting the interface in this way for multicast which would be great to understand.

This is traditionally how I've done it on other projects, in an attempt to guess a "correct" multicast interface to use for the IGMP requests, hopefully limiting what has to get set in a config file. Since multicast traffic doesn't follow normal routing principles, you have to explicitly specify which ethernet interface to use when joining the IGMP group. Doesn't matter when you only have one, but does when you have multiple (at least, it did when I tested this thoroughly years ago).

Could alternatively default to localhost if unspecified, which is completely invalid for anything other than basic testing, and force users to specify their outgoing multicast interface all the time INADDR_ANY as you're suggesting and let the OS choose, but this "resolve your hostname" thing proved to be correct most of the time on the systems I was working on.

(Not sure why my brain jumped off the track and started talking about localhost up there... ignore that)

"`interface` parameter. (You can ignore this message if you're "
"actually just doing loopback testing.)")
sock.setsockopt(socket.IPPROTO_IP, socket.IP_MULTICAST_IF,
socket.inet_aton(self.interface))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really do need set the multicast interface. Otherwise people running this on multi-homed machines won't be able to specify which interface to receive multicast traffic on.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, now that I've said that, I think you're right... you only have to do the IP_MULTICAST_IF call if you're sending multicast traffic. The 2nd param to the IP_ADD_MEMBERSHIP call below is how we specify the interface to receive on.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my understanding as well -- IP_MULTICAST_IF for sending, IP_ADD_MEMBERSHIP for receiving 👍

)


# bind to mc_group:port

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent here, even though you don't have to bind to the mc_group IP, was to hopefully have netstat -pnave --udp output display the mc_group IP instead of INADDR_ANY. That being said, I don't recall if it actually had that effect. It does work, though, and is simply more explicit than binding to INADDR_ANY.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, happy to roll this back 👍

# NOTE: This means hostname cannot be an alias to localhost, or
# you won't be able to send IGMP packets correctly.
#
self.interface = socket.gethostbyname(socket.gethostname())

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this was written, we always set the multicast interface using self.interface, but we try to guess a sane default if it wasn't specified. I was trying to avoid using INADDR_ANY by default for this because if your system has multiple ethernet interfaces, the OS will "pick one", the behavior of which is undefined as far as I know, almost definitely varies by OS, and probably not what you want at least half the time.

If what you'd like to see happen is let the OS pick if we haven't specified (which is admittedly probably more likely to be expected by developers), instead of making this resolv-your-hostname guess, then I'd say we should simply set self.interface = socket.INADDR_ANY right here and leave the existing IP_ADD_MEMBERSHIP setsockopt code below alone.

But then, if we're changing the class's behavior, you have to update the docstring at the top of init as well.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting and I'd like to understand better how our environments differ, because at least in my environment, I would never expect the current implementation to provide a sane default, but would usually expect the proposed implementation to.

From my side, we are deploying on an Ubuntu server. As you would expect for a 'normal' Ubuntu / Debian server, there is an entry in /etc/hosts like 127.0.1.1 my-example-hostname. As our /etc/nsswitch.conf configuration is also left to defaults, /etc/hosts will be preferred over DNS as a source for host resolution. So, if I call socket.gethostbyname(socket.gethostname()) on my system, I am going to be returned 127.0.1.1, which is not a usable interface for multicast because it's in the loopback range.

In contrast, notwithstanding that the OS might not pick the right or best interface if I specify socket.INADDR_ANY, it should at least pick a non-loopback interface, and (I won't pretend to have a great understanding of what's happening under the hood, but) an interface that it expects to be able to receive on from the specified multicast group.

Curious how this reads from your side, is there anything obviously different in our environments that would account for our different experiences?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting and I'd like to understand better how our environments differ, because at least in my environment, I would never expect the current implementation to provide a sane default, but would usually expect the proposed implementation to.

From my side, we are deploying on an Ubuntu server. As you would expect for a 'normal' Ubuntu / Debian server, there is an entry in /etc/hosts like 127.0.1.1 my-example-hostname. As our /etc/nsswitch.conf configuration is also left to defaults, /etc/hosts will be preferred over DNS as a source for host resolution. So, if I call socket.gethostbyname(socket.gethostname()) on my system, I am going to be returned 127.0.1.1, which is not a usable interface for multicast because it's in the loopback range.

This is practically a religious debate among old-school Linux users. ;-) Traditionally, RHEL does NOT set the hostname to an alias in /ec/hosts because servers should be able to resolve their IPs (and it's actually considered INCORECT for your hostname to resolve to localhost). In contrast, Fedora sets a localhost-ish alias to the hostname in /etc/hosts just like Ubuntu now does, because these distros cater to home/mobile users. Ubuntu Server does the same thing as regular Ubuntu (pretty sure), but then they're practically the same product compared to the RHEL/Fedora split.

So, in a RHEL environment (at least as of a few years ago), it's perfectly safe to assume your hostname resolves to your actual IP address. I primarily use Ubuntu (now, because new job was all Debian/Ubuntu), but can't stand the localhost alias and remove it on my servers (I've run down multiple troubleshooting rabbit holes over the years that went off the rails because of bad assumptions being made because the hostname is resolving to localhost). I even go so far as to patch the OpenRVDAS install script to NOT make that alias (pretty sure it still does that).

That being said, years removed from writing this code, I agree you're right: the use case for my resolve-hostname-to-guess is overcomplicated and is going to definitely be wrong on a non-trivial percentage of distros. Just using INADDR_ANY will at least be correct all the time on systems with only a single NIC. Defaulting to INADDR_ANY seems much more "general case".

In contrast, notwithstanding that the OS might not pick the right or best interface if I specify socket.INADDR_ANY, it should at least pick a non-loopback interface, and (I won't pretend to have a great understanding of what's happening under the hood, but) an interface that it expects to be able to receive on from the specified multicast group.

The problem is that in an environment with multiple network interfaces, the kernel CANNOT make a smart choice. Multicast traffic doesn't follow normal routing rules, so it's going to do something silly but hopefully at least deterministic (e.g., always pick the interface with the lowest MAC address). So if you have 2 interfaces, say eth0 on 10.1.10.0/24 and eth1 on 192.168.1.0/24, and INADDR_ANY happens to cause eth1 to be used, your IGMP messages go to 192.168.1.255. But if the actual multicast server is sitting on 10.1.10.0/24, it won't get your IGMP message at all... and you'll get no multicast traffic even though you "joined the group".

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation @veggiemike , I've never worked with RHEL so that's news to me but useful to know! And point well taken RE: kernel not being able to make a smart choice between multiple NICs.

Notwithstanding, I'm taking away that you're happy to adopt INADDR_ANY as the default case all things considered (correct me if I'm wrong)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely. 😎

Comment thread logger/readers/udp_reader.py Outdated
#
sock.setsockopt(socket.IPPROTO_IP, socket.IP_ADD_MEMBERSHIP,
socket.inet_aton(self.mc_group) + socket.inet_aton(self.interface))
if self.interface:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As suggested above, just blindly use self.interface as originally written w/out any conditional code. We've either been passed in a value or already assigned it to INADDR_ANY.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm totally happy in principle to avoid having conditional code here, however it's a little fiddlier than just setting self.interface = socket.INADDR_ANY in the constructor, because the existing code here calls inet_aton on the value, which we can't pass socket.INADDR_ANY to. We could set self.interface = socket.inet_aton(interface) or socket.INADDR_ANY in the constructor to avoid conditional logic here.

An advantage of the conditional code is that it leaves your battle-tested implementation in place for the one path, and uses our Investigator-battle-tested implementation for the other path.

No strong feelings about this one though, happy to do what the crowd is happy with :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm totally happy in principle to avoid having conditional code here, however it's a little fiddlier than just setting self.interface = socket.INADDR_ANY in the constructor, because the existing code here calls inet_aton on the value, which we can't pass socket.INADDR_ANY to. We could set self.interface = socket.inet_aton(interface) or socket.INADDR_ANY in the constructor to avoid conditional logic here.

Yep, I missed that. I still like setting it while checking params at the top of the function and then just using it, just make sure it's the same data type. So something like this:

        if interface:
            # resolve once in constructor                                                                                               
            self.interface = socket.inet_aton(socket.gethostbyname(interface))
        else:
            self.interface = socket.INADDR_ANY

Then further down we can just use self.interface consistently w/out calling inet_aton.

While we're at it, we should also prep self.mc_group in the constructor if specified (i.e., make sure if inet_aton is gonna fail, it happens during construction). So stick something like this right after we set self.interface:

        if mc_group:
            # resolve once in constructor
            self.mc_group = socket.inet_aton(socket.gethostbyname(mc_group))
        else:
            self.mc_group = None

That way both self.interface and self.mc_group are ready to get passed into getsockopt calls after construction.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with what you've suggested here, thanks -- slightly under the pump with other work but will aim to push a commit soon!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, I know the feeling.

@veggiemike

Copy link
Copy Markdown
Contributor

And lastly, apologies for taking so long to look at this. Saying "I've been busy" would be a gross understatement. 🤣

@ce-rvi

ce-rvi commented Jun 1, 2026

Copy link
Copy Markdown
Author

Hi @veggiemike, no worries and thanks for finding the time!

I've replied to your comments individually.

@veggiemike

Copy link
Copy Markdown
Contributor

Cheers from somewhere near Hawaii!

@davidpablocohn-claude-bot

Copy link
Copy Markdown
Contributor

Hi all - checking in on whether we're close to having this resolved...

…uctor. Use inet_ntoa to convert addresses back to strings for sock.bind calls.

Signed-off-by: Charlie Evans <charles.evans@csiro.au>
@ce-rvi

ce-rvi commented Jun 23, 2026

Copy link
Copy Markdown
Author

@veggiemike apols again for the delay, I've been pulled every which way this month! Would you mind taking another look at this in light of my latest commit? I've moved all the conditional mc_group / interface logic up into the constructor as discussed which I think works nicely. It did have the unintended consequence of complicating the sock.bind calls at the bottom of _open_socket because as far as I know, bind only takes the address in string form ... not a showstopper, I've just converted them back as required with socket.inet_ntoa, but let me know if any objections. Similarly, I had to change the default unicast 'interface' value from '' to socket.inet_aton('0.0.0.0') to make self.interface type-consistent across the board, but this is functionally equivalent to my understanding.

Thanks in advance if you've time to take a look!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants