Wrong source IP for reply if 'ip-address' is not specified

Hi,

If I don’t specify the IP addresses on which NSD should bind, the IP
address used for the reply is the one attached to interface instead of
the one the request is destined.

I use NSD 4.1.18 on a FreeBSD 11.1-STABLE r326743.

morvan ~ # ip a s eth0 | grep inet
    inet 89.234.186.5/32 brd 89.234.186.5 scope global eth0
    inet6 2a00:5884::5/64 scope global
    inet6 fe80::6465:64ff:fe62:6331/64 scope link
morvan ~ # dig -t TXT hostname.as112.net @blackhole-1.iana.org
;; reply from unexpected source: 2a00:5884:0:100::1:10#53, expected 2620:4f:8000::6#53
;; reply from unexpected source: 89.234.186.134#53, expected 192.175.48.6#53
^Cmorvan ~ #

root@as112:~ # ifconfig vtnet0.102 | grep inet
        inet 89.234.186.134 netmask 0xfffffff8 broadcast 89.234.186.135
        inet6 fe80::8074:b5ff:fe78:d83c%vtnet0.102 prefixlen 64 scopeid 0x5
        inet6 2a00:5884:0:100::1:10 prefixlen 112
root@as112:~ # ifconfig lo1 | grep inet
        inet 192.175.48.1 netmask 0xffffff00
        inet 192.175.48.6 netmask 0xffffff00
        inet 192.175.48.42 netmask 0xffffff00
        inet 192.31.196.1 netmask 0xffffff00
        inet6 2620:4f:8000::6 prefixlen 64
        inet6 2620:4f:8000::42 prefixlen 64
        inet6 2001:4:112::1 prefixlen 64
        inet6 2620:4f:8000::1 prefixlen 64
root@as112:~ # route -6 -n get 2a00:5884::5
   route to: 2a00:5884::5
destination: 2a00:5884::
       mask: ffff:ffff::
    gateway: 2a00:5884:0:100::1:2
        fib: 0
  interface: vtnet0.102
      flags: <UP,GATEWAY,DONE,PROTO1>
recvpipe sendpipe ssthresh rtt,msec mtu weight expire
       0 0 0 0 1500 1 0
root@as112:~ # route -n get 89.234.186.5
   route to: 89.234.186.5
destination: 89.234.186.0
       mask: 255.255.255.0
    gateway: 89.234.186.130
        fib: 0
  interface: vtnet0.102
      flags: <UP,GATEWAY,DONE,PROTO1>
recvpipe sendpipe ssthresh rtt,msec mtu weight expire
       0 0 0 0 1500 1 0

09:37:35.342718 IP6 2a00:5884::5.44418 > 2620:4f:8000::6.53: 16577+ [1au] TXT? hostname.as112.net. (59)
09:37:35.343173 IP6 2a00:5884:0:100::1:10.53 > 2a00:5884::5.44418: 16577*- 4/2/1 TXT "grifon" "Rennes, FR", TXT "See http://www.as112.net/ for more information.", TXT "See https://monitoring.
09:37:36.343048 IP 89.234.186.5.41908 > 192.175.48.6.53: 16577+ [1au] TXT? hostname.as112.net. (59)
09:37:36.343261 IP 89.234.186.134.53 > 89.234.186.5.41908: 16577*- 4/2/1 TXT "grifon" "Rennes, FR", TXT "See http://www.as112.net/ for more information.", TXT "See https://monitoring.grifon.f

So, the request is addressed to 2620:4f:8000::6 but replied from
2a00:5884:0:100::1:10.
But, if I specify the addresses in nsd.conf, all is right:

morvan ~ # dig -t TXT hostname.as112.net @blackhole-1.iana.org

; <<>> DiG 9.11.1-P3 <<>> -t TXT hostname.as112.net @blackhole-1.iana.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 29606
;; flags: qr aa rd; QUERY: 1, ANSWER: 4, AUTHORITY: 2, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;hostname.as112.net. IN TXT

;; ANSWER SECTION:
hostname.as112.net. 604800 IN TXT "grifon" "Rennes, FR"
hostname.as112.net. 604800 IN TXT "See http://www.as112.net/ for more information."
hostname.as112.net. 604800 IN TXT "See https://monitoring.grifon.fr/munin/grifon.fr/as112.grifon.fr/index.html#dns for statistics."
hostname.as112.net. 604800 IN TXT "Unicast IP: 89.234.186.134"

;; AUTHORITY SECTION:
hostname.as112.net. 604800 IN NS blackhole-2.iana.org.
hostname.as112.net. 604800 IN NS blackhole-1.iana.org.

;; Query time: 24 msec
;; SERVER: 2620:4f:8000::6#53(2620:4f:8000::6)
;; WHEN: Wed Dec 13 09:38:02 CET 2017
;; MSG SIZE rcvd: 344

09:38:02.559512 IP6 2a00:5884::5.35278 > 2620:4f:8000::6.53: 29606+ [1au] TXT? hostname.as112.net. (59)
09:38:02.582280 IP6 2620:4f:8000::6.53 > 2a00:5884::5.35278: 29606*- 4/2/1 TXT "grifon" "Rennes, FR", TXT "See http://www.as112.net/ for more information.", TXT "See https://monitoring.grifon

The complete nsd configuration if findable here:
https://www.swordarmor.fr/le-noeud-as112-chez-grifon-et-breizh-ix.html#nsd.conf
(the article is in french but the configuration is commented in english)

I don’t know if it is known nor considerable as an issue.

Regards,

Hi Alarig,

If I don’t specify the IP addresses on which NSD should bind, the IP
address used for the reply is the one attached to interface instead of
the one the request is destined.

This is normal behaviour. On a server with multiple interfaces and
addresses, it is best if you explicitly specify all the addresses to
which NSD should bind.

Regards,
Anand Buddhdev
RIPE NCC

Hi Anand,

We have a different opinion on what is "normal behaviour". I believe the
normal behaviour is to reply using the IP address you received the
packet from, eg using:

err = setsockopt(s, SOL_IP, IP_PKTINFO, &opt, sizeof(opt));

or

err = setsockopt(s, IPPROTO_IP, IP_RECVDSTADDR, &opt, sizeof(opt));

For example: https://github.com/libreswan/libreswan/blob/master/programs/pluto/udpfromto.c

I assumed nsd would do this....

Paul

Hi Paul,

We have a different opinion on what is "normal behaviour". I believe the
normal behaviour is to reply using the IP address you received the
packet from, eg using:

err = setsockopt(s, SOL_IP, IP_PKTINFO, &opt, sizeof(opt));

or

err = setsockopt(s, IPPROTO_IP, IP_RECVDSTADDR, &opt, sizeof(opt));

I don't know if these options are available in non-Linux socket
implementations, and is probably the reason that NSD doesn't use them.
But I'm sure Wouter can comment more definitively.

I know the questions will come, so let me try to anticipate them and
answer them. Someone might ask why this isn't necessary with BIND. This
is because BIND attempts to detect the capability of the OS it's running
on, and compensate for the cases where these advanced options are not
present. This may make it easier for an operator, but at the expense of
more code complexity. I really do prefer NSD's simpler approach.

Note also that in nsd.conf, this cause is very clearly noted. From nsd.conf:

ip-address: <ip4 or ip6>[@port]
    NSD will bind to the listed ip-address. Can be give multiple times
    to bind multiple ip-addresses. Optionally, a port number can be
    given. If none are given NSD listens to the wildcard
    interface. Same as commandline option -a. For servers with multiple
    IP addresses that can be used to send traffic to the internet, list
    them one by one, or the source address of replies could be wrong.

    This is because if the udp socket associates a source address of
    0.0.0.0 then the kernel picks an ip-address with which to send to
    the internet, and it picks the wrong one. Typically needed for
    anycast instances. Use ip-transparent to be able to list
    addresses that turn on later (typical for certain load-balancing).

Regards,
Anand

We have a different opinion on what is "normal behaviour". I believe the
normal behaviour is to reply using the IP address you received the
packet from, eg using:

err = setsockopt(s, SOL_IP, IP_PKTINFO, &opt, sizeof(opt));

or

err = setsockopt(s, IPPROTO_IP, IP_RECVDSTADDR, &opt, sizeof(opt));

I don't know if these options are available in non-Linux socket
implementations, and is probably the reason that NSD doesn't use them.

Yes they are, but apparently for freebsd you use IP_RECVDSTADDR.

https://www.freebsd.org/cgi/man.cgi?query=ip&sektion=4&manpath=FreeBSD+9.0-RELEASE

      If the IP_RECVDSTADDR option is enabled on a SOCK_DGRAM socket, the
      recvmsg(2) call will return the destination IP address for a UDP data-
      gram. The msg_control field in the msghdr structure points to a buffer
      that contains a cmsghdr structure followed by the IP address. The
      cmsghdr fields have the following values:

      cmsg_len = sizeof(struct in_addr)
      cmsg_level = IPPROTO_IP
      cmsg_type = IP_RECVDSTADDR

      The source address to be used for outgoing UDP datagrams on a socket that
      is not bound to a specific IP address can be specified as ancillary data
      with a type code of IP_SENDSRCADDR. The msg_control field in the msghdr
      structure should point to a buffer that contains a cmsghdr structure fol-
      lowed by the IP address. The cmsghdr fields should have the following
      values:

      [...]

There is really no excuse to answer using the wrong IP :stuck_out_tongue:

I know the questions will come, so let me try to anticipate them and
answer them. Someone might ask why this isn't necessary with BIND. This
is because BIND attempts to detect the capability of the OS it's running
on, and compensate for the cases where these advanced options are not
present. This may make it easier for an operator, but at the expense of
more code complexity. I really do prefer NSD's simpler approach.

bind, powerdns, unbound and knot use these mechanisms. Someone should
really just fix is for nsd as well.

Paul

> > We have a different opinion on what is "normal behaviour". I believe the
> > normal behaviour is to reply using the IP address you received the
> > packet from, eg using:
> >
> > err = setsockopt(s, SOL_IP, IP_PKTINFO, &opt, sizeof(opt));
> >
> > or
> >
> > err = setsockopt(s, IPPROTO_IP, IP_RECVDSTADDR, &opt, sizeof(opt));
>
> I don't know if these options are available in non-Linux socket
> implementations, and is probably the reason that NSD doesn't use them.

Yes they are, but apparently for freebsd you use IP_RECVDSTADDR.

https://www.freebsd.org/cgi/man.cgi?query=ip&sektion=4&manpath=FreeBSD+9.0-RELEASE

     If the IP_RECVDSTADDR option is enabled on a SOCK_DGRAM socket, the
     recvmsg(2) call will return the destination IP address for a UDP data-
     gram. The msg_control field in the msghdr structure points to a buffer
     that contains a cmsghdr structure followed by the IP address. The
     cmsghdr fields have the following values:

     cmsg_len = sizeof(struct in_addr)
     cmsg_level = IPPROTO_IP
     cmsg_type = IP_RECVDSTADDR

     The source address to be used for outgoing UDP datagrams on a socket that
     is not bound to a specific IP address can be specified as ancillary data
     with a type code of IP_SENDSRCADDR. The msg_control field in the msghdr
     structure should point to a buffer that contains a cmsghdr structure fol-
     lowed by the IP address. The cmsghdr fields should have the following
     values:

     [...]

There is really no excuse to answer using the wrong IP :stuck_out_tongue:

It is long-established and expected behaviour for UDP unless you
have a specific bind. IP_PKTINFO/IP_SENDSRCADDR are relatively new,
if they're supported at all (I have a feeling MacOS might still not
have IP_SENDSRCADDR).

> I know the questions will come, so let me try to anticipate them and
> answer them. Someone might ask why this isn't necessary with BIND. This
> is because BIND attempts to detect the capability of the OS it's running
> on, and compensate for the cases where these advanced options are not
> present. This may make it easier for an operator, but at the expense of
> more code complexity. I really do prefer NSD's simpler approach.

bind, powerdns, unbound and knot use these mechanisms. Someone should
really just fix is for nsd as well.

You are correct about Unbound, PowerDNS and Knot. BIND doesn't
use these mechanisms at all, it looks up local addresses and binds to
all it finds unless instructed otherwise.

Hi Paul and Anand,