[PATCH] performance increase with SO_REUSEPORT on Linux 3.9+

Hi,

There is a SO_REUSEPORT socket option available on Linux 3.9+, described
in this LWN article:

    https://lwn.net/Articles/542629/

It is of particular interest to DNS servers, since it allows more evenly
distributing incoming queries among multiple UDP server sockets bound to
the same port:

    As with TCP, SO_REUSEPORT allows multiple UDP sockets to be bound to
    the same port. This facility could, for example, be useful in a DNS
    server operating over UDP. With SO_REUSEPORT, each thread could use
    recv() on its own socket to accept datagrams arriving on the port.
    The traditional approach is that all threads would compete to
    perform recv() calls on a single shared socket. As with the second
    of the traditional TCP scenarios described above, this can lead to
    unbalanced loads across the threads. By contrast, SO_REUSEPORT
    distributes datagrams evenly across all of the receiving threads.

    Tom noted that the traditional SO_REUSEADDR socket option already
    allows multiple UDP sockets to be bound to, and accept datagrams on,
    the same UDP port. However, by contrast with SO_REUSEPORT,
    SO_REUSEADDR does not prevent port hijacking and does not distribute
    datagrams evenly across the receiving threads.

There is also a slide deck available here:

    http://domsch.com/linux/lpc2010/Scaling_techniques_for_servers_with_high_connection%20rates.pdf

I've written a patch for Unbound that enables per-thread server sockets
with SO_REUSEPORT on Linux. It must be explicitly requested with
"so-reuseport: yes" because Linux < 3.9 doesn't support the socket
option (it will fail with ENOPROTOOPT) and because it has different
semantics on Linux than on other platforms with a similarly named socket
option it has no effect on non-Linux platforms.

Under my simplistic 100% cache hit microbenchmark, Unbound with
SO_REUSEPORT support running on Linux 3.12 achieves somewhat lower CPU
utilization at all query rates except the very highest (I suspect this
is because the patched Unbound is delivering a slightly higher response
rate). E.g., at 810Kq/s from the traffic generator, unbound trunk
(r3036) consumes 75% of the system's 4 CPUs, while unbound with
SO_REUSEPORT consumes 66%. I've attached a plot showing the benchmark
data graphically.

(attachments)

0001-Add-so-reuseport-option-to-enable-SO_REUSEPORT-on-li.patch (18.9 KB)
plot-unbound-soreuseport.png

Hi Robert,

Hi,

There is a SO_REUSEPORT socket option available on Linux 3.9+,
described in this LWN article:

Thank you very much for this patch. I have applied it.
(small changed in header comments, and documentation entry).

The only thing I wonder if it wouldn't be better to make this the
default setting. It could be disabled if you then mention it in the
config file? This would still only apply to Linux systems (unless we
had some way for a similar effect on other systems, and then we have
to rename this linux-reuseport?)

Best regards,
   Wouter

W.C.A. Wijngaards wrote:

Hi Robert,

> Hi,
>
> There is a SO_REUSEPORT socket option available on Linux 3.9+,
> described in this LWN article:

Thank you very much for this patch. I have applied it.
(small changed in header comments, and documentation entry).

The only thing I wonder if it wouldn't be better to make this the
default setting. It could be disabled if you then mention it in the
config file? This would still only apply to Linux systems (unless we
had some way for a similar effect on other systems, and then we have
to rename this linux-reuseport?)

Best regards,
   Wouter

Hi, Wouter:

You could try to enable Linux/SO_REUSEPORT support if it is available,
by detecting whether __linux__ and SO_REUSEPORT are defined at compile
time. But it's possible for SO_REUSEPORT to be defined at compile time
but not usable at run time (e.g., new headers with old kernel, for
example during a Linux distribution upgrade where the new userland is
temporarily running with the previous kernel), in which case
setsockopt() will fail with ENOPROTOOPT.

With the current version of this patch, the daemon will fail to start if
it tries to set SO_REUSEPORT and setsockopt() fails. So I would not
recommend making this the default just yet.

It would be possible to detect if SO_REUSEPORT is not available at run
time and fall back to only using a single listening socket. That would
require a little more coordination between the daemon code and the code
that initializes the sockets but it's certainly possible.

Hi Robert,

W.C.A. Wijngaards wrote:

Hi Robert,

Hi,

There is a SO_REUSEPORT socket option available on Linux 3.9+,
described in this LWN article:

Thank you very much for this patch. I have applied it. (small
changed in header comments, and documentation entry).

The only thing I wonder if it wouldn't be better to make this
the default setting. It could be disabled if you then mention it
in the config file? This would still only apply to Linux systems
(unless we had some way for a similar effect on other systems,
and then we have to rename this linux-reuseport?)

Best regards, Wouter

Hi, Wouter:

You could try to enable Linux/SO_REUSEPORT support if it is
available, by detecting whether __linux__ and SO_REUSEPORT are
defined at compile time. But it's possible for SO_REUSEPORT to be
defined at compile time but not usable at run time (e.g., new
headers with old kernel, for example during a Linux distribution
upgrade where the new userland is temporarily running with the
previous kernel), in which case setsockopt() will fail with
ENOPROTOOPT.

With the current version of this patch, the daemon will fail to
start if it tries to set SO_REUSEPORT and setsockopt() fails. So I
would not recommend making this the default just yet.

It would be possible to detect if SO_REUSEPORT is not available at
run time and fall back to only using a single listening socket.
That would require a little more coordination between the daemon
code and the code that initializes the sockets but it's certainly
possible.

This is implemented, it detectes and uses it. You can thus enable
this option on multiple platforms, but only on the linux platforms
with the right kernel will it take effect, on the others it falls back
to using the existing setup. This means you can have the same config
(and perhaps binary) on multiple machines.

Best regards,
   Wouter