I have a number of kvm instances running debian where unbound 1.7.1
fails.
Many of these instances run whichever kernel was current when I first
leased them, and do not support newer kernels.
(Others look on the fs for a kernel to kexec, but not all do.)
Debian of course compiles unbound on a kernel which support
genrandom(2), but many of mine do not.
Unlike 1.6, 1.7 failes on such a machine, calling SIGKILL rather than
reading /dev/urandom.
It looks like getentropy_urandom() only needs CAN_REFERENCE_MAIN
defined, which getentropy_getrandom also needs, but still
getentropy_urandom() is ignored.
Deb's packaging makes no changes to that part of the code.
I've started work on an LD_PRELOAD lib to emulate getrandom(2) by
reading from urandom(5). Other than that, does anyone have any thoughts
on why this started breaking with 1.7.1?
James Cloos via Unbound-users <unbound-users@unbound.net> writes:
I have a number of kvm instances running debian where unbound 1.7.1
fails.
An LD_PRELOAD lib which implments getentropy(3) via read(3)ing
urandom(4) solved the bug.
Unbound *always* should fall back to urandom(4) when getentropy(3)
results in ENOSYS, even when compiled against a kernel which advertizes
support for getrandom(2).
James Cloos via Unbound-users <unbound-users@unbound.net> writes:
I have a number of kvm instances running debian where unbound 1.7.1
fails.
An LD_PRELOAD lib which implments getentropy(3) via read(3)ing
urandom(4) solved the bug.
Unbound *always* should fall back to urandom(4) when getentropy(3)
results in ENOSYS, even when compiled against a kernel which advertizes
support for getrandom(2).
But Unbound does that! It falls back to that when the other results in
ENOSYS.
What could be happening is that configure detects arc4random. If that
is the case, Unbound calls that arc4random. And then this library call
has to call getentropy, and it could be that that could does not
fallback? You can check the configure output for the arc4random check.
Or afterwards in config.log or config.h (HAVE_ARC4RANDOM is defined or
not defined).
If that is not the case, then we'd need to go for having log printout
for debug to see what happens.
Unbound *always* should fall back to urandom(4) when getentropy(3)
results in ENOSYS, even when compiled against a kernel which advertizes
support for getrandom(2).
But Unbound does that!
It falls back to that when the other results in ENOSYS.
In the strace there is no attempt to open(2) "/dev/urandom".
Once the call to glibc's genentropy(3) fails, it immediately sends a
SIGKILL.
What could be happening is that configure detects arc4random. If that
is the case, Unbound calls that arc4random. And then this library call
has to call getentropy, and it could be that that could does not
fallback?
Indeed, compat/arc4random.c:_rs_stir calls genentropy and does not fall back.
You can check the configure output for the arc4random check.
Or afterwards in config.log or config.h (HAVE_ARC4RANDOM is defined or
not defined).
I don't have the configure output; this is debian's compile.
If that is not the case, then we'd need to go for having log printout
for debug to see what happens.
What config is needed for that? verbosity:5 ? I don't see anything
else documented in unbound.conf(5).
Unbound *always* should fall back to urandom(4) when getentropy(3)
results in ENOSYS, even when compiled against a kernel which advertizes
support for getrandom(2).
> But Unbound does that!
> It falls back to that when the other results in ENOSYS.
In the strace there is no attempt to open(2) "/dev/urandom".
Once the call to glibc's genentropy(3) fails, it immediately sends a
SIGKILL.
But in your strace you have a call to getrandom() that fails.
So, would like to implement the fallback, but where? getrandom?
arc4random? getentropy? The strace says getrandom is called and gets
ENOSYS.
In compat/compat/getentropy_linux.c that would result in a fallback to
getentropy_urandom(). That would not fail, and thus not raise SIGKILL.
So perhaps we are executing different code. What code? Depends on
configure time detection.
> What could be happening is that configure detects arc4random. If that
> is the case, Unbound calls that arc4random. And then this library call
> has to call getentropy, and it could be that that could does not
> fallback?
Indeed, compat/arc4random.c:_rs_stir calls genentropy and does not fall back.
Yes that does that. fall back for some glibc implementation?
> You can check the configure output for the arc4random check.
> Or afterwards in config.log or config.h (HAVE_ARC4RANDOM is defined or
> not defined).
I don't have the configure output; this is debian's compile.
This is why I want to find out what happens, and what happened at
configure time. Not sure, but can you run ltrace, it should intercept
the library call and tell us if unbound is calling a function from glibc
(or some other library perhaps, i.e. misexported internal compat
function from another program). And we'd know more what code in unbound
gets executed?
And then I can find out where the fallback code needs to be? I can also
make a patch for you in _rs_stir ; but you'd need to compile to try it
(and when you're doing that tell what configure at that compile says
about what library functions are available and which not).
I don't have the configure output; this is debian's compile
I'll try to recompile the Debian package to catch configure output ... @James: which Debian Version?
Here is a patch you can use to fix the problem for him, it calls the
urandom fallback when ENOSYS is returns in _rs_stir().
Hopefully that will alleviate the problems. The configure output that
would match this patch is one where getentropy is provided by the
system, not arc4random, and that getentropy then returns ENOSYS.
I don't have the configure output; this is debian's compile
I'll try to recompile the Debian package to catch configure output ... @James: which Debian Version?
I didn't see that until just now; I had replied initially to the copies
I got parallel to the list, and so missed your replies (and the proposed
patch).