Riddle me this: why does one machine fail to start SSL for the control channel?

So, I've got two nearly identical machines, with nearly identical configurations, running unbound-1.2.0, but yet only one of them will start unbound without having the server-key-file and server-cert-file readable by the user unbound runs as after it gets started. The (extremely unhelpful!!! -- I almost had to ktrace to find out which file(s) it's actually complaining about!!!) errors are appended below.

By "nearly identical configurations" I mean I've copied the unbound.conf file from the working machine and only changed the interface and outgoing-interface settings.

Everything else on the two machines is identical, including all file and directory permissions. There are two other client machines, also with very similar configurations, where unbound starts without problems. Only this one machine is giving the error.

It's almost as if on one machine unbound is switching to the unprivileged user earlier on the other.

The only other thing I can think of is something to do with the secondary group privileges the process somehow retains.

Perhaps unbound is not making a call to initgroups(3) or setgroups(2) to clear secondary group privileges and thus retains rights to the "wheel" group, which can in fact read the *.key and *.pem files.

I'm not sure what's different on the machine where unbound fails, but I do note that root does not have any additional groups set, while on the machines where it works, root has additional groups set, including "wheel":

[works] # id
uid=0(root) gid=0(wheel) groups=0(wheel),5(operator),20(staff),31(guest)

[fails] # id
uid=0(root) gid=0(wheel)

If this is the significant difference then indeed unbound-1.2.0 is failing to use setgroups(2) or initgroups(3) or best of all setusercontext(3) to ensure the unprivileged process dumps _all_ unnecessary privileges; and then of course it also needs to have already opened all privileged files prior to dropping privileges.

Hi Greg,

Thank you for this report. It looks like it is indeed the groups. Did
not know about secondary group privileges. I'll put on my todo to print
ssl filenames, and to call initgroups(3).

Unbound 1.2.0 calls setresgid but not initgroups.

Please note that the server needs 3 files to operate:
server-key-file, server-cert-file and
control-cert-file(unbound_control.pem). It uses that last one to
authenticate the client.

Best regards,
   Wouter

Greg A. Woods; Planix, Inc. wrote:

I have a strange case where on a CentOS machines where if I start
unbound on start via initscripts, it fails, but if I login and
run 'service unbound start' it starts fine.

There isn't much in the logs, even with verbosity:4

Feb 10 21:35:29 resolver unbound: [1607:0] error: Error setting up SSL_CTX key and cert crypto error:0200100D:system library:fopen:Permission denied
Feb 10 21:35:29 resolver unbound: [1607:0] error: and additionally crypto error:20074002:BIO routines:FILE_CTRL:system lib
Feb 10 21:35:29 resolver unbound: [1607:0] error: and additionally crypto error:140AD002:SSL routines:SSL_CTX_use_certificate_file:system lib
Feb 10 21:35:29 resolver unbound: [1607:0] fatal error: Could not initialize main thread

Paul

I forgot to add the startup logs before it switched to syslog:

Starting unbound: [1234322215] unbound[2721:0] debug: creating udp6 socket :: 53
[1234322215] unbound[2721:0] debug: creating tcp6 socket :: 53
[1234322215] unbound[2721:0] debug: creating udp4 socket 0.0.0.0 53
[1234322216] unbound[2721:0] debug: creating tcp4 socket 0.0.0.0 53
[1234322216] unbound[2721:0] debug: creating tcp6 socket ::1 953
[1234322216] unbound[2721:0] debug: creating tcp4 socket 127.0.0.1 953
[1234322216] unbound[2721:0] debug: switching log to syslog

Paul

Hi Paul,

This is 1.2.0 I think, there are fixes for it in 1.2.1 ?

In 1.2.1 I added initgroups(3), so that it drops secondary group
permissions. These might have stuck around from earlier (root or user)
permissions. These secondary group permissions may be the difference
between init and login for you.

Also, 1.2.1 prints the filename that is the problem.

I suggest you chown/chmod the files for remote-control (cert and key
files). One of these files gets 'Permission denied'. chown to
unbound, readable by user.

Best regards,
   Wouter

Paul Wouters wrote:

Thanks for that!

It might be even better to use setusercontext(3) instead on those platforms which have such a function (all the BSDs at least).

That way login.conf(5) could be used properly to set things like user resource limits for the process.

Hi Greg,

Implemented in the subversion trunk.
Thanks for the help with the user credential system calls.

Best regards,
   Wouter

Greg A. Woods wrote: