frequent crashes / restarts with unbound 1.9.0

Hello unbound users,

I am new to unbound and running an open resolver (a project of
digitalcourage.de against censorship attempts in germany[1]; we have
taken measures outside of unbound against DNS amplification).

It seems that unbound is frequently crashing / restarting. Once the
service is started, nothing indicates a crash, but average time.up (from
stats)never even reaches 120s (our monitoring interval). RAM usage stays
very low and caches are not populated.

After removing all "experimental" features from our config (see below)
the situation got better. Now sometimes time.up reaches 1000s. At night
time.up goes up to higher values, so I think server load plays some role
in this strange behavior. Any ideas?

Platform details:
* Debian 10 (stable), Kernel 4.19.0-10-amd64
* unbound 1.9.0-2+deb10u2
* iptables, limiting UDP output (on public NS-interface)
Hardware & stats:
* VM with 4 VCPUs, 6 GB RAM
* num.query.tls: 60/sec
* total.num.queries: 1500/sec
* load average: 2,67, 3,27, 3,33

-- config --------------------------------------------------------------

server:
  interface: 46.182.19.48@53
  interface: 2a02:2970:1002::18@53

  interface: 46.182.19.48@853
  interface: 2a02:2970:1002::18@853
  tls-additional-port: 853
  tls-service-key: /path/dns2.digitalcourage.de/privkey.pem
  tls-service-pem: /path/dns2.digitalcourage.de/fullchain.pem

  outgoing-interface: [some IPv4]

  hide-identity: yes
  hide-version: yes

  prefetch: no

  # public access
  access-control: 0.0.0.0/0 allow
  access-control: ::/0 allow

        verbosity: 1
  logfile: /var/log/unbound.log
  
  # platform / scaling / performance
  num-threads: 4
  msg-cache-slabs: 4
  rrset-cache-slabs: 4
  infra-cache-slabs: 4
  key-cache-slabs: 4
  msg-cache-size: 1024m
  rrset-cache-size: 2048m
  # with libevent
  outgoing-range: 8192
  num-queries-per-thread: 4096
        incoming-num-tcp: 1000
        outgoing-num-tcp: 1000
  
  root-hints: root.hints
  
  deny-any: yes
  minimal-responses: yes

  statistics-interval: 0
  extended-statistics: yes
  
remote-control:
  control-enable: yes
  control-use-cert: no

Some additional information:

/var/log/unbound.log showing a crash / restart ----------------------
...
[1598473907] unbound[32484:3] notice: ssl handshake failed IP port 35524
[1598473908] unbound[32484:3] error: ssl handshake failed crypto
error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad certificate
[1598473908] unbound[32484:3] notice: ssl handshake failed IP port 35536
[1598473908] unbound[32484:3] error: ssl handshake failed crypto
error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad certificate
[1598473908] unbound[32484:3] notice: ssl handshake failed IP port 35546
[1598473909] unbound[4285:0] notice: init module 0: subnet
[1598473909] unbound[4285:0] notice: init module 1: validator
[1598473909] unbound[4285:0] notice: init module 2: iterator
[1598473909] unbound[4285:0] info: start of service (unbound 1.9.0).
[1598473910] unbound[4285:0] info: generate keytag query _ta-4f66. NULL IN
[1598473910] unbound[4285:3] info: generate keytag query _ta-4f66. NULL IN
[1598473910] unbound[4285:2] info: generate keytag query _ta-4f66. NULL IN
[1598473910] unbound[4285:1] info: generate keytag query _ta-4f66. NULL IN
[1598473910] unbound[4285:3] error: ssl handshake failed crypto
error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad certificate
[1598473910] unbound[4285:3] notice: ssl handshake failed IP port 35566
[1598473916] unbound[4285:2] error: ssl handshake failed crypto
error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad certificate
[1598473916] unbound[4285:2] notice: ssl handshake failed IP port 35670
...

Try unbound 1.11.0-1~bpo10+1 from backports.

Hello,

thank you for your help (via private messages). We are now running
unbound 1.11 from backports. It is running for nearly a day without a crash.

Regards,
Georg