Unbound 1.13.0rc1 pre-release

Hi,

Unbound 1.13.0rc1 pre-release is available:
https://nlnetlabs.nl/downloads/unbound/unbound-1.13.0rc1.tar.gz
sha256 a55e8b5dfc290867017e7fbb75f1023ee2f6234943f870a5c24694b0908d7c17
pgp https://nlnetlabs.nl/downloads/unbound/unbound-1.13.0rc1.tar.gz.asc

This version has fixes to connect for UDP sockets, slowing down
potential ICMP side channel leakage. The fix can be controlled with the
option udp-connect: yes, it is enabled by default.

Additionally CVE-2020-28935 is fixed, this solves a problem where the
pidfile is altered by a symlink, and fails if a symlink is encountered.
See https://nlnetlabs.nl/downloads/unbound/CVE-2020-28935.txt for more
information.

New features are upstream TCP and TLS query reuse, where a channel is
reused for several queries. And http-notls-downstream: yesno for
unencrypted DoH, useful for back end support servers. The option
infra-keep-probing can be used to probe hosts that are down more
frequently.

The options edns-client-string and edns-client-string-opcode can be used
to add an EDNS option with the specified string in queries towards
servers, with the servers specified by IP address. It replaces the
edns-client-tag option.

Features
- Pass the comm_reply information to the inplace_cb_reply* functions
  during the mesh state and update the documentation on that.
- Fix #330: [Feature request] Add unencrypted DNS over HTTPS support.
  This adds the option http-notls-downstream: yesno to change that,
  and the dohclient test code has the -n option.
- Merge PR #228 : infra-keep-probing option to probe hosts that are
  down. Add infra-keep-probing: yes option. Hosts that are down are
  probed more frequently.
  With the option turned on, it probes about every 120 seconds,
  eventually after exponential backoff, and that keeps that way. If
  traffic keeps up for the domain. It probes with one at a time, eg.
  one query is allowed to probe, other queries within that 120 second
  interval are turned away.
- Merge PR #313 from Ralph Dolmans: Replace edns-client-tag with
  edns-client-string option.
- Merge PR #283 : Stream reuse. This implements upstream stream
  reuse for performing several queries over the same TCP or TLS
  channel.
- Fix to connect() to UDP destinations, default turned on,
  this lowers vulnerability to ICMP side channels.
  Option to toggle udp-connect, default is enabled.

Bug Fixes
- Fix #319: potential memory leak on config failure, in rpz config.
- Fix dnstap socket and the chroot not applied properly to the dnstap
  socket path.
- Fix warning in libnss compile, nss_buf2dsa is not used without DSA.
- Fix #323: unbound testsuite fails on mock build in systemd-nspawn
  if systemd support is build.
- Fix for python reply callback to see mesh state reply_list member,
  it only removes it briefly for the commpoint call so that it does
  not drop it and attempt to modify the reply list during reply.
- Fix that if there are on reply callbacks, those are called per
  reply and a new message created if that was modified by the call.
- Free up auth zone parse region after use for lookup of host
- Merge PR #326 from netblue30: DoH: implement content-length
  header field.
- DoH content length, simplify code, remove declaration after
  statement and fix cast warning.
- Fix that if there are reply callbacks for the given rcode, those
  are called per reply and a new message created if that was modified
  by the call.
- Fix that the out of order TCP processing does not limit the
  number of outstanding queries over a connection.
- Fix python documentation warning on functions.rst inplace_cb_reply.
- Log ip address when http session recv fails, eg. due to tls fail.
- Fix to set the tcp handler event toggle flag back to default when
  the handler structure is reused.
- Clean the fix for out of order TCP processing limits on number
  of queries. It was tested to work.
- Fix that http settings have colon in set_option, for
  http-endpoint, http-max-streams, http-query-buffer-size,
  http-response-buffer-size, and http-nodelay.
- Fix memory leak of https port string when reading config.
- local-zone regional allocations outside of chunk
- Merge PR #324 from James Renken: Add modern X.509v3 extensions to
  unbound-control TLS certificates.
- Fix for PR #324 to attach the x509v3 extensions to the client
  certificate.
- Fix #327: net/if.h check fails on some darwin versions; contribution by
  Joshua Root.
- Fix #320: potential memory corruption due to size miscomputation upton
  custom region alloc init.
- Fix #333: Unbound Segmentation Fault w/ log_info Functions From
  Python Mod.
- Fix that minimal-responses does not remove addresses from a priming
  query response.
- In man page note that tls-cert-bundle is read before permission
  drop and chroot.
- Fix #341: fixing a possible memory leak.
- Fix memory leak after fix for possible memory leak failure.
- Fix #343: Fail to build --with-libnghttp2 with error: 'SSIZE_MAX'
  undeclared.
- Fix for #303 CVE-2020-28935 : Fix that symlink does not interfere
  with chown of pidfile.
- Fix #347: IP_DONTFRAG broken on Apple xcode 12.2.
- Fix #350: with the AF_NETLINK permission, to fix 1.12.0 error:
  failed to list interfaces: getifaddrs: Address family not
  supported by protocol.
- Merge #351 from dvzrv: Add AF_NETLINK to set of allowed socket
  address families.
- iana portlist updated.

Best regards, Wouter

Hmmmm.

Built successfully, but not work.

dig ya.ru

; <<>> DiG 9.11.13 <<>> ya.ru
;; global options: +cmd
;; connection timed out; no servers could be reached

With previous configuration.

24.11.2020 20:28, Wouter Wijngaards via Unbound-users пишет:

Hi Yuri,

Can you tell me what the logs look like with verbosity 5 or so?

I do not recall your previous configuration, was there anything
particular about it?

Best regards, Wouter

Hi Wouter,

Sorry this update just crashes...

This is the tail of the log file - I have a complete log if required.

BTW does this new release include a fix for my RPZ issue that George was looking at?

24/11/2020 15:41:08 unbound.exe[13260:2] info: sending query: www.tm.lg.prod.aadmsa.akadns.net. A IN
24/11/2020 15:41:08 unbound.exe[13260:2] debug: sending to target: <.> 8.8.8.8#853
24/11/2020 15:41:08 unbound.exe[13260:2] debug: dnssec status: not expected
24/11/2020 15:41:08 unbound.exe[13260:2] debug: pending_tcp_query
24/11/2020 15:41:08 unbound.exe[13260:2] debug: reuse_tcp_find
24/11/2020 15:41:08 unbound.exe[13260:2] debug: reuse_tcp_find: num reuse streams 45
24/11/2020 15:41:08 unbound.exe[13260:2] debug: reuse_tcp_find check inexact match
24/11/2020 15:41:08 unbound.exe[13260:2] debug: reuse_tcp_close_oldest
24/11/2020 15:41:08 unbound.exe[13260:2] debug: decommission_pending_tcp
24/11/2020 15:41:08 unbound.exe[13260:2] debug: bio_cb 3, before read
24/11/2020 15:41:08 unbound.exe[13260:2] debug: bio_cb 131, return read
24/11/2020 15:41:08 unbound.exe[13260:2] debug: bio_cb 6, before read
24/11/2020 15:41:08 unbound.exe[13260:2] debug: bio_cb 134, return read
24/11/2020 15:41:08 unbound.exe[13260:2] debug: bio_cb 1, before write
24/11/2020 15:41:08 unbound.exe[13260:2] debug: comm_point_close of 808: event_del
24/11/2020 15:41:08 unbound.exe[13260:2] debug: event_del 00000000075033A0 added=1 fd=808 tv=1606232526623 EV_READ EV_TIMEOUT
24/11/2020 15:41:08 unbound.exe[13260:2] debug: winsock: tcp wouldblock EV_READ
24/11/2020 15:41:08 unbound.exe[13260:2] debug: winsock: tcp wouldblock EV_WRITE
24/11/2020 15:41:08 unbound.exe[13260:2] debug: close fd 808
24/11/2020 15:41:08 unbound.exe[13260:2] debug: reuse_tcp_remove_tree_list
24/11/2020 15:41:08 unbound.exe[13260:2] fatal error: util/rbtree.c:324: change_child_ptr: assertion child->parent == old || child->parent == new failed

Confirm. It crashes repeatedly with stacktrace:

(dbx) where
current thread: t@2
=>[1] reuse_cmp(0xfffffd7ffc5fed08, 0x0, 0x28, 0x0, 0xfffffd7ffe6f5000, 0x379616265040000), at 0x4eb85e
[2] rbtree_find_less_equal(), at 0x4dc2e9
[3] 0x4eba25(), at 0x4eba25
[4] pending_tcp_query(), at 0x50b69f
[5] outnet_serviced_query(), at 0x50d2f3
[6] worker_send_query(), at 0x5339cd
[7] 0x525fbd(), at 0x525fbd
[8] 0x527672(), at 0x527672
[9] iter_operate(), at 0x545b0f
[10] mesh_run(), at 0x4b702e
[11] mesh_new_client(), at 0x4c7099
[12] worker_handle_request(), at 0x4e5e3e
[13] comm_point_udp_callback(), at 0x50553b
[14] 0x59f010(), at 0x59f010
[15] 0x59f4b7(), at 0x59f4b7
[16] 0x5a37d8(), at 0x5a37d8
[17] comm_base_dispatch(), at 0x505744
[18] 0x52aa5a(), at 0x52aa5a
[19] _thr_setup(), at 0xfffffd7ffef5dbab
[20] _lwp_start(), at 0xfffffd7ffef5dde0

24.11.2020 21:45, RayG via Unbound-users пишет:

./unittest fail on i686 on all Linux variants I build for.

assertion failure testcode/unitregional.c:68

Hi RayG,

Hi Wouter,

This is the entry in the event log Windows 20H2 19042.630

I have 3 of them and they are all the same.

Faulting application name: unbound.exe, version: 1.13.0.1, time stamp: 0x5fbd149c
Faulting module name: unbound.exe, version: 1.13.0.1, time stamp: 0x5fbd149c
Exception code: 0xc0000005
Fault offset: 0x00000000000a2326
Faulting process ID: 0x412c
Faulting application start time: 0x01d6c277643f681a
Faulting application path: C:\Program Files\Unbound\unbound.exe
Faulting module path: C:\Program Files\Unbound\unbound.exe
Report ID: a79bec8f-f4a2-49fa-ad58-6de5bbaefe3c
Faulting package full name:
Faulting package-relative application ID:

Debug build produced following stacktrace:

t@3 (l@3) terminated by signal SEGV (no mapping at the fault address)
Current function is reuse_cmp_addrportssl
144 r = sockaddr_cmp(&r1->addr, r1->addrlen, &r2->addr, r2->addrlen);
(dbx) where
current thread: t@3
=>[1] reuse_cmp_addrportssl(key1 = 0xfffffd7ffc1ff268, key2 = (nil)), line 144 in "outside_network.c"
[2] reuse_cmp(key1 = 0xfffffd7ffc1ff268, key2 = (nil)), line 160 in "outside_network.c"
[3] rbtree_find_less_equal(rbtree = 0xfffffd7ffe6e0300, key = 0xfffffd7ffc1ff268, result = 0xfffffd7ffc1fefe8), line 527 in "rbtree.c"
[4] reuse_tcp_find(outnet = 0xfffffd7ffe6e0200, addr = 0xfffffd7ffe6f6258, addrlen = 16U, use_ssl = 1), line 480 in "outside_network.c"
[5] pending_tcp_query(sq = 0xfffffd7ffe6f6200, packet = 0xfffffd7fc12180c0, timeout = 3000, callback = 0x57adec = &serviced_tcp_callback(struct comm_point *c, void *arg, int error, struct comm_reply *rep), callback_arg = 0xfffffd7ffe6f6200), line 2056 in "outside_network.c"
[6] serviced_tcp_send(sq = 0xfffffd7ffe6f6200, buff = 0xfffffd7fc12180c0), line 2767 in "outside_network.c"
[7] outnet_serviced_query(outnet = 0xfffffd7ffe6e0200, qinfo = 0xfffffd7fa1294580, flags = 256U, dnssec = 32768, want_dnssec = 0, nocaps = 0, tcp_upstream = 0, ssl_upstream = 1, tls_auth_name = 0xfffffd7fa1294958 "cloudflare-dns.com", addr = 0xfffffd7fa1294840, addrlen = 16U, zone = 0xfffffd7fa1294820 "", zonelen = 1U, qstate = 0xfffffd7fa1294088, callback = 0x4c1e16 = &worker_handle_service_reply(), callback_arg = 0xfffffd7fa1295b40, buff = 0xfffffd7fc12180c0, env = 0xfffffd7ffd815540), line 2998 in "outside_network.c"
[8] worker_send_query(qinfo = 0xfffffd7fa1294580, flags = 256U, dnssec = 32768, want_dnssec = 0, nocaps = 0, addr = 0xfffffd7fa1294840, addrlen = 16U, zone =0xfffffd7fa1294820 "", zonelen = 1U, ssl_upstream = 1, tls_auth_name = 0xfffffd7fa1294958 "cloudflare-dns.com", q = 0xfffffd7fa1294088), line 2001 in "worker.c"
[9] processQueryTargets(qstate = 0xfffffd7fa1294088, iq = 0xfffffd7fa1294480,ie = 0xfffffd7ffe28b100, id = 1), line 2600 in "iterator.c"
[10] iter_handle(qstate = 0xfffffd7fa1294088, iq = 0xfffffd7fa1294480, ie = 0xfffffd7ffe28b100, id = 1), line 3634 in "iterator.c"
[11] process_request(qstate = 0xfffffd7fa1294088, iq = 0xfffffd7fa1294480, ie= 0xfffffd7ffe28b100, id = 1), line 3677 in "iterator.c"
[12] iter_operate(qstate = 0xfffffd7fa1294088, event = module_event_pass, id = 1, outbound = (nil)), line 3889 in "iterator.c"
[13] mesh_run(mesh = 0xfffffd7ffe6f0600, mstate = 0xfffffd7fa1294038, ev = module_event_pass, e = (nil)), line 1699 in "mesh.c"
[14] mesh_new_client(mesh = 0xfffffd7ffe6f0600, qinfo = 0xfffffd7ffc1ffb90, cinfo = (nil), qflags = 288U, edns = 0xfffffd7ffc1ffb70, rep = 0xfffffd7ffc1ffc70, qid = 48945U), line 585 in "mesh.c"
[15] worker_handle_request(c = 0xfffffd7fca6e2800, arg = 0xfffffd7ffd814000, error = 0, repinfo = 0xfffffd7ffc1ffc70), line 1565 in "worker.c"
[16] comm_point_udp_callback(fd = 3, event = 2, arg = 0xfffffd7fca6e2800), line 716 in "netevent.c"
[17] event_process_active_single_queue(), at 0x5d1680
[18] event_process_active(), at 0x5d1b27
[19] 42(), at 0x5d5e48
[20] ub_event_base_dispatch(base = 0xfffffd7fe5a00400), line 280 in "ub_event.c"
[21] comm_base_dispatch(b = 0xfffffd7ffe015d40), line 246 in "netevent.c"
[22] worker_work(worker = 0xfffffd7ffd814000), line 1941 in "worker.c"
[23] thread_start(arg = 0xfffffd7ffd814000), line 540 in "daemon.c"
[24] _thr_setup(), at 0xfffffd7ffef5dbab
[25] _lwp_start(), at 0xfffffd7ffef5dde0

24.11.2020 20:28, Wouter Wijngaards via Unbound-users пишет:

Hi RayG,

Thanks for the detail! It looks like the sort order on item removal is
not the same as when it was inserted and this is caused by the removal
code in the wrong order that happens in the case of this closure order.

There is a commit that fixes it, I think (I have not reproduced the
case, but it looks like the code should exactly fix it), in
https://github.com/NLnetLabs/unbound/commit/978d3840dc6a28d634b1a184a62645663b679175

Best regards, Wouter

Hi Tuomo,

This should be fixed in commit
https://github.com/NLnetLabs/unbound/commit/4e8a1ede3b41e9b4a93cb664fbc3f48b1521f62b

Best regards, Wouter

No. Still crashes.

24.11.2020 22:07, Wouter Wijngaards via Unbound-users пишет:

Different stacktrace:

(dbx) where
current thread: t@7
=>[1] evmap_io_del_(0xfffffd7ff9e00400, 0x3c, 0xfffffd7f9dd81800, 0xfffffd7f9ed3a680, 0x9, 0xfffffd7ffe2dfe80), at 0x5a768f
[2] event_del_nolock_(), at 0x59f854
[3] event_del(), at 0x5a0af4
[4] comm_point_listen_for_rw(), at 0x5503fb
[5] 0x5023fe(), at 0x5023fe
[6] comm_point_tcp_handle_callback(), at 0x5037ca
[7] 0x59f0d0(), at 0x59f0d0
[8] 0x59f577(), at 0x59f577
[9] 0x5a3898(), at 0x5a3898
[10] comm_base_dispatch(), at 0x5056e4
[11] 0x52ab1a(), at 0x52ab1a
[12] _thr_setup(), at 0xfffffd7ffef5dbab
[13] _lwp_start(), at 0xfffffd7ffef5dde0
(dbx)

24.11.2020 22:07, Wouter Wijngaards via Unbound-users пишет:

Hi Yuri,

This looks like it is in the TLS handshake routine, and I do not really
expect any issues here, so I wonder if it is because on Solaris there is
an older version of libevent (if that is what is happening for you).

Or use --without-libevent or --with-libevent=no to use an internal
select() based event mechanism. Perhaps that is useful.

Otherwise I would like more details in the stack trace, eg. with like
line numbers and debug symbols. Or perhaps a verbosity trace, or some
way to reproduce it.

Best regards, Wouter

Wwwwwwwww. libevent as fresh as possible. built from sources. v2.1.12.

24.11.2020 22:35, Wouter Wijngaards via Unbound-users пишет:

Indeed. I added two post 1.13.0rc1 changes and now it works.

It was too early to say that - now it segfaulted.

Log gives this hint.

kernel: traps: unbound[18976] general protection ip:7f5fde56a471
sp:7fff060abb80 error:0 in libssl.so.1.0.2k[7f5fde525000+67000]

./unittest fail on i686 on all Linux variants I build for.

assertion failure testcode/unitregional.c:68

This should be fixed in commit
https://github.com/NLnetLabs/unbound/commit/4e8a1ede3b41e9b4a93cb664fbc3f48b1521f62b

I can confirm that this also fixes it on NetBSD/i386 8.0, the
selftest and all the unit tests now all succeed (I also added
that TLS-related patch from earlier).

Regards,

- Håvard

Hi Yuri,

If this is similar to an older issue
(https://web.archive.org/web/20200919090000/https://www.nlnetlabs.nl/bugs-script/show_bug.cgi?id=4227)
then the fix in
https://github.com/NLnetLabs/unbound/commit/d05c259458364ac7705d030d1106c4041956d908
may do it.

Best regards, Wouter