• intermittent failures and queries sent over TCP

    From David Newman@dnewman@networktest.com to bind-users on Tue Aug 18 17:34:41 2020
    From Newsgroup: comp.protocols.dns.bind

    bind 9.11.5.P4 on Debian 10

    Greetings. I recently had to migrate a nameserver from FreeBSD to
    Debian. It works fine most of the time but I've noticed a few
    intermittent resolution failures.

    After "gmail.com" failed to resolve I took a packet capture using
    tcpdump to listen to the result of the command "dig -t mx gmail.com" and
    here's what I found:

    1. That query over UDP, with responses over UDP pointing to Google's nameservers

    2. Nearly 200 attempts to reach root servers over TCP, followed
    immediately by RST messages from the root servers.

    Some time later, gmail.com started resolving succesfully again, clearing
    up the issue for now.

    AFAIK there's nothing in the BIND configs that would force the use of
    TCP queries. I checked the docs for various TCP options and didn't see
    any applied here. I don't know if the TCP queries are related to the
    gmail.com resolution failure but I suspect they are (and in any event
    inability to reach root servers is a problem).

    This server is authoritative for several domains. It gets its zones from
    a hidden primary. The system's firewall permits inbound TCP and UDP
    traffic on port 53 and AFAIK does not block outbound UDP (the firewall
    is nftables, which is new to me, but since I see UDP queries in the
    packet capture I think it works).

    What would cause the server to send queries over TCP?

    Thanks in advance for troubleshooting clues.


    dn



    CONFIG FILES

    (named.conf is just pointers to .local and .options and .default-zones)

    // named.conf.local

    acl "xfer" {
    // redacted -- a list of IPv4 and IPv6 addresses I trust
    };

    controls {
    inet 127.0.0.1 port 953 allow { 127.0.0.1; };
    };

    logging {
    channel simple_log {
    file "/var/log/named/named.log" versions 30 size 1m;
    severity info;
    print-time yes;
    print-severity yes;
    print-category yes;
    };
    category default { simple_log; };
    category update { simple_log; };
    category update-security { simple_log; };
    category security { simple_log; };
    category queries { simple_log; };
    category lame-servers { null; };
    };

    zone "example1.org" in {
    type slave;
    file "example1.org.bak";
    masters { 198.18.0.53; }; // not the real address
    allow-query { any; };
    allow-transfer { xfer; };
    };

    zone "example2.org" in {
    type slave;
    file "example2.org.bak";
    masters { 198.18.0.53; }; // not the real address
    allow-query { any; };
    allow-transfer { xfer; };
    };

    // etc.


    // named.conf.options

    acl "trusted" {

    // redacted -- a list of IPv4 and IPv6 addresses I trust
    };

    options {
    directory "/var/cache/bind";
    pid-file "/var/run/named/named.pid";
    statistics-file "/var/run/named/named.stats";
    transfer-format many-answers;
    masterfile-format text;
    max-transfer-time-in 60;
    allow-query { any; };
    allow-recursion { trusted; };
    allow-query-cache { trusted; };
    allow-transfer { xfer; };
    version none;

    disable-empty-zone "255.255.255.255.IN-ADDR.ARPA";
    disable-empty-zone "0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.IP6.ARPA";
    disable-empty-zone "1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.IP6.ARPA";


    querylog yes;


    };
    --- Synchronet 3.21d-Linux NewsLink 1.2
  • From David Newman@dnewman@networktest.com to Mark Andrews on Tue Aug 18 18:12:45 2020
    From Newsgroup: comp.protocols.dns.bind

    On 8/18/20 5:55 PM, Mark Andrews wrote:

    If you are getting RST responses check your firewall settings. RST is often forged
    when TCP is blocked. The root servers normally accept TCP connections.

    % dig +tcp gmail.com @a.root-servers.net +dnssec

    Bingo. This query failed before adding a rule to the upstream firewall
    to allow outbound queries, and works now.

    Thanks!

    dn

    --- Synchronet 3.21d-Linux NewsLink 1.2