Networking & Protocols
13 min read

DNS Resolution Path: Stub to Recursive to Authoritative

A DNS query traverses multiple actors before returning an answer: stub resolver, recursive resolver, and a chain of authoritative servers (root, TLD, domain). Each hop introduces latency, caching decisions, and potential failure modes. Understanding this path is essential for diagnosing resolution delays, debugging SERVFAIL responses, and architecting systems that depend on DNS availability.

Authoritative ServerTLD ServerRoot ServerRecursive ResolverStub ResolverApplicationAuthoritative ServerTLD ServerRoot ServerRecursive ResolverStub ResolverApplicationCache missgethostbyname("api.example.com")Query (RD=1)Query for api.example.com (iterative)Referral to .com TLD serversQuery for api.example.comReferral to example.com NSQuery for api.example.comA record (TTL=300)Answer + cacheIP address
End-to-end DNS resolution flow showing iterative queries from recursive resolver to authoritative chain.

DNS resolution is a hierarchical delegation system. The stub resolver on your machine forwards queries to a recursive resolver, which iteratively walks the namespace tree—root servers delegate to TLD servers, which delegate to authoritative servers for the target domain. Each response is cached according to its TTL (Time To Live), and subsequent queries for the same name hit cache until expiry.

Key mental model:

  • Stub: Forwards queries, minimal caching, sets RD (Recursion Desired) flag
  • Recursive: Performs iterative resolution, maintains authoritative cache, enforces TTLs
  • Authoritative: Holds zone data, returns definitive answers or referrals
  • Caching: Dominates real-world latency; a cached response returns in <1ms, uncached may take 100-400ms
  • Failure modes: NXDOMAIN (domain doesn’t exist), SERVFAIL (upstream failure), REFUSED (policy rejection), timeouts (network/server issues)

The stub resolver is the DNS client library on your machine—gethostbyname(), getaddrinfo(), or the resolver in /etc/resolv.conf. Per RFC 1034 Section 5.3.1, a stub resolver “cannot perform full resolution itself; it depends on a recursive resolver.”

Behavior:

  • Sets the RD (Recursion Desired) flag in outgoing queries
  • Forwards queries to configured recursive resolvers (typically 1-3 servers)
  • Validates that RA (Recursion Available) is set in responses
  • Maintains minimal cache (OS-dependent, typically seconds to minutes)

Timeout behavior varies by platform:

PlatformDefault TimeoutRetriesNotes
Linux (glibc)5 seconds2Configurable via options timeout:N in resolv.conf
macOS5 seconds2mDNSResponder adds complexity
Windows1 second2Per-server, then cycles

The stub resolver is intentionally simple. It offloads complexity to the recursive resolver, which has the resources to cache, validate DNSSEC, and handle iterative resolution.

The recursive resolver (also called a full-service resolver or caching resolver) performs the actual work of walking the DNS tree. Per RFC 9499: “A server operating in recursive mode receives DNS queries and either responds from local cache or sends queries to other servers to get final answers.”

Key characteristics:

  • Maintains a cache of previously resolved records
  • Performs iterative queries to authoritative servers
  • Enforces TTLs and negative caching
  • May validate DNSSEC signatures
  • Implements rate limiting, prefetching, and serve-stale policies

Design decision—why iterative, not recursive all the way down?

RFC 1034 specifies that recursive resolvers use iterative queries to authoritative servers. The recursive resolver asks “where should I go next?” and follows referrals itself, rather than asking each server to do the full resolution. This design:

  • Optimizes caching: The recursive resolver sees all intermediate referrals and caches them
  • Limits trust: Authoritative servers only answer for their zones, not arbitrary queries
  • Reduces load on authoritative servers: They don’t need to implement recursive logic

Popular recursive resolvers include BIND, Unbound, PowerDNS Recursor, and public services like Google Public DNS (8.8.8.8), Cloudflare (1.1.1.1), and Quad9 (9.9.9.9).

Authoritative servers hold the definitive DNS records for zones they serve. They respond to queries with either:

  1. Authoritative answer: AA (Authoritative Answer) flag set, contains the requested record
  2. Referral: NS records pointing to child zone nameservers (for delegated subdomains)
  3. Negative response: NXDOMAIN or NODATA, with SOA record in Authority section for negative caching

The authoritative hierarchy:

. (root)
├── com. (TLD)
│ └── example.com. (domain)
│ └── api.example.com. (subdomain)
└── org. (TLD)
└── example.org.

Each level delegates to the next. Root servers know TLD servers; TLD servers know domain nameservers; domain nameservers know their subdomains.

Root servers are the entry point when the recursive resolver has no cached data. There are 13 logical root servers (A through M), operated by 12 independent organizations. As of 2024, over 1,700 physical instances exist worldwide, all using anycast for geographic distribution.

TLD servers handle generic TLDs (.com, .org, .net) and country-code TLDs (.uk, .de, .jp). They return referrals to the authoritative nameservers for second-level domains.

When a recursive resolver receives a query for api.example.com with an empty cache, it performs iterative resolution:

Before the first query, the resolver loads root hints—a file containing the names and IP addresses of root servers. RFC 9609 specifies the priming process:

  1. Resolver sends: QNAME=".", QTYPE=NS to a randomly selected root server address
  2. Root server responds with authoritative NS records for the root zone
  3. Resolver caches these records, replacing the hints

This priming query ensures the resolver has accurate root server data, not stale hints.

Query: api.example.com. IN A
From: Recursive resolver
To: Root server (e.g., 198.41.0.4, A-root)
Response:
Header: RCODE=NOERROR, AA=0 (not authoritative for this name)
Authority section:
com. 172800 IN NS a.gtld-servers.net.
com. 172800 IN NS b.gtld-servers.net.
...
Additional section:
a.gtld-servers.net. 172800 IN A 192.5.6.30
...

The root server returns a referral—NS records for the .com TLD along with glue records (A/AAAA records for the nameservers themselves).

Why glue records? If a.gtld-servers.net is the nameserver for .net, you’d need to resolve .net to find a.gtld-servers.net, creating a circular dependency. Glue records break this cycle by embedding IP addresses directly in the referral.

Query: api.example.com. IN A
From: Recursive resolver
To: a.gtld-servers.net (192.5.6.30)
Response:
Header: RCODE=NOERROR, AA=0
Authority section:
example.com. 172800 IN NS ns1.example.com.
example.com. 172800 IN NS ns2.example.com.
Additional section:
ns1.example.com. 172800 IN A 93.184.216.34
...

The TLD server returns another referral, pointing to the authoritative nameservers for example.com.

Query: api.example.com. IN A
From: Recursive resolver
To: ns1.example.com (93.184.216.34)
Response:
Header: RCODE=NOERROR, AA=1 (authoritative)
Answer section:
api.example.com. 300 IN A 93.184.216.50

The authoritative server returns the final answer with the AA flag set. The recursive resolver caches this record for 300 seconds (the TTL) and returns it to the stub resolver.

HopTypical LatencyNotes
Stub → Recursive1-50msLAN or ISP network
Cache lookup<1msIn-memory hash table
Recursive → Root10-30msAnycast, well-distributed
Recursive → TLD10-50ms.com has many anycast instances
Recursive → Authoritative10-200msDepends on server location
Total (uncached)50-400msVaries significantly
Total (cached)1-50msCache hit at recursive

TTL (Time To Live) is a 32-bit unsigned integer specifying the maximum duration a record may be cached, in seconds. Per RFC 1035, TTL “specifies the time interval that the resource record may be cached before the source of the information should again be consulted.”

Key behaviors:

  • Zero TTL: Use only for the current transaction; do not cache
  • TTL countdown: Cached records decrement TTL; at zero, the entry expires
  • TTL cap: RFC 8767 recommends capping at 604,800 seconds (7 days) to limit stale data risk

TTL at each layer:

LayerBehaviorTypical Values
AuthoritativeSets TTL in zone file; value is constant300-86400 seconds
RecursiveCaches with countdown; honors TTL from responseRespects authoritative TTL
StubMinimal caching, OS-dependentSeconds to minutes

Negative responses (NXDOMAIN, NODATA) are cached to reduce load on authoritative servers. RFC 2308 specifies that the negative cache TTL is:

Negative TTL = min(SOA.MINIMUM, SOA TTL)

Authoritative servers MUST include the SOA record in the Authority section of negative responses. Without it, the negative response SHOULD NOT be cached.

Historical note: The SOA MINIMUM field originally specified the minimum TTL for all zone records. RFC 2308 repurposed it specifically for negative caching.

RFC 9520 (December 2023) introduces mandatory caching of resolution failures—situations where no useful response was received (timeouts, SERVFAIL from all servers). This prevents “query storms” where failed authoritative servers receive 10x normal load from retrying resolvers.

Distinction:

  • NXDOMAIN/NODATA: Not failures—the server provided a useful (negative) answer
  • Resolution failure: No useful response received; cache to prevent retry floods

Modern resolvers prefetch records before TTL expiry to eliminate cache-miss latency for popular domains:

ResolverTrigger ConditionEligibility
BIND2 seconds remaining TTLOriginal TTL > 9 seconds
Unbound10% of original TTL remainingAll records
PowerDNSConfigurable percentagePopular domains

When an authoritative server is unreachable, resolvers MAY serve expired cache data rather than returning SERVFAIL:

  • Stale response TTL: 30 seconds (recommended)
  • Maximum stale timer: Configurable upper bound on how long past-TTL data is served
  • Resolver continues refresh attempts in background

This improves resilience during authoritative outages at the cost of potentially serving outdated data.

DNS responses include a 4-bit RCODE (Response Code) field. The primary values from RFC 1035:

RCODENameMeaning
0NOERRORQuery succeeded (may or may not have answer data)
1FORMERRServer couldn’t parse the query
2SERVFAILServer failed to complete the query
3NXDOMAINDomain does not exist
4NOTIMPQuery type not implemented
5REFUSEDServer refuses to answer (policy)

NXDOMAIN (Non-Existent Domain)

The queried name does not exist anywhere in DNS. The authoritative server for the parent zone confirms non-existence.

Terminal window
$ dig nonexistent.example.com
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN

RFC 8020 clarifies NXDOMAIN behavior: if a resolver receives NXDOMAIN for foo.example.com, it MAY assume that bar.foo.example.com also doesn’t exist (NXDOMAIN cut), reducing unnecessary queries.

SERVFAIL (Server Failure)

The recursive resolver couldn’t complete resolution. Common causes:

  • All authoritative servers timed out
  • DNSSEC validation failed
  • Lame delegation (NS records point to servers that don’t serve the zone)
  • Upstream server returned malformed response

SERVFAIL is the catch-all for “something went wrong.” Debugging requires checking resolver logs.

REFUSED

The server refuses to answer based on policy:

  • Query from unauthorized IP (ACL restriction)
  • Rate limiting triggered
  • Recursive query to authoritative-only server

Timeout (no response)

No RCODE—the query never received a response. Causes:

  • Network connectivity issues
  • Firewall blocking UDP/53 or TCP/53
  • Server overloaded or crashed
  • Anycast routing issues

RFC 1035 leaves retry logic to implementations. Typical patterns:

  1. Send query to first configured server
  2. Wait timeout period (1-5 seconds)
  3. No response → try next server
  4. Cycle through all servers
  5. End of cycle → double timeout (exponential backoff)
  6. Maximum retries (typically 2-4)

Early termination: Any NXDOMAIN response stops retries—it’s a definitive negative answer.

  1. Cache misses: Dominant factor. Cold cache resolution takes 100-400ms; cached response <1ms
  2. Geographic distance: Authoritative servers on another continent add 100-200ms RTT
  3. Packet loss: Triggers retries with exponential backoff; one lost packet can add seconds
  4. Lame delegations: NS records pointing to non-responsive servers waste query attempts
  5. DNSSEC validation: Additional queries for DNSKEY and DS records

All root servers and most major TLD/public resolvers use anycast—multiple servers share a single IP address, and BGP routes queries to the topologically nearest instance.

Benefits:

  • Reduces RTT by routing to nearby instances
  • Distributes DDoS traffic across instances
  • Improves availability (instance failure → traffic reroutes)

Caveat: “Nearest” means fewest network hops, not geographic distance. A query from Tokyo might route to a Singapore instance even if there’s one in Tokyo, depending on BGP policies.

StrategyMechanismTrade-off
Aggressive cachingLonger TTLsSlower propagation of changes
PrefetchingRefresh before expiryAdditional background queries
Serve-staleReturn expired data during outagesRisk of stale data
Multiple authoritative serversGeographic distributionOperational complexity
Low TTL during migrations60-300 seconds temporarilyHigher authoritative load

Browsers speculatively resolve hostnames found in page links:

  • Reduces perceived latency by parallelizing DNS with page load
  • Disabled by default on HTTPS pages (privacy concern)
  • Controlled via X-DNS-Prefetch-Control header or <link rel="dns-prefetch">
Terminal window
$ dig api.example.com
; <<>> DiG 9.18.18 <<>> api.example.com
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12345
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; ANSWER SECTION:
api.example.com. 300 IN A 93.184.216.50
;; Query time: 23 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)

Key fields:

  • status: NOERROR — Query succeeded
  • flags: qr rd ra — Query Response, Recursion Desired, Recursion Available
  • Query time: 23 msec — Round-trip to recursive resolver
Terminal window
$ dig +trace api.example.com

Trace mode bypasses the recursive resolver and performs iterative resolution directly, showing each hop:

. 518400 IN NS a.root-servers.net.
...
com. 172800 IN NS a.gtld-servers.net.
...
example.com. 172800 IN NS ns1.example.com.
...
api.example.com. 300 IN A 93.184.216.50

This reveals which server returned each referral and helps identify where resolution stalls.

Terminal window
$ dig @ns1.example.com api.example.com

Query a specific nameserver directly. Useful for verifying authoritative server configuration or comparing responses across replicas.

Terminal window
$ dig +norecurse @8.8.8.8 api.example.com

The +norecurse flag asks the resolver to return only cached data. If the record isn’t cached, you’ll get a referral or empty response.

Terminal window
$ dig +dnssec example.com

Requests DNSSEC records (RRSIG, DNSKEY) in the response. Check the ad (Authenticated Data) flag in the response header—if set, the resolver validated the DNSSEC chain.

DoH encapsulates DNS queries in HTTPS, providing:

  • Confidentiality: TLS encrypts the query and response
  • Integrity: TLS prevents tampering
  • Authentication: Server certificate validates resolver identity

DoH uses the application/dns-message media type with standard DNS wire format. It integrates with HTTP caching—HTTP freshness lifetime MUST be ≤ smallest Answer TTL.

Trade-offs:

  • Pro: Bypasses network-level DNS interception/filtering
  • Con: Centralizes DNS at browser-configured resolver (often Cloudflare or Google)
  • Con: Breaks enterprise DNS policies and split-horizon setups

DoT runs DNS over TLS on port 853. Unlike DoH, it’s a dedicated protocol, not tunneled through HTTP.

Design decision: RFC 7858 mandates port 853 for DoT and prohibits port 53. This separation reduces downgrade attack risk but makes DoT easier to block at the network level.

DNSSEC provides cryptographic authentication of DNS responses:

  1. Zone operator signs records with private key
  2. Signatures published as RRSIG records
  3. Public key published as DNSKEY record
  4. Parent zone publishes DS record (hash of child’s DNSKEY)
  5. Resolver follows chain of trust from root to target

NSEC/NSEC3 provide authenticated denial—proof that a name doesn’t exist, preventing attackers from forging NXDOMAIN responses.

Adoption: DNSSEC is widely deployed at TLDs but inconsistently at domain level. Validation failures result in SERVFAIL, which can break resolution for misconfigured zones.

DNS resolution is deceptively simple on the surface—a name goes in, an IP comes out. The underlying system is a distributed, hierarchical database with caching at every layer. Performance depends on cache hit rates; reliability depends on redundant authoritative servers and proper delegation. When debugging DNS issues, trace the path: stub → recursive → root → TLD → authoritative. Check TTLs, verify RCODE, and use +trace to pinpoint where resolution fails.

  • TCP/IP networking fundamentals
  • Basic command-line familiarity (dig, nslookup)
  • Understanding of client-server architecture
TermDefinition
Stub resolverDNS client library that forwards queries to a recursive resolver
Recursive resolverServer that performs iterative resolution and maintains cache
Authoritative serverServer that holds definitive records for a zone
TTLTime To Live; seconds a record may be cached
RCODEResponse Code; 4-bit field indicating query result
Glue recordA/AAAA record embedded in referral to break circular dependencies
AnycastRouting technique where multiple servers share one IP address
DNSSECDNS Security Extensions; cryptographic authentication of DNS data
DoHDNS over HTTPS (RFC 8484)
DoTDNS over TLS (RFC 7858)
  • DNS resolution flows from stub → recursive → root → TLD → authoritative, following referrals down the namespace tree
  • Caching at the recursive resolver dominates latency; TTLs control cache lifetime
  • Negative responses (NXDOMAIN, NODATA) are cached using SOA.MINIMUM
  • SERVFAIL indicates resolution failure; use dig +trace to identify the failing hop
  • Modern DNS adds encryption (DoH, DoT) and authentication (DNSSEC)
  • Anycast distributes load and reduces latency for root/TLD servers

Read more

  • Previous

    DNS Records, TTL Strategy, and Cache Behavior

    Web Foundations / Networking & Protocols 16 min read

    DNS records encode more than addresses—they define routing policies, ownership verification, security constraints, and service discovery. TTL (Time To Live) values control how long resolvers cache these records, creating a fundamental trade-off between propagation speed and query load. This article covers record types in depth, TTL design decisions for different operational scenarios, and the caching behaviors that determine how quickly DNS changes take effect.

  • Next

    Web Workers and Worklets for Off-Main-Thread Work

    Web Foundations / Browser APIs 15 min read

    Concurrency primitives for keeping the main thread responsive. Workers provide general-purpose parallelism via message passing; worklets integrate directly into the browser’s rendering pipeline for synchronized paint, animation, and audio processing.