RECALL Operations and Development Guide

Overview

This guide documents the RECALL recursive caching DNS resolver implemented in:

It is intended for both operators (deployment and troubleshooting) and developers (architecture, resolution semantics, and scripting API).

1. Quick Start

1.1 Runtime entry point

The standalone RECALL app entry point is defined in buildgen/apps/recall.lua and runs:

local recall = require("recall")
math.randomseed(os.time())
local srv, err = recall.new()
if not srv then
	print("failed to init RECALL: " .. tostring(err))
	os.exit(-1)
end
srv:run()

1.2 Config file location

RECALL reads config JSON from:

  1. RECALL_CONFIG_FILE environment variable, if set

  2. /etc/recall/config.json otherwise

1.3 Minimal config

{
  "ip": "127.0.0.1",
  "port": 5353
}

This starts RECALL on port 5353, performing full iterative resolution from IANA root servers on cache misses and caching results in memory.

To enable persistent caching with MNEME (embedded database):

{
  "ip": "127.0.0.1",
  "port": 5353,
  "cache": { "mneme": true }
}

The MNEME database file defaults to /var/cache/recall/dns.mneme. MNEME is also required for the scripting engine (section 7).

1.4 Test query

dig @127.0.0.1 -p 5353 example.com A

2. Full Configuration Reference

All defaults are defined in src/recall/recall.lua.

2.1 Top-level config

KeyDefaultNotes
ip127.0.0.1Bind address
port53Bind port (UDP and TCP)
timeout5Per-query socket timeout for iterative resolution (seconds)
edns_buffer_size4096EDNS advertised UDP payload size
tcp_timeout10TCP session idle timeout (seconds)
tcp_fork_limit64Max concurrent TCP coroutines
tcp_per_ip_limit4Max concurrent TCP connections per source IP
max_udp_size4096Max UDP datagram size to receive
log_levelinfoLogger level passed to std.logger
dnssecfalseEnable DNSSEC validation for iterative resolution
dnssec_reject_bogusfalseReturn SERVFAIL for DNSSEC-bogus responses (requires dnssec)

2.2 Cache config (cache)

KeyDefaultNotes
min_ttl30Minimum TTL floor (seconds)
max_ttl86400Maximum TTL cap (seconds)
max_negative_ttl300Negative response (NXDOMAIN/NODATA) TTL cap
mnemefalseUse embedded MNEME database for persistent caching
mneme_path/var/cache/recall/dns.mnemeMNEME database file path
compact_threshold0.5Free-page ratio that triggers auto-compaction
compact_interval3600Minimum seconds between compaction checks

Cache config is passed to dns.cache.new(). See DNS.md for cache backend details. Backend priority: MNEME (if mneme = true) > in-memory.

2.4 Forward zones config (forward_zones)

Map of zone names to zone configuration objects:

{
  "forward_zones": {
    "internal.corp.": {
      "servers": ["10.0.0.10", "10.0.0.11"],
      "timeout": 3
    },
    "dev.example.com.": {
      "servers": ["10.120.0.10:5353"],
      "timeout": 2,
      "retries": 3
    },
    "secure.corp.": {
      "servers": ["1.1.1.1", "1.0.0.1"],
      "use_tls": true,
      "server_name": "cloudflare-dns.com"
    }
  }
}

Per-zone fields:

KeyDefaultNotes
serversrequiredUpstream nameserver list
timeouttop-level timeoutPer-query timeout for this zone
retries2Retry attempts
use_tlsfalseUse DNS-over-TLS (port 853) for this zone
server_nameserver addressTLS SNI hostname
cafilesystem CA bundlePath to CA certificate bundle
capathPath to CA certificate directory
verifytrueEnable TLS certificate verification

Server entries support port specification as "host:port" strings or {"host": "10.120.0.10", "port": 5353} tables. When use_tls is enabled, servers default to port 853 instead of 53.

Forward zone queries use dns.client with RD=1 (recursive requests to the configured forwarders), not iterative resolution. Each zone gets its own dns.client instance sharing the global cache. Zone matching is longest-match: if both corp. and dev.corp. are configured, queries for app.dev.corp. route to the dev.corp. forwarder.

Root zone forwarding

The root zone "." can be used as a catch-all forward zone to route all queries to specific upstream resolvers, completely replacing iterative resolution:

{
  "forward_zones": {
    ".": {
      "servers": ["1.1.1.1", "1.0.0.1"],
      "use_tls": true,
      "server_name": "cloudflare-dns.com"
    }
  }
}

When "." is configured, all queries that don't match a more specific forward zone are sent to its servers. Forward zone matches are terminal — if the upstream forwarder fails, SERVFAIL is returned without falling back to iterative resolution. More specific zones still take priority due to longest-match-first ordering.

2.5 Scripting config (scripting)

KeyDefaultNotes
enabledfalseEnable script engine
subdomains[]List of subdomains with dynamic scripts
cache_ttl60TTL for caching script-generated records
script_cache_ttl300In-memory compiled function cache TTL (seconds)
{
  "scripting": {
    "enabled": true,
    "subdomains": ["dyn.example.com."],
    "cache_ttl": 60,
    "script_cache_ttl": 300
  }
}

See section 7 for the scripting API.

2.6 Rate limit config (rate_limit)

KeyDefaultNotes
enabledtrueEnable UDP rate limiting
queries_per_second50Token refill rate per source IP
burst100Token bucket capacity (max burst)
cleanup_interval15Stale bucket GC interval (seconds)
max_buckets50000Maximum tracked source IPs; new IPs rejected when full
{
  "rate_limit": {
    "enabled": true,
    "queries_per_second": 50,
    "burst": 100,
    "cleanup_interval": 15,
    "max_buckets": 50000
  }
}

Rate limiting applies only to UDP queries. TCP connections are structurally limited by tcp_fork_limit (global) and tcp_per_ip_limit (per source IP). See section 5 for details.

3. MNEME Storage

When cache.mneme is enabled, RECALL stores all persistent data in a single MNEME database file (default /var/cache/recall/dns.mneme) with two keyspaces:

KeyspaceKey formatValueTTL
cache<qname>:<qtype>JSON {records: [...]}DNS record TTL (clamped)
cacheNEG:<qname>:<qtype>JSON {rcode, soa}Negative TTL from SOA
scripts<subdomain>Lua source codeNone (persistent)

Auto-compaction

MNEME's copy-on-write B-tree accumulates free pages as DNS records expire and get replaced. RECALL periodically checks the free-page ratio (free_pages / page_count) and compacts when it exceeds compact_threshold (default 0.5). Compaction rewrites the database to a new file, reclaiming space, then atomically replaces the original. The minimum interval between compaction checks is compact_interval seconds (default 3600).

4. Architecture

recall (manager)
└── recall.listener     — LEV async event loop (epoll + coroutines)
    ├── recall.handler  — request pipeline per query
    │   ├── recall.scripting  — dynamic script execution
    │   ├── recall.forward    — forward zone routing
    │   └── recall.resolver   — iterative resolution (dns.iter + dns.cache)
    ├── [UDP coroutines] — one per incoming query
    ├── [TCP coroutines] — one per TCP connection
    └── dns.cache        — TTL cache (in-memory or MNEME)

4.1 Process model

Manager forks a single listener process. The listener runs inside lev.run(), using epoll-based async I/O with Lua coroutines for concurrency. UDP queries are received in a main loop and each query spawns a detached coroutine for processing. TCP connections are accepted in a dedicated coroutine and each accepted connection spawns its own handler coroutine. All upstream DNS queries (iterative resolution, forward zones) use LEV transports (dns.transport.udp, dns.transport.tcp). MNEME caching is embedded — no network I/O for cache operations.

4.2 Query validation

The handler validates each incoming query before resolution. Checks run in order:

  1. Decode failure → silently dropped (no response)

  2. Not a query (QR=1) → silently dropped

  3. Non-standard opcode (OPCODE != 0) → NOTIMP

  4. EDNS version > 0 → BADVERS (extended rcode via OPT record)

  5. Empty question section → FORMERR

  6. Non-IN class (qclass != 1) → REFUSED

  7. Invalid domain name → FORMERR

4.3 Query routing

Queries that pass validation are routed through resolution paths in order:

  1. Scripted subdomains — if qname matches a configured scripting subdomain

  2. Forward zones — if qname matches a configured forward zone (longest match)

  3. RFC1918 reverse zone guard — if qname falls under a private reverse zone (10.in-addr.arpa., 168.192.in-addr.arpa., 16-31.172.in-addr.arpa.), return NXDOMAIN with a synthetic SOA immediately instead of querying public root servers

  4. Iterative resolution — full walk from IANA root servers via dns.iter.trace()

The first path that produces a result (or a definitive negative like NXDOMAIN) is used. Scripted responses that return nil fall through to forward zones, then to the RFC1918 guard, then to iterative resolution. To resolve private reverse lookups against an internal DNS server, configure a forward zone for the relevant .in-addr.arpa. zone — forward zones (step 2) take priority over the guard (step 3).

5. Rate Limiting

5.1 UDP rate limiting

RECALL uses a token bucket algorithm to rate-limit UDP queries per source IP. Each source IP gets an independent bucket with a configurable refill rate (queries_per_second, default 50) and capacity (burst, default 100). Packets that arrive when the bucket is empty are silently dropped — no REFUSED or SERVFAIL response is sent.

Rate limiting is checked before decoding the DNS message, minimizing CPU cost for dropped packets.

Stale buckets (idle longer than burst / queries_per_second seconds) are garbage-collected every cleanup_interval seconds (default 15).

5.2 TCP connection limits

TCP connections are structurally rate-limited rather than using token buckets:

6. Resolution Semantics

6.1 Iterative resolver (recall.resolver)

The resolver wraps dns.iter.trace() with dns.cache for positive and negative caching.

Resolution flow:

  1. Check positive cache → return on hit

  2. Check negative cache → return NXDOMAIN/NODATA on hit

  3. Call iter.trace(qname, qtype) → walk from root servers to authoritative answer

  4. Process trace result:

    • answer: collect CNAME chain from intermediate steps, prepend to answer records, cache combined result, return

    • nxdomain: cache negative with SOA, return nil, "NXDOMAIN", soa

    • nodata: cache negative with SOA, return nil, "NODATA", soa

    • error: return nil, error_description

CNAME chain handling: when iter.trace() follows CNAMEs (restarting from root for each target), the resolver collects CNAME records from intermediate trace steps and prepends them to the final answer. Clients receive the full CNAME chain in the answer section, matching standard resolver behavior.

6.2 DNSSEC validation

When dnssec is enabled, the iterative resolver performs chain-of-trust validation from the IANA root trust anchor through each delegation to the authoritative answer. At each step, DS records from referrals are collected, DNSKEY records are fetched from zone apex servers, and RRSIG signatures are verified cryptographically.

Each resolution step receives a validation status:

StatusMeaning
secureChain of trust validated from root to answer
insecureZone is provably unsigned (no DS in parent delegation)
bogusSignatures present but verification failed
indeterminateCannot determine (unsupported algorithm, missing data)

Supported DNSSEC algorithms: RSA/SHA-256 (8), RSA/SHA-512 (10), ECDSA P-256 (13), ECDSA P-384 (14), ED25519 (15).

When dnssec_reject_bogus is also enabled, queries that resolve with a bogus DNSSEC status return SERVFAIL instead of the answer. This prevents clients from receiving responses that fail signature verification. Unsigned zones (insecure) are passed through normally.

DNSSEC validation adds one extra DNSKEY query per delegation level (root, TLD, authoritative zone). The validation status is logged at debug level (or warn level for bogus results).

6.3 Forward zone resolution (recall.forward)

Forward zones use dns.client instances configured with the zone's upstream servers. Queries are sent with RD=1 (recursion desired). The dns.client provides retry, failover, CNAME following, and integrated caching through the shared cache instance.

Zone matching is longest-match by label count. A query for app.dev.corp. matches dev.corp. before corp. if both are configured.

6.4 Response construction

All responses set:

Response types:

ConditionRCODEAnswerAuthority
Records foundNOERRORrecordsempty
Name not foundNXDOMAINemptySOA if available
No records for typeNOERRORemptySOA if available
Resolution failureSERVFAILemptyempty
Malformed queryFORMERRemptyempty

6.5 UDP truncation

When a UDP response exceeds the client's advertised EDNS buffer size (or 512 bytes without EDNS), the response is re-encoded with TC=1 set and the answer, authority, and additional sections stripped. The client is expected to retry over TCP.

7. Scripting API

Dynamic Lua scripts stored in MNEME generate DNS responses at query time. Scripts are loaded from the scripts keyspace in the MNEME database, keyed by subdomain name. Scripting requires cache.mneme = true.

7.1 Storing scripts

cat << EOF > my_script.lua
if qtype ~= dns.TYPE.A then return nil end
local hosts = { "10.0.1.1", "10.0.1.2", "10.0.1.3" }
local idx = (os.time() % #hosts) + 1
return { dns.a_record(qname, hosts[idx], 30) }
EOF

mneme set --db /var/cache/recall/dns.mneme -K scripts -t file example.com. my_script.lua

7.2 Script environment

Scripts execute in a sandboxed environment with setfenv. Available globals:

Query context:

VariableTypeDescription
qnamestringNormalized FQDN with trailing dot
qtypenumberNumeric type code

DNS record constructors:

FunctionArgumentsRecord type
dns.a_record(name, address, ttl)name, IPv4 string, TTLA
dns.aaaa_record(name, address, ttl)name, IPv6 string, TTLAAAA
dns.cname_record(name, target, ttl)name, target FQDN, TTLCNAME
dns.txt_record(name, text, ttl)name, text string, TTLTXT
dns.mx_record(name, exchange, preference, ttl)name, exchange FQDN, pref, TTLMX
dns.srv_record(name, target, port, priority, weight, ttl)name, target FQDN, port, priority, weight, TTLSRV
dns.ns_record(name, nsdname, ttl)name, NS FQDN, TTLNS
dns.ptr_record(name, ptrdname, ttl)name, PTR FQDN, TTLPTR

All TTL arguments default to 60 seconds when omitted. The dns.TYPE table is available for type comparisons (dns.TYPE.A, dns.TYPE.AAAA, etc.).

Safe builtins:

Not available: io, os.execute, require, dofile, loadfile, loadstring, debug, coroutine.

7.3 Script return value

Scripts must return an array of record tables (as produced by the dns.*_record() constructors), or nil to fall through to forward zone / iterative resolution.

-- Return A records
return { dns.a_record(qname, "10.0.1.1", 60) }

-- Return nil to fall through
if qtype ~= dns.TYPE.A then return nil end

7.4 Script caching

Compiled script functions are cached in memory for script_cache_ttl seconds (default 300). After expiry, the script source is re-fetched from MNEME and recompiled. This allows hot-reloading scripts without restarting RECALL.

Script-generated records are cached in the DNS cache for cache_ttl seconds (default 60).

7.5 Example scripts

Round-robin A records:

if qtype ~= dns.TYPE.A then return nil end
local hosts = { "10.0.1.1", "10.0.1.2", "10.0.1.3" }
local idx = (os.time() % #hosts) + 1
return { dns.a_record(qname, hosts[idx], 30) }

Wildcard subdomain routing:

if qtype ~= dns.TYPE.A then return nil end
local label = qname:match("^([^.]+)%.")
if label == "web" then
    return { dns.a_record(qname, "10.0.2.1", 60) }
elseif label == "api" then
    return { dns.a_record(qname, "10.0.2.2", 60) }
end
return nil

TXT record with timestamp:

if qtype ~= dns.TYPE.TXT then return nil end
return { dns.txt_record(qname, "generated at " .. os.time(), 10) }

8. Failure-Mode Responses

RCODETrigger
FORMERRMissing question section, invalid domain name
SERVFAILIterative resolution failure, forward zone timeout, DNSSEC bogus (when dnssec_reject_bogus enabled)
NXDOMAINDomain does not exist (from authoritative server or cache), or reverse lookup for RFC1918 private network with no forward zone configured
NOERROR (empty)Name exists but no records of requested type (NODATA)
NOTIMPNon-standard opcode (OPCODE != 0)
BADVERSEDNS version > 0 (extended rcode via OPT record)
REFUSEDQuery class is not IN (1)

Non-query messages (QR=1) are silently dropped. Malformed UDP packets that fail to decode are silently dropped.

9. Observability

RECALL uses structured logging via std.logger. Log events include:

10. Build

The standalone recall binary is built via buildgen/apps/recall.lua. Dependencies:

11. Source Files

FilePurpose
src/recall/recall.luaManager: config loading, process spawning, child reaping
src/recall/recall/listener.luaLEV async event loop (UDP + TCP + coroutines)
src/recall/recall/handler.luaRequest pipeline: decode, route, resolve, encode
src/recall/recall/resolver.luaIterative resolver wrapping dns.iter + dns.cache
src/recall/recall/forward.luaForward zone index and dns.client wrappers
src/recall/recall/scripting.luaDynamic Lua script loading and sandboxed execution
buildgen/apps/recall.luaStandalone binary definition
buildgen/entrypoints/recall/start.luaEntry point