Multi-VLAN DHCP+DNS home network setup with dnsmasq
Last week my ISP had an outage, and I discovered to my dismay that local DNS resolution on my home network stopped working too. My ISP having outages is nothing new, but the broken local DNS is, as the whole point of setting it up in the first place was to allow local services to keep working during the not-infrequent ISP oopsies.
My home network is largely run over Unifi hardware, with a Gateway Pro acting as, well, the network’s gateway. It comes with its own local DHCP and DNS setup based on dnsmasq which has worked fairly well in the past, so it’s not entirely clear whether this new failure mode is something that was introduced in a recent update (as I said, I’ve weathered ISP outages before without noticing the issue).
Looking a bit into it, it appears that the dnsmasq instance responsible for local DHCP and DNS is configured to use the gateway’s WAN interface, which seems a little strange, with the result that Unifi’s WAN fallback mechanism seems to disable it when the upstream connection breaks (thus breaking DNS as well).
I did a fair amount of searching and found a few mentions of similar issues from other people, though there didn’t seem to be a solution.
I decided that instead of trying to fix the issue locally on the gateway itself I would just deploy my own dnsmasq installation on separate hardware to prevent future updates from breaking it again.
Easier said than done.
Mo VLANs Mo Problems
Installing dnsmasq on my own host was easy enough, however I soon ran into some issues that didn’t exist in the previous setup.
You see, my network has a number of VLANs to keep local hosts tidy and organized. For example, one VLAN (with disabled Internet access) hosts all of my IoT tchotchkes which helps prevent major security issues, and the host running Home Assistant is connected to both the IoT VLAN for managing these devices, as well as the main network to allow users to interact with it.
With the Gateway Pro setup the Home Assistant host would get two DHCP leases, one for each VLAN, and more critically, its domain name would resolve to one or the other IP address depending on which VLAN the DNS query came from.
This didn’t work in my naive setup, as it turns out that a single dnsmasq
instance can only associate a particular hostname with a single DHCP lease file
entry, which means that the multi-VLAN host would always resolve to a single IP
address regardless of the source VLAN (with only one lease having the correct
hostname in the lease database, and the others listing * as the hostname).
The solution was running separate dnsmasq instances for DHCP (one per VLAN) plus
a dedicated dnsmasq instance for DNS only, which is actually pretty much how the
Unifi gateway setup works as well. The DHCP instances never listen on port 53,
they just hand out addresses and use dnsmasq’s dhcp-script hook to write
individual /etc/hosts-compatible files into a shared directory.
The DNS instance then uses hostsdir=/run/dnsmasq/hosts.d to pick up all those
per-IP hosts files and resolve them. Since each file is keyed by IP address
rather than hostname, there’s no conflict when the same device has leases on
multiple VLANs, you just get multiple files for different IPs of the same host.
The pieces
Each DHCP instance is enabled as a systemd template unit
(dnsmasq-dhcp@<VLAN>.service) so they can be managed independently, and each
instance’s configuration is generated from my Ansible inventory.
It looks something like this (with some cruft removed for brevity):
domain=example.com
interface=
port=0
dhcp-authoritative
dhcp-leasefile=/var/lib/dnsmasq/leases-
dhcp-script=/usr/local/bin/dhcp-script.sh
script-on-renewalWhere renders to the netif's name, and to
the actual netif device. Also note port=0 which means this instance only does
DHCP, never DNS.
The dhcp-script.sh hook runs whenever a lease is added, renewed, or deleted,
and writes one file per IP address into /run/dnsmasq/hosts.d/, containing the
full hostname (with domain) plus the short name.
I couldn’t find an example of a similar setup anywhere, and the Unifi’s dhcp-script seems to be a binary so I couldn’t quite inspect what it does, so I ended up vibecoding my own script:
And for completeness, the DNS-only instance config is as follows:
domain=example.com
bind-dynamic
bogus-priv
hostsdir=/run/dnsmasq/hosts.d
localise-queries
no-hosts
no-resolv
local=/example.com/
server=127.0.0.1#5053The hostsdir directive makes dnsmasq load all files from that directory as if
they were static host entries, and localise-queries ensures each client gets
resolved with the correct domain based on which interface it’s coming from.
dnscrypt-proxy
Finally upstream DNS resolution is handled by
dnscrypt-proxy (largely to get all
the nice modern goodies like DNS over HTTPS), which I run as a systemd
socket-activated service on 127.0.2.1:5053.
The default Debian package listens on 127.0.0.1:53, so I just had to override
the socket unit file to change port:
[Socket]
ListenStream=
ListenDatagram=
ListenStream=127.0.0.1:5053
ListenDatagram=127.0.0.1:5053This avoids conflicting with dnsmasq’s own DNS on port 53.
Anyways, I hope this will be useful to the unfortunate souls (human or otherwise) that decide that hosting their own dnsmasq is “probably just going to take 10 minutes” in the future.