It doesn't really pull its weight: it adds 577 KB to the binary and
is rarely useful.
Also, we now have static IPs and other connectivity paths coming
soon enough.
Updates #5853
Updates #1278
Updates tailscale/corp#32168
Change-Id: If336fed00a9c9ae9745419e6d81f7de6da6f7275
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This fixes a flaky test which has been occasionally timing out in CI.
In particular, this test times out if `watchFile` receives multiple
notifications from inotify before we cancel the test context. We block
processing the second notification, because we've stopped listening to
the `callbackDone` channel.
This patch changes the test so we only send on the first notification.
Testing this locally with `stress` confirms that the test is no longer
flaky.
Fixes#17172
Updates #14699
Signed-off-by: Alex Chan <alexc@tailscale.com>
The Tracker was using direct callbacks to ipnlocal. This PR moves those
to be triggered via the eventbus.
Additionally, the eventbus is now closed on exit from tailscaled
explicitly, and health is now a SubSystem in tsd.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
This is step 4 of making syspolicy a build-time feature.
This adds a policyclient.Get() accessor to return the correct
implementation to use: either the real one, or the no-op one. (A third
type, a static one for testing, also exists, so in general a
policyclient.Client should be plumbed around and not always fetched
via policyclient.Get whenever possible, especially if tests need to use
alternate syspolicy)
Updates #16998
Updates #12614
Change-Id: Iaf19670744a596d5918acfa744f5db4564272978
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Step 3 in the series. See earlier cc532efc2000 and d05e6dc09e.
This step moves some types into a new leaf "ptype" package out of the
big "settings" package. The policyclient.Client will later get new
methods to return those things (as well as Duration and Uint64, which
weren't done at the time of the earlier prototype).
Updates #16998
Updates #12614
Change-Id: I4d72d8079de3b5351ed602eaa72863372bd474a2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This is step 2 of ~4, breaking up #14720 into reviewable chunks, with
the aim to make syspolicy be a build-time configurable feature.
Step 1 was #16984.
In this second step, the util/syspolicy/policyclient package is added
with the policyclient.Client interface. This is the interface that's
always present (regardless of build tags), and is what code around the
tree uses to ask syspolicy/MDM questions.
There are two implementations of policyclient.Client for now:
1) NoPolicyClient, which only returns default values.
2) the unexported, temporary 'globalSyspolicy', which is implemented
in terms of the global functions we wish to later eliminate.
This then starts to plumb around the policyclient.Client to most callers.
Future changes will plumb it more. When the last of the global func
callers are gone, then we can unexport the global functions and make a
proper policyclient.Client type and constructor in the syspolicy
package, removing the globalSyspolicy impl out of tsd.
The final change will sprinkle build tags in a few more places and
lock it in with dependency tests to make sure the dependencies don't
later creep back in.
Updates #16998
Updates #12614
Change-Id: Ib2c93d15c15c1f2b981464099177cd492d50391c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This is step 1 of ~3, breaking up #14720 into reviewable chunks, with
the aim to make syspolicy be a build-time configurable feature.
In this first (very noisy) step, all the syspolicy string key
constants move to a new constant-only (code-free) package. This will
make future steps more reviewable, without this movement noise.
There are no code or behavior changes here.
The future steps of this series can be seen in #14720: removing global
funcs from syspolicy resolution and using an interface that's plumbed
around instead. Then adding build tags.
Updates #12614
Change-Id: If73bf2c28b9c9b1a408fe868b0b6a25b03eeabd1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Resolver.PreferGo didn't used to work on Windows.
It was fixed in 2022, though. (https://github.com/golang/go/issues/33097)
Updates #5161
Change-Id: I4e1aeff220ebd6adc8a14f781664fa6a2068b48c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
fixestailscale/corp#25612
We now keep track of any dns configurations which we could not
compile. This gives RecompileDNSConfig a configuration to
attempt to recompile and apply when the OS pokes us to indicate
that the interface dns servers have changed/updated. The manager config
will remain unset until we have the required information to compile
it correctly which should eliminate the problematic SERVFAIL
responses (especially on macOS 15).
This also removes the missingUpstreamRecovery func in the forwarder
which is no longer required now that we have proper error handling
and recovery manager and the client.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
In this PR, we make DNS registration behavior configurable via the EnableDNSRegistration policy setting.
We keep the default behavior unchanged, but allow admins to either enforce DNS registration and dynamic
DNS updates for the Tailscale interface, or prevent Tailscale from modifying the settings configured in
the network adapter's properties or by other means.
Updates #14917
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In this PR, we make the "user-dial-routes" behavior default on all platforms except for iOS and Android.
It can be disabled by setting the TS_DNS_FORWARD_USE_ROUTES envknob to 0 or false.
Updates #12027
Updates #13837
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Android is Linux, but doesn't use Linux DNS managers (or D-Bus).
Updates #12614
Change-Id: I487802ac74a259cd5d2480ac26f7faa17ca8d1c3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This adds netx.DialFunc, unifying a type we have a bazillion other
places, giving it now a nice short name that's clickable in
editors, etc.
That highlighted that my earlier move (03b47a55c7956) of stuff from
nettest into netx moved too much: it also dragged along the memnet
impl, meaning all users of netx.DialFunc who just wanted netx for the
type definition were instead also pulling in all of memnet.
So move the memnet implementation netx.Network into memnet, a package
we already had.
Then use netx.DialFunc in a bunch of places. I'm sure I missed some.
And plenty remain in other repos, to be updated later.
Updates tailscale/corp#27636
Change-Id: I7296cd4591218e8624e214f8c70dab05fb884e95
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
updates tailscale/corp#27145
We require a means to trigger a recompilation of the DNS configuration
to pick up new nameservers for platforms where we blend the interface
nameservers from the OS into our DNS config.
Notably, on Darwin, the only API we have at our disposal will, in rare instances,
return a transient error when querying the interface nameservers on a link change if
they have not been set when we get the AF_ROUTE messages for the link
update.
There's a corresponding change in corp for Darwin clients, to track
the interface namservers during NEPathMonitor events, and call this
when the nameservers change.
This will also fix the slightly more obscure bug of changing nameservers
while tailscaled is running. That change can now be reflected in
magicDNS without having to stop the client.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
We build up maps of both the existing MagicDNS configuration in hosts
and the desired MagicDNS configuration, compare the two, and only
write out a new one if there are changes. The comparison doesn't need
to be perfect, as the occasional false-positive is fine, but this
should greatly reduce rewrites of the hosts file.
I also changed the hosts updating code to remove the CRLF/LF conversion
stuff, and use Fprintf instead of Frintln to let us write those inline.
Updates #14428
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
We still use josharian/native (hi @josharian!) via
netlink, but I also sent https://github.com/mdlayher/netlink/pull/220
Updates #8632
Change-Id: I2eedcb7facb36ec894aee7f152c8a1f56d7fc8ba
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Importing the ~deprecated golang.org/x/exp/maps as "xmaps" to not
shadow the std "maps" was getting ugly.
And using slices.Collect on an iterator is verbose & allocates more.
So copy (x)maps.Keys+Values into our slicesx package instead.
Updates #cleanup
Updates #12912
Updates #14514 (pulled out of that change)
Change-Id: I5e68d12729934de93cf4a9cd87c367645f86123a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
During resolv.conf update, old 'search' lines are cleared but '\n' is not
deleted, leaving behind a new blank line on every update.
This adds 's' flag to regexp, so '\n' is included in the match and deleted when
old lines are cleared.
Also, insert missing `\n` when updated 'search' line is appended to resolv.conf.
Signed-off-by: Renato Aguiar <renato@renatoaguiar.net>
If multiple upstream DNS servers are available, quad-100 sends requests to all of them
and forwards the first successful response, if any. If no successful responses are received,
it propagates the first failure from any of them.
This PR adds some test coverage for these scenarios.
Updates #13571
Signed-off-by: Nick Khyl <nickk@tailscale.com>
We currently have two executions paths where (*forwarder).forwardWithDestChan
returns nil, rather than an error, without sending a DNS response to responseChan.
These paths are accompanied by a comment that reads:
// Returning an error will cause an internal retry, there is
// nothing we can do if parsing failed. Just drop the packet.
But it is not (or no longer longer) accurate: returning an error from forwardWithDestChan
does not currently cause a retry.
Moreover, although these paths are currently unreachable due to implementation details,
if (*forwarder).forwardWithDestChan were to return nil without sending a response to
responseChan, it would cause a deadlock at one call site and a panic at another.
Therefore, we update (*forwarder).forwardWithDestChan to return errors in those two paths
and remove comments that were no longer accurate and misleading.
Updates #cleanup
Updates #13571
Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
If a DoH server returns an HTTP server error, rather than a SERVFAIL within
a successful HTTP response, we should handle it in the same way as SERVFAIL.
Updates #13571
Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
As per the docstring, (*forwarder).forwardWithDestChan should either send to responseChan
and returns nil, or returns a non-nil error (without sending to the channel).
However, this does not hold when all upstream DNS servers replied with an error.
We've been handling this special error path in (*Resolver).Query but not in (*Resolver).HandlePeerDNSQuery.
As a result, SERVFAIL responses from upstream servers were being converted into HTTP 503 responses,
instead of being properly forwarded as SERVFAIL within a successful HTTP response, as per RFC 8484, section 4.2.1:
A successful HTTP response with a 2xx status code (see Section 6.3 of [RFC7231]) is used for any valid DNS response,
regardless of the DNS response code. For example, a successful 2xx HTTP status code is used even with a DNS message
whose DNS response code indicates failure, such as SERVFAIL or NXDOMAIN.
In this PR we fix (*forwarder).forwardWithDestChan to no longer return an error when it sends a response to responseChan,
and remove the special handling in (*Resolver).Query, as it is no longer necessary.
Updates #13571
Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
Updates tailscale/tailscale#6148
This is the result of some observations we made today with @raggi. The DNS over HTTPS client currently doesn't cap the number of connections it uses, either in-use or idle. A burst of DNS queries will open multiple connections. Idle connections remain open for 30 seconds (this interval is defined in the dohTransportTimeout constant). For DoH providers like NextDNS which send keep-alives, this means the cellular modem will remain up more than expected to send ACKs if any keep-alives are received while a connection remains idle during those 30 seconds. We can set the IdleConnTimeout to 10 seconds to ensure an idle connection is terminated if no other DNS queries come in after 10 seconds. Additionally, we can cap the number of connections to 1. This ensures that at all times there is only one open DoH connection, either active or idle. If idle, it will be terminated within 10 seconds from the last query.
We also observed all the DoH providers we support are capable of TLS 1.3. We can force this TLS version to reduce the number of packets sent/received each time a TLS connection is established.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Updates tailscale/tailscale#13326
Adds a CLI subcommand to perform DNS queries using the internal DNS forwarder and observe its internals (namely, which upstream resolvers are being used).
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
mdnsResponder at least as of macOS Sequoia does not find NXDOMAIN
responses to these dns-sd PTR queries acceptable unless they include the
question section in the response. This was found debugging #13511, once
we turned on additional diagnostic reporting from mdnsResponder we
witnessed:
```
Received unacceptable 12-byte response from 100.100.100.100 over UDP via utun6/27 -- id: 0x7F41 (32577), flags: 0x8183 (R/Query, RD, RA, NXDomain), counts: 0/0/0/0,
```
If the response includes a question section, the resposnes are
acceptable, e.g.:
```
Received acceptable 59-byte response from 8.8.8.8 over UDP via en0/17 -- id: 0x2E55 (11861), flags: 0x8183 (R/Query, RD, RA, NXDomain), counts: 1/0/0/0,
```
This may be contributing to an issue under diagnosis in #13511 wherein
some combination of conditions results in mdnsResponder no longer
answering DNS queries correctly to applications on the system for
extended periods of time (multiple minutes), while dig against quad-100
provides correct responses for those same domains. If additional debug
logging is enabled in mdnsResponder we see it reporting:
```
Penalizing server 100.100.100.100 for 60 seconds
```
It is also possible that the reason that macOS & iOS never "stopped
spamming" these queries is that they have never been replied to with
acceptable responses. It is not clear if this special case handling of
dns-sd PTR queries was ever beneficial, and given this evidence may have
always been harmful. If we subsequently observe that the queries settle
down now that they have acceptable responses, we should remove these
special cases - making upstream queries very occasionally isn't a lot of
battery, so we should be better off having to maintain less special
cases and avoid bugs of this class.
Updates #2442
Updates #3025
Updates #3363
Updates #3594
Updates #13511
Signed-off-by: James Tucker <james@tailscale.com>
We already disable dynamic updates by setting DisableDynamicUpdate to 1 for the Tailscale interface.
However, this does not prevent non-dynamic DNS registration from happening when `ipconfig /registerdns`
runs and in similar scenarios. Notably, dns/windowsManager.SetDNS runs `ipconfig /registerdns`,
triggering DNS registration for all interfaces that do not explicitly disable it.
In this PR, we update dns/windowsManager.disableDynamicUpdates to also set RegistrationEnabled to 0.
Fixes#13411
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Discovered this while investigating the following issue; I think it's
unrelated, but might as well fix it. Also, add a test helper for
checking things that have an IsZero method using the reflect package.
Updates tailscale/support-escalations#55
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I57b7adde43bcef9483763b561da173b4c35f49e2
Updates tailscale/tailscale#13326
This PR begins implementing a `tailscale dns` command group in the Tailscale CLI. It provides an initial implementation of `tailscale dns status` which dumps the state of the internal DNS forwarder.
Two new endpoints were added in LocalAPI to support the CLI functionality:
- `/netmap`: dumps a copy of the last received network map (because the CLI shouldn't have to listen to the ipn bus for a copy)
- `/dns-osconfig`: dumps the OS DNS configuration (this will be very handy for the UI clients as well, as they currently do not display this information)
My plan is to implement other subcommands mentioned in tailscale/tailscale#13326, such as `query`, in later PRs.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Updates tailscale/tailscale#177
It appears that the OSS distribution of `tailscaled` is currently unable to get the current system base DNS configuration, as GetBaseConfig() in manager_darwin.go is unimplemented. This PR adds a basic implementation that reads the current values in `/etc/resolv.conf`, to at least unblock DNS resolution via Quad100 if `--accept-dns` is enabled.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
`DNS unavailable` was marked as a high severity warning. On Android (and other platforms), these trigger a system notification. Here we reduce the severity level to medium. A medium severity warning will still display the warning icon on platforms with a tray icon because of the `ImpactsConnectivity=true` flag being set here, but it won't show a notification anymore. If people enter an area with bad cellular reception, they're bound to receive so many of these notifications and we need to reduce notification fatigue.
Signed-off-by: Andrea Gottardo <andrea@tailscale.com>
Troubleshooting DNS resolution issues often requires additional information.
This PR expands the effect of the TS_DEBUG_DNS_FORWARD_SEND envknob to forwarder.forwardWithDestChan,
and includes the request type, domain name length, and the first 3 bytes of the domain's SHA-256 hash in the output.
Fixes#13070
Signed-off-by: Nick Khyl <nickk@tailscale.com>
fixes tailscale#12968
The dns manager cleanup func was getting passed a nil
health tracker, which will panic. Fixed to pass it
the system health tracker.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
updates tailscale/corp#21823
Misconfigured, broken, or blocked DNS will often present as
"internet is broken'" to the end user. This plumbs the health tracker
into the dns manager and forwarder and adds a health warning
with a 5 second delay that is raised on failures in the forwarder and
lowered on successes.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
This commit truncates any additional information (mainly hostnames) that's passed to controlD via DOH URL in DoHIPsOfBase.
This change is to make sure only resolverID is passed to controlDv6Gen but not the additional information.
Updates: #7946
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>