This required sharing the dropped packet metric between two packages
(tstun and magicsock), so I've moved its definition to util/usermetric.
Updates tailscale/corp#22075
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
In an environment with unstable latency, such as upstream bufferbloat,
there are cases where a full netcheck could drop the prior preferred
DERP (likely home DERP) from future netcheck probe plans. This will then
likely result in a home DERP having a missing sample on the next
incremental netcheck, ultimately resulting in a home DERP move.
This change does not fix our overall response to highly unstable
latency, but it is an incremental improvement to prevent single spurious
samples during a full netcheck from alone triggering a flapping
condition, as now the prior changes to include historical latency will
still provide the desired resistance, and the home DERP should not move
unless latency is consistently worse over a 5 minute period.
Note that there is a nomenclature and semantics issue remaining in the
difference between a report preferred DERP and a home DERP. A report
preferred DERP is aspirational, it is what will be picked as a home DERP
if a home DERP connection needs to be established. A nodes home DERP may
be different than a recent preferred DERP, in which case a lot of
netcheck logic is fallible. In future enhancements much of the DERP move
logic should move to consider the home DERP, rather than recent report
preferred DERP.
Updates #8603
Updates #13969
Signed-off-by: James Tucker <james@tailscale.com>
During resolv.conf update, old 'search' lines are cleared but '\n' is not
deleted, leaving behind a new blank line on every update.
This adds 's' flag to regexp, so '\n' is included in the match and deleted when
old lines are cleared.
Also, insert missing `\n` when updated 'search' line is appended to resolv.conf.
Signed-off-by: Renato Aguiar <renato@renatoaguiar.net>
This allows us to print the time that a netcheck was run, which is
useful in debugging.
Updates #10972
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Id48d30d4eb6d5208efb2b1526a71d83fe7f9320b
Few changes to resolve TODOs in the code:
- Instead of using a hardcoded IP, get it from the netmap.
- Use 100.100.100.100 as the gateway IP
- Use the /10 CGNAT range instead of a random /24
Updates #2589
Signed-off-by: Maisem Ali <maisem@tailscale.com>
It had bit-rotted likely during the transition to vector io in
76389d8baf. Tested on Ubuntu 24.04
by creating a netns and doing the DHCP dance to get an IP.
Updates #2589
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Updates tailscale/tailscale#13839
Adds a new blockblame package which can detect common MITM SSL certificates used by network appliances. We use this in `tlsdial` to display a dedicated health warning when we cannot connect to control, and a network appliance MITM attack is detected.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
GetReport() may have side effects when the caller enforces a deadline
that is shorter than ReportTimeout.
Updates #13783
Updates #13394
Signed-off-by: Jordan Whited <jordan@tailscale.com>
connstats currently increments the packet counter whenever it is called
to store a length of data, however when udp batch sending was introduced
we pass the length for a series of packages, and it is only incremented
ones, making it count wrongly if we are on a platform supporting udp
batches.
Updates tailscale/corp#22075
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
If multiple upstream DNS servers are available, quad-100 sends requests to all of them
and forwards the first successful response, if any. If no successful responses are received,
it propagates the first failure from any of them.
This PR adds some test coverage for these scenarios.
Updates #13571
Signed-off-by: Nick Khyl <nickk@tailscale.com>
We currently have two executions paths where (*forwarder).forwardWithDestChan
returns nil, rather than an error, without sending a DNS response to responseChan.
These paths are accompanied by a comment that reads:
// Returning an error will cause an internal retry, there is
// nothing we can do if parsing failed. Just drop the packet.
But it is not (or no longer longer) accurate: returning an error from forwardWithDestChan
does not currently cause a retry.
Moreover, although these paths are currently unreachable due to implementation details,
if (*forwarder).forwardWithDestChan were to return nil without sending a response to
responseChan, it would cause a deadlock at one call site and a panic at another.
Therefore, we update (*forwarder).forwardWithDestChan to return errors in those two paths
and remove comments that were no longer accurate and misleading.
Updates #cleanup
Updates #13571
Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
If a DoH server returns an HTTP server error, rather than a SERVFAIL within
a successful HTTP response, we should handle it in the same way as SERVFAIL.
Updates #13571
Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
As per the docstring, (*forwarder).forwardWithDestChan should either send to responseChan
and returns nil, or returns a non-nil error (without sending to the channel).
However, this does not hold when all upstream DNS servers replied with an error.
We've been handling this special error path in (*Resolver).Query but not in (*Resolver).HandlePeerDNSQuery.
As a result, SERVFAIL responses from upstream servers were being converted into HTTP 503 responses,
instead of being properly forwarded as SERVFAIL within a successful HTTP response, as per RFC 8484, section 4.2.1:
A successful HTTP response with a 2xx status code (see Section 6.3 of [RFC7231]) is used for any valid DNS response,
regardless of the DNS response code. For example, a successful 2xx HTTP status code is used even with a DNS message
whose DNS response code indicates failure, such as SERVFAIL or NXDOMAIN.
In this PR we fix (*forwarder).forwardWithDestChan to no longer return an error when it sends a response to responseChan,
and remove the special handling in (*Resolver).Query, as it is no longer necessary.
Updates #13571
Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
Updates tailscale/tailscale#6148
This is the result of some observations we made today with @raggi. The DNS over HTTPS client currently doesn't cap the number of connections it uses, either in-use or idle. A burst of DNS queries will open multiple connections. Idle connections remain open for 30 seconds (this interval is defined in the dohTransportTimeout constant). For DoH providers like NextDNS which send keep-alives, this means the cellular modem will remain up more than expected to send ACKs if any keep-alives are received while a connection remains idle during those 30 seconds. We can set the IdleConnTimeout to 10 seconds to ensure an idle connection is terminated if no other DNS queries come in after 10 seconds. Additionally, we can cap the number of connections to 1. This ensures that at all times there is only one open DoH connection, either active or idle. If idle, it will be terminated within 10 seconds from the last query.
We also observed all the DoH providers we support are capable of TLS 1.3. We can force this TLS version to reduce the number of packets sent/received each time a TLS connection is established.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
I noticed while debugging a test failure elsewhere that our failure
logs (when verbosity is cranked up) were uselessly attributing dial
failures to failure to dial an invalid IP address (this IPv6 address
we didn't have), rather than showing me the actual IPv4 connection
failure.
Updates #13597 (tangentially)
Change-Id: I45ffbefbc7e25ebfb15768006413a705b941dae5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Updates tailscale/tailscale#1634
Updates tailscale/tailscale#13265
Captive portal detection uses a custom `net.Dialer` in its `http.Client`. This custom Dialer ensures that the socket is bound specifically to the Wi-Fi interface. This is crucial because without it, if any default routes are set, the outgoing requests for detecting a captive portal would bypass Wi-Fi and go through the default route instead.
The Dialer did not have a Timeout property configured, so the default system timeout was applied. This caused issues in #13265, where we attempted to make captive portal detection requests over an IPsec interface used for Wi-Fi Calling. The call to `connect()` would fail and remain blocked until the system timeout (approximately 1 minute) was reached.
In #13598, I simply excluded the IPsec interface from captive portal detection. This was a quick and safe mitigation for the issue. This PR is a follow-up to make the process more robust, by setting a 3 seconds timeout on any connection establishment on any interface (this is the same timeout interval we were already setting on the HTTP client).
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
We were previously not checking that the external IP that we got back
from a UPnP portmap was a valid endpoint; add minimal validation that
this endpoint is something that is routeable by another host.
Updates tailscale/corp#23538
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Id9649e7683394aced326d5348f4caa24d0efd532
Updates tailscale/tailscale#1634
Logs from some iOS users indicate that we're pointlessly performing captive portal detection on certain interfaces named ipsec*. These are tunnels with the cellular carrier that do not offer Internet access, and are only used to provide internet calling functionality (VoLTE / VoWiFi).
```
attempting to do captive portal detection on interface ipsec1
attempting to do captive portal detection on interface ipsec6
```
This PR excludes interfaces with the `ipsec` prefix from captive portal detection.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
this commit changes usermetrics to be non-global, this is a building
block for correct metrics if a go process runs multiple tsnets or
in tests.
Updates #13420
Updates tailscale/corp#22075
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Updates tailscale/tailscale#13326
Adds a CLI subcommand to perform DNS queries using the internal DNS forwarder and observe its internals (namely, which upstream resolvers are being used).
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
mdnsResponder at least as of macOS Sequoia does not find NXDOMAIN
responses to these dns-sd PTR queries acceptable unless they include the
question section in the response. This was found debugging #13511, once
we turned on additional diagnostic reporting from mdnsResponder we
witnessed:
```
Received unacceptable 12-byte response from 100.100.100.100 over UDP via utun6/27 -- id: 0x7F41 (32577), flags: 0x8183 (R/Query, RD, RA, NXDomain), counts: 0/0/0/0,
```
If the response includes a question section, the resposnes are
acceptable, e.g.:
```
Received acceptable 59-byte response from 8.8.8.8 over UDP via en0/17 -- id: 0x2E55 (11861), flags: 0x8183 (R/Query, RD, RA, NXDomain), counts: 1/0/0/0,
```
This may be contributing to an issue under diagnosis in #13511 wherein
some combination of conditions results in mdnsResponder no longer
answering DNS queries correctly to applications on the system for
extended periods of time (multiple minutes), while dig against quad-100
provides correct responses for those same domains. If additional debug
logging is enabled in mdnsResponder we see it reporting:
```
Penalizing server 100.100.100.100 for 60 seconds
```
It is also possible that the reason that macOS & iOS never "stopped
spamming" these queries is that they have never been replied to with
acceptable responses. It is not clear if this special case handling of
dns-sd PTR queries was ever beneficial, and given this evidence may have
always been harmful. If we subsequently observe that the queries settle
down now that they have acceptable responses, we should remove these
special cases - making upstream queries very occasionally isn't a lot of
battery, so we should be better off having to maintain less special
cases and avoid bugs of this class.
Updates #2442
Updates #3025
Updates #3363
Updates #3594
Updates #13511
Signed-off-by: James Tucker <james@tailscale.com>
netcheck.Client.GetReport() applies its own deadlines. This 2s deadline
was causing GetReport() to never fall back to HTTPS/ICMP measurements
as it was shorter than netcheck.stunProbeTimeout, leaving no time
for fallbacks.
Updates #13394
Updates #6187
Signed-off-by: Jordan Whited <jordan@tailscale.com>
We already disable dynamic updates by setting DisableDynamicUpdate to 1 for the Tailscale interface.
However, this does not prevent non-dynamic DNS registration from happening when `ipconfig /registerdns`
runs and in similar scenarios. Notably, dns/windowsManager.SetDNS runs `ipconfig /registerdns`,
triggering DNS registration for all interfaces that do not explicitly disable it.
In this PR, we update dns/windowsManager.disableDynamicUpdates to also set RegistrationEnabled to 0.
Fixes#13411
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Disable TCP & UDP GRO if the probe fails.
torvalds/linux@e269d79c7d broke virtio_net
TCP & UDP GRO causing GRO writes to return EINVAL. The bug was then
resolved later in
torvalds/linux@89add40066. The offending
commit was pulled into various LTS releases.
Updates #13041
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Discovered this while investigating the following issue; I think it's
unrelated, but might as well fix it. Also, add a test helper for
checking things that have an IsZero method using the reflect package.
Updates tailscale/support-escalations#55
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I57b7adde43bcef9483763b561da173b4c35f49e2
Updates tailscale/tailscale#13326
This PR begins implementing a `tailscale dns` command group in the Tailscale CLI. It provides an initial implementation of `tailscale dns status` which dumps the state of the internal DNS forwarder.
Two new endpoints were added in LocalAPI to support the CLI functionality:
- `/netmap`: dumps a copy of the last received network map (because the CLI shouldn't have to listen to the ipn bus for a copy)
- `/dns-osconfig`: dumps the OS DNS configuration (this will be very handy for the UI clients as well, as they currently do not display this information)
My plan is to implement other subcommands mentioned in tailscale/tailscale#13326, such as `query`, in later PRs.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Updates tailscale/tailscale#177
It appears that the OSS distribution of `tailscaled` is currently unable to get the current system base DNS configuration, as GetBaseConfig() in manager_darwin.go is unimplemented. This PR adds a basic implementation that reads the current values in `/etc/resolv.conf`, to at least unblock DNS resolution via Quad100 if `--accept-dns` is enabled.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Previously, despite what the commit said, we were using a raw IP socket
that was *not* an AF_PACKET socket, and thus was subject to the host
firewall rules. Switch to using a real AF_PACKET socket to actually get
the functionality we want.
Updates #13140
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: If657daeeda9ab8d967e75a4f049c66e2bca54b78
This commit adds a new usermetric package and wires
up metrics across the tailscale client.
Updates tailscale/corp#22075
Co-authored-by: Anton Tolchanov <anton@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
In prep for updating to new staticcheck required for Go 1.23.
Updates #12912
Change-Id: If77892a023b79c6fa798f936fc80428fd4ce0673
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In 2f27319baf we disabled GRO due to a
data race around concurrent calls to tstun.Wrapper.Write(). This commit
refactors GRO to be thread-safe, and re-enables it on Linux.
This refactor now carries a GRO type across tstun and netstack APIs
with a lifetime that is scoped to a single tstun.Wrapper.Write() call.
In 25f0a3fc8f we used build tags to
prevent importation of gVisor's GRO package on iOS as at the time we
believed it was contributing to additional memory usage on that
platform. It wasn't, so this commit simplifies and removes those
build tags.
Updates tailscale/corp#22353
Updates tailscale/corp#22125
Updates #6816
Signed-off-by: Jordan Whited <jordan@tailscale.com>
`DNS unavailable` was marked as a high severity warning. On Android (and other platforms), these trigger a system notification. Here we reduce the severity level to medium. A medium severity warning will still display the warning icon on platforms with a tray icon because of the `ImpactsConnectivity=true` flag being set here, but it won't show a notification anymore. If people enter an area with bad cellular reception, they're bound to receive so many of these notifications and we need to reduce notification fatigue.
Signed-off-by: Andrea Gottardo <andrea@tailscale.com>
Coder has just adopted nhooyr/websocket which unfortunately changes the import path.
`github.com/coder/coder` imports `tailscale.com/net/wsconn` which was still pointing
to `nhooyr.io/websocket`, but this change updates it.
See https://coder.com/blog/websocket
Updates #13154
Change-Id: I3dec6512472b14eae337ae22c5bcc1e3758888d5
Signed-off-by: Kyle Carberry <kyle@carberry.com>
In particular, tests showing that #3824 works. But that test doesn't
actually work yet; it only gets a DERP connection. (why?)
Updates #13038
Change-Id: Ie1fd1b6a38d4e90fae7e72a0b9a142a95f0b2e8f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Troubleshooting DNS resolution issues often requires additional information.
This PR expands the effect of the TS_DEBUG_DNS_FORWARD_SEND envknob to forwarder.forwardWithDestChan,
and includes the request type, domain name length, and the first 3 bytes of the domain's SHA-256 hash in the output.
Fixes#13070
Signed-off-by: Nick Khyl <nickk@tailscale.com>
I noticed a few places with custom http.Transport where we are not
closing idle connections when transport is no longer used.
Updates tailscale/corp#21609
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
We were copying 12 out of the 16 bytes which meant that
the 1:1 NAT required would only work if the last 4 bytes
happened to match between the new and old address, something
that our tests accidentally had. Fix it by copying the full
16 bytes and make the tests also verify the addr and use rand
addresses.
Updates #9511
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Updates tailscale/tailscale#1634
This PR optimizes captive portal detection on Android and iOS by excluding cellular data interfaces (`pdp*` and `rmnet`). As cellular networks do not present captive portals, frequent network switches between Wi-Fi and cellular would otherwise trigger captive detection unnecessarily, causing battery drain.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
This commit implements TCP GRO for packets being written to gVisor on
Linux. Windows support will follow later. The wireguard-go dependency is
updated in order to make use of newly exported IP checksum functions.
gVisor is updated in order to make use of newly exported
stack.PacketBuffer GRO logic.
TCP throughput towards gVisor, i.e. TUN write direction, is dramatically
improved as a result of this commit. Benchmarks show substantial
improvement, sometimes as high as 2x. High bandwidth-delay product
paths remain receive window limited, bottlenecked by gVisor's default
TCP receive socket buffer size. This will be addressed in a follow-on
commit.
The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. There is roughly ~13us of round
trip latency between them.
The first result is from commit 57856fc without TCP GRO.
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 4.77 GBytes 4.10 Gbits/sec 20 sender
[ 5] 0.00-10.00 sec 4.77 GBytes 4.10 Gbits/sec receiver
The second result is from this commit with TCP GRO.
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 10.6 GBytes 9.14 Gbits/sec 20 sender
[ 5] 0.00-10.00 sec 10.6 GBytes 9.14 Gbits/sec receiver
Updates #6816
Signed-off-by: Jordan Whited <jordan@tailscale.com>
I updated the address parsing stuff to return a specific error for
unspecified hosts passed as empty strings, and look for that
when logging errors. I explicitly did not make parseAddress return a
netip.Addr containing an unspecified address because at this layer,
in the absence of any host, we don't necessarily know the address
family we're dealing with.
For the purposes of this code I think this is fine, at least until
we implement #12588.
Fixes#12979
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
It seems some security software or macOS itself might be MITMing TLS
(for ScreenTime?), so don't warn unless it fails x509 validation
against system roots.
Updates #3198
Change-Id: I6ea381b5bb6385b3d51da4a1468c0d803236b7bf
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This commit implements TCP GSO for packets being read from gVisor on
Linux. Windows support will follow later. The wireguard-go dependency is
updated in order to make use of newly exported GSO logic from its tun
package.
A new gVisor stack.LinkEndpoint implementation has been established
(linkEndpoint) that is loosely modeled after its predecessor
(channel.Endpoint). This new implementation supports GSO of monster TCP
segments up to 64K in size, whereas channel.Endpoint only supports up to
32K. linkEndpoint will also be required for GRO, which will be
implemented in a follow-on commit.
TCP throughput from gVisor, i.e. TUN read direction, is dramatically
improved as a result of this commit. Benchmarks show substantial
improvement through a wide range of RTT and loss conditions, sometimes
as high as 5x.
The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. There is roughly ~13us of round
trip latency between them.
The first result is from commit 57856fc without TCP GSO.
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 2.51 GBytes 2.15 Gbits/sec 154 sender
[ 5] 0.00-10.00 sec 2.49 GBytes 2.14 Gbits/sec receiver
The second result is from this commit with TCP GSO.
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec 6 sender
[ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec receiver
Updates #6816
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Fixestailscale/tailscale#12973
Updates tailscale/tailscale#1634
There was a logic issue in the captive detection code we shipped in https://github.com/tailscale/tailscale/pull/12707.
Assume a captive portal has been detected, and the user notified. Upon switching to another Wi-Fi that does *not* have a captive portal, we were issuing a signal to interrupt any pending captive detection attempt. However, we were not also setting the `captive-portal-detected` warnable to healthy. The result was that any "captive portal detected" alert would not be cleared from the UI.
Also fixes a broken log statement value.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
fixes tailscale#12968
The dns manager cleanup func was getting passed a nil
health tracker, which will panic. Fixed to pass it
the system health tracker.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
updates tailscale/corp#21823
Misconfigured, broken, or blocked DNS will often present as
"internet is broken'" to the end user. This plumbs the health tracker
into the dns manager and forwarder and adds a health warning
with a 5 second delay that is raised on failures in the forwarder and
lowered on successes.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
Updates tailscale/corp#21949
As discussed with @raggi, this PR updates the static DERPMap embedded in the client to reflect the availability of HTTP on the DERP servers run by Tailscale.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Updates tailscale/tailscale#1634
This PR introduces a new `captive-portal-detected` Warnable which is set to an unhealthy state whenever a captive portal is detected on the local network, preventing Tailscale from connecting.
ipn/ipnlocal: fix captive portal loop shutdown
Change-Id: I7cafdbce68463a16260091bcec1741501a070c95
net/captivedetection: fix mutex misuse
ipn/ipnlocal: ensure that we don't fail to start the timer
Change-Id: I3e43fb19264d793e8707c5031c0898e48e3e7465
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
This commit truncates any additional information (mainly hostnames) that's passed to controlD via DOH URL in DoHIPsOfBase.
This change is to make sure only resolverID is passed to controlDv6Gen but not the additional information.
Updates: #7946
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
This adds a package with GP-related functions and types to be used in the future PRs.
It also updates nrptRuleDatabase to use the new package instead of its own gpNotificationWatcher implementation.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Updates tailscale/corp#20677
The recover function wasn't getting set in the benchmark
tests. Default changed to an empty func.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
And some misc doc tweaks for idiomatic Go style.
Updates #cleanup
Change-Id: I3ca45f78aaca037f433538b847fd6a9571a2d918
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This turns the checklocks workflow into a real check, and adds
annotations to a few basic packages as a starting point.
Updates #12625
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I2b0185bae05a843b5257980fc6bde732b1bdd93f
Previously, if we had a umask set (e.g. 0027) that prevented creating a
world-readable file, /etc/resolv.conf would be created without the o+r
bit and thus other users may be unable to resolve DNS.
Since a umask only applies to file creation, chmod the file after
creation and before renaming it to ensure that it has the appropriate
permissions.
Updates #12609
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I2a05d64f4f3a8ee8683a70be17a7da0e70933137
Fixestailscale/corp#20677
Replaces the original attempt to rectify this (by injecting a netMon
event) which was both heavy handed, and missed cases where the
netMon event was "minor".
On apple platforms, the fetching the interface's nameservers can
and does return an empty list in certain situations. Apple's API
in particular is very limiting here. The header hints at notifications
for dns changes which would let us react ahead of time, but it's all
private APIs.
To avoid remaining in the state where we end up with no
nameservers but we absolutely need them, we'll react
to a lack of upstream nameservers by attempting to re-query
the OS.
We'll rate limit this to space out the attempts. It seems relatively
harmless to attempt a reconfig every 5 seconds (triggered
by an incoming query) if the network is in this broken state.
Missing nameservers might possibly be a persistent condition
(vs a transient error), but that would also imply that something
out of our control is badly misconfigured.
Tested by randomly returning [] for the nameservers. When switching
between Wifi networks, or cell->wifi, this will randomly trigger
the bug, and we appear to reliably heal the DNS state.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
This is implemented via GetBestInterfaceEx. Should we encounter errors
or fail to resolve a valid, non-Tailscale interface, we fall back to
returning the index for the default interface instead.
Fixes#12551
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
I meant to do this in the earlier change and had a git fail.
To atone, add a test too while I'm here.
Updates #12486
Updates #12507
Change-Id: I4943b454a2530cb5047636f37136aa2898d2ffc7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I noticed we were allocating these every time when they could just
share the same memory. Rather than document ownership, just lock it
down with a view.
I was considering doing all of the fields but decided to just do this
one first as test to see how infectious it became. Conclusion: not
very.
Updates #cleanup (while working towards tailscale/corp#20514)
Change-Id: I8ce08519de0c9a53f20292adfbecd970fe362de0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
For pprof cosmetic/confusion reasons more than performance, but it
might have tiny speed benefit.
Updates #12486
Change-Id: I40e03714f3afa3a7e7f5e1fa99b81c7e889b91b6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
So profiles show more useful names than just func1, func2, func3, etc.
There will still be func1 on them all, but the symbol before will say
what the lookup type is.
Updates #12486
Change-Id: I910b024a7861394eb83d07f5a899eae338cb1f22
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This moves NewContainsIPFunc from tsaddr to new ipset package.
And wgengine/filter types gets split into wgengine/filter/filtertype,
so netmap (and thus the CLI, etc) doesn't need to bring in ipset,
bart, etc.
Then add a test making sure the CLI deps don't regress.
Updates #1278
Change-Id: Ia246d6d9502bbefbdeacc4aef1bed9c8b24f54d5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
NewContainsIPFunc was previously documented as performing poorly if
there were many netip.Prefixes to search over. As such, we never it used it
in such cases.
This updates it to use bart at a certain threshold (over 6 prefixes,
currently), at which point the bart lookup overhead pays off.
This is currently kinda useless because we're not using it. But now we
can and get wins elsewhere. And we can remove the caveat in the docs.
goos: darwin
goarch: arm64
pkg: tailscale.com/net/tsaddr
│ before │ after │
│ sec/op │ sec/op vs base │
NewContainsIPFunc/empty-8 2.215n ± 11% 2.239n ± 1% +1.08% (p=0.022 n=10)
NewContainsIPFunc/cidr-list-1-8 17.44n ± 0% 17.59n ± 6% +0.89% (p=0.000 n=10)
NewContainsIPFunc/cidr-list-2-8 27.85n ± 0% 28.13n ± 1% +1.01% (p=0.000 n=10)
NewContainsIPFunc/cidr-list-3-8 36.05n ± 0% 36.56n ± 13% +1.41% (p=0.000 n=10)
NewContainsIPFunc/cidr-list-4-8 43.73n ± 0% 44.38n ± 1% +1.50% (p=0.000 n=10)
NewContainsIPFunc/cidr-list-5-8 51.61n ± 2% 51.75n ± 0% ~ (p=0.101 n=10)
NewContainsIPFunc/cidr-list-10-8 95.65n ± 0% 68.92n ± 0% -27.94% (p=0.000 n=10)
NewContainsIPFunc/one-ip-8 4.466n ± 0% 4.469n ± 1% ~ (p=0.491 n=10)
NewContainsIPFunc/two-ip-8 8.002n ± 1% 7.997n ± 4% ~ (p=0.697 n=10)
NewContainsIPFunc/three-ip-8 27.98n ± 1% 27.75n ± 0% -0.82% (p=0.012 n=10)
geomean 19.60n 19.07n -2.71%
Updates #12486
Change-Id: I2e2320cc4384f875f41721374da536bab995c1ce
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Without this rule, Windows 8.1 and newer devices issue parallel DNS requests to DNS servers
associated with all network adapters, even when "Override local DNS" is enabled and/or
a Mullvad exit node is being used, resulting in DNS leaks.
This also adds "disable-local-dns-override-via-nrpt" nodeAttr that can be used to disable
the new behavior if needed.
Fixestailscale/corp#20718
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Updates tailscale/tailscale#4136
This PR is the first round of work to move from encoding health warnings as strings and use structured data instead. The current health package revolves around the idea of Subsystems. Each subsystem can have (or not have) a Go error associated with it. The overall health of the backend is given by the concatenation of all these errors.
This PR polishes the concept of Warnable introduced by @bradfitz a few weeks ago. Each Warnable is a component of the backend (for instance, things like 'dns' or 'magicsock' are Warnables). Each Warnable has a unique identifying code. A Warnable is an entity we can warn the user about, by setting (or unsetting) a WarningState for it. Warnables have:
- an identifying Code, so that the GUI can track them as their WarningStates come and go
- a Title, which the GUIs can use to tell the user what component of the backend is broken
- a Text, which is a function that is called with a set of Args to generate a more detailed error message to explain the unhappy state
Additionally, this PR also begins to send Warnables and their WarningStates through LocalAPI to the clients, using ipn.Notify messages. An ipn.Notify is only issued when a warning is added or removed from the Tracker.
In a next PR, we'll get rid of subsystems entirely, and we'll start using structured warnings for all errors affecting the backend functionality.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
This commit introduces a userspace program for managing an experimental
eBPF XDP STUN server program. derp/xdp contains the eBPF pseudo-C along
with a Go pkg for loading it and exporting its metrics.
cmd/xdpderper is a package main user of derp/xdp.
Updates tailscale/corp#20689
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Fixestailscale/corp#20677
On macOS sleep/wake, we're encountering a condition where reconfigure the network
a little bit too quickly - before apple has set the nameservers for our interface.
This results in a persistent condition where we have no upstream resolver and
fail all forwarded DNS queries.
No upstream nameservers is a legitimate configuration, and we have no (good) way
of determining when Apple is ready - but if we need to forward a query, and we
have no nameservers, then something has gone badly wrong and the network is
very broken.
A simple fix here is to simply inject a netMon event, which will go through the
configuration dance again when we hit the SERVFAIL condition.
Tested by artificially/randomly returning [] for the list of nameservers in the bespoke
ipn-bridge code responsible for getting the nameservers.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
As an alterative to #11935 using #12003.
Updates #11935
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I05f643fe812ceeaec5f266e78e3e529cab3a1ac3
When we're starting child processes on Windows that are CLI programs that
don't need to output to a console, we should pass in DETACHED_PROCESS as a
CreationFlag on SysProcAttr. This prevents the OS from even creating a console
for the child (and paying the associated time/space penalty for new conhost
processes). This is more efficient than letting the OS create the console
window and then subsequently trying to hide it, which we were doing at a few
callsites.
Fixes#12270
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
As quad-100 is an authoritative server for 4via6 domains, it should always return responses
with a response code of 0 (indicating no error) when resolving records for these domains.
If there's no resource record of the specified type (e.g. A), it should return a response
with an empty answer section rather than NXDomain. Such a response indicates that there
is at least one RR of a different type (e.g., AAAA), suggesting the Windows stub resolver
to look for it.
Fixestailscale/corp#20767
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Add a new TS_EXPERIMENTAL_ENABLE_FORWARDING_OPTIMIZATIONS env var
that can be set for tailscale/tailscale container running as
a subnet router or exit node to enable UDP GRO forwarding
for improved performance.
See https://tailscale.com/kb/1320/performance-best-practices#linux-optimizations-for-subnet-routers-and-exit-nodes
This is currently considered an experimental approach;
the configuration support is partially to allow further experimentation
with containerized environments to evaluate the performance
improvements.
Updates tailscale/tailscale#12295
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Updates corp#15802.
Adds the ability for control to disable the recently added change that uses split DNS in more cases on iOS. This will allow us to disable the feature if it leads to regression in production. We plan to remove this knob once we've verified that the feature works properly.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
This bug was introduced in e6b84f215 (May 2020) but was only used in
tests when stringifying probeProto values on failure so it wasn't
noticed for a long time.
But then it was moved into non-test code in 8450a18aa (Jun 2024) and I
didn't notice during the code movement that it was wrong. It's still
only used in failure paths in logs, but having wrong/ambiguous
debugging information isn't the best.
Whoops.
Updates tailscale/corp#20654
Change-Id: I296c727ed1c292a04db7b46ecc05c07fc1abc774
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This adds a new ListenPacket function on tsnet.Server
which acts mostly like `net.ListenPacket`.
Unlike `Server.Listen`, this requires listening on a
specific IP and does not automatically listen on both
V4 and V6 addresses of the Server when the IP is unspecified.
To test this, it also adds UDP support to tsdial.Dialer.UserDial
and plumbs it through the localapi. Then an associated test
to make sure the UDP functionality works from both sides.
Updates #12182
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Updates https://github.com/tailscale/corp/issues/15802.
On iOS exclusively, this PR adds logic to use a split DNS configuration in more cases, with the goal of improving battery life. Acting as the global DNS resolver on iOS should be avoided, as it leads to frequent wakes of IPNExtension.
We try to determine if we can have Tailscale only handle DNS queries for resources inside the tailnet, that is, all routes in the DNS configuration do not require a custom resolver (this is the case for app connectors, for instance).
If so, we set all Routes as MatchDomains. This enables a split DNS configuration which will help preserve battery life. Effectively, for the average Tailscale user who only relies on MagicDNS to resolve *.ts.net domains, this means that Tailscale DNS will only be used for those domains.
This PR doesn't affect users with Override Local DNS enabled. For these users, there should be no difference and Tailscale will continue acting as a global DNS resolver.
Signed-off-by: Andrea Gottardo <andrea@tailscale.com>
Palo Alto reported interpreting hairpin probes as LAND attacks, and the
firewalls may be responding to this by shutting down otherwise in use NAT sessions
prematurely. We don't currently make use of the outcome of the hairpin
probes, and they contribute to other user confusion with e.g. the
AirPort Extreme hairpin session workaround. We decided in response to
remove the whole probe feature as a result.
Updates #188
Updates tailscale/corp#19106
Updates tailscale/corp#19116
Signed-off-by: James Tucker <james@tailscale.com>
Palo Alto firewalls have a typically hard NAT, but also have a mode
called Persistent DIPP that is supposed to provide consistent port
mapping suitable for STUN resolution of public ports. Persistent DIPP
works initially on most Palo Alto firewalls, but some models/software
versions have a bug which this works around.
The bug symptom presents as follows:
- STUN sessions resolve a consistent public IP:port to start with
- Much later netchecks report the same IP:Port for a subset of
sessions, most often the users active DERP, and/or the port related
to sustained traffic.
- The broader set of DERPs in a full netcheck will now consistently
observe a new IP:Port.
- After this point of observation, new inbound connections will only
succeed to the new IP:Port observed, and existing/old sessions will
only work to the old binding.
In this patch we now advertise the lowest latency global endpoint
discovered as we always have, but in addition any global endpoints that
are observed more than once in a single netcheck report. This should
provide viable endpoints for potential connection establishment across
a NAT with this behavior.
Updates tailscale/corp#19106
Signed-off-by: James Tucker <james@tailscale.com>
In this commit I updated the Ipv6 range we use to generate Control D DOH ip, we were using the NextDNSRanges to generate Control D DOH ip, updated to use the correct range.
Updates: #7946
Signed-off-by: Kevin Liang <kevinliang@tailscale.com>
In a configuration where the local node (ip1) has a different IP (ip2)
that it uses to communicate with a peer (ip3) we would do UDP flow
tracking on the `ip2->ip3` tuple. When we receive the response from
the peer `ip3->ip2` we would dnat it back to `ip3->ip1` which would
then not match the flow track state and the packet would get dropped.
To fix this, we should do flow tracking on the `ip1->ip3` tuple instead
of `ip2->ip3` which requires doing SNAT after the running filterPacketOutboundToWireGuard.
Updates tailscale/corp#19971, tailscale/corp#8020
Signed-off-by: Maisem Ali <maisem@tailscale.com>
It was documented as such but seems to have been dropped in a
refactor, restore the behavior. This brings down the time it
takes to run a single integration test by 2s which adds up
quite a bit.
Updates tailscale/corp#19786
Signed-off-by: Maisem Ali <maisem@tailscale.com>
This adds a new bool that can be sent down from control
to do jailing on the client side. Previously this would
only be done from control by modifying the packet filter
we sent down to clients. This would result in a lot of
additional work/CPU on control, we could instead just
do this on the client. This has always been a TODO which
we keep putting off, might as well do it now.
Updates tailscale/corp#19623
Signed-off-by: Maisem Ali <maisem@tailscale.com>
This plumbs a packet filter for jailed nodes through to the
tstun.Wrapper; the filter for a jailed node is equivalent to a "shields
up" filter. Currently a no-op as there is no way for control to
tell the client whether a peer is jailed.
Updates tailscale/corp#19623
Co-authored-by: Andrew Dunham <andrew@du.nham.ca>
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Change-Id: I5ccc5f00e197fde15dd567485b2a99d8254391ad
Now that tsdial.Dialer.UserDial has been updated to honor the configured routes
and dial external network addresses without going through Tailscale, while also being
able to dial a node/subnet router on the tailnet, we can start using UserDial to forward
DNS requests. This is primarily needed for DNS over TCP when forwarding requests
to internal DNS servers, but we also update getKnownDoHClientForProvider to use it.
Updates tailscale/corp#18725
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This refactors the peerConfig struct to allow storing more
details about a peer and not just the masq addresses. To be
used in a follow up change.
As a side effect, this also makes the DNAT logic on the inbound
packet stricter. Previously it would only match against the packets
dst IP, not it also takes the src IP into consideration. The beahvior
is at parity with the SNAT case.
Updates tailscale/corp#19623
Co-authored-by: Andrew Dunham <andrew@du.nham.ca>
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Change-Id: I5f40802bebbf0f055436eb8824e4511d0052772d
We'd like to use tsdial.Dialer.UserDial instead of SystemDial for DNS over TCP.
This is primarily necessary to properly dial internal DNS servers accessible
over Tailscale and subnet routes. However, to avoid issues when switching
between Wi-Fi and cellular, we need to ensure that we don't retain connections
to any external addresses on the old interface. Therefore, we need to determine
which dialer to use internally based on the configured routes.
This plumbs routes and localRoutes from router.Config to tsdial.Dialer,
and updates UserDial to use either the peer dialer or the system dialer,
depending on the network address and the configured routes.
Updates tailscale/corp#18725
Fixes#4529
Signed-off-by: Nick Khyl <nickk@tailscale.com>
While debugging a failing test in airplane mode on macOS, I noticed
netcheck logspam about ICMP socket creation permission denied errors.
Apparently macOS just can't do those, or at least not in airplane
mode. Not worth spamming about.
Updates #cleanup
Change-Id: I302620cfd3c8eabb25202d7eef040c01bd8a843c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
To aid in debugging exactly what's going wrong, instead of the
not-particularly-useful "dns udp query: context deadline exceeded" error
that we currently get.
Updates #3786
Updates #10768
Updates #11620
(etc.)
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I76334bf0681a8a2c72c90700f636c4174931432c
So that we can use this for additional, non-NAT configuration without it
being confusing.
Updates #cleanup
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I1658d59c9824217917a94ee76d2d08f0a682986f
This was a holdover from the older, pre-BART days and is no longer
necessary.
Updates #cleanup
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I71b892bab1898077767b9ff51cef33d59c08faf8
Updates tailscale/corp#18960
Tests in corp called us using the wrong logging calls. Removed.
This is logged downstream anyway.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
Updates tailscale/corp#18960
iOS uses Apple's NetworkMonitor to track the default interface and
there's no reason we shouldn't also use this on macOS, for the same
reasons noted in the comments for why this change was made on iOS.
This eliminates the need to load and parse the routing table when
querying the defaultRouter() in almost all cases.
A slight modification here (on both platforms) to fallback to the default
BSD logic in the unhappy-path rather than making assumptions that
may not hold. If netmon is eventually parsing AF_ROUTE and able
to give a consistently correct answer for the default interface index,
we can fall back to that and eliminate the Swift dependency.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
Certain device drivers (e.g. vxlan, geneve) do not properly handle
coalesced UDP packets later in the stack, resulting in packet loss.
Updates #11026
Signed-off-by: Jordan Whited <jordan@tailscale.com>
In prep for most of the package funcs in net/interfaces to become
methods in a long-lived netmon.Monitor that can cache things. (Many
of the funcs are very heavy to call regularly, whereas the long-lived
netmon.Monitor can subscribe to things from the OS and remember
answers to questions it's asked regularly later)
Updates tailscale/corp#10910
Updates tailscale/corp#18960
Updates #7967
Updates #3299
Change-Id: Ie4e8dedb70136af2d611b990b865a822cd1797e5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
... in prep for merging the net/interfaces package into net/netmon.
This is a no-op change that updates a bunch of the API signatures ahead of
a future change to actually move things (and remove the type alias)
Updates tailscale/corp#10910
Updates tailscale/corp#18960
Updates #7967
Updates #3299
Change-Id: I477613388f09389214db0d77ccf24a65bff2199c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The goal is to move more network state accessors to netmon.Monitor
where they can be cheaper/cached. But first (this change and others)
we need to make sure the one netmon.Monitor is plumbed everywhere.
Some notable bits:
* tsdial.NewDialer is added, taking a now-required netmon
* because a tsdial.Dialer always has a netmon, anything taking both
a Dialer and a NetMon is now redundant; take only the Dialer and
get the NetMon from that if/when needed.
* netmon.NewStatic is added, primarily for tests
Updates tailscale/corp#10910
Updates tailscale/corp#18960
Updates #7967
Updates #3299
Change-Id: I877f9cb87618c4eb037cee098241d18da9c01691
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This has been a TODO for ages. Time to do it.
The goal is to move more network state accessors to netmon.Monitor
where they can be cheaper/cached.
Updates tailscale/corp#10910
Updates tailscale/corp#18960
Updates #7967
Updates #3299
Change-Id: I60fc6508cd2d8d079260bda371fc08b6318bcaf1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I'm working on moving all network state queries to be on
netmon.Monitor, removing old APIs.
Updates tailscale/corp#10910
Updates tailscale/corp#18960
Updates #7967
Updates #3299
Change-Id: If0de137e0e2e145520f69e258597fb89cf39a2a3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This adds a health.Tracker to tsd.System, accessible via
a new tsd.System.HealthTracker method.
In the future, that new method will return a tsd.System-specific
HealthTracker, so multiple tsnet.Servers in the same process are
isolated. For now, though, it just always returns the temporary
health.Global value. That permits incremental plumbing over a number
of changes. When the second to last health.Global reference is gone,
then the tsd.System.HealthTracker implementation can return a private
Tracker.
The primary plumbing this does is adding it to LocalBackend and its
dozen and change health calls. A few misc other callers are also
plumbed. Subsequent changes will flesh out other parts of the tree
(magicsock, controlclient, etc).
Updates #11874
Updates #4136
Change-Id: Id51e73cfc8a39110425b6dc19d18b3975eac75ce
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Previously it was both metadata about the class of warnable item as
well as the value.
Now it's only metadata and the value is per-Tracker.
Updates #11874
Updates #4136
Change-Id: Ia1ed1b6c95d34bc5aae36cffdb04279e6ba77015
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This moves most of the health package global variables to a new
`health.Tracker` type.
But then rather than plumbing the Tracker in tsd.System everywhere,
this only goes halfway and makes one new global Tracker
(`health.Global`) that all the existing callers now use.
A future change will eliminate that global.
Updates #11874
Updates #4136
Change-Id: I6ee27e0b2e35f68cb38fecdb3b2dc4c3f2e09d68
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
If this happens, it results in us pessimistically closing more
connections than might be necessary, but is more correct since we won't
"miss" a change to the default route interface and keep trying to send
data over a nonexistent interface, or one that can't reach the internet.
Updates tailscale/corp#19124
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ia0b8b04cb8cdcb0da0155fd08751c9dccba62c1a
-Move Android impl into interfaces_android.go
-Instead of using ip route to get the interface name, use the one passed in by Android (ip route is restricted in Android 13+ per termux/termux-app#2993)
Follow-up will be to do the same for router
Fixestailscale/corp#19215Fixestailscale/corp#19124
Signed-off-by: kari-ts <kari@tailscale.com>
This ensures that we close the underlying connection(s) when a major
link change happens. If we don't do this, on mobile platforms switching
between WiFi and cellular can result in leftover connections in the
http.Client's connection pool which are bound to the "wrong" interface.
Updates #10821
Updates tailscale/corp#19124
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ibd51ce2efcaf4bd68e14f6fdeded61d4e99f9a01
This wasn't previously handling the case where an interface in s2 was
removed and not present in s1, and would cause the Equal method to
incorrectly return that the states were equal.
Updates tailscale/corp#19124
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I3af22bc631015d1ddd0a1d01bfdf312161b9532d
It should've been deleted in 11ece02f52.
Updates #9040
Change-Id: If8a136bdb6c82804af658c9d2b0a8c63ce02d509
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
At least in userspace-networking mode.
Fixes#11361
Change-Id: I78d33f0f7e05fe9e9ee95b97c99b593f8fe498f2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We were being too aggressive when deciding whether to write our NRPT rules
to the local registry key or the group policy registry key.
After once again reviewing the document which calls itself a spec
(see issue), it is clear that the presence of the DnsPolicyConfig subkey
is the important part, not the presence of values set in the DNSClient
subkey. Furthermore, a footnote indicates that the presence of
DnsPolicyConfig in the GPO key will always override its counterpart in
the local key. The implication of this is important: we may unconditionally
write our NRPT rules to the local key. We copy our rules to the policy
key only when it contains NRPT rules belonging to somebody other than us.
Fixes https://github.com/tailscale/corp/issues/19071
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
This removes a potentially increased boot delay for certain boot
topologies where they block on ExecStartPre that may have socket
activation dependencies on other system services (such as
systemd-resolved and NetworkManager).
Also rename cleanup to clean up in affected/immediately nearby places
per code review commentary.
Fixes#11599
Signed-off-by: James Tucker <james@tailscale.com>
At least in the case of dialing a Tailscale IP.
Updates #4529
Change-Id: I9fd667d088a14aec4a56e23aabc2b1ffddafa3fe
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Updates #7946
[@bradfitz fixed up version of #8417]
Change-Id: I1dbf6fa8d525b25c0d7ad5c559a7f937c3cd142a
Signed-off-by: alexelisenko <39712468+alexelisenko@users.noreply.github.com>
Signed-off-by: Alex Paguis <alex@windscribe.com>
The netcheck package and the magicksock package coordinate via the
health package, but both sides have time based heuristics through
indirect dependencies. These were misaligned, so the implemented
heuristic aimed at reducing DERP moves while there is active traffic
were non-operational about 3/5ths of the time.
It is problematic to setup a good test for this integration presently,
so instead I added comment breadcrumbs along with the initial fix.
Updates #8603
Signed-off-by: James Tucker <james@tailscale.com>