10476 Commits

Author SHA1 Message Date
Brad Fitzpatrick
00a08ea86d control/tsp: add lite map update support
Updates #12542
Updates tailscale/corp#40088

Change-Id: Idb4526f1bf1f3f424d6fb3d7e34ebe89a474b57b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-17 04:19:50 -07:00
Tom Proctor
c2da563fef
tstest/integration/vms: skip cloud-init package updates (#19443)
The package updates started getting really slow yesterday. We can do
better, but attempt a band aid fix for now, as the test is failing about
a third of the time on PR CI.

Updates tailscale/corp#40465

Change-Id: Icf53292ba83dd1ed76b9bdf9fb94a8f6fb448c07

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2026-04-17 10:39:47 +01:00
Brad Fitzpatrick
50d7176333 control/tsp, cmd/tsp: add low-level Tailscale protocol client and tool
Add a new control/tsp package providing a client for speaking the
Tailscale protocol to a coordination server over Noise, along with a
cmd/tsp binary exposing it as a low-level composable tool for
generating keys, registering nodes, and issuing map requests.

Previously developed out-of-tree at github.com/bradfitz/tsp; imported
here without git history.

Updates #12542

Change-Id: I6ad21143c4aefe8939d4a46ae65b2184173bf69f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-16 20:00:25 -07:00
Jordan Whited
69572c7435 derp/derpserver: add rate limit config metrics
Updates tailscale/corp#40421

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-04-16 12:48:41 -07:00
Michael Ben-Ami
1dc08f4d41 appc,feature/conn25: prevent clients from forwarding DNS requests and
modifying DNS responses for domains they are also connectors for

For Connectors 2025, determine if a client is configured as a
connector and what domains it is a connector for. When acting as a
client, don't install Split DNS routes to other connectors for those
domains, and don't alter DNS responses for those domains. The responses
are forwarded back to the original client, which in turn does the alteration,
swapping the real IP for a Magic IP.

A client is also a connector for a domain if it has tags that overlap
with tags in the configured policy, and --advertise-connector=true
in the prefs (not in the self-node Hostinfo from the netmap). We use the prefs
as the source of truth because control only gets a copy from the prefs, and
may drift. And the AppConnector field is currently zeroed out in the
self-node Hostinfo from control.

The extension adds a ProfileStateChange hook to process prefs changes,
and the config type is split into prefs and nodeview sub-configs.

Fixes tailscale/corp#39317

Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
2026-04-16 09:41:54 -04:00
Alex Chan
4f47c3c93d ipn/ipnlocal: log AUM hash on startup as base32, not hex
Before:

    tka initialized at head 325557575a59525354484e4a534f494b4c4e56575435583737564b5036584c4d4c335534554255344c344c36484c5a444a323341

After:

    tka initialized at head 2UWWZYRSTHNJSOIKLNVWT5X77VKP6XLML3U4UBU4L4L6HLZDJ23A

Printing the AUM hash as hex makes it difficult to compare to other AUM
hashes; stringifying it will make it consistent with other printing.

Updates #cleanup

Change-Id: Ic1e23a9ce6a71a53cff7d2190f9fa06eb838ab89
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-04-16 13:45:29 +01:00
Alex Valiushko
d3ba1480f5
magicsock: invalidate endpoint on trust timeout (#19415)
Endpoint's best address was cleared on trustBestAddrUntil expiry
only if it was a udprelay connection. This generalizes invalidation
to also cover direct UDP.

Trust deadline is checked in two cases:

On disco ping timeout from the endpoint's best address.
Traffic goes DERP-only, heartbeats to the old address stop.
The discovery pings are still in flight, handled by the following.

On disco ping success from an alternative. BestAddr switches to the
working path, trust refreshed, eager discovery stops. The still
in flight pongs are handled by betterAddr().

Updates #19407


Change-Id: Ic41ed18edb4a6e4350a2d49271ba01566a6a6964

Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>
2026-04-15 19:22:07 -07:00
Brad Fitzpatrick
b39ee0445d util/httpm: open .git/index to defeat Go test caching
TestUsedConsistently shells out to git grep to find forbidden
http.Method* uses across the repo. Since the test itself doesn't
open any repo files, Go's test cache considers it unchanged
between commits and serves stale passing results even when new
violations are introduced.

Fix by opening .git/index, which makes Go's test cache track it
as an input. The index file changes on git reset, checkout, pull,
etc., so the cache is properly invalidated when moving between
commits.

Updates tailscale/corp#40359

Change-Id: If1497b992a545351bdd68cff279d60f5591fe70b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-15 15:44:19 -07:00
David Bond
eea39eaf52
cmd/k8s-operator: add affinity rules to DNSConfig (#19360)
This commit modifies the `DNSConfig` custom resource to allow the
user to specify affinity rules on the nameserver pods.

Updates: https://github.com/tailscale/tailscale/issues/18556

Signed-off-by: David Bond <davidsbond93@gmail.com>
2026-04-15 22:39:04 +01:00
Jonathan Nobels
acc43356c6
control/controlclient: enable request signatures on macOS (#19317)
fixes tailscale/corp#39422

Updates tailscale/certstore for properly macOS support and
builds the request signing support into macOS builds.  iOS and builds
that do not use cGo are omitted.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
2026-04-15 14:11:14 -04:00
M. J. Fromberger
1e4934659b
ipn/ipnlocal: discard cached netmaps upon panic during SetNetworkMap (#19414)
For debugging purposes, unstable builds will sometimes intentionally panic for
unexpected behaviours. We observed such a panic after loading a cached netmap,
but because we had a valid cached map, the client was unable to recover on its
own and the operator had to manually reset the cache.

As a defensive hedge, when netmap caching is enabled, check for a panic during
installation of a net network map: If one occurs, discard any cached netmaps
before letting the panic unwind, so that we do not lose the panic itself, but
reduce the need for manual intervention.

Updates #12639
Updates tailscale/corp#27300

Change-Id: I0436889c6bdc2fa728c9cb83630cd7b00a72ce68
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2026-04-15 11:07:42 -07:00
Anton Tolchanov
958bcda5bf control/controlclient: handle 429 responses during node registration
If we get a 429 response during node registration, use the `Retry-After`
header for backoff instead of the regular exponential backoff.

The rate limiter error is propagated to the user, just like other
registration errors are, e.g.

```
$ tailscale up
backend error: node registration rate limited; will retry after 57s
exit status 1
```

Updates tailscale/corp#39533

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2026-04-15 18:54:08 +01:00
Jordan Whited
d8190e0de5 derp/derpserver: implement hierarchical token bucket rate limiting
By adding a server-global parent bucket. Per-client rate limiting is
subject to the parent bucket if global rate limiting is enabled.

This implementation is experimental, and all related APIs should be
considered unstable.

Updates tailscale/corp#40291

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-04-15 09:06:03 -07:00
Tom Meadows
5eb0b4be31
cmd/containerboot,cmd/k8s-proxy,kube: add authkey renewal to k8s-proxy (#19221)
* kube/authkey,cmd/containerboot: extract shared auth key reissue package

Move auth key reissue logic (set marker, wait for new key, clear marker,
read config) into a shared kube/authkey package and update containerboot
to use it. No behaviour change.

Updates #14080

Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>

* kube/authkey,kube/state,cmd/containerboot: preserve device_id across restarts

Stop clearing device_id, device_fqdn, and device_ips from state on startup.
These keys are now preserved across restarts so the operator can track
device identity. Expand ClearReissueAuthKey to clear device state and
tailscaled profile data when performing a full auth key reissue.

Updates #14080

Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>

* cmd/containerboot: use root context for auth key reissue wait

Pass the root context instead of bootCtx to setAndWaitForAuthKeyReissue.
The 60-second bootCtx timeout was cancelling the reissue wait before the
operator had time to respond, causing the pod to crash-loop.

Updates #14080

Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>

* cmd/k8s-proxy: add auth key renewal support

Add auth key reissue handling to k8s-proxy, mirroring containerboot.
When the proxy detects an auth failure (login-state health warning or
NeedsLogin state), it disconnects from control, signals the operator
via the state Secret, waits for a new key, clears stale state, and
exits so Kubernetes restarts the pod with the new key.

A health watcher goroutine runs alongside ts.Up() to short-circuit
the startup timeout on terminal auth failures.

Updates #14080

Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>

---------

Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
2026-04-15 16:13:46 +01:00
Brad Fitzpatrick
dbf468740b control/controlclient: add patchify miss stats
Add an opt-in metrics.LabelMap tracking why patchifyPeer fails to
convert a PeersChanged entry into a PeersChangedPatch. The stats are
gated behind the TS_DEBUG_PATCHIFY_PEER_MISS envknob so there is zero
overhead in normal operation.

peerChangeDiff now takes an optional onFalse callback that is called
with the field name on every non-patchable return path. When the
envknob is off, nil is passed and replaced with a no-op at the top of
peerChangeDiff.

The resulting metric renders as:

    counter_patchify_miss{why="Hostinfo"} 2
    counter_patchify_miss{why="peer_not_found"} 1170

Updates tailscale/corp#40088

Change-Id: I2d4b9074bf42ec03ab296c0629a54106bafa873e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-15 08:05:57 -07:00
Claus Lensbøl
61c95f409c
control/controlclient: accept key if last seen on exist node is absent (#19402)
On some nodes (found via natlab), the existing nodes last seen could be
unset. For these cases, we would want to accept the key and write a last
seen. This was breaking the cached netmap natlab tests.

Updates #12639

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-04-15 03:53:40 -04:00
Avery Pennarun
effbe67fe3 wgengine/magicsock: remove pickPort, use port 0 to avoid TOCTOU race
pickPort would bind a UDP socket on :0 to get a free port, close
the socket, then hope to rebind to the same port in NewConn. This
is a TOCTOU race that can cause flaky test failures when another
process grabs the port in between.

Instead, pass Port: 0 to NewConn and let the OS assign the port
atomically, then read back the assigned port via conn.LocalPort().

Fixes #19409

Change-Id: Ie44b599fb93c361e29a05f2171ad747c46f82b7a
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2026-04-14 18:08:47 -07:00
Naman Sood
6301a6ce4b
util/linuxfw,wgengine/router: allow incoming CGNAT range traffic with nodeattr
Clients with the newly added node attribute
`"disable-linux-cgnat-drop-rule"` will not automatically drop inbound
traffic on non-Tailscale network interfaces with the source IP in the
CGNAT IP range. This is an initial proof-of-concept for enabling
connectivity with off-Tailnet CGNAT endpoints.

Fixes tailscale/corp#36270.

Signed-off-by: Naman Sood <mail@nsood.in>
2026-04-14 16:45:06 -04:00
Fernando Serboncini
5834058269
wgengine: replace reflect.DeepEqual with typed Equal for maybeReconfigInputs (#19365)
reflect.DeepEqual is expensive and allocates heavily. Replace it with
a field-by-field comparison that does zero allocations.

Adds tests and benchmarks for the new Equal method.

Fixes #19363

Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
2026-04-14 13:16:21 -04:00
Brad Fitzpatrick
943b426038 util/linuxfw: fix nil deref in nftables chain check
Fix a panic in getOrCreateChain when the kernel lacks nftables support
(CONFIG_NF_TABLES). When the nftables netlink connection fails, chain
objects returned by getChainFromTable can have nil Hooknum and Priority
fields. Dereferencing these caused tailscaled to SIGSEGV during router
configuration, which manifested as tailscaled silently crashing ~13
seconds after "tailscale up" on arm64 gokrazy (whose kernel.arm64
build doesn't include nftables).

Updates #13038

Change-Id: I14433616da5ed57895cad37038921fb4f79c3534
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-14 07:45:01 -07:00
Brad Fitzpatrick
a0a8fae856 tstest/integration: use linkat to hardlink test binaries on Linux
Use linkat via /proc/self/fd with AT_SYMLINK_FOLLOW to create a
hardlink of the test binary instead of copying it. This avoids
copying ~50MB+ binaries into each test's temp directory, making
test setup faster and reducing disk I/O.

The simpler os.Link(b.Path, ret.Path) can't be used here because
the source binary lives in the first test's TempDir, which may be
cleaned up before later tests call CopyTo. The open FD keeps the
inode alive after the path is deleted, but os.Link needs a valid
path. (See also b9f468240f which tried os.Link but is racy for
this reason.)

The /proc/self/fd approach works without elevated privileges,
unlike AT_EMPTY_PATH which requires CAP_DAC_READ_SEARCH. If the
linkat fails for any reason (e.g. cross-filesystem temp dirs), it
falls back to the existing full-copy path.

Fixes #19397

Change-Id: I4b1f97f7e63a9ae9e09dce36dfbdd1f6cff92320
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-14 07:13:10 -07:00
Avery Pennarun
621dc9cf1b tstest: fix kernel version parsing for Debian-style version strings
The kernel version parser used strings.Cut with "-" to handle versions
like "5.4.0-76-generic", but Debian uses "+" in versions like
"6.12.41+deb13-amd64".

Use strings.IndexAny to find the first "-" or "+" and truncate there.

Fixes TestKernelVersion on Debian systems.

Fixes #19395

Change-Id: I70e5f95682d54baf908e51f9f4b51c130b00aaaa
Co-Authored-By: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2026-04-14 07:11:44 -07:00
Brad Fitzpatrick
6aa10576c9 wgengine/magicsock: deflake TestTwoDevicePing compare-metrics-stats
The compare-metrics-stats subtest reset two independent counting
systems (physical connection counters and expvar.Int user metrics)
non-atomically. Background WireGuard keepalives arriving between the
resets could increment one system but not the other, causing
off-by-one packet/byte mismatches in either direction.

Replace the reset-then-compare pattern with snapshot-and-delta:
snapshot both systems before pings, snapshot again after, and compare
the deltas. This eliminates the non-atomic reset window entirely.
As a belt-and-suspenders safety net, tolerate a difference of exactly
one packet (and corresponding bytes) from a stray keepalive that
could still arrive in the narrow window between the two snapshots.

flakestress passes with ~5900 runs (~2800 without -race, ~3100 with
-race) but it also passed previously too. This is an annoying one to
repro.

Fixes #11762

Change-Id: I3447ad67e71c8146e85eed38b7a665033ef9e284
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-14 06:57:24 -07:00
Brad Fitzpatrick
49eb1b5d26 net/dns: fix TestDNSTrampleRecovery failure under flakestress
The test had two problems:

1. runFileWatcher passed hardcoded "/etc/" to the inotify watcher,
   but the test filesystem uses a temp directory prefix. The watcher
   was watching the real /etc/, never seeing the test's file writes.

2. The test's watchFile used gonotify.NewDirWatcher which creates
   goroutines that block on real inotify syscalls. These don't work
   inside synctest's fake-time bubble. The test only passed standalone
   by accident: gonotify walks /etc/ on startup producing fake events
   that happened to trigger trample detection at the right time.

Fix the path issue by adding ActualPath to the wholeFileFS interface,
which translates logical paths (like "/etc/resolv.conf") to real
filesystem paths (respecting any test prefix). Use it in
runFileWatcher so the inotify watch targets the correct directory.

Replace gonotify in the test with a one-shot timer that synctest can
advance through fake time, reliably triggering the trample check.

Fixes #19400

Change-Id: Idb252881ec24d0ab3b3c1d154dbdaf532db837d4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-14 06:55:35 -07:00
Claus Lensbøl
27f1d4c15d
control/controlclient: improve filter on netmap updates (#19308)
The previous filters would allow for a handful of subtle issues such as
updating the last seen date when the key or online status had not
changed, and making online keys unconditionally make an engine update.

These have been fixed along side making no change updates from TSMP into
a no-op for the engine so we don't have to reconfigure.

A bunch of additional testing has been added as well.

Updates #12639

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-04-14 08:43:07 -04:00
Patrick O'Doherty
0afaa29503 go.mod: upgrade go-git to v5.17.1
Partially resolve govulncheck warnings in OSS and corp.

Updates #cleanup

Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
2026-04-13 21:10:57 -07:00
Jordan Whited
75819aeed0 derp/derpserver: increase minimum token bucket size
And cap WaitN calls to prevent token bucket errors. Frame length is
inclusive of DERP key for FrameSendPacket frames.

Updates tailscale/corp#40171

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-04-13 19:30:31 -07:00
Avery Pennarun
ab74ea0a67 tstest/integration: clear SSH_CLIENT env to prevent false positive detection
When running integration tests over SSH (e.g., in remote development
environments), the SSH_CLIENT environment variable is set. This causes
isSSHOverTailscale() to incorrectly detect an SSH session and change
behavior.

Clear SSH_CLIENT in the test node environment to prevent these false
positives.

Fixes #19393

Change-Id: I1411abf0be9704cce37051476efb04d59beed386
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2026-04-13 18:53:07 -07:00
Brad Fitzpatrick
9fbe4b3ed2 all: fix six tests that failed with -count=2
Avery found a bunch of tests that fail with -count=2.

Updates tailscale/corp#40176 (tracks making our CI detect them)

Change-Id: Ie3e4398070dd92e4fe0146badddf1254749cca20
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Co-authored-by: Avery Pennarun <apenwarr@tailscale.com>
2026-04-13 18:52:57 -07:00
James Tucker
13d5370951 .gitignore: explicitly include tool/go.exe
Updates #19255

Signed-off-by: James Tucker <james@tailscale.com>
2026-04-13 18:44:59 -07:00
Brad Fitzpatrick
a97850f7e2 cmd/derper: fix TestLookupMetric to pass when run alone
TestLookupMetric was added in e8d140654 (2023-08-17) without
initializing the dnsCache and dnsCacheBytes globals. When run in
isolation, handleBootstrapDNS writes a nil body (from the
uninitialized dnsCacheBytes), causing getBootstrapDNS to fail
decoding an empty response with EOF.

Add a setDNSCache test helper that stores the dnsEntryMap, marshals
dnsCacheBytes, and registers a t.Cleanup to nil both out, so tests
that forget to call it will hit the dnsCache-nil fatal in
getBootstrapDNS rather than silently depending on prior test state.

Also add AssertNotParallel and a dnsCache-nil fatal check to
getBootstrapDNS, the central helper all bootstrap DNS tests flow
through, to prevent future tests from running in parallel (they
all mutate package-level DNS caches and metrics) and to give a
clear error if a test forgets to initialize the DNS caches.

Fixes #19388

Change-Id: I8ad454ec6026c71f13ecfa14d25925df5478b908
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Co-authored-by: Avery Pennarun <apenwarr@tailscale.com>
2026-04-13 17:20:43 -07:00
Brad Fitzpatrick
7dcb378875 tstest/integration/nat, tstest/natlab/vnet: fix natlab test flake
The natlab-integrationtest CI job frequently flakes by exhausting its
3m go test timeout. The root cause is that the QEMU VMs run under
pure software emulation (TCG) with no KVM. Under TCG, the guest
kernel's timer calibration busy-loops are at the mercy of host CPU
scheduling. When two VMs boot simultaneously on a 2-core CI runner,
one VM's calibration gets starved and produces wrong results, leaving
the kernel with broken timers that prevent it from ever completing
boot — even after the other VM finishes and frees up CPU.

Additionally, the microvm machine type doesn't provide HPET hardware,
but the kernel command line specified clocksource=hpet. And the VM
image build (make natlab) ran inside the test itself, consuming most
of the 3m timeout budget before the actual test started.

Fix by:

 - Enabling KVM when /dev/kvm is available, so timer calibration
   uses real hardware timers unaffected by host CPU scheduling.

 - Adding a CI step to set /dev/kvm permissions on the GitHub
   Actions runner (ubuntu-latest provides KVM but needs a udev rule).

 - Pre-building the VM image in a separate CI step so it doesn't
   cut into the go test -timeout budget.

 - Replacing the hardcoded 60s context timeout with one derived from
   t.Deadline(), so the test uses the full -timeout budget.

 - Adding VM boot progress detection (AwaitFirstPacket) and QMP
   diagnostics, so boot failures produce clear errors instead of
   opaque "context deadline exceeded" messages.

With KVM enabled, the test passes reliably even on a single CPU core
with 3 parallel workers — a scenario that was 100% broken under TCG.

Fixes #18906

Change-Id: I4c87631a9c9678d185b9f30cb05c0f7bfa9f5c62
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13 16:34:15 -07:00
Brad Fitzpatrick
dbd19e4b65 tstest: add AssertNotParallel helper
For tests to loudly declare (and panic on violation) when they're doing
something that's not safe in a parallel test.

Fixes #19385

Change-Id: If79693b0c235c146871a05ed74fa9ea75bb500f9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13 16:14:33 -07:00
Brad Fitzpatrick
50b8cfbde2 wgengine/netstack: fix data race on in-flight connection test globals
The maxInFlightConnectionAttemptsForTest and
maxInFlightConnectionAttemptsPerClientForTest globals were plain ints
read by background gVisor TCP handler goroutines (via
wrapTCPProtocolHandler) and written by tstest.Replace cleanup in
TestTCPForwardLimits_PerClient. When a gVisor goroutine outlived the
test cleanup window, the race detector caught the unsynchronized
access.

The race-prone code was introduced in c5abbcd4b4d8 (2024-02-26,
"wgengine/netstack: add a per-client limit for in-flight TCP
forwards") which added both the plain int globals and the
TestTCPForwardLimits_PerClient test that writes them via
tstest.Replace. It is not obvious why this has only recently started
being detected as a data race; likely some combination of gVisor
version bumps, Go toolchain scheduler changes, and additional
TCP-injecting subtests (e.g. 03461ea7f, 2026-01-30) increased
goroutine churn enough to hit the window.

Change both globals to atomic.Int32 and replace tstest.Replace (which
does non-atomic *target = old on cleanup) with explicit Store/Cleanup
pairs.

Fixes #19118

Change-Id: Id26ba6fbfb2e4ade319976db80af8e16c7c8778e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13 15:24:35 -07:00
Brad Fitzpatrick
6500d3c3f8 cmd/containerboot: mark TestContainerBoot as flaky
Updates #19380

Change-Id: Ib1be53836e37224265d10abd0c2213644ea54d64
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13 15:21:42 -07:00
Brad Fitzpatrick
9dfe7875fd version: show tailscale/go toolchain git hash in version output
When built with the Tailscale Go toolchain, include the toolchain's
git revision in the version output. The non-JSON output shows the
first 10 hex digits:

  go version: go1.26.2 (tailscale/go dfe2a5fd8e)

The JSON output includes the full hash as "tailscaleGoGitHash", or
omits the field when not using tsgo.

The toolchain rev is read via a separate sync.OnceValue rather than
piggybacking on getEmbeddedInfo, because that function discards all
data when VCS fields are absent (e.g. in test binaries), while the
tailscale.toolchain.rev setting is still present.

Also add a CI-only test verifying tailscaleToolchainRev is non-empty
when built with the tailscale_go build tag.

Fixes #19374

Change-Id: Ied0b16d7aead5471d8c614c30cba8b0dcf80c691
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13 15:20:56 -07:00
Brad Fitzpatrick
5a7ef4a533 ipn/ipnlocal: mark TestStateMachineSeamless as flaky
Updates #19377

Change-Id: I7dbf5b954effbfa821339e79d02d8a6e46d2862a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13 15:19:42 -07:00
Adriano Sela Aviles
4ce1643929 types/netmap,tailcfg: update documentation for Services cap
Updates tailscale/corp#40052

Signed-off-by: Adriano Sela Aviles <adriano@tailscale.com>
2026-04-13 14:36:48 -07:00
Brad Fitzpatrick
e2fa9ff140 ssh/tailssh: speed up SSH integration tests
Parallelize the SSH integration tests across OS targets and reduce
per-container overhead:

- CI: use GitHub Actions matrix strategy to run all 4 OS containers
  (ubuntu:focal, ubuntu:jammy, ubuntu:noble, alpine:latest) in parallel
  instead of sequentially (~4x wall-clock improvement)

- Makefile: run docker builds in parallel for local dev too

- Dockerfile: consolidate ~20 separate RUN commands into 5 (one per
  test phase), eliminating Docker layer overhead. Combine test binary
  invocations where no state mutation is needed between them. Fix a bug
  where TestDoDropPrivileges was silently not being run (was passed as a
  second positional arg to -test.run instead of using regex alternation).

- TestMain: replace tail -F + 2s sleep with synchronous log read,
  eliminating 2s overhead per test binary invocation. Set debugTest once
  in TestMain instead of redundantly in each test function.

- session.read(): close channel on EOF so non-shell tests return
  immediately instead of waiting for the 1s silence timeout.

Updates #19244

Change-Id: I2cc8588964fbce0dd7b654fb94e7ff33440b8584
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13 14:18:27 -07:00
License Updater
cfed69f3ed licenses: update license notices
Signed-off-by: License Updater <noreply+license-updater@tailscale.com>
2026-04-13 12:47:58 -07:00
Jordan Whited
929ad51be0 cmd/derper: mark rate-config flag as experimental and unstable
Updates tailscale/corp#38509

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-04-13 12:24:59 -07:00
Adriano Sela Aviles
21880457eb ipn/localapi,client/local: add services over localapi
Updates tailscale/corp#40052

Signed-off-by: Adriano Sela Aviles <adriano@tailscale.com>
2026-04-13 11:47:23 -07:00
Brad Fitzpatrick
aa9a76cf30 ssh/tailssh: gofmt
I'm not sure how this file got into the repo without gofmt.

Maybe gofmt rules changed in some Go release?

Updates #cleanup

Change-Id: Ia8bd46e29f116f7fbfca11be80c8ef48699cd9f2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13 11:09:13 -07:00
Brad Fitzpatrick
d5341fd60c tailscaleroot: add test that tsgo rev is in Go build cache keys
Verify that GODEBUG=gocachehash=1 output from ./tool/go includes the
git revision from go.toolchain.rev, ensuring that bumping the Tailscale
Go fork (without a Go version number change) properly invalidates the
build cache.

The test only runs in CI or when the current Go binary is the Tailscale
toolchain (GOROOT contains /.cache/tsgo/), so open source contributors
using stock Go aren't forced to download tsgo.

Fixes tailscale/corp#36589

Change-Id: Ia98d3a3aa8c7fa67f9a0293066fa02a1997dcb95
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13 10:17:22 -07:00
Adriano Sela Aviles
4fcce6000d
tailcfg,types/netmap: add (visible) Services to SelfNode Caps (#19335)
Updates #40052

Signed-off-by: Adriano Sela Aviles <adriano@tailscale.com>
2026-04-13 08:48:02 -07:00
Brad Fitzpatrick
674f866ecc tstest/tailmac: add headless mode for automated VM testing
Add a --headless flag to the Host.app Run subcommand for running
macOS VMs without a GUI, enabling use from test frameworks.

Key changes:

  - HostCli.swift: When --headless is set, run the VM via VMController
    + RunLoop.main.run() instead of NSApplicationMain. Using the
    RunLoop (not dispatchMain) is required because VZ framework
    callbacks depend on RunLoop sources.

  - VMController.swift: Add headless parameter to createVirtualMachine
    that configures a single socket-based NIC (no NAT NIC). This
    matches the NIC configuration used when creating/saving VMs, so
    saved state restoration works correctly. A NIC count mismatch
    causes VZ to silently fail to execute guest code.

  - TailMacConfigHelper.swift: Clean up socket network device logging.

  - Config.swift: Move VM storage from ~/VM.bundle to
    ~/.cache/tailscale/vmtest/macos/.

  - TailMac.swift: Fix dispatchMain→RunLoop.main.run() in the create
    command (same VZ RunLoop requirement).

Updates #13038

Change-Id: Iea51c043aa92e8fc6257139b9f0e2e7677072fa2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-11 12:50:53 -07:00
Brad Fitzpatrick
0e8ae9d60c gokrazy: add arm64 natlab appliance image support
Add natlabapp.arm64 config and gokrazydeps.go for building a gokrazy
natlab appliance image targeting arm64 (Apple Silicon). This is the
arm64 counterpart to the existing natlabapp (amd64) used by vmtest.

The arm64 image uses github.com/gokrazy/kernel.arm64 and is built
with "make natlab-arm64" in the gokrazy directory.

Updates #13038

Change-Id: I0e1f8e5840083a5de5954f2cf46e3babec129d96
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-10 16:57:19 -07:00
Brad Fitzpatrick
cf59a6fb23 .github, tool/listpkgs: automatically find tests which use tstest.RequireRoot
Updates tailscale/corp#40007

Change-Id: I677d3d9e276cb6633a14ac07e4b58ea08e52fac4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-10 16:22:05 -07:00
Mike O'Driscoll
ca5db865b4
cmd/derper,derp: add --rate-config file with SIGHUP reload (#19314)
Add a --rate-config flag pointing to a JSON file for per-client receive
rate limits (bytes/sec and burst bytes). The config is reloaded on SIGHUP,
updating all existing client connections live. The --per-client-rate-limit
and --per-client-rate-burst flags are removed in favor of the config file.

In derpserver, rate limiting uses an atomic.Pointer[xrate.Limiter] per
client: nil when unlimited or mesh (zero overhead), non-nil when
rate-limited.

Document that clientSet.activeClient Store operations require Server.mu.

Updates tailscale/corp#38509

Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
2026-04-10 18:37:54 -04:00
Amal Bansode
b4c0d67f8b wgengine/router/osrouter: fix privileged tests missing fake netfilter runner
These test failures were never caught by CI because the package in question
was missing from our privileged tests list. tailscale/corp#40007 covers improving
our process around this.

Fixes #19316

Signed-off-by: Amal Bansode <amal@tailscale.com>
2026-04-10 14:51:55 -07:00