10462 Commits

Author SHA1 Message Date
M. J. Fromberger
bc8095df38 ipn/ipnlocal: disable netmap caching for ios
Updates #todo

Change-Id: I3efa627729de23c00022dfc46493ab94921aa68c
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2026-04-15 10:09:04 -07:00
Claus Lensbøl
61c95f409c
control/controlclient: accept key if last seen on exist node is absent (#19402)
On some nodes (found via natlab), the existing nodes last seen could be
unset. For these cases, we would want to accept the key and write a last
seen. This was breaking the cached netmap natlab tests.

Updates #12639

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-04-15 03:53:40 -04:00
Avery Pennarun
effbe67fe3 wgengine/magicsock: remove pickPort, use port 0 to avoid TOCTOU race
pickPort would bind a UDP socket on :0 to get a free port, close
the socket, then hope to rebind to the same port in NewConn. This
is a TOCTOU race that can cause flaky test failures when another
process grabs the port in between.

Instead, pass Port: 0 to NewConn and let the OS assign the port
atomically, then read back the assigned port via conn.LocalPort().

Fixes #19409

Change-Id: Ie44b599fb93c361e29a05f2171ad747c46f82b7a
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2026-04-14 18:08:47 -07:00
Naman Sood
6301a6ce4b
util/linuxfw,wgengine/router: allow incoming CGNAT range traffic with nodeattr
Clients with the newly added node attribute
`"disable-linux-cgnat-drop-rule"` will not automatically drop inbound
traffic on non-Tailscale network interfaces with the source IP in the
CGNAT IP range. This is an initial proof-of-concept for enabling
connectivity with off-Tailnet CGNAT endpoints.

Fixes tailscale/corp#36270.

Signed-off-by: Naman Sood <mail@nsood.in>
2026-04-14 16:45:06 -04:00
Fernando Serboncini
5834058269
wgengine: replace reflect.DeepEqual with typed Equal for maybeReconfigInputs (#19365)
reflect.DeepEqual is expensive and allocates heavily. Replace it with
a field-by-field comparison that does zero allocations.

Adds tests and benchmarks for the new Equal method.

Fixes #19363

Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
2026-04-14 13:16:21 -04:00
Brad Fitzpatrick
943b426038 util/linuxfw: fix nil deref in nftables chain check
Fix a panic in getOrCreateChain when the kernel lacks nftables support
(CONFIG_NF_TABLES). When the nftables netlink connection fails, chain
objects returned by getChainFromTable can have nil Hooknum and Priority
fields. Dereferencing these caused tailscaled to SIGSEGV during router
configuration, which manifested as tailscaled silently crashing ~13
seconds after "tailscale up" on arm64 gokrazy (whose kernel.arm64
build doesn't include nftables).

Updates #13038

Change-Id: I14433616da5ed57895cad37038921fb4f79c3534
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-14 07:45:01 -07:00
Brad Fitzpatrick
a0a8fae856 tstest/integration: use linkat to hardlink test binaries on Linux
Use linkat via /proc/self/fd with AT_SYMLINK_FOLLOW to create a
hardlink of the test binary instead of copying it. This avoids
copying ~50MB+ binaries into each test's temp directory, making
test setup faster and reducing disk I/O.

The simpler os.Link(b.Path, ret.Path) can't be used here because
the source binary lives in the first test's TempDir, which may be
cleaned up before later tests call CopyTo. The open FD keeps the
inode alive after the path is deleted, but os.Link needs a valid
path. (See also b9f468240f which tried os.Link but is racy for
this reason.)

The /proc/self/fd approach works without elevated privileges,
unlike AT_EMPTY_PATH which requires CAP_DAC_READ_SEARCH. If the
linkat fails for any reason (e.g. cross-filesystem temp dirs), it
falls back to the existing full-copy path.

Fixes #19397

Change-Id: I4b1f97f7e63a9ae9e09dce36dfbdd1f6cff92320
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-14 07:13:10 -07:00
Avery Pennarun
621dc9cf1b tstest: fix kernel version parsing for Debian-style version strings
The kernel version parser used strings.Cut with "-" to handle versions
like "5.4.0-76-generic", but Debian uses "+" in versions like
"6.12.41+deb13-amd64".

Use strings.IndexAny to find the first "-" or "+" and truncate there.

Fixes TestKernelVersion on Debian systems.

Fixes #19395

Change-Id: I70e5f95682d54baf908e51f9f4b51c130b00aaaa
Co-Authored-By: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2026-04-14 07:11:44 -07:00
Brad Fitzpatrick
6aa10576c9 wgengine/magicsock: deflake TestTwoDevicePing compare-metrics-stats
The compare-metrics-stats subtest reset two independent counting
systems (physical connection counters and expvar.Int user metrics)
non-atomically. Background WireGuard keepalives arriving between the
resets could increment one system but not the other, causing
off-by-one packet/byte mismatches in either direction.

Replace the reset-then-compare pattern with snapshot-and-delta:
snapshot both systems before pings, snapshot again after, and compare
the deltas. This eliminates the non-atomic reset window entirely.
As a belt-and-suspenders safety net, tolerate a difference of exactly
one packet (and corresponding bytes) from a stray keepalive that
could still arrive in the narrow window between the two snapshots.

flakestress passes with ~5900 runs (~2800 without -race, ~3100 with
-race) but it also passed previously too. This is an annoying one to
repro.

Fixes #11762

Change-Id: I3447ad67e71c8146e85eed38b7a665033ef9e284
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-14 06:57:24 -07:00
Brad Fitzpatrick
49eb1b5d26 net/dns: fix TestDNSTrampleRecovery failure under flakestress
The test had two problems:

1. runFileWatcher passed hardcoded "/etc/" to the inotify watcher,
   but the test filesystem uses a temp directory prefix. The watcher
   was watching the real /etc/, never seeing the test's file writes.

2. The test's watchFile used gonotify.NewDirWatcher which creates
   goroutines that block on real inotify syscalls. These don't work
   inside synctest's fake-time bubble. The test only passed standalone
   by accident: gonotify walks /etc/ on startup producing fake events
   that happened to trigger trample detection at the right time.

Fix the path issue by adding ActualPath to the wholeFileFS interface,
which translates logical paths (like "/etc/resolv.conf") to real
filesystem paths (respecting any test prefix). Use it in
runFileWatcher so the inotify watch targets the correct directory.

Replace gonotify in the test with a one-shot timer that synctest can
advance through fake time, reliably triggering the trample check.

Fixes #19400

Change-Id: Idb252881ec24d0ab3b3c1d154dbdaf532db837d4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-14 06:55:35 -07:00
Claus Lensbøl
27f1d4c15d
control/controlclient: improve filter on netmap updates (#19308)
The previous filters would allow for a handful of subtle issues such as
updating the last seen date when the key or online status had not
changed, and making online keys unconditionally make an engine update.

These have been fixed along side making no change updates from TSMP into
a no-op for the engine so we don't have to reconfigure.

A bunch of additional testing has been added as well.

Updates #12639

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-04-14 08:43:07 -04:00
Patrick O'Doherty
0afaa29503 go.mod: upgrade go-git to v5.17.1
Partially resolve govulncheck warnings in OSS and corp.

Updates #cleanup

Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
2026-04-13 21:10:57 -07:00
Jordan Whited
75819aeed0 derp/derpserver: increase minimum token bucket size
And cap WaitN calls to prevent token bucket errors. Frame length is
inclusive of DERP key for FrameSendPacket frames.

Updates tailscale/corp#40171

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-04-13 19:30:31 -07:00
Avery Pennarun
ab74ea0a67 tstest/integration: clear SSH_CLIENT env to prevent false positive detection
When running integration tests over SSH (e.g., in remote development
environments), the SSH_CLIENT environment variable is set. This causes
isSSHOverTailscale() to incorrectly detect an SSH session and change
behavior.

Clear SSH_CLIENT in the test node environment to prevent these false
positives.

Fixes #19393

Change-Id: I1411abf0be9704cce37051476efb04d59beed386
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2026-04-13 18:53:07 -07:00
Brad Fitzpatrick
9fbe4b3ed2 all: fix six tests that failed with -count=2
Avery found a bunch of tests that fail with -count=2.

Updates tailscale/corp#40176 (tracks making our CI detect them)

Change-Id: Ie3e4398070dd92e4fe0146badddf1254749cca20
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Co-authored-by: Avery Pennarun <apenwarr@tailscale.com>
2026-04-13 18:52:57 -07:00
James Tucker
13d5370951 .gitignore: explicitly include tool/go.exe
Updates #19255

Signed-off-by: James Tucker <james@tailscale.com>
2026-04-13 18:44:59 -07:00
Brad Fitzpatrick
a97850f7e2 cmd/derper: fix TestLookupMetric to pass when run alone
TestLookupMetric was added in e8d140654 (2023-08-17) without
initializing the dnsCache and dnsCacheBytes globals. When run in
isolation, handleBootstrapDNS writes a nil body (from the
uninitialized dnsCacheBytes), causing getBootstrapDNS to fail
decoding an empty response with EOF.

Add a setDNSCache test helper that stores the dnsEntryMap, marshals
dnsCacheBytes, and registers a t.Cleanup to nil both out, so tests
that forget to call it will hit the dnsCache-nil fatal in
getBootstrapDNS rather than silently depending on prior test state.

Also add AssertNotParallel and a dnsCache-nil fatal check to
getBootstrapDNS, the central helper all bootstrap DNS tests flow
through, to prevent future tests from running in parallel (they
all mutate package-level DNS caches and metrics) and to give a
clear error if a test forgets to initialize the DNS caches.

Fixes #19388

Change-Id: I8ad454ec6026c71f13ecfa14d25925df5478b908
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Co-authored-by: Avery Pennarun <apenwarr@tailscale.com>
2026-04-13 17:20:43 -07:00
Brad Fitzpatrick
7dcb378875 tstest/integration/nat, tstest/natlab/vnet: fix natlab test flake
The natlab-integrationtest CI job frequently flakes by exhausting its
3m go test timeout. The root cause is that the QEMU VMs run under
pure software emulation (TCG) with no KVM. Under TCG, the guest
kernel's timer calibration busy-loops are at the mercy of host CPU
scheduling. When two VMs boot simultaneously on a 2-core CI runner,
one VM's calibration gets starved and produces wrong results, leaving
the kernel with broken timers that prevent it from ever completing
boot — even after the other VM finishes and frees up CPU.

Additionally, the microvm machine type doesn't provide HPET hardware,
but the kernel command line specified clocksource=hpet. And the VM
image build (make natlab) ran inside the test itself, consuming most
of the 3m timeout budget before the actual test started.

Fix by:

 - Enabling KVM when /dev/kvm is available, so timer calibration
   uses real hardware timers unaffected by host CPU scheduling.

 - Adding a CI step to set /dev/kvm permissions on the GitHub
   Actions runner (ubuntu-latest provides KVM but needs a udev rule).

 - Pre-building the VM image in a separate CI step so it doesn't
   cut into the go test -timeout budget.

 - Replacing the hardcoded 60s context timeout with one derived from
   t.Deadline(), so the test uses the full -timeout budget.

 - Adding VM boot progress detection (AwaitFirstPacket) and QMP
   diagnostics, so boot failures produce clear errors instead of
   opaque "context deadline exceeded" messages.

With KVM enabled, the test passes reliably even on a single CPU core
with 3 parallel workers — a scenario that was 100% broken under TCG.

Fixes #18906

Change-Id: I4c87631a9c9678d185b9f30cb05c0f7bfa9f5c62
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13 16:34:15 -07:00
Brad Fitzpatrick
dbd19e4b65 tstest: add AssertNotParallel helper
For tests to loudly declare (and panic on violation) when they're doing
something that's not safe in a parallel test.

Fixes #19385

Change-Id: If79693b0c235c146871a05ed74fa9ea75bb500f9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13 16:14:33 -07:00
Brad Fitzpatrick
50b8cfbde2 wgengine/netstack: fix data race on in-flight connection test globals
The maxInFlightConnectionAttemptsForTest and
maxInFlightConnectionAttemptsPerClientForTest globals were plain ints
read by background gVisor TCP handler goroutines (via
wrapTCPProtocolHandler) and written by tstest.Replace cleanup in
TestTCPForwardLimits_PerClient. When a gVisor goroutine outlived the
test cleanup window, the race detector caught the unsynchronized
access.

The race-prone code was introduced in c5abbcd4b4d8 (2024-02-26,
"wgengine/netstack: add a per-client limit for in-flight TCP
forwards") which added both the plain int globals and the
TestTCPForwardLimits_PerClient test that writes them via
tstest.Replace. It is not obvious why this has only recently started
being detected as a data race; likely some combination of gVisor
version bumps, Go toolchain scheduler changes, and additional
TCP-injecting subtests (e.g. 03461ea7f, 2026-01-30) increased
goroutine churn enough to hit the window.

Change both globals to atomic.Int32 and replace tstest.Replace (which
does non-atomic *target = old on cleanup) with explicit Store/Cleanup
pairs.

Fixes #19118

Change-Id: Id26ba6fbfb2e4ade319976db80af8e16c7c8778e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13 15:24:35 -07:00
Brad Fitzpatrick
6500d3c3f8 cmd/containerboot: mark TestContainerBoot as flaky
Updates #19380

Change-Id: Ib1be53836e37224265d10abd0c2213644ea54d64
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13 15:21:42 -07:00
Brad Fitzpatrick
9dfe7875fd version: show tailscale/go toolchain git hash in version output
When built with the Tailscale Go toolchain, include the toolchain's
git revision in the version output. The non-JSON output shows the
first 10 hex digits:

  go version: go1.26.2 (tailscale/go dfe2a5fd8e)

The JSON output includes the full hash as "tailscaleGoGitHash", or
omits the field when not using tsgo.

The toolchain rev is read via a separate sync.OnceValue rather than
piggybacking on getEmbeddedInfo, because that function discards all
data when VCS fields are absent (e.g. in test binaries), while the
tailscale.toolchain.rev setting is still present.

Also add a CI-only test verifying tailscaleToolchainRev is non-empty
when built with the tailscale_go build tag.

Fixes #19374

Change-Id: Ied0b16d7aead5471d8c614c30cba8b0dcf80c691
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13 15:20:56 -07:00
Brad Fitzpatrick
5a7ef4a533 ipn/ipnlocal: mark TestStateMachineSeamless as flaky
Updates #19377

Change-Id: I7dbf5b954effbfa821339e79d02d8a6e46d2862a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13 15:19:42 -07:00
Adriano Sela Aviles
4ce1643929 types/netmap,tailcfg: update documentation for Services cap
Updates tailscale/corp#40052

Signed-off-by: Adriano Sela Aviles <adriano@tailscale.com>
2026-04-13 14:36:48 -07:00
Brad Fitzpatrick
e2fa9ff140 ssh/tailssh: speed up SSH integration tests
Parallelize the SSH integration tests across OS targets and reduce
per-container overhead:

- CI: use GitHub Actions matrix strategy to run all 4 OS containers
  (ubuntu:focal, ubuntu:jammy, ubuntu:noble, alpine:latest) in parallel
  instead of sequentially (~4x wall-clock improvement)

- Makefile: run docker builds in parallel for local dev too

- Dockerfile: consolidate ~20 separate RUN commands into 5 (one per
  test phase), eliminating Docker layer overhead. Combine test binary
  invocations where no state mutation is needed between them. Fix a bug
  where TestDoDropPrivileges was silently not being run (was passed as a
  second positional arg to -test.run instead of using regex alternation).

- TestMain: replace tail -F + 2s sleep with synchronous log read,
  eliminating 2s overhead per test binary invocation. Set debugTest once
  in TestMain instead of redundantly in each test function.

- session.read(): close channel on EOF so non-shell tests return
  immediately instead of waiting for the 1s silence timeout.

Updates #19244

Change-Id: I2cc8588964fbce0dd7b654fb94e7ff33440b8584
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13 14:18:27 -07:00
License Updater
cfed69f3ed licenses: update license notices
Signed-off-by: License Updater <noreply+license-updater@tailscale.com>
2026-04-13 12:47:58 -07:00
Jordan Whited
929ad51be0 cmd/derper: mark rate-config flag as experimental and unstable
Updates tailscale/corp#38509

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-04-13 12:24:59 -07:00
Adriano Sela Aviles
21880457eb ipn/localapi,client/local: add services over localapi
Updates tailscale/corp#40052

Signed-off-by: Adriano Sela Aviles <adriano@tailscale.com>
2026-04-13 11:47:23 -07:00
Brad Fitzpatrick
aa9a76cf30 ssh/tailssh: gofmt
I'm not sure how this file got into the repo without gofmt.

Maybe gofmt rules changed in some Go release?

Updates #cleanup

Change-Id: Ia8bd46e29f116f7fbfca11be80c8ef48699cd9f2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13 11:09:13 -07:00
Brad Fitzpatrick
d5341fd60c tailscaleroot: add test that tsgo rev is in Go build cache keys
Verify that GODEBUG=gocachehash=1 output from ./tool/go includes the
git revision from go.toolchain.rev, ensuring that bumping the Tailscale
Go fork (without a Go version number change) properly invalidates the
build cache.

The test only runs in CI or when the current Go binary is the Tailscale
toolchain (GOROOT contains /.cache/tsgo/), so open source contributors
using stock Go aren't forced to download tsgo.

Fixes tailscale/corp#36589

Change-Id: Ia98d3a3aa8c7fa67f9a0293066fa02a1997dcb95
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13 10:17:22 -07:00
Adriano Sela Aviles
4fcce6000d
tailcfg,types/netmap: add (visible) Services to SelfNode Caps (#19335)
Updates #40052

Signed-off-by: Adriano Sela Aviles <adriano@tailscale.com>
2026-04-13 08:48:02 -07:00
Brad Fitzpatrick
674f866ecc tstest/tailmac: add headless mode for automated VM testing
Add a --headless flag to the Host.app Run subcommand for running
macOS VMs without a GUI, enabling use from test frameworks.

Key changes:

  - HostCli.swift: When --headless is set, run the VM via VMController
    + RunLoop.main.run() instead of NSApplicationMain. Using the
    RunLoop (not dispatchMain) is required because VZ framework
    callbacks depend on RunLoop sources.

  - VMController.swift: Add headless parameter to createVirtualMachine
    that configures a single socket-based NIC (no NAT NIC). This
    matches the NIC configuration used when creating/saving VMs, so
    saved state restoration works correctly. A NIC count mismatch
    causes VZ to silently fail to execute guest code.

  - TailMacConfigHelper.swift: Clean up socket network device logging.

  - Config.swift: Move VM storage from ~/VM.bundle to
    ~/.cache/tailscale/vmtest/macos/.

  - TailMac.swift: Fix dispatchMain→RunLoop.main.run() in the create
    command (same VZ RunLoop requirement).

Updates #13038

Change-Id: Iea51c043aa92e8fc6257139b9f0e2e7677072fa2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-11 12:50:53 -07:00
Brad Fitzpatrick
0e8ae9d60c gokrazy: add arm64 natlab appliance image support
Add natlabapp.arm64 config and gokrazydeps.go for building a gokrazy
natlab appliance image targeting arm64 (Apple Silicon). This is the
arm64 counterpart to the existing natlabapp (amd64) used by vmtest.

The arm64 image uses github.com/gokrazy/kernel.arm64 and is built
with "make natlab-arm64" in the gokrazy directory.

Updates #13038

Change-Id: I0e1f8e5840083a5de5954f2cf46e3babec129d96
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-10 16:57:19 -07:00
Brad Fitzpatrick
cf59a6fb23 .github, tool/listpkgs: automatically find tests which use tstest.RequireRoot
Updates tailscale/corp#40007

Change-Id: I677d3d9e276cb6633a14ac07e4b58ea08e52fac4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-10 16:22:05 -07:00
Mike O'Driscoll
ca5db865b4
cmd/derper,derp: add --rate-config file with SIGHUP reload (#19314)
Add a --rate-config flag pointing to a JSON file for per-client receive
rate limits (bytes/sec and burst bytes). The config is reloaded on SIGHUP,
updating all existing client connections live. The --per-client-rate-limit
and --per-client-rate-burst flags are removed in favor of the config file.

In derpserver, rate limiting uses an atomic.Pointer[xrate.Limiter] per
client: nil when unlimited or mesh (zero overhead), non-nil when
rate-limited.

Document that clientSet.activeClient Store operations require Server.mu.

Updates tailscale/corp#38509

Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
2026-04-10 18:37:54 -04:00
Amal Bansode
b4c0d67f8b wgengine/router/osrouter: fix privileged tests missing fake netfilter runner
These test failures were never caught by CI because the package in question
was missing from our privileged tests list. tailscale/corp#40007 covers improving
our process around this.

Fixes #19316

Signed-off-by: Amal Bansode <amal@tailscale.com>
2026-04-10 14:51:55 -07:00
Brad Fitzpatrick
5e81840b57 tstest: add RequireRoot helper
Start using a common helper for tests to declare that they require root.

This is step 1. A later step will then make this helper track which tests were
skipped so a subsequent pass will run these test as root.

Updates tailscale/corp#40007

Change-Id: I4979e1def0fa3691d38c83f48c89aaa443e7f62e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-10 10:48:50 -07:00
Alex Chan
399f048332 tka: Revert "improve logging for Compact and Commit operations"
This reverts commit b25920dfc07452833895ad00b42db7e581b3cec8.

The `log.Printf` messages are causing panics in corp, in particular:

> panic: please use tailscale.com/logger.Logf instead of the log package

Fixing the TKA code to plumb through a logger properly is going to be
a hassle, so for now remove these logs to unblock merges to corp.

Updates tailscale/corp#39455

Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-04-10 17:13:23 +01:00
Alex Chan
1ff369a261 tka: keep the CompactionDefaults alongside the other limits
Updates #cleanup

Change-Id: Ib5e481d5a9c7ec7ac3e6b3913909ab1bf21d7a4d
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-04-10 16:06:23 +01:00
Jonathan Nobels
03c3551ee5
ipn/ipnlocal: add netmap mutations to the ipn bus (#19120)
ipn/local: add netmap mutations to the ipn bus

updates tailscale/tailscale#1909

This adds a new new NotifyWatchOpt that allows watchers to
receive PeerChange events (derived from node mutations)
on the IPN bus in lieu of a complete netmap.  We'll continue
to send the full netmap for any map response that includes it,
but for  mutations, sending PeerChange events gives the client
the option to manage it's own models more selectively and cuts
way down on json serialization overhead.

On chatty tailnets, this will vastly reduce the amount of
chatter on the bus.

This change should be backwards compatible, it is
purely additive.  Clients that subscribe to NotifyNetmap will
get the full netmap for every delta.  New clients can
omit that and instead opt into NotifyPeerChanges.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
2026-04-09 15:45:41 -04:00
Fernando Serboncini
6b7caaf7ee
cmd/k8s-operator: set PreferDualStack on ProxyGroup egress services (#19194)
On dual-stack clusters defaulting to IPv6, the ProxyGroup egress
service only got an IPv6 address, which causes request failures.
Individual egress proxies already set PreferDualStack correctly.

Fixes: #18768

Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
2026-04-09 13:33:39 -04:00
Andrew Dunham
27e6fed0c1 ssh/tailssh: fix default PATH for Debian
Validated against a modern Debian install, fixes a typo.

Updates #cleanup

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I7b26012f54dbd2f0f9fea98722e8edc2fe97645a
2026-04-09 11:57:40 -04:00
Brad Fitzpatrick
dca1d8eea1 tstest/natlab: add TestSubnetRouterFreeBSD with FreeBSD cloud image support
As a warm-up to making natlab support multiple operating systems,
start with an easy one (in that it's also Unixy and open source like
Linux) and add FreeBSD 15.0 as a VM OS option for the vmtest
integration test framework, and add TestSubnetRouterFreeBSD which
tests subnet routing through a FreeBSD VM (Gokrazy → FreeBSD →
Gokrazy).

Key changes:
- Add FreeBSD150 OSImage using the official FreeBSD 15.0
  BASIC-CLOUDINIT cloud image (xz-compressed qcow2)
- Add GOOS()/IsFreeBSD() methods to OSImage for cross-compilation
  and OS-specific behavior
- Handle xz-compressed image downloads in ensureImage
- Refactor compileBinaries into compileBinariesForOS to support
  multiple GOOS targets (linux, freebsd), with binaries registered
  at <goos>/<name> paths on the file server VIP
- Add FreeBSD-specific cloud-init (nuageinit) user-data generation:
  string-form runcmd (nuageinit doesn't support YAML arrays),
  fetch(1) instead of curl, FreeBSD sysctl names for IP forwarding,
  mkdir /usr/local/bin, PATH setup for tta
- Skip network-config in cidata ISO for FreeBSD (DHCP via rc.conf)

Updates tailscale/tailscale#13038

Change-Id: Ibeb4f7d02659d5cd8e3a7c3a66ee7b1a92a0110d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-09 07:49:07 -07:00
David Bond
85d6ba9473
cmd/k8s-operator: migrate to tailscale-client-go-v2 (#19010)
This commit modifies the kubernetes operator to use the `tailscale-client-go-v2`
package instead of the internal tailscale client it was previously using. This
now gives us the ability to expand out custom resources and features as they
become available via the API module.

The tailnet reconciler has also been modified to manage clients as tailnets
are created and removed, providing each subsequent reconciler with a single
`ClientProvider` that obtains a tailscale client for the respective tailnet
by name, or the operator's default when presented with a blank string.

Fixes: https://github.com/tailscale/corp/issues/38418

Signed-off-by: David Bond <davidsbond93@gmail.com>
2026-04-09 14:39:46 +01:00
Alex Chan
b25920dfc0 tka: improve logging for Compact and Commit operations
Log whenever we:

* Commit an AUM which was previously soft-deleted (which we don't expect
  to happen in practice, and may indicate an issue with our sync code)
* Purge AUMs during a Compact operation.
* Successfully commit AUMs as part of a bootstrap or sync operation.

All three logs mention `tka` for easy of discoverability.

Updates tailscale/corp#39455

Change-Id: I2b07bb0ef075877f40ec34b80bb668be59e1cdc3
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-04-09 13:06:39 +01:00
Brad Fitzpatrick
ec0b23a21f vmtest: add VM-based integration test framework
Add tstest/natlab/vmtest, a high-level framework for running multi-VM
integration tests with mixed OS types (gokrazy + Ubuntu/Debian cloud
images) connected via natlab's vnet virtual network.

The vmtest package provides:
  - Env type that orchestrates vnet, QEMU processes, and agent connections
  - OS image support (Gokrazy, Ubuntu2404, Debian12) with download/cache
  - QEMU launch per OS type (microvm for gokrazy, q35+KVM for cloud)
  - Cloud-init seed ISO generation with network-config for multi-NIC
  - Cross-compilation of test binaries for cloud VMs
  - Debug SSH NIC on cloud VMs for interactive debugging
  - Test helpers: ApproveRoutes, HTTPGet, TailscalePing, DumpStatus,
    WaitForPeerRoute, SSHExec

TTA enhancements (cmd/tta):
  - Parameterize /up (accept-routes, advertise-routes, snat-subnet-routes)
  - Add /set, /start-webserver, /http-get endpoints
  - /http-get uses local.Client.UserDial for Tailscale-routed requests
  - Fix /ping for non-gokrazy systems

TestSubnetRouter exercises a 3-VM subnet router scenario:
  client (gokrazy) → subnet-router (Ubuntu, dual-NIC) → backend (gokrazy)
  Verifies HTTP access to the backend webserver through the Tailscale
  subnet route. Passes in ~30 seconds.

Updates tailscale/tailscale#13038

Change-Id: I165b64af241d37f5f5870e796a52502fc56146fa
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-08 17:24:18 -07:00
Jason O'Donnell
d948b78b23
tsweb: add TS_DEBUG_TRUSTED_CIDRS envknob to debug (#19283)
Add a new envknob that allows connections from trusted CIDR ranges
to access debug endpoints without Tailscale authentication. This is
useful for in-cluster scrapers like Prometheus that are not on a
tailnet, do not have static IP addresses and cannot use debug keys.

Fixes #19282

Signed-off-by: Jason O'Donnell <2160810+jasonodonnell@users.noreply.github.com>
2026-04-08 18:47:52 -04:00
Brad Fitzpatrick
647deed2d9 misc: add install-git-hooks.go and git hook for Change-Id tracking
Add misc/install-git-hooks.go and misc/git_hook/ to the OSS repo,
adapted from the corp repo. The primary motivation is Change-Id
generation in commit messages, which provides a persistent identifier
for a change across cherry-picks between branches.

The installer uses "git rev-parse --git-common-dir" instead of go-git
to find the hooks directory, avoiding a new direct dependency while
still supporting worktrees.

Hooks included:
- commit-msg: adds Change-Id trailer
- pre-commit: blocks NOCOMMIT / DO NOT SUBMIT markers
- pre-push: blocks local-directory replace directives in go.mod
- post-checkout: warns when the hook binary is outdated

Also update docs/commit-messages.md to reflect that Change-Id is no
longer optional in the OSS repo.

Updates tailscale/corp#39860

Change-Id: I09066b889118840c0ec6995cc03a9cf464740ffa
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-08 15:10:53 -07:00
Nathan Perry
33cd8ea86b tool/goexe: refactor to use windows_sys
Updates #19255

Signed-off-by: Nathan Perry <nathan@tailscale.com>
Change-Id: Idf69f23b5a61417d5fa3638a276d64856a6a6964
2026-04-08 14:08:30 -07:00
Brad Fitzpatrick
8a9840d6a8 tool: replace go.cmd with a 19KB Rust go.exe wrapper
go.cmd used cmd.exe to invoke PowerShell, which mangled arguments:
cmd.exe treats ^ as an escape character (so -run "^$" became -run "$",
running all tests instead of none) and = signs also caused issues in
the PowerShell→cmd.exe argument passing layer.

Replace it with a tiny no_std Rust binary (19KB, 32-bit x86 for
universal Windows compat: x86/x64/ARM64) that directly invokes the
Tailscale Go toolchain via CreateProcessW. The raw command line from
GetCommandLineW is passed through to CreateProcessW with only argv[0]
replaced, so arguments are never parsed or re-escaped.

The binary also handles first-run toolchain download natively using
curl.exe and tar.exe (both ship with Windows 10+), so PowerShell is
no longer required for normal operation. The PowerShell fallback is
only used for the rare TS_USE_GOCROSS=1 path.

PowerShell prefers go.exe over go.cmd when resolving ./tool/go, so
this is a drop-in replacement.

With go.exe in place, the CI can use the natural -bench=. -benchtime=1x
-run="^$" flags directly.

Also removes tool/go-win.ps1 which is now unused.

Updates #19255

Change-Id: I80da23285b74796e7694b89cff29a9fa0eaa6281
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-08 14:08:30 -07:00