2687 Commits

Author SHA1 Message Date
Brad Fitzpatrick
f289f7e77c tstest/natlab/vmtest,cmd/tta: add TestSiteToSite
Verifies that site-to-site Tailscale subnet routing with
--snat-subnet-routes=false preserves the original source IP
end-to-end.

Topology: two sites, each with a Linux subnet router on a NATted WAN
plus an internal LAN, and a non-Tailscale backend on each LAN. Backends
are given static routes pointing to their local subnet router for the
remote site's prefix; an HTTP GET from backend-a to backend-b over
Tailscale returns a body containing backend-a's LAN IP.

Adds the supporting vmtest.SNATSubnetRoutes NodeOption and plumbs
snat-subnet-routes through TTA's /up handler. The webserver started by
vmtest.WebServer now also echoes the remote IP, for the preservation
assertion.

Adds a /add-route TTA endpoint (Linux-only for now) and a vmtest
Env.AddRoute helper so the test can install the backend static routes
through TTA rather than needing a host SSH key and debug NIC.

ensureGokrazy now always rebuilds the natlab qcow2 (once per test
process, via sync.Once) so the test picks up the new TTA and webserver
behavior.

This is pulled out of a larger pending change that adds FreeBSD
site-to-site subnet routing support; figured we should have at least
the Linux test covering what works today.

Updates #5573

Change-Id: I881c55b0f118ac9094546b5fbe68dddf179bb042
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-22 12:11:30 -07:00
Fernando Serboncini
81fbcc1ac8
cmd/tsnet-proxy: add tsnet-based port proxy tool (#19468)
Exposes a local port on the tailnet under a chosen hostname. Raw TCP by
default; --http or --https reverse-proxy with Tailscale-User-* identity
headers from WhoIs, matching tailscaled's serve header conventions.

Useful as a one-shot to put a dev server on the tailnet.

Fixes #19467

Change-Id: I79f63cfbbedf7e40cf0f1f51cbae8df86ae90cdf

Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
2026-04-22 13:34:18 -04:00
James 'zofrex' Sanderson
ffae275d4d
ipn/ipnlocal,tailcfg: add /debug/tka c2n endpoint (#19198)
Updates tailscale/corp#35015

Signed-off-by: James Sanderson <jsanderson@tailscale.com>
2026-04-20 16:00:03 +01:00
BeckyPauley
b239e92eb6
cmd/k8s-operator: add e2e test setup and l7 ingress test for multi-tailnet (#19426)
This change adds setup for a second tailnet to enable multi-tailnet e2e
tests. When running against devcontrol, a second tailnet is created via the
API. Otherwise, credentials are read from SECOND_TS_API_CLIENT_SECRET.

Also adds an l7 HA Ingress test for multi-tailnet.

Fixes tailscale/corp#37498

Signed-off-by: Becky Pauley <becky@tailscale.com>
2026-04-17 17:03:25 +01:00
Andrew Dunham
d52ae45e9b cmd/cloner: deep-clone pointer elements in map-of-slice values
The cloner's codegen for map[K][]*V fields was doing a shallow
append (copying pointer values) instead of cloning each element.
This meant that cloned structs aliased the original's pointed-to
values through the map's slice entries.

Mirror the existing standalone-slice logic that checks
ContainsPointers(sliceType.Elem()) and generates per-element
cloning for pointer, interface, and struct types.

Regenerate net/dns and tailcfg which both had affected
map[...][]*dnstype.Resolver fields.

Fixes #19284

Signed-off-by: Andrew Dunham <andrew@tailscale.com>
2026-04-17 11:36:05 -04:00
Bjorn Stange
47ecbe5845
cmd/k8s-operator: add priorityClassName support to helm chart (#19236)
Expose priorityClassName in the operator Helm chart values so that
users can configure the operator deployment with a Kubernetes
PriorityClass. This prevents the operator pods from being preempted
by lower-priority workloads.

Fixes #19235

Signed-off-by: Bjorn Stange <bjorn.stange@expel.io>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 12:57:12 +01:00
Brad Fitzpatrick
50d7176333 control/tsp, cmd/tsp: add low-level Tailscale protocol client and tool
Add a new control/tsp package providing a client for speaking the
Tailscale protocol to a coordination server over Noise, along with a
cmd/tsp binary exposing it as a low-level composable tool for
generating keys, registering nodes, and issuing map requests.

Previously developed out-of-tree at github.com/bradfitz/tsp; imported
here without git history.

Updates #12542

Change-Id: I6ad21143c4aefe8939d4a46ae65b2184173bf69f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-16 20:00:25 -07:00
David Bond
eea39eaf52
cmd/k8s-operator: add affinity rules to DNSConfig (#19360)
This commit modifies the `DNSConfig` custom resource to allow the
user to specify affinity rules on the nameserver pods.

Updates: https://github.com/tailscale/tailscale/issues/18556

Signed-off-by: David Bond <davidsbond93@gmail.com>
2026-04-15 22:39:04 +01:00
Jonathan Nobels
acc43356c6
control/controlclient: enable request signatures on macOS (#19317)
fixes tailscale/corp#39422

Updates tailscale/certstore for properly macOS support and
builds the request signing support into macOS builds.  iOS and builds
that do not use cGo are omitted.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
2026-04-15 14:11:14 -04:00
Tom Meadows
5eb0b4be31
cmd/containerboot,cmd/k8s-proxy,kube: add authkey renewal to k8s-proxy (#19221)
* kube/authkey,cmd/containerboot: extract shared auth key reissue package

Move auth key reissue logic (set marker, wait for new key, clear marker,
read config) into a shared kube/authkey package and update containerboot
to use it. No behaviour change.

Updates #14080

Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>

* kube/authkey,kube/state,cmd/containerboot: preserve device_id across restarts

Stop clearing device_id, device_fqdn, and device_ips from state on startup.
These keys are now preserved across restarts so the operator can track
device identity. Expand ClearReissueAuthKey to clear device state and
tailscaled profile data when performing a full auth key reissue.

Updates #14080

Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>

* cmd/containerboot: use root context for auth key reissue wait

Pass the root context instead of bootCtx to setAndWaitForAuthKeyReissue.
The 60-second bootCtx timeout was cancelling the reissue wait before the
operator had time to respond, causing the pod to crash-loop.

Updates #14080

Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>

* cmd/k8s-proxy: add auth key renewal support

Add auth key reissue handling to k8s-proxy, mirroring containerboot.
When the proxy detects an auth failure (login-state health warning or
NeedsLogin state), it disconnects from control, signals the operator
via the state Secret, waits for a new key, clears stale state, and
exits so Kubernetes restarts the pod with the new key.

A health watcher goroutine runs alongside ts.Up() to short-circuit
the startup timeout on terminal auth failures.

Updates #14080

Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>

---------

Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
2026-04-15 16:13:46 +01:00
Brad Fitzpatrick
9fbe4b3ed2 all: fix six tests that failed with -count=2
Avery found a bunch of tests that fail with -count=2.

Updates tailscale/corp#40176 (tracks making our CI detect them)

Change-Id: Ie3e4398070dd92e4fe0146badddf1254749cca20
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Co-authored-by: Avery Pennarun <apenwarr@tailscale.com>
2026-04-13 18:52:57 -07:00
Brad Fitzpatrick
a97850f7e2 cmd/derper: fix TestLookupMetric to pass when run alone
TestLookupMetric was added in e8d140654 (2023-08-17) without
initializing the dnsCache and dnsCacheBytes globals. When run in
isolation, handleBootstrapDNS writes a nil body (from the
uninitialized dnsCacheBytes), causing getBootstrapDNS to fail
decoding an empty response with EOF.

Add a setDNSCache test helper that stores the dnsEntryMap, marshals
dnsCacheBytes, and registers a t.Cleanup to nil both out, so tests
that forget to call it will hit the dnsCache-nil fatal in
getBootstrapDNS rather than silently depending on prior test state.

Also add AssertNotParallel and a dnsCache-nil fatal check to
getBootstrapDNS, the central helper all bootstrap DNS tests flow
through, to prevent future tests from running in parallel (they
all mutate package-level DNS caches and metrics) and to give a
clear error if a test forgets to initialize the DNS caches.

Fixes #19388

Change-Id: I8ad454ec6026c71f13ecfa14d25925df5478b908
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Co-authored-by: Avery Pennarun <apenwarr@tailscale.com>
2026-04-13 17:20:43 -07:00
Brad Fitzpatrick
6500d3c3f8 cmd/containerboot: mark TestContainerBoot as flaky
Updates #19380

Change-Id: Ib1be53836e37224265d10abd0c2213644ea54d64
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13 15:21:42 -07:00
Jordan Whited
929ad51be0 cmd/derper: mark rate-config flag as experimental and unstable
Updates tailscale/corp#38509

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-04-13 12:24:59 -07:00
Mike O'Driscoll
ca5db865b4
cmd/derper,derp: add --rate-config file with SIGHUP reload (#19314)
Add a --rate-config flag pointing to a JSON file for per-client receive
rate limits (bytes/sec and burst bytes). The config is reloaded on SIGHUP,
updating all existing client connections live. The --per-client-rate-limit
and --per-client-rate-burst flags are removed in favor of the config file.

In derpserver, rate limiting uses an atomic.Pointer[xrate.Limiter] per
client: nil when unlimited or mesh (zero overhead), non-nil when
rate-limited.

Document that clientSet.activeClient Store operations require Server.mu.

Updates tailscale/corp#38509

Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
2026-04-10 18:37:54 -04:00
Fernando Serboncini
6b7caaf7ee
cmd/k8s-operator: set PreferDualStack on ProxyGroup egress services (#19194)
On dual-stack clusters defaulting to IPv6, the ProxyGroup egress
service only got an IPv6 address, which causes request failures.
Individual egress proxies already set PreferDualStack correctly.

Fixes: #18768

Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
2026-04-09 13:33:39 -04:00
David Bond
85d6ba9473
cmd/k8s-operator: migrate to tailscale-client-go-v2 (#19010)
This commit modifies the kubernetes operator to use the `tailscale-client-go-v2`
package instead of the internal tailscale client it was previously using. This
now gives us the ability to expand out custom resources and features as they
become available via the API module.

The tailnet reconciler has also been modified to manage clients as tailnets
are created and removed, providing each subsequent reconciler with a single
`ClientProvider` that obtains a tailscale client for the respective tailnet
by name, or the operator's default when presented with a blank string.

Fixes: https://github.com/tailscale/corp/issues/38418

Signed-off-by: David Bond <davidsbond93@gmail.com>
2026-04-09 14:39:46 +01:00
Brad Fitzpatrick
ec0b23a21f vmtest: add VM-based integration test framework
Add tstest/natlab/vmtest, a high-level framework for running multi-VM
integration tests with mixed OS types (gokrazy + Ubuntu/Debian cloud
images) connected via natlab's vnet virtual network.

The vmtest package provides:
  - Env type that orchestrates vnet, QEMU processes, and agent connections
  - OS image support (Gokrazy, Ubuntu2404, Debian12) with download/cache
  - QEMU launch per OS type (microvm for gokrazy, q35+KVM for cloud)
  - Cloud-init seed ISO generation with network-config for multi-NIC
  - Cross-compilation of test binaries for cloud VMs
  - Debug SSH NIC on cloud VMs for interactive debugging
  - Test helpers: ApproveRoutes, HTTPGet, TailscalePing, DumpStatus,
    WaitForPeerRoute, SSHExec

TTA enhancements (cmd/tta):
  - Parameterize /up (accept-routes, advertise-routes, snat-subnet-routes)
  - Add /set, /start-webserver, /http-get endpoints
  - /http-get uses local.Client.UserDial for Tailscale-routed requests
  - Fix /ping for non-gokrazy systems

TestSubnetRouter exercises a 3-VM subnet router scenario:
  client (gokrazy) → subnet-router (Ubuntu, dual-NIC) → backend (gokrazy)
  Verifies HTTP access to the backend webserver through the Tailscale
  subnet route. Passes in ~30 seconds.

Updates tailscale/tailscale#13038

Change-Id: I165b64af241d37f5f5870e796a52502fc56146fa
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-08 17:24:18 -07:00
Brad Fitzpatrick
a182b864ac tsd, all: add Sys.ExtraRootCAs, plumb through TLS dial paths
Add ExtraRootCAs *x509.CertPool to tsd.System and plumb it through
the control client, noise transport, DERP, and wgengine layers so
that platforms like Android can inject user-installed CA certificates
into Go's TLS verification.

tlsdial.Config now honors base.RootCAs as additional trusted roots,
tried after system roots and before the baked-in LetsEncrypt fallback.
SetConfigExpectedCert gets the same treatment for domain-fronted DERP.

The Android client will set sys.ExtraRootCAs with a pool built from
x509.SystemCertPool + user-installed certs obtained via the Android
KeyStore API, replacing the current SSL_CERT_DIR environment variable
approach.

Updates #8085

Change-Id: Iecce0fd140cd5aa0331b124e55a7045e24d8e0c2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-07 18:10:54 -07:00
Doug Bryant
8df8e9cb6e cmd/containerboot: rate-limit IPN bus netmap notifications
CPU profiling a containerboot subnet router on a large tailnet showed
roughly 40% of CPU spent in serveWatchIPNBus JSON-encoding the full
netmap on every update. containerboot only reads SelfNode fields from
those notifications (and does a peer lookup when TailnetTargetFQDN is
set), so it does not need every intermediate netmap delta.

Set ipn.NotifyRateLimit on all three WatchIPNBus calls so netmap
notifications are coalesced to one per 3s. Initial-state delivery is
unaffected since the rateLimitingBusSender flushes the first send
immediately.

Updates #cleanup

Signed-off-by: Doug Bryant <dougbryant@anthropic.com>
2026-04-07 16:03:15 -07:00
Mike O'Driscoll
e689283ebd
derp/derpserver: add per-connection receive rate limiting (#19222)
Add server-side per-client bandwidth enforcement using TCP backpressure.
When configured, the server calls WaitN after reading each DERP frame,
which delays the next read, fills the TCP receive buffer, shrinks
the TCP window, and naturally throttles the sender — no packets are dropped.

- Rate limiting is on the receive (inbound) side, which is what an abusive
  client controls
- Mesh peers are exempt since they are trusted infrastructure
- The burst size is at least MaxPacketSize (64KB) to ensure a
  single max-size frame can always be processed

Also refactors sclient to store a context.Context directly instead of a
done channel, which simplifies the rate limiter's WaitN call.

Flags added to cmd/derper:
  --per-client-rate-limit (bytes/sec, default 0 = unlimited)
  --per-client-rate-burst (bytes, default 0 = 2x rate limit)

Example for 10Mbps: --per-client-rate-limit=1250000

Updates #38509

Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
2026-04-07 18:40:41 -04:00
Brad Fitzpatrick
8a7e160a6e ipn/desktop: move behind feature/condregister
Move the ipn/desktop blank import from cmd/tailscaled/tailscaled_windows.go
into feature/condregister/maybe_desktop_sessions.go, consistent with how
all other modular features are registered. tailscaled already imports
feature/condregister, so it still gets ipn/desktop on Windows.

Updates #12614

Change-Id: I92418c4bf0e67f0ab40542e47584762ac0ffa2b2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-07 11:37:47 -07:00
Brad Fitzpatrick
1b5b43787c ipn/localapi, cli, clientmetric: add ipnbus feature tag; fix omit.go stub
Add a new "ipnbus" build feature tag so the watch-ipn-bus LocalAPI
endpoint can be independently controlled, rather than being gated
behind HasDebug || HasServe. Minimal/embedded builds that omit both
debug and serve were getting 404s on watch-ipn-bus, breaking
"tailscale up --authkey=..." and other CLI flows that depend on
WatchIPNBus.

In the CLI, check buildfeatures.HasIPNBus before attempting to watch
the IPN bus in "tailscale up"/"tailscale login", and exit early with
an informational message when the feature is omitted.

Also add the missing NewCounterFunc stub to clientmetric/omit.go,
which caused compilation errors when building with
ts_omit_clientmetrics and netstack enabled.

Fixes #19240

Change-Id: I2e3c69a72fc50fa02542b91b8a54859618a463d1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-07 10:22:37 -07:00
Kristoffer Dalby
dd3b613787 ssh: replace tempfork with tailscale/gliderssh
Brings in a newer version of Gliderlabs SSH with added socket forwarding support.

Fixes #12409
Fixes #5295

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2026-04-07 11:59:38 +01:00
Brad Fitzpatrick
86f42ea87b cmd/cloner, cmd/viewer: handle named map/slice types with Clone/View methods
The cloner and viewer code generators didn't handle named types
with basic underlying types (map/slice) that have their own Clone
or View methods. For example, a type like:

    type Map map[string]any
    func (m Map) Clone() Map { ... }
    func (m Map) View() MapView { ... }

When used as a struct field, the cloner would descend into the
underlying map[string]any and fail because it can't clone the any
(interface{}) value type. Similarly, the viewer would try to create
a MapFnOf view and fail.

Fix the cloner to check for a Clone method on the named type
before falling through to the underlying type handling.

Fix the viewer to check for a View method on named map/slice types,
so the type author can provide a purpose-built safe view that
doesn't leak raw any values. Named map/slice types without a View
method fall through to normal handling, which correctly rejects
types like map[string]any as unsupported.

Updates tailscale/corp#39502 (needed by tailscale/corp#39594)

Change-Id: Iaef0192a221e02b4b8e409c99ef8398090327744
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-05 20:20:32 -07:00
Brad Fitzpatrick
5ef3713c9f cmd/vet: add subtestnames analyzer; fix all existing violations
Add a new vet analyzer that checks t.Run subtest names don't contain
characters requiring quoting when re-running via "go test -run". This
enforces the style guide rule: don't use spaces or punctuation in
subtest names.

The analyzer flags:
- Direct t.Run calls with string literal names containing spaces,
  regex metacharacters, quotes, or other problematic characters
- Table-driven t.Run(tt.name, ...) calls where tt ranges over a
  slice/map literal with bad name field values

Also fix all 978 existing violations across 81 test files, replacing
spaces with hyphens and shortening long sentence-like names to concise
hyphenated forms.

Updates #19242

Change-Id: Ib0ad96a111bd8e764582d1d4902fe2599454ab65
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-05 15:52:51 -07:00
M. J. Fromberger
eaa5d9df4b
client,cmd/tailscale,ipn/{ipnlocal,localapi}: add debug CLI command to clear netmap caches (#19213)
This is a follow-up to #19117, adding a debug CLI command allowing the operator
to explicitly discard cached netmap data, as a safety and recovery measure.

Updates #12639

Change-Id: I5c3c47c0204754b9c8e526a4ff8f69d6974db6d0
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2026-04-02 12:06:39 -07:00
Naman Sood
d6b626f5bb
tstest: add test for connectivity to off-tailnet CGNAT endpoints
This test is currently known-broken, but work is underway to fix it.
tailscale/corp#36270 tracks this work.

Updates tailscale/corp#36270
Fixes tailscale/corp#36272

Signed-off-by: Naman Sood <mail@nsood.in>
2026-04-02 14:44:40 -04:00
BeckyPauley
e82ffe03ad
cmd/k8s-operator: add further E2E tests for Ingress (#19219)
* cmd/k8s-operator/e2e: add L7 HA ingress test

Change-Id: Ic017e4a7e3affbc3e2a87b9b6b9c38afd65f32ed
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>

* cmd/k8s-operator: add further E2E tests for Ingress (#34833)

This change adds E2E tests for L3 HA Ingress and L7 Ingress (Standalone and
HA). Updates the existing L3 Ingress test to use the Service's Magic DNS
name to test connectivity.

Also refactors test setup to set TS_DEBUG_ACME_DIRECTORY_URL only for tests
running against devcontrol, and updates the Kind node image from v1.30.0 to
v1.35.0.

Fixes tailscale/corp#34833

Signed-off-by: Becky Pauley <becky@tailscale.com>

---------

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Signed-off-by: Becky Pauley <becky@tailscale.com>
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
2026-04-02 15:49:40 +01:00
Alex Chan
5b62f98894 ipn, cmd/tailscale/cli: allow setting FQDN sans dot as an exit node
In #10057, @seigel pointed out an inconsistency in the help text for
`exit-node list` and `set --exit-node`:

1.  Use `tailscale exit-node list`, which has a column titled "hostname"
    and tells you that you can use a hostname with `set --exit-node`:

    ```console
    $ tailscale exit-node list
     IP                  HOSTNAME                               COUNTRY            CITY                   STATUS
     100.98.193.6        linode-vps.tailfa84dd.ts.net           -                  -                      -
    […]
     100.93.242.75       ua-iev-wg-001.mullvad.ts.net           Ukraine            Kyiv                   -

    # To view the complete list of exit nodes for a country, use `tailscale exit-node list --filter=` followed by the country name.
    # To use an exit node, use `tailscale set --exit-node=` followed by the hostname or IP.
    # To have Tailscale suggest an exit node, use `tailscale exit-node suggest`.
    ```

    (This is the same format hostnames are presented in the admin
    console.)

2.  Try copy/pasting a hostname into `set --exit-node`:

    ```console
    $ tailscale set --exit-node=linode-vps.tailfa84dd.ts.net
    invalid value "linode-vps.tailfa84dd.ts.net" for --exit-node; must be IP or unique node name
    ```

3.  Note that the command allows some hostnames, if they're from nodes
    in a different tailnet:

    ```console
    $ tailscale set --exit-node= ua-iev-wg-001.mullvad.ts.net
    $ echo $?
    0
    ```

This patch addresses the inconsistency in two ways:

1.  Allow using `tailscale set --exit-node=` with an FQDN that's missing
    the trailing dot, matching the formatting used in `exit-node list`
    and the admin console.

2.  Make the description of valid exit nodes consistent across commands
    ("hostname or IP").

Updates #10057

Change-Id: If5d74f950cc1a9cc4b0ebc0c2f2d70689ffe4d73
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-04-01 20:42:35 +01:00
Alex Chan
4ffb92d7f6 tka: refer consistently to "DisablementValues"
This avoids putting "DisablementSecrets" in the JSON output from
`tailscale lock log`, which is potentially scary to somebody who doesn't
understand the distinction.

AUMs are stored and transmitted in CBOR-encoded format, which uses an
integer rather than a string key, so this doesn't break already-created
TKAs.

Fixes #19189

Change-Id: I15b4e81a7cef724a450bafcfa0b938da223c78c9
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-04-01 19:09:22 +01:00
Alex Chan
edb2be1a01 cmd/tailscale: improve tailscale lock error message if no keys
Previously, running `add/remove/revoke-keys` without passing any keys
would fail with an unhelpful error:

```console
$ tailscale lock revoke-keys
generation of recovery AUM failed: sending generate-recovery-aum: 500 Internal Server Error: no provided key is currently trusted
```

or

```console
$ tailscale lock revoke-keys
generation of recovery AUM failed: sending generate-recovery-aum: 500 Internal Server Error: network-lock is not active
```

Now they fail with a more useful error:

```console
$ tailscale lock revoke-keys
missing argument, expected one or more tailnet lock keys
```

Fixes #19130

Change-Id: I9d81fe2f5b92a335854e71cbc6928e7e77e537e3
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-03-29 09:28:52 +01:00
KevinLiang10
45f989f52a
ipn/ipnlocal: warn incompatibility between no-snat-routes and exitnode (#19023)
* ipn/ipnlocal: warn incompatibility between no-snat-routes and exitnode

This commit adds a warning to health check when the --snat-subnet-routes=false flag for subnet router is
set alone side --advertise-exit-node=true. These two would conflict with each other and result internet-bound
traffic from peers using this exit node no masqueraded to the node's source IP and fail to route return
packets back. The described combination is not valid until we figure out a way to separate exitnode masquerade rule and skip it for subnet routes.

Updates #18725

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

* use date instead of for now to clarify effectivness

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

---------

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
2026-03-26 12:36:31 -04:00
Jordan Whited
9c36a71a90 feature/*,net/tstun: add tundev_txq_drops clientmetric on Linux
By polling RTM_GETSTATS via netlink. RTM_GETSTATS is a relatively
efficient and targeted (single device) polling method available since
Linux v4.7.

The tundevstats "feature" can be extended to other platforms in the
future, and it's trivial to add new rtnl_link_stats64 counters on
Linux.

Updates tailscale/corp#38181

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-03-24 09:44:58 -07:00
Alex Chan
302e49dc4e cmd/tailscale/cli: add a debug command to print the statedir
Example:

```console
$ tailscale debug statedir
/tmp/ts/node1
```

Updates #18019

Change-Id: I7c93c94179bd7b56d0fa8fe57a9129df05c2c1df
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-03-24 15:16:43 +00:00
Mike O'Driscoll
1403920367
derp,types,util: use bufio Peek+Discard for allocation-free fast reads (#19067)
Replace byte-at-a-time ReadByte loops with Peek+Discard in the DERP
read path. Peek returns a slice into bufio's internal buffer without
allocating, and Discard advances the read pointer without copying.

Introduce util/bufiox with a BufferedReader interface and ReadFull
helper that uses Peek+copy+Discard as an allocation-free alternative
to io.ReadFull.

  - derp.ReadFrameHeader: replace 5× ReadByte with Peek(5)+Discard(5),
    reading the frame type and length directly from the peeked slice.
    Remove now-unused readUint32 helper.

    name                  old ns/op  new ns/op  speedup
    ReadFrameHeader-8     24.2       12.4       ~2x
    (0 allocs/op in both)

  - key.NodePublic.ReadRawWithoutAllocating: replace 32× ReadByte with
    bufiox.ReadFull. Addresses the "Dear future" comment about switching
    away from byte-at-a-time reads once a non-escaping alternative exists.

    name                              old ns/op  new ns/op  speedup
    NodeReadRawWithoutAllocating-8    140        43.6       ~3.2x
    (0 allocs/op in both)

  - derpserver.handleFramePing: replace io.ReadFull with bufiox.ReadFull.

Updates tailscale/corp#38509

Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
2026-03-24 10:52:20 -04:00
Alex Chan
1d0fde6fc2 all: use bart.Lite instead of bart.Table where appropriate
When we don't care about the payload value and are just checking whether
a set contains an IP/prefix, we can use `bart.Lite` for the same lookup
times but a lower memory footprint.

Fixes #19075

Change-Id: Ia709e8b718666cc61ea56eac1066467ae0b6e86c
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-03-24 14:45:23 +00:00
Alex Chan
67496e14c6 cmd/tailscale/cli: fix a typo in the whois help text
Updates #cleanup

Change-Id: I739052548b81a94c4e4997d15883ee755c57df3c
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-03-23 15:05:11 +00:00
Alex Chan
34267d5afa cmd/tailscale: print a helpful error for Taildrive CLI on macOS GUI
Rather than printing `unknown subcommand: drive` for any Taildrive
commands run in the macOS GUI, print an error message directing the user
to the GUI client and the docs page.

Updates #17210
Fixes #18823

Change-Id: I6435007b5911baee79274b56e3ee101e6bb6d809
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-03-23 09:27:27 +00:00
Claus Lensbøl
85bb5f84a5
wgengine/magicsock,control/controlclient: do not overwrite discokey with old key (#18606)
When a client starts up without being able to connect to control, it
sends its discoKey to other nodes it wants to communicate with over
TSMP. This disco key will be a newer key than the one control knows
about.

If the client that can connect to control gets a full netmap, ensure
that the disco key for the node not connected to control is not
overwritten with the stale key control knows about.

This is implemented through keeping track of mapSession and use that for
the discokey injection if it is available. This ensures that we are not
constantly resetting the wireguard connection when getting the wrong
keys from control.

This is implemented as:
 - If the key is received via TSMP:
   - Set lastSeen for the peer to now()
   - Set online for the peer to false
 - When processing new keys, only accept keys where either:
   - Peer is online
   - lastSeen is newer than existing last seen

If mapSession is not available, as in we are not yet connected to
control, punt down the disco key injection to magicsock.

Ideally, we will want to have mapSession be long lived at some point in
the near future so we only need to inject keys in one location and then
also use that for testing and loading the cache, but that is a yak for
another PR.

Updates #12639

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-03-20 08:56:27 -04:00
Mike O'Driscoll
4e88d231d5
control,health,ipn: move IP forwarding check to health tracker (#19007)
Currently IP forwarding health check is done on sending MapRequests.

Move ip forwarding to the health service to gain the benefits
of the health tracker and perodic monitoring out of band from
the MapRequest path. ipnlocal now provides a closure to
the health service to provide the check if forwarding is broken.

Removed `skipIPForwardingCheck` from controlclient/direct.go,
it wasn't being used as the comments describe it, that check
has moved to ipnlocal for the closure to the health tracker.

Updates #18976

Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
2026-03-18 16:24:12 -04:00
Michael Ben-Ami
ce7789071f feature/conn25: add NATing support with flow caching
Introduce a datapathHandler that implements hooks that will
receive packets from the tstun.Wrapper. This commit does not wire
those up just yet.

Perform DNAT from Magic IP to Transit IP on outbound flows on clients,
and reverse SNAT in the reverse direction.

Perform DNAT from Transit IP to final destination IP on outbound flows
on connectors, and reverse SNAT in the reverse direction.

Introduce FlowTable to cache validated flows by 5-tuple for fast lookups
after the first packet.

Flow expiration is not covered, and is intended as future work before
the feature is officially released.

Fixes tailscale/corp#34249
Fixes tailscale/corp#35995

Co-authored-by: Fran Bull <fran@tailscale.com>
Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
2026-03-18 11:49:47 -04:00
Tom Proctor
621f71981c
cmd/k8s-operator: fix Service reconcile triggers for default ProxyClass (#18983)
The e2e ingress test was very occasionally flaky. On looking at operator
logs from one failure, you can see the default ProxyClass was not ready
before the first reconcile loop for the exposed Service. The ProxyClass
became ready soon after, but no additional reconciles were triggered for
the exposed Service because we only triggered reconciles for Services
that explicitly named their ProxyClass.

This change adds additional list API calls for when it's the default
ProxyClass that's been updated in order to catch Services that use it by
default. It also adds indexes for the fields we need to search on to
ensure the list is efficient.

Fixes tailscale/corp#37533

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2026-03-13 14:31:16 +00:00
Tom Proctor
95a135ead1
cmd/{containerboot,k8s-operator}: reissue auth keys for broken proxies (#16450)
Adds logic for containerboot to signal that it can't auth, so the
operator can reissue a new auth key. This only applies when running with
a config file and with a kube state store.

If the operator sees reissue_authkey in a state Secret, it will create a
new auth key iff the config has no auth key or its auth key matches the
value of reissue_authkey from the state Secret. This is to ensure we
don't reissue auth keys in a tight loop if the proxy is slow to start or
failing for some other reason. The reissue logic also uses a burstable
rate limiter to ensure there's no way a terminally misconfigured
or buggy operator can automatically generate new auth keys in a tight loop.

Additional implementation details (ChaosInTheCRD):

- Added `ipn.NotifyInitialHealthState` to ipn watcher, to ensure that
  `n.Health` is populated when notify's are returned.
- on auth failure, containerboot:
  - Disconnects from control server
  - Sets reissue_authkey marker in state Secret with the failing key
  - Polls config file for new auth key (10 minute timeout)
  - Restarts after receiving new key to apply it

- modified operator's reissue logic slightly:
  - Deletes old device from tailnet before creating new key
  - Rate limiting: 1 key per 30s with initial burst equal to replica count
  - In-flight tracking (authKeyReissuing map) prevents duplicate API calls
    across reconcile loops

Updates #14080

Change-Id: I6982f8e741932a6891f2f48a2936f7f6a455317f


(cherry picked from commit 969927c47c3d4de05e90f5b26a6d8d931c5ceed4)

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Co-authored-by: chaosinthecrd <tom@tmlabs.co.uk>
2026-03-11 10:25:57 +00:00
Brad Fitzpatrick
f905871fb1 ipn/ipnlocal, feature/ssh: move SSH code out of LocalBackend to feature
This makes tsnet apps not depend on x/crypto/ssh and locks that in with a test.

It also paves the wave for tsnet apps to opt-in to SSH support via a
blank feature import in the future.

Updates #12614

Change-Id: Ica85628f89c8f015413b074f5001b82b27c953a9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-10 17:27:17 -07:00
David Bond
9522619031
cmd/k8s-operator: use correct tailnet client for L7 & L3 ingresses (#18749)
* cmd/k8s-operator: use correct tailnet client for L7 & L3 ingresses

This commit fixes a bug when using multi-tailnet within the operator
to spin up L7 & L3 ingresses where the client used to create the
tailscale services was not switching depending on the tailnet used
by the proxygroup backing the service/ingress.

Updates: https://github.com/tailscale/corp/issues/34561

Signed-off-by: David Bond <davidsbond93@gmail.com>

* cmd/k8s-operator: adding server url to proxygroups when a custom tailnet has been specified

Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
(cherry picked from commit 3b21ac5504e713e32dfcd43d9ee21e7e712ac200)

---------

Signed-off-by: David Bond <davidsbond93@gmail.com>
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
Co-authored-by: chaosinthecrd <tom@tmlabs.co.uk>
2026-03-10 10:33:55 +00:00
Fran Bull
a4614d7d17 appc,feature/conn25: conn25: send address assignments to connector
After we intercept a DNS response and assign magic and transit addresses
we must communicate the assignment to our connector so that it can
direct traffic when it arrives.

Use the recently added peerapi endpoint to send the addresses.

Updates tailscale/corp#34258
Signed-off-by: Fran Bull <fran@tailscale.com>
2026-03-09 14:10:38 -07:00
Brad Fitzpatrick
e400d5aa7b cmd/testwrapper: make test tolerant of a GOEXPERIMENT being set
Otherwise it generates an syntactically invalid go.mod file
and subsequently fails.

Updates #18884

Change-Id: I1a0ea17a57b2a37bde3770187e1a6e2d8aa55bfe
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-06 14:05:35 -08:00
Brad Fitzpatrick
bd2a2d53d3 all: use Go 1.26 things, run most gofix modernizers
I omitted a lot of the min/max modernizers because they didn't
result in more clear code.

Some of it's older "for x := range 123".

Also: errors.AsType, any, fmt.Appendf, etc.

Updates #18682

Change-Id: I83a451577f33877f962766a5b65ce86f7696471c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-06 13:32:03 -08:00
Brad Fitzpatrick
2a64c03c95 types/ptr: deprecate ptr.To, use Go 1.26 new
Updates #18682

Change-Id: I62f6aa0de2a15ef8c1435032c6aa74a181c25f8f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-05 20:13:18 -08:00