tailscale

mirror of https://github.com/tailscale/tailscale.git synced 2026-05-05 04:06:35 +02:00

Author	SHA1	Message	Date
Brad Fitzpatrick	89a78dc9b7	client/local, ipn/localapi, ipn/ipnlocal: add PeerByID Add a narrow LocalAPI accessor and matching client/LocalBackend method to look up a single peer's current full [tailcfg.Node] by NodeID, in O(1) time on the daemon side, without fetching the entire netmap. Useful for callers that need the latest state of a single peer (e.g. in response to a peer-mutation event on the IPN bus) without paying for a full netmap fetch. Updates #12542 Change-Id: I1cb2d350e6ad846a5dabc1f5368dfc8121387f7c Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-01 06:20:46 -07:00
Alex Chan	cac94f51cc	ipn/ipnlocal: don't compact TKA state on startup Compacting on startup means nodes may compact at a different cadence based on whether they're long-running or restarting frequently. We already compact after every sync, which only occurs when the TKA state has changed. Waiting for TKA changes to trigger compaction on nodes means compaction will occur more consistently across a tailnet. Updates tailscale/corp#33537 Change-Id: Ia0aa6d9e5e362e9ab08450fde69772841790d5b5 Signed-off-by: Alex Chan <alexc@tailscale.com>	2026-05-01 13:27:12 +01:00
Brad Fitzpatrick	a6c5d23742	ipn, ipn/ipnlocal: add Notify.SelfChange Add a new bus signal that lets reactive consumers (containerboot, kube agents, sniproxy, tsconsensus, etc.) react to self-node updates without having to subscribe to the full netmap. Today those consumers either watch Notify.NetMap (which on large tailnets is expensive to encode and ship per watcher) or poll. SelfChange is a cheap, narrow alternative: addresses, name, key expiry, capabilities, etc. Consumers that need additional state can react to SelfChange and then fetch the relevant bits on demand via existing LocalClient methods. Producer-side, every netmap-bearing setControlClientStatus call now also publishes SelfChange. Future changes will migrate individual in-tree consumers off Notify.NetMap to this signal, and eventually gate the legacy NetMap emission to platforms whose host GUIs still require it. Updates #12542 Change-Id: I4441650b0e085d663eb6bf26a03748b7d961ca49 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-30 14:47:03 -07:00
Brad Fitzpatrick	159cf8707a	ipn/ipnlocal, all: split LocalBackend.NetMap into NetMapNoPeers / NetMapWithPeers Add two narrower accessors alongside the existing [LocalBackend.NetMap], with docs that distinguish their semantics: - NetMapNoPeers: cheap (returns the cached *netmap.NetworkMap with a possibly-stale Peers slice). For callers that only read non-Peers fields like SelfNode, DNS, PacketFilter, capabilities. - NetMapWithPeers: documented as returning an up-to-date Peers slice. For callers that genuinely need to iterate Peers or call PeerByXxx. Mark the existing NetMap deprecated and point readers at the two new accessors. NetMap, NetMapNoPeers, and NetMapWithPeers all currently return the same value (b.currentNode().NetMap()): this commit is a no-op behaviorally, just a renaming and migration of in-tree callers. A subsequent change in the same series will switch NetMapWithPeers to actually rebuild the Peers slice from the live per-node-backend peers map (O(N) per call), at which point the distinction between the two new accessors becomes load-bearing. Migrate in-tree callers to the appropriate accessor based on what fields they read: - NetMapNoPeers (most common): localapi handlers, peerapi accept, GetCertPEMWithValidity, web client noise request, doctor DNS resolver check, tsnet CertDomains/TailscaleIPs, ssh/tailssh SSH-policy/cap reads, several LocalBackend internals (isLocalIP, allowExitNodeDNSProxyToServeName, pauseForNetwork nil-check, serve config). - NetMapWithPeers: writeNetmapToDiskLocked (persist full netmap to disk for fast restart), PeerByTailscaleIP lookup. Tests still call the legacy NetMap; they'll see the deprecation warning but otherwise behave identically. Also add two pieces of plumbing the next change in this series will need, but which are already useful on their own: - [client/local.GetDebugResultJSON]: a generic [Client.DebugResultJSON] that decodes directly into a target type T, avoiding the marshal/unmarshal roundtrip callers otherwise need. - localapi "current-netmap" debug action: returns the current netmap (with peers) as JSON. Documented as debug-only — the netmap.NetworkMap shape is internal and may change without notice. This commit is part of a series breaking up a larger change for review; on its own it is a no-op refactor. Updates #12542 Change-Id: Idbb30707414f8da3149c44ca0273262708375b02 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-30 11:14:06 -07:00
Brad Fitzpatrick	f343b496c3	wgengine, all: remove LazyWG, use wireguard-go callback API for on-demand peers Replace the UAPI text protocol-based wireguard configuration with wireguard-go's new direct callback API (SetPeerLookupFunc, SetPeerByIPPacketFunc, RemoveMatchingPeers, SetPrivateKey). Instead of computing a trimmed wireguard config ahead of time upon control plane updates and pushing it via UAPI, install callbacks so wireguard-go creates peers on demand when packets arrive. This removes all the LazyWG trimming machinery: idle peer tracking, activity maps, noteRecvActivity callbacks, the KeepFullWGConfig control knob, and the ts_omit_lazywg build tag. For incoming packets, PeerLookupFunc answers wireguard-go's questions about unknown public keys by looking up the peer in the full config. For outgoing packets, PeerByIPPacketFunc (installed from LocalBackend.lookupPeerByIP) maps destination IPs to node public keys using the existing nodeByAddr index. Updates tailscale/corp#12345 Change-Id: I4cba80979ac49a1231d00a01fdba5f0c2af95dd8 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-29 19:46:19 -07:00
Claus Lensbøl	978b6a81b2	ipn/ipnlocal: always ReSTUN when starting up without a cache (#19586 ) 78627c1 introduced starting up and preserving the DERP server from cache, but also changed it so the initial ReSTUN would not fire when setting the DERPMap. Change this so when not working from a cache, the ReSTUN will always fire during startup. Updates #19585 Signed-off-by: Claus Lensbøl <claus@tailscale.com>	2026-04-29 18:56:57 -04:00
Brad Fitzpatrick	15cba0a3f6	tstest/natlab/vmtest: add TestDiscoKeyChange Add a vmtest that brings up two gokrazy nodes A and B behind two One2OneNAT networks (so direct UDP works in both directions and any slowness can't be blamed on NAT traversal), establishes a WireGuard tunnel A → B with TSMP, then rotates B's disco key four times and asserts that the data plane recovers in both directions after each rotation. All pings are TSMP (the data-plane ping; disco pings would not exercise the WireGuard tunnel itself). The five pings: 1. A → B (initial; brings up the tunnel; 30s budget) 2. B → A after rotate (LocalAPI rotate-disco-key debug action) 3. A → B after rotate (LocalAPI) 4. B → A after restart (SIGKILL; gokrazy supervisor respawns) 5. A → B after restart (SIGKILL) Each post-rotation ping gets a 15-second budget. Two unavoidable multi-second waits dominate today: - The rotate-then-a→b phase takes ~10s on main because of LazyWG. After B's WantRunning bounce, B's wgengine resets its sentActivityAt/recvActivityAt maps and trims A out of the wireguard-go config as an "idle peer"; B only re-adds A on inbound activity, by which point A's first few TSMP packets have been silently dropped at B's tundev. The bradfitz/rm_lazy_wg branch removes that trimming entirely (verified locally: this phase drops to <100ms there). - The restart phases take ~5s for wireguard-go's RekeyTimeout handshake retry. After SIGKILL+respawn the first WG handshake init from the restarted node sometimes goes into the void (likely the brief peer-removed window in the receiver's two-step maybeReconfigWireguardLocked reconfig during which the peer is absent from wireguard-go), and wg-go's 5s+jitter retransmit timer is the next opportunity to retry. That retry succeeds and the staged TSMP packet flushes. Intrinsic to the protocol's retransmit policy. Once LazyWG is removed and the first-handshake-after-reconfig race is fixed, the budget should drop to 5s. Supporting changes: ipn/ipnlocal: DebugRotateDiscoKey now toggles WantRunning off and back on after rotating the disco key. magicsock.Conn.RotateDiscoKey only resets local disco state; without also dropping wireguard-go session keys, peers keep encrypting with their stale per-peer session against us until their rekey timer fires (WireGuard has no data-plane signaling to invalidate sessions). Bouncing WantRunning runs the engine through Reconfig(empty) → authReconfig, which drops every peer's WG session so the next packet either way triggers a fresh handshake. ipn/ipnlocal, ipn/localapi: add a debug-only "peer-disco-keys" LocalAPI action ([LocalBackend.DebugPeerDiscoKeys]) that returns a map[NodePublic]DiscoPublic from the current netmap. Tests reach it via [local.Client.DebugResultJSON]. We do not surface disco keys via [ipnstate.PeerStatus] because adding a non-comparable [key.DiscoPublic] field there breaks reflect-based test helpers (e.g. TestFilterFormatAndSortExitNodes' use of cmp.Diff), and general LocalAPI clients have no need for disco keys. Since the debug LocalAPI is gated behind the ts_omit_debug build tag, this endpoint is automatically stripped from small binaries. cmd/tta: add /restart-tailscaled handler (Linux-only, via /proc walk) to drive the SIGKILL phase. On gokrazy the supervisor respawns tailscaled within a second. tstest/integration/testcontrol: add Server.AllOnline. When set, every peer entry in MapResponses is marked Online=true. Several disco-key handling fast paths in controlclient and wgengine (removeUnwantedDiscoUpdates, removeUnwantedDiscoUpdatesFromFull NetmapUpdate, the wgengine tsmpLearnedDisco fast path) only fire for online peers; without this flag, tests exercising disco-key rotation only hit the offline-peer code paths, which mask issues and are several seconds slower in this scenario. Finer-grained per-node online tracking can be added later. tstest/natlab/vmtest: add Env.RotateDiscoKey, Env.RestartTailscaled, Env.PeerDiscoKey, Node.Name, an [AllOnline] EnvOption that plumbs through to testcontrol.Server.AllOnline, and an exported Env.Ping(from, to, type, timeout). Ping replaces the unexported helper so callers can specify both a ping type (PingDisco for warming peer state, PingTSMP for asserting end-to-end connectivity) and a deadline. PeerDiscoKey returns its LocalAPI error so callers inside tstest.WaitFor can retry transient failures rather than fataling the test. Updates #12639 Updates #13038 Change-Id: I3644f27fc30e52990ba25a3983498cc582ddb958 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-29 12:58:00 -07:00
Brad Fitzpatrick	22ff402da9	wgengine/magicsock: restore SetDERPMap signature, add SetDERPMapWithoutReSTUN Commit 78627c132f changed the signature of magicsock.Conn.SetDERPMap to take an additional bool doReStun parameter. Avoid both the boolean parameter and the API signature change by restoring SetDERPMap to its original single-argument form and adding a new SetDERPMapWithoutReSTUN method for the cache-loading caller that wants to skip the post-set ReSTUN. Updates #19490 Change-Id: I97d9e82156bfc546ccf59756d1ea52f039b5de06 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-29 12:46:15 -07:00
Claus Lensbøl	78627c132f	wgengine/magicsock,ipn/ipnlocal: store and load homeDERP from cache (#19491 ) With netmap caching, the home DERP of the self node was neither saved to the cache or loaded from it, making nodes not stick to a DERP when starting without a connection to control. Instead, make sure that when a cache is available, load that cache, before looking for DERP servers. This is implemented by allowing a skip of ReSTUN in setting the DERP map (we must have a DERP map before setting the home DERP), so the DERP from cache will set itself and be sticky until a connection to control is established. Making DERP only change when connected to control is handled by existing code from f072d017bd8241675aa946a27fc1827f570435cb. Updates #19490 Signed-off-by: Claus Lensbøl <claus@tailscale.com>	2026-04-29 10:24:09 -04:00
Alex Chan	bb91bb842c	all: remove everything related to non-seamless key renewal Seamless key renewal has been the default in all clients since 1.90. We retained the ability to disable it from the control plane as a precaution, but we haven't seen any issues that require us to disable it. We're now removing all the code for non-seamless key renewal, because we don't expect to turn it on again, and indeed it's been untested in the field for three releases so might contain latent bugs! Updates tailscale/corp#33042 Change-Id: I4b80bf07a3a50298d1c303743484169accc8844b Signed-off-by: Alex Chan <alexc@tailscale.com>	2026-04-29 10:03:26 +01:00
Brad Fitzpatrick	ad5436af0d	tstest/largetailnet, tstest/integration/testcontrol: add in-process large-tailnet benchmark Add a Go benchmark that exercises a single tailnet client (a [tsnet.Server] running in the test process) against a synthetic large initial netmap and a stream of caller-driven peer add/remove deltas, all in-process. The harness is split in two parts: - tstest/largetailnet, a reusable package containing a [Streamer] that hijacks the map long-poll on a [testcontrol.Server] via the new AltMapStream hook, sends one initial MapResponse with N synthetic peers, and forwards caller-supplied delta MapResponses on the same stream. Helpers like MakePeer / AllocPeer build synthetic peers with unique IDs and addresses derived from the Tailscale ULA range. - tstest/largetailnet/largetailnet_test.go, BenchmarkGiantTailnet (headless tailscaled workload, no IPN bus subscriber) and BenchmarkGiantTailnetBusWatcher (GUI-client workload with one Notify subscriber attached). Both are gated on --actually-test-giant-tailnet (skipped by default), stand up an in-process testcontrol + tsnet.Server, let Up block until the initial N-peer netmap has been processed, then ResetTimer and run add+remove pairs via b.Loop. Per-delta sync is via a test-only [ipnlocal.LocalBackend.AwaitNodeKeyForTest] channel that closes once the just-added peer key appears in the netmap (no-watcher variant) or via bus-Notify drain (bus-watcher variant). To support the hijack, [testcontrol.Server] grows an AltMapStream hook and a small MapStreamWriter interface for benchmarks/stress tests that need to drive a controlled MapResponse sequence; the normal serveMap path is untouched when AltMapStream is nil. The streamer answers non-streaming "lite" map polls (which controlclient issues before the streaming long-poll to push HostInfo) with an empty MapResponse and returns immediately, so the streaming poll that follows is the one that gets the initial netmap. The benchmark is intended for before/after comparisons of netmap- and delta-handling changes targeted at large tailnets. CPU profiles on unmodified main show the expected O(N) hotspots: setControlClientStatusLocked / authReconfigLocked / userspaceEngine.Reconfig / setNetMapLocked, plus JSON encoding of the full Notify.NetMap to bus watchers (which dominates the BusWatcher variant). Median ms/op over 10 runs on unmodified main, by tailnet size N: N no-watcher bus-watcher 10000 32 166 50000 222 865 100000 504 1765 250000 1551 4696 Recommended invocation: go test ./tstest/largetailnet/ -run=^$ \ -bench='BenchmarkGiantTailnet(BusWatcher)?$' \ -benchtime=2000x -timeout=10m \ --actually-test-giant-tailnet \ --giant-tailnet-n=250000 \ -cpuprofile=/tmp/giant.cpu.pprof Updates #12542 Change-Id: I4f5b2bb271a36ba853d5a0ffe82054ef2b15c585 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-27 11:47:12 -07:00
Evan Lowry	3a05c450ce	posture: add HealthTracker for serial number retrieval (#19181 ) Device posture checking can fail while enabled if tailscaled does not have access to smbios. Previously, this was only observable by looking in the tailscaled logs. Fixes tailscale/corp#39314 Signed-off-by: Evan Lowry <evan@tailscale.com>	2026-04-25 15:42:47 -03:00
kari-ts	aa740cb393	ipnlocal/drive: reduce noisey per-peer remote logs (#19493 ) This drops the per peer "appending remote" log while constructing the remote list, which can get noisy on big tailnets, and keeps logs around remote availability checks, including whether a peer is missing, offline, lacks PeerAPI reachability, lacks sharing permission, or is available. Updates tailscale/corp#40580 Signed-off-by: kari-ts <kari@tailscale.com>	2026-04-24 08:26:33 -07:00
James 'zofrex' Sanderson	36f094ea3b	ipn/ipnlocal: deflake TestStateMachine{,Seamless} (#19475 ) Remove the remaining known sources of flakiness in TestStateMachine and TestStateMachineSeamless. Updates tailscale/corp#36230 Updates #19377 Signed-off-by: James Sanderson <jsanderson@tailscale.com>	2026-04-22 10:22:47 +01:00
James 'zofrex' Sanderson	ffae275d4d	ipn/ipnlocal,tailcfg: add /debug/tka c2n endpoint (#19198 ) Updates tailscale/corp#35015 Signed-off-by: James Sanderson <jsanderson@tailscale.com>	2026-04-20 16:00:03 +01:00
James 'zofrex' Sanderson	ec86f0ff93	ipn/ipnlocal: make TestStateMachine less flaky (#19434 ) TestStateMachine & TestStateMachineSeamless both flake a lot asserting the "Shutdown" call on cc after a Logout. This is because Shutdown is called on a goroutine to avoid a deadlock if it's called while holding the LocalBackend lock (#18052). This fixes that cause of flakes by waiting for LocalBackend's goroutine tracker to have no goroutines running (so the goroutine that calls Shutdown must have finished). This does not make TestStateMachine non-flaky because it can flake later in the test, too: the assertion on "unpause" after clearing the netmap between "Start4" and "Start4 -> netmap" sometimes fails. Updates tailscale/corp#36230 Updates #19377 Updates #18052 Signed-off-by: James Sanderson <jsanderson@tailscale.com>	2026-04-20 15:58:21 +01:00
Alex Chan	cf76202aa3	ipn/ipnlocal: log the local and remote TKA HEADs during sync Update this log message to show both the local and remote TKA HEAD; this is useful for debugging issues on nodes that have fallen behind the remote TKA HEAD. Updates tailscale/corp#39455 Change-Id: Ia62ce15756180d2fbac4a898fb94d6143df08b54 Signed-off-by: Alex Chan <alexc@tailscale.com>	2026-04-19 16:52:48 +01:00
Scott Graham	cb5a53c424	ipn/ipnlocal: preserve b.loginFlags in auto-login cc.Login calls LocalBackend stores loginFlags at construction so that per-instance properties (e.g. LoginEphemeral set by tsnet.Server.Ephemeral) persist for the session. StartLoginInteractiveAs already merges b.loginFlags into its cc.Login call, but the two auto-login call sites pass bare controlclient.LoginDefault, silently dropping any stored flags. Merge b.loginFlags at both auto-login call sites to match the existing StartLoginInteractiveAs pattern. LoginDefault is zero so this is a no-op when loginFlags is empty, and restores the documented behavior when it isn't. Fixes #15852 Signed-off-by: Scott Graham <scott.github@h4ck3r.net>	2026-04-17 23:31:18 -05:00
Michael Ben-Ami	1dc08f4d41	appc,feature/conn25: prevent clients from forwarding DNS requests and modifying DNS responses for domains they are also connectors for For Connectors 2025, determine if a client is configured as a connector and what domains it is a connector for. When acting as a client, don't install Split DNS routes to other connectors for those domains, and don't alter DNS responses for those domains. The responses are forwarded back to the original client, which in turn does the alteration, swapping the real IP for a Magic IP. A client is also a connector for a domain if it has tags that overlap with tags in the configured policy, and --advertise-connector=true in the prefs (not in the self-node Hostinfo from the netmap). We use the prefs as the source of truth because control only gets a copy from the prefs, and may drift. And the AppConnector field is currently zeroed out in the self-node Hostinfo from control. The extension adds a ProfileStateChange hook to process prefs changes, and the config type is split into prefs and nodeview sub-configs. Fixes tailscale/corp#39317 Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>	2026-04-16 09:41:54 -04:00
Alex Chan	4f47c3c93d	ipn/ipnlocal: log AUM hash on startup as base32, not hex Before: tka initialized at head 325557575a59525354484e4a534f494b4c4e56575435583737564b5036584c4d4c335534554255344c344c36484c5a444a323341 After: tka initialized at head 2UWWZYRSTHNJSOIKLNVWT5X77VKP6XLML3U4UBU4L4L6HLZDJ23A Printing the AUM hash as hex makes it difficult to compare to other AUM hashes; stringifying it will make it consistent with other printing. Updates #cleanup Change-Id: Ic1e23a9ce6a71a53cff7d2190f9fa06eb838ab89 Signed-off-by: Alex Chan <alexc@tailscale.com>	2026-04-16 13:45:29 +01:00
M. J. Fromberger	1e4934659b	ipn/ipnlocal: discard cached netmaps upon panic during SetNetworkMap (#19414 ) For debugging purposes, unstable builds will sometimes intentionally panic for unexpected behaviours. We observed such a panic after loading a cached netmap, but because we had a valid cached map, the client was unable to recover on its own and the operator had to manually reset the cache. As a defensive hedge, when netmap caching is enabled, check for a panic during installation of a net network map: If one occurs, discard any cached netmaps before letting the panic unwind, so that we do not lose the panic itself, but reduce the need for manual intervention. Updates #12639 Updates tailscale/corp#27300 Change-Id: I0436889c6bdc2fa728c9cb83630cd7b00a72ce68 Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>	2026-04-15 11:07:42 -07:00
Naman Sood	6301a6ce4b	util/linuxfw,wgengine/router: allow incoming CGNAT range traffic with nodeattr Clients with the newly added node attribute `"disable-linux-cgnat-drop-rule"` will not automatically drop inbound traffic on non-Tailscale network interfaces with the source IP in the CGNAT IP range. This is an initial proof-of-concept for enabling connectivity with off-Tailnet CGNAT endpoints. Fixes tailscale/corp#36270. Signed-off-by: Naman Sood <mail@nsood.in>	2026-04-14 16:45:06 -04:00
Brad Fitzpatrick	9fbe4b3ed2	all: fix six tests that failed with -count=2 Avery found a bunch of tests that fail with -count=2. Updates tailscale/corp#40176 (tracks making our CI detect them) Change-Id: Ie3e4398070dd92e4fe0146badddf1254749cca20 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com> Co-authored-by: Avery Pennarun <apenwarr@tailscale.com>	2026-04-13 18:52:57 -07:00
Brad Fitzpatrick	5a7ef4a533	ipn/ipnlocal: mark TestStateMachineSeamless as flaky Updates #19377 Change-Id: I7dbf5b954effbfa821339e79d02d8a6e46d2862a Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-13 15:19:42 -07:00
Alex Chan	1ff369a261	tka: keep the CompactionDefaults alongside the other limits Updates #cleanup Change-Id: Ib5e481d5a9c7ec7ac3e6b3913909ab1bf21d7a4d Signed-off-by: Alex Chan <alexc@tailscale.com>	2026-04-10 16:06:23 +01:00
Jonathan Nobels	03c3551ee5	ipn/ipnlocal: add netmap mutations to the ipn bus (#19120 ) ipn/local: add netmap mutations to the ipn bus updates tailscale/tailscale#1909 This adds a new new NotifyWatchOpt that allows watchers to receive PeerChange events (derived from node mutations) on the IPN bus in lieu of a complete netmap. We'll continue to send the full netmap for any map response that includes it, but for mutations, sending PeerChange events gives the client the option to manage it's own models more selectively and cuts way down on json serialization overhead. On chatty tailnets, this will vastly reduce the amount of chatter on the bus. This change should be backwards compatible, it is purely additive. Clients that subscribe to NotifyNetmap will get the full netmap for every delta. New clients can omit that and instead opt into NotifyPeerChanges. Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>	2026-04-09 15:45:41 -04:00
Brad Fitzpatrick	a182b864ac	tsd, all: add Sys.ExtraRootCAs, plumb through TLS dial paths Add ExtraRootCAs *x509.CertPool to tsd.System and plumb it through the control client, noise transport, DERP, and wgengine layers so that platforms like Android can inject user-installed CA certificates into Go's TLS verification. tlsdial.Config now honors base.RootCAs as additional trusted roots, tried after system roots and before the baked-in LetsEncrypt fallback. SetConfigExpectedCert gets the same treatment for domain-fronted DERP. The Android client will set sys.ExtraRootCAs with a pool built from x509.SystemCertPool + user-installed certs obtained via the Android KeyStore API, replacing the current SSL_CERT_DIR environment variable approach. Updates #8085 Change-Id: Iecce0fd140cd5aa0331b124e55a7045e24d8e0c2 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-07 18:10:54 -07:00
James Tucker	21695cdbf8	ipn/ipnlocal,net/netmon: make frequent darkwake more efficient Investigating battery costs on a busy tailnet I noticed a large number of nodes regularly reconnecting to control and DERP. In one case I was able to analyze closely `pmset` reported the every-minute wake-ups being triggered by bluetooth. The node was by side effect reconnecting to control constantly, and this was at times visible to peers as well. Three changes here improve the situation: - Short time jumps (less than 10 minutes) no longer produce "major network change" events, and so do not trigger full rebind/reconnect. - Many "incidental" fields on interfaces are ignored, like MTU, flags and so on - if the route is still good, the rest should be manageable. - Additional log output will provide more detail about the cause of major network change events. Updates #3363 Signed-off-by: James Tucker <james@tailscale.com>	2026-04-06 15:46:51 -07:00
Brad Fitzpatrick	5a899e406d	ipn/ipnlocal: add health.Tracker to tests where it was warning in CI To denoise log output, to make it easier to find real failures. Updates #19252 Change-Id: Iae64a9278c70de24a236c39e3d181a509a512a0b Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-05 20:20:07 -07:00
Brad Fitzpatrick	5ef3713c9f	cmd/vet: add subtestnames analyzer; fix all existing violations Add a new vet analyzer that checks t.Run subtest names don't contain characters requiring quoting when re-running via "go test -run". This enforces the style guide rule: don't use spaces or punctuation in subtest names. The analyzer flags: - Direct t.Run calls with string literal names containing spaces, regex metacharacters, quotes, or other problematic characters - Table-driven t.Run(tt.name, ...) calls where tt ranges over a slice/map literal with bad name field values Also fix all 978 existing violations across 81 test files, replacing spaces with hyphens and shortening long sentence-like names to concise hyphenated forms. Updates #19242 Change-Id: Ib0ad96a111bd8e764582d1d4902fe2599454ab65 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-05 15:52:51 -07:00
Harry Harpham	7ddbd84171	ipn/ipnlocal: ensure TestServeUnixSocket actually serves a Unix socket The test sets up an HTTP-over-Unix server and a reverse proxy pointed at this server, but prior to this change did not round-trip anything to the backing server. This change ensures that we test code paths which proxy Unix sockets for serve. Fixes #19232 Signed-off-by: Harry Harpham <harry@tailscale.com>	2026-04-03 14:15:15 -06:00
M. J. Fromberger	eaa5d9df4b	client,cmd/tailscale,ipn/{ipnlocal,localapi}: add debug CLI command to clear netmap caches (#19213 ) This is a follow-up to #19117, adding a debug CLI command allowing the operator to explicitly discard cached netmap data, as a safety and recovery measure. Updates #12639 Change-Id: I5c3c47c0204754b9c8e526a4ff8f69d6974db6d0 Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>	2026-04-02 12:06:39 -07:00
M. J. Fromberger	211ef67222	tailcfg,ipn/ipnlocal: regulate netmap caching via a node attribute (#19117 ) Add a new tailcfg.NodeCapability (NodeAttrCacheNetworkMaps) to control whether a node with support for caching network maps will attempt to do so. Update the capability version to reflect this change (mainly as a safety measure, as the control plane does not currently need to know about it). Use the presence (or absence) of the node attribute to decide whether to create and update a netmap cache for each profile. If caching is disabled, discard the cached data; this allows us to use the presence of a cached netmap as an indicator it should be used (unless explicitly overridden). Add a test that verifies the attribute is respected. Reverse the sense of the environment knob to be true by default, with an override to disable caching at the client regardless what the node attribute says. Move the creation/update of the netmap cache (when enabled) until after successfully applying the network map, to reduce the possibility that we will cache (and thus reuse after a restart) a network map that fails to correctly configure the client. Updates #12639 Change-Id: I1df4dd791fdb485c6472a9f741037db6ed20c47e Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>	2026-04-01 15:02:53 -07:00
Alex Chan	4ffb92d7f6	tka: refer consistently to "DisablementValues" This avoids putting "DisablementSecrets" in the JSON output from `tailscale lock log`, which is potentially scary to somebody who doesn't understand the distinction. AUMs are stored and transmitted in CBOR-encoded format, which uses an integer rather than a string key, so this doesn't break already-created TKAs. Fixes #19189 Change-Id: I15b4e81a7cef724a450bafcfa0b938da223c78c9 Signed-off-by: Alex Chan <alexc@tailscale.com>	2026-04-01 19:09:22 +01:00
Alex Chan	88e7330ff1	ipn,tka: improve Tailnet Lock logs * Refer to "tailnet-lock" instead of "network-lock" in log messages * Log keys as `tlpub:<hex>` rather than as Go structs Updates tailscale/corp#39455 Updates tailscale/corp#37904 Change-Id: I644407d1eda029ee11027bcc949897aa4ba52787 Signed-off-by: Alex Chan <alexc@tailscale.com>	2026-04-01 17:08:12 +01:00
Harry Harpham	61ac021c5d	wgengine/magicsock: assume network up for tests Without this, any test relying on underlying use of magicsock will fail without network connectivity, even when the test logic has no need for a network connection. Tests currently in this bucket include many in tstest/integration and in tsnet. Further explanation: ipn only becomes Running when it sees at least one live peer or DERP connection: `0cc1b2ff76/ipn/ipnlocal/local.go (L5861-L5866)` When tests only use a single node, they will never see a peer, so the node has to wait to see a DERP server. magicsock sets the preferred DERP server in updateNetInfo(), but this function returns early if the network is down. `0cc1b2ff76/wgengine/magicsock/magicsock.go (L1053-L1106)` Because we're checking the real network, this prevents ipn from entering "Running" and causes the test to fail or hang. In tests, we can assume the network is up unless we're explicitly testing the behaviour of tailscaled when the network is down. We do something similar in magicsock/derp.go, where we assume we're connected to control unless explicitly testing otherwise: `7d2101f352/wgengine/magicsock/derp.go (L166-L177)` This is the template for the changes to `networkDown()`. Fixes #17122 Co-authored-by: Alex Chan <alexc@tailscale.com> Signed-off-by: Harry Harpham <harry@tailscale.com>	2026-03-31 09:57:14 -06:00
Claus Lensbøl	bf467727fc	control/controlclient,ipn/ipnlocal,wgengine: avoid restarting wireguard when key is learned via tsmp (#19142 ) When disco keys are learned on a node that is connected to control and has a mapSession, wgengine will see the key as having changed, and assume that any existing connections will need to be reset. For keys learned via TSMP, the connection should not be reset as that key is learned via an active wireguard connection. If wgengine resets that connetion, a 15s timeout will occur. This change adds a map to track new keys coming in via TSMP, and removes them from the list of keys that needs to trigger wireguard resets. This is done with an interface chain from controlclient down via localBackend to userspaceEngine via the watchdog. Once a key has been actively used for preventing a wireguard reset, the key is removed from the map. If mapSession becomes a long lived process instead of being dependent on having a connection to control. This interface chain can be removed, and the event sequence from wrap->controlClient->userspaceEngine, can be changed to wrap->userspaceEngine->controlClient as we know the map will not be gunked up with stale TSMP entries. Updates #12639 Signed-off-by: Claus Lensbøl <claus@tailscale.com>	2026-03-30 14:26:08 -04:00
KevinLiang10	45f989f52a	ipn/ipnlocal: warn incompatibility between no-snat-routes and exitnode (#19023 ) * ipn/ipnlocal: warn incompatibility between no-snat-routes and exitnode This commit adds a warning to health check when the --snat-subnet-routes=false flag for subnet router is set alone side --advertise-exit-node=true. These two would conflict with each other and result internet-bound traffic from peers using this exit node no masqueraded to the node's source IP and fail to route return packets back. The described combination is not valid until we figure out a way to separate exitnode masquerade rule and skip it for subnet routes. Updates #18725 Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com> * use date instead of for now to clarify effectivness Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com> --------- Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>	2026-03-26 12:36:31 -04:00
Fran Bull	2d5962f524	feature/conn25,ipn/ipnext,ipn/ipnlocal: add ExtraRouterConfigRoutes hook conn25 needs to add routes to the operating system to direct handling of the addresses in the magic IP range to the tailscale0 TUN and tailscaled. The way we do this for exit nodes and VIP services is that we add routes to the Routes field of router.Config, and then the config is passed to the WireGuard engine Reconfig. conn25 is implemented as an ipnext.Extension and so this commit adds a hook to ipnext.Hooks to allow any extension to provide routes to the config. The hook if provided is called in routerConfigLocked, similarly to exit nodes and VIP services. Fixes tailscale/corp#38123 Signed-off-by: Fran Bull <fran@tailscale.com>	2026-03-25 19:28:33 -07:00
Michael Ben-Ami	a57c6457c9	ipn/ipnlocal: debounce extra enqueues in ExtensionHost.AuthReconfigAsync Fixes tailscale/corp#39065 Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>	2026-03-25 09:11:15 -04:00
kari-ts	9992b7c817	ipn,ipn/local: broadcast ClientVersion if AutoUpdate.Check (#19107 ) If AutoUpdate.Check is false, the client has opted out of checking for updates, so we shouldn't broadcast ClientVersion. If the client has opted in, it should be included in the initial Notify. Updates tailscale/corp#32629 Signed-off-by: kari-ts <kari@tailscale.com>	2026-03-24 15:06:20 -07:00
Amal Bansode	04ef9d80b5	ipn/ipnlocal: add a map for node public key to node ID lookups (#19051 ) This path is currently only used by DERP servers that have also enabled `verify-clients` to ensure that only authorized clients within a Tailnet are allowed to use said DERP server. The previous naive linear scan in NodeByKey would almost certainly lead to bad outcomes with a large enough netmap, so address an existing todo by building a map of node key -> node ID. Updates #19042 Signed-off-by: Amal Bansode <amal@tailscale.com>	2026-03-23 10:23:28 -07:00
Michael Ben-Ami	ea7040eea2	ipn/{ipnext,ipnlocal}: expose authReconfig in ipnext.Host as AuthReconfigAsync Also implement a limit of one on the number of goroutines that can be waiting to do a reconfig via AuthReconfig, to prevent extensions from calling too fast and taxing resources. Even with the protection, the new method should only be used in experimental or proof-of-concept contexts. The current intended use is for an extension to be able force a reconfiguration of WireGuard, and have the reconfiguration call back into the extension for extra Allowed IPs. If in the future if WireGuard is able to reconfigure individual peers more dynamically, an extension might be able to hook into that process, and this method on ipnext.Host may be deprecated. Fixes tailscale/corp#38120 Updates tailscale/corp#38124 Updates tailscale/corp#38125 Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>	2026-03-20 17:29:11 -04:00
Brendan Creane	ffa7df2789	ipn: reject advertised routes with non-address bits set (#18649 ) * ipn: reject advertised routes with non-address bits set The config file path, EditPrefs local API, and App Connector API were accepting invalid subnet route prefixes with non-address bits set (e.g., 2a01:4f9:c010:c015::1/64 instead of 2a01:4f9:c010:c015::/64). All three paths now reject prefixes where prefix != prefix.Masked() with an error message indicating the expected masked form. Updates tailscale/corp#36738 Signed-off-by: Brendan Creane <bcreane@gmail.com> * address review comments Signed-off-by: Brendan Creane <bcreane@gmail.com> --------- Signed-off-by: Brendan Creane <bcreane@gmail.com>	2026-03-20 10:10:43 -07:00
Gesa Stupperich	ca9aa20255	ipn/ipnlocal: populate Groups field in profileFromView This populates UserProfile.Groups in the WhoIs response from the local backend with the groups of the corresponding user in the netmap. This allows tsnet apps to see (and e.g. forward) which groups a user making a request belongs to - as long as the tsnet app runs on a node that been granted the tailscale.com/visible-groups capability via node attributes. If that's not the case or the user doesn't belong to any groups allow-listed via the node attribute, Groups won't be populated. Updates tailscale/corp#31529 Signed-off-by: Gesa Stupperich <gesa@tailscale.com>	2026-03-19 21:46:55 +00:00
Mike O'Driscoll	4e88d231d5	control,health,ipn: move IP forwarding check to health tracker (#19007 ) Currently IP forwarding health check is done on sending MapRequests. Move ip forwarding to the health service to gain the benefits of the health tracker and perodic monitoring out of band from the MapRequest path. ipnlocal now provides a closure to the health service to provide the check if forwarding is broken. Removed `skipIPForwardingCheck` from controlclient/direct.go, it wasn't being used as the comments describe it, that check has moved to ipnlocal for the closure to the health tracker. Updates #18976 Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>	2026-03-18 16:24:12 -04:00
kari-ts	4c7c1091ba	netns: add Android callback to bind socket to network (#18915 ) After switching from cellular to wifi without ipv6, ForeachInterface still sees rmnet prefixes, so HaveV6 stays true, and magicsock keeps attempting ipv6 connections that either route through cellular or time out for users on wifi without ipv6 This: -Adds SetAndroidBindToNetworkFunc, a callback to bind the socket to the selected Android Network object Updates tailscale/tailscale#6152 Signed-off-by: kari-ts <kari@tailscale.com>	2026-03-11 12:28:28 -07:00
Brad Fitzpatrick	f905871fb1	ipn/ipnlocal, feature/ssh: move SSH code out of LocalBackend to feature This makes tsnet apps not depend on x/crypto/ssh and locks that in with a test. It also paves the wave for tsnet apps to opt-in to SSH support via a blank feature import in the future. Updates #12614 Change-Id: Ica85628f89c8f015413b074f5001b82b27c953a9 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-03-10 17:27:17 -07:00
Gesa Stupperich	6a19995f13	tailcfg: reintroduce UserProfile.Groups This change reintroduces UserProfile.Groups, a slice that contains the ACL-defined and synced groups that a user is a member of. The slice will only be non-nil for clients with the node attribute see-groups, and will only contain groups that the client is allowed to see as per the app payload of the see-groups node attribute. For example: ``` "nodeAttrs": [ { "target": ["tag:dev"], "app": { "tailscale.com/see-groups": [{"groups": ["group:dev"]}] } }, [...] ] ``` UserProfile.Groups will also be gated by a feature flag for the time being. Updates tailscale/corp#31529 Signed-off-by: Gesa Stupperich <gesa@tailscale.com>	2026-03-09 11:08:45 +00:00
Brad Fitzpatrick	bd2a2d53d3	all: use Go 1.26 things, run most gofix modernizers I omitted a lot of the min/max modernizers because they didn't result in more clear code. Some of it's older "for x := range 123". Also: errors.AsType, any, fmt.Appendf, etc. Updates #18682 Change-Id: I83a451577f33877f962766a5b65ce86f7696471c Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-03-06 13:32:03 -08:00

1 2 3 4 5 ...

1440 Commits