tailscale

mirror of https://github.com/tailscale/tailscale.git synced 2026-04-26 16:01:03 +02:00

Author	SHA1	Message	Date
Brad Fitzpatrick	a7d8aeb8ae	misc/genreadme,tempfork/pkgdoc,tsnet: generate README.md files from godoc Adds a CI check to keep opted-in directories' README.md files in sync with their package godoc. For now tsnet (and its sub-packages under tsnet/example) is the only opted-in tree. The list of directories lives in misc/genreadme/genreadme.go as defaultRoots, so CI and humans both just run `./tool/go run ./misc/genreadme` with no arguments. The check piggybacks on the existing go_generate job in test.yml and fails if any README.md is out of date, pointing the user at the same command. Along the way: - tempfork/pkgdoc now emits Markdown instead of plain text: headings become level-2 with no {#hdr-...} anchors, and [Symbol] doc links resolve to pkg.go.dev URLs, including for symbols in the current package (which the default Printer would otherwise emit as bare #Name fragments with no backing anchor in a README). Parsing no longer uses parser.ImportsOnly, so doc.Package knows the package's symbols and can resolve [Symbol] links at all. - genreadme also emits a pkg.go.dev Go Reference badge at the top of a library package's README; suppressed for package main. - tsnet/tsnet.go's package godoc is expanded in idiomatic godoc syntax — [Type], [Type.Method], reference-style [link]: URL definitions — rather than Markdown-flavored [text](url) or backtick-quoted identifiers, so that both pkg.go.dev and the generated README.md render cleanly from a single source. Fixes #19431 Fixes #19483 Fixes #19470 Change-Id: I8ca37e9e7b3bd446b8bfa7a91ac548f142688cb1 Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com> Signed-off-by: Walter Poupore <walterp@tailscale.com> Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-22 15:13:09 -07:00
Brad Fitzpatrick	311dd3839d	wgengine/magicsock: replace peers slice with peersByID map; add Upsert/RemovePeer Replace Conn.peers (sorted views.Slice) with peersByID, a map[tailcfg.NodeID]tailcfg.NodeView. The only caller that needed the sorted slice (the disco message receive path's binary search) becomes a single map lookup. Drop nodesEqual. Add Conn.UpsertPeer / Conn.RemovePeer for O(1) single-peer endpoint work. RemovePeer also performs a targeted single-disco-key cleanup (previously that scan was O(discoInfo)). Extract the shared per-peer upsert body as upsertPeerLocked; still used by SetNetworkMap's bulk path. SetNetworkMap is documented as the bulk / initial / self-change path; UpsertPeer and RemovePeer are preferred for single-peer changes. Make the relay server set update O(1) per peer: add serverUpsertCh / serverRemoveCh to relayManager with matching run-loop handlers. UpsertPeer / RemovePeer evaluate the per-peer relay predicate locally and dispatch upsert or remove. The full-rebuild updateRelayServersSet stays for the initial netmap, filter changes, and fallback. Move the hasPeerRelayServers atomic from Conn onto relayManager, next to the serversByNodeKey map it summarizes. The run loop is now the single writer and needs no back-pointer to Conn; endpoint's two hot-path readers take one extra hop to de.c.relayManager.hasPeerRelayServers but the cost is the same atomic load. No callers use UpsertPeer/RemovePeer yet; a subsequent change will plumb per-peer add/remove through the incremental map update path. Updates #12542 Change-Id: If6a3442fe29ccbd77890ea61b754a4d1ad6ef225 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-22 15:07:11 -07:00
Brad Fitzpatrick	f289f7e77c	tstest/natlab/vmtest,cmd/tta: add TestSiteToSite Verifies that site-to-site Tailscale subnet routing with --snat-subnet-routes=false preserves the original source IP end-to-end. Topology: two sites, each with a Linux subnet router on a NATted WAN plus an internal LAN, and a non-Tailscale backend on each LAN. Backends are given static routes pointing to their local subnet router for the remote site's prefix; an HTTP GET from backend-a to backend-b over Tailscale returns a body containing backend-a's LAN IP. Adds the supporting vmtest.SNATSubnetRoutes NodeOption and plumbs snat-subnet-routes through TTA's /up handler. The webserver started by vmtest.WebServer now also echoes the remote IP, for the preservation assertion. Adds a /add-route TTA endpoint (Linux-only for now) and a vmtest Env.AddRoute helper so the test can install the backend static routes through TTA rather than needing a host SSH key and debug NIC. ensureGokrazy now always rebuilds the natlab qcow2 (once per test process, via sync.Once) so the test picks up the new TTA and webserver behavior. This is pulled out of a larger pending change that adds FreeBSD site-to-site subnet routing support; figured we should have at least the Linux test covering what works today. Updates #5573 Change-Id: I881c55b0f118ac9094546b5fbe68dddf179bb042 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-22 12:11:30 -07:00
Fernando Serboncini	81fbcc1ac8	cmd/tsnet-proxy: add tsnet-based port proxy tool (#19468 ) Exposes a local port on the tailnet under a chosen hostname. Raw TCP by default; --http or --https reverse-proxy with Tailscale-User-* identity headers from WhoIs, matching tailscaled's serve header conventions. Useful as a one-shot to put a dev server on the tailnet. Fixes #19467 Change-Id: I79f63cfbbedf7e40cf0f1f51cbae8df86ae90cdf Signed-off-by: Fernando Serboncini <fserb@tailscale.com>	2026-04-22 13:34:18 -04:00
James 'zofrex' Sanderson	36f094ea3b	ipn/ipnlocal: deflake TestStateMachine{,Seamless} (#19475 ) Remove the remaining known sources of flakiness in TestStateMachine and TestStateMachineSeamless. Updates tailscale/corp#36230 Updates #19377 Signed-off-by: James Sanderson <jsanderson@tailscale.com>	2026-04-22 10:22:47 +01:00
Brad Fitzpatrick	12813dee02	tool/listpkgs: add --has-go-generate filter flag too For use in parallelizing go:generate up-to-date checks. Updates tailscale/corp#28679 Change-Id: Ifc31c56de4225ba2e0fc048b0f18974dc2f2fc82 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-21 17:51:13 -07:00
Fran Bull	d7916d4369	feature/conn25: add expiresAt field to addrs And use it to allow overwrites of old address assignments in the conn25 client. The magic and transit address pools from which the addresses come are limited resources and we want to reuse them. This commit is a small part of that bigger need. We expect to follow soon: * Extending expiry if assignments are still in use. * Returning expired addresses back to the pools so they can be reallocated. Updates tailscale/corp#39975 Signed-off-by: Fran Bull <fran@tailscale.com>	2026-04-21 14:22:39 -07:00
Fran Bull	19544b4b81	feature/conn25: move byConnKey from addrAssignments to client addrAssignments is a table of addrs with lookup indices, representing the assignments of magic+destination+transit IP addresses the client has made dut to the domain being routed because of an app . byConnKey is a map of node public key to prefixes of transit IPs, so it is associated with, but not that data itself, and can be its own thing. Updates tailscale/corp#39975 Signed-off-by: Fran Bull <fran@tailscale.com>	2026-04-21 14:22:39 -07:00
Walter Poupore	04415b8177	misc/genreadme: port from corp (#19477 ) also port pkgdoc, into the tempfork folder git rev from corp at the time this copy was made: - e909fc93595414c90ff1339cece7c84500ab3c36 Updates #19470 Change-Id: I3d98d82020a2b336647b795210dcb7065dfa44d7 Change-Id: Ie63141860b76dd2d5ae3ff52f8a4bcdf6106421e Signed-off-by: Walter Poupore <walterp@tailscale.com>	2026-04-21 12:18:37 -07:00
Fernando Serboncini	1669b0d3d4	misc/git_hook: fix building git_hook in a nested worktree (#19473 ) When the repo is checked out as a nested worktree, a go.work in the outer tree hijacks module resolution, which makes the rebuild fails with "main module does not contain package." Set GOWORK=off for the build since the hook is self-contained. Bumps HOOK_VERSION so existing installs pick up the fix. Updates #cleanup Change-Id: Ibd14849efc26e4e1893c5b8e300caa71573f54bd Signed-off-by: Fernando Serboncini <fserb@fserb.com.br>	2026-04-21 11:42:53 -04:00
Brad Fitzpatrick	1e68a11721	logtail: run HTTP tests in-memory with memnet + synctest TestEncodeAndUploadMessages waited on the default 2s FlushDelay, making the logtail package the slowest non-integration test in the tree (~2s real time). Switch the shared harness from an httptest.Server-on-loopback to a memnet.Listener-backed *http.Server and run the tests inside synctest.Test, so fake time advances the flush timer instantly. Drops the net/http/httptest dependency from these tests. Combined with the TestMain non-localhost dial guard added in the previous commit, no test in this package can accidentally reach the real log.tailscale.com server. Whole package now runs in ~7ms. Updates tailscale/corp#28679 Change-Id: Ie0e7a6a79641384ed0eecb99d767e17cda8bb944 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-20 13:33:10 -07:00
Brad Fitzpatrick	5b06e32f33	logtail: add Config.Disabled to suppress the startup banner NewLogger unconditionally writes a "logtail started" banner before it returns, which callers that later call Logger.SetEnabled(false) have no way to suppress: the banner is already buffered for upload by the time the caller gets the logger back. Add Config.Disabled so callers that know up front they want the logger to start disabled (e.g. Android's remote-logging opt-out) can seed the state before NewLogger's internal Write. The process- wide Disable kill switch still takes precedence; SetEnabled can still flip the state at runtime. Updates #13174 Updates tailscale/tailscale-android#695 Change-Id: Icc4fa88c198447cf0faa707264dac84e359fe52c Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-20 13:33:10 -07:00
Adriano Sela Aviles	4a832d8d0f	types/netmap,client/local: modify services format in local api Reverting back to the previous format (including the "svc:" prefix in the map's keys). Note that the /services endpoint in localapi, along with any software that relies on this is unreleased so this does not break any clients. Updates tailscale/corp#40052 Signed-off-by: Adriano Sela Aviles <adriano@tailscale.com>	2026-04-20 09:22:23 -07:00
James 'zofrex' Sanderson	ffae275d4d	ipn/ipnlocal,tailcfg: add /debug/tka c2n endpoint (#19198 ) Updates tailscale/corp#35015 Signed-off-by: James Sanderson <jsanderson@tailscale.com>	2026-04-20 16:00:03 +01:00
James 'zofrex' Sanderson	ec86f0ff93	ipn/ipnlocal: make TestStateMachine less flaky (#19434 ) TestStateMachine & TestStateMachineSeamless both flake a lot asserting the "Shutdown" call on cc after a Logout. This is because Shutdown is called on a goroutine to avoid a deadlock if it's called while holding the LocalBackend lock (#18052). This fixes that cause of flakes by waiting for LocalBackend's goroutine tracker to have no goroutines running (so the goroutine that calls Shutdown must have finished). This does not make TestStateMachine non-flaky because it can flake later in the test, too: the assertion on "unpause" after clearing the netmap between "Start4" and "Start4 -> netmap" sometimes fails. Updates tailscale/corp#36230 Updates #19377 Updates #18052 Signed-off-by: James Sanderson <jsanderson@tailscale.com>	2026-04-20 15:58:21 +01:00
Brad Fitzpatrick	dfc2667f8f	tstest/integration/testcontrol: make Stream w/ capver >= 68 match docs, prod testcontrol wasn't following the document specs (and prod behavior) breaking a WIP integration test elsewhere. Updates tailscale/corp#40088 Change-Id: I02cf70894346bad7c85940b617d99c21c5310664 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-20 07:34:04 -07:00
Alex Chan	cf76202aa3	ipn/ipnlocal: log the local and remote TKA HEADs during sync Update this log message to show both the local and remote TKA HEAD; this is useful for debugging issues on nodes that have fallen behind the remote TKA HEAD. Updates tailscale/corp#39455 Change-Id: Ia62ce15756180d2fbac4a898fb94d6143df08b54 Signed-off-by: Alex Chan <alexc@tailscale.com>	2026-04-19 16:52:48 +01:00
Scott Graham	cb5a53c424	ipn/ipnlocal: preserve b.loginFlags in auto-login cc.Login calls LocalBackend stores loginFlags at construction so that per-instance properties (e.g. LoginEphemeral set by tsnet.Server.Ephemeral) persist for the session. StartLoginInteractiveAs already merges b.loginFlags into its cc.Login call, but the two auto-login call sites pass bare controlclient.LoginDefault, silently dropping any stored flags. Merge b.loginFlags at both auto-login call sites to match the existing StartLoginInteractiveAs pattern. LoginDefault is zero so this is a no-op when loginFlags is empty, and restores the documented behavior when it isn't. Fixes #15852 Signed-off-by: Scott Graham <scott.github@h4ck3r.net>	2026-04-17 23:31:18 -05:00
Adriano Sela Aviles	618dfd4081	client/local,types/netmap: modify services format in local api Updates the format of the service map that is served over the local api to be keyed without the "svc:" prefix. This change is backwards incompatible, this is OK because there is only one tailnet with the services-in-nodecapmap feature flag enabled, and the client side changes that start showing services over local api have not been released. (These were added in 4fcce6000d3d3f79d1ac1fca571a50efb059cbf2). Updates tailscale/corp#40052 Signed-off-by: Adriano Sela Aviles <adriano@tailscale.com>	2026-04-17 14:14:03 -07:00
Fernando Serboncini	514d7d28e7	misc/git_hook: extract shared githook package; auto-rebuild on version bump (#19440 ) Pull the hook logic into a reusable githook library package so tailscale/corp can share it via a thin wrapper main instead of keeping a forked copy in sync. The install flow also changes: a wrapper scripts now build the binary and reinstall the git hooks. Pulling new shared code no longer requires re-running the installer. Updates tailscale/corp#39860 Change-Id: I4d606d11c8c883015c190c54e3387a7f9fe4dd32 Signed-off-by: Fernando Serboncini <fserb@tailscale.com>	2026-04-17 16:24:39 -04:00
Brad Fitzpatrick	1fbb834dc3	logtail: add Logger.SetEnabled to toggle uploads at runtime Callers that need to turn logtail uploads on and off in response to user preference or policy changes previously had no choice: the package-level Disable is a one-way kill switch intended for the controlplane DisableLogTail debug message, and requires a process restart to undo. Add a per-Logger disabled flag, toggled via SetEnabled, that drops incoming entries without buffering while disabled. The process-wide Disable still takes precedence, so a controlplane-issued kill switch cannot be overridden by a client setting it back on. To simplify https://github.com/tailscale/tailscale-android/pull/695 Updates #13174 Change-Id: I06e75bd719c851f5f837ca5b2d1e17f7c68355f0 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-17 12:19:39 -07:00
kari-ts	8dda62cc24	feature/clientupdate: windows update should use tailscale.exe update (#19438 ) Currently, clientupdate.NewUpdater().Update() is called directly inside tailscaled, which fatals. There is also a failure that doesn't return, causing a panic. This fix allows us to use the same approach as startAutoUpdate, which is to find tailscale.exe and run tailscale.exe --update, though since it's calling the updater library directly, we get progress messages. Fixes tailscale/corp#40430s Signed-off-by: kari-ts <kari@tailscale.com>	2026-04-17 10:28:35 -07:00
BeckyPauley	b239e92eb6	cmd/k8s-operator: add e2e test setup and l7 ingress test for multi-tailnet (#19426 ) This change adds setup for a second tailnet to enable multi-tailnet e2e tests. When running against devcontrol, a second tailnet is created via the API. Otherwise, credentials are read from SECOND_TS_API_CLIENT_SECRET. Also adds an l7 HA Ingress test for multi-tailnet. Fixes tailscale/corp#37498 Signed-off-by: Becky Pauley <becky@tailscale.com>	2026-04-17 17:03:25 +01:00
Andrew Dunham	d52ae45e9b	cmd/cloner: deep-clone pointer elements in map-of-slice values The cloner's codegen for map[K][]V fields was doing a shallow append (copying pointer values) instead of cloning each element. This meant that cloned structs aliased the original's pointed-to values through the map's slice entries. Mirror the existing standalone-slice logic that checks ContainsPointers(sliceType.Elem()) and generates per-element cloning for pointer, interface, and struct types. Regenerate net/dns and tailcfg which both had affected map[...][]dnstype.Resolver fields. Fixes #19284 Signed-off-by: Andrew Dunham <andrew@tailscale.com>	2026-04-17 11:36:05 -04:00
Bjorn Stange	47ecbe5845	cmd/k8s-operator: add priorityClassName support to helm chart (#19236 ) Expose priorityClassName in the operator Helm chart values so that users can configure the operator deployment with a Kubernetes PriorityClass. This prevents the operator pods from being preempted by lower-priority workloads. Fixes #19235 Signed-off-by: Bjorn Stange <bjorn.stange@expel.io> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 12:57:12 +01:00
Brad Fitzpatrick	00a08ea86d	control/tsp: add lite map update support Updates #12542 Updates tailscale/corp#40088 Change-Id: Idb4526f1bf1f3f424d6fb3d7e34ebe89a474b57b Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-17 04:19:50 -07:00
Tom Proctor	c2da563fef	tstest/integration/vms: skip cloud-init package updates (#19443 ) The package updates started getting really slow yesterday. We can do better, but attempt a band aid fix for now, as the test is failing about a third of the time on PR CI. Updates tailscale/corp#40465 Change-Id: Icf53292ba83dd1ed76b9bdf9fb94a8f6fb448c07 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>	2026-04-17 10:39:47 +01:00
Brad Fitzpatrick	50d7176333	control/tsp, cmd/tsp: add low-level Tailscale protocol client and tool Add a new control/tsp package providing a client for speaking the Tailscale protocol to a coordination server over Noise, along with a cmd/tsp binary exposing it as a low-level composable tool for generating keys, registering nodes, and issuing map requests. Previously developed out-of-tree at github.com/bradfitz/tsp; imported here without git history. Updates #12542 Change-Id: I6ad21143c4aefe8939d4a46ae65b2184173bf69f Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-16 20:00:25 -07:00
Jordan Whited	69572c7435	derp/derpserver: add rate limit config metrics Updates tailscale/corp#40421 Signed-off-by: Jordan Whited <jordan@tailscale.com>	2026-04-16 12:48:41 -07:00
Michael Ben-Ami	1dc08f4d41	appc,feature/conn25: prevent clients from forwarding DNS requests and modifying DNS responses for domains they are also connectors for For Connectors 2025, determine if a client is configured as a connector and what domains it is a connector for. When acting as a client, don't install Split DNS routes to other connectors for those domains, and don't alter DNS responses for those domains. The responses are forwarded back to the original client, which in turn does the alteration, swapping the real IP for a Magic IP. A client is also a connector for a domain if it has tags that overlap with tags in the configured policy, and --advertise-connector=true in the prefs (not in the self-node Hostinfo from the netmap). We use the prefs as the source of truth because control only gets a copy from the prefs, and may drift. And the AppConnector field is currently zeroed out in the self-node Hostinfo from control. The extension adds a ProfileStateChange hook to process prefs changes, and the config type is split into prefs and nodeview sub-configs. Fixes tailscale/corp#39317 Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>	2026-04-16 09:41:54 -04:00
Alex Chan	4f47c3c93d	ipn/ipnlocal: log AUM hash on startup as base32, not hex Before: tka initialized at head 325557575a59525354484e4a534f494b4c4e56575435583737564b5036584c4d4c335534554255344c344c36484c5a444a323341 After: tka initialized at head 2UWWZYRSTHNJSOIKLNVWT5X77VKP6XLML3U4UBU4L4L6HLZDJ23A Printing the AUM hash as hex makes it difficult to compare to other AUM hashes; stringifying it will make it consistent with other printing. Updates #cleanup Change-Id: Ic1e23a9ce6a71a53cff7d2190f9fa06eb838ab89 Signed-off-by: Alex Chan <alexc@tailscale.com>	2026-04-16 13:45:29 +01:00
Alex Valiushko	d3ba1480f5	magicsock: invalidate endpoint on trust timeout (#19415 ) Endpoint's best address was cleared on trustBestAddrUntil expiry only if it was a udprelay connection. This generalizes invalidation to also cover direct UDP. Trust deadline is checked in two cases: On disco ping timeout from the endpoint's best address. Traffic goes DERP-only, heartbeats to the old address stop. The discovery pings are still in flight, handled by the following. On disco ping success from an alternative. BestAddr switches to the working path, trust refreshed, eager discovery stops. The still in flight pongs are handled by betterAddr(). Updates #19407 Change-Id: Ic41ed18edb4a6e4350a2d49271ba01566a6a6964 Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>	2026-04-15 19:22:07 -07:00
Brad Fitzpatrick	b39ee0445d	util/httpm: open .git/index to defeat Go test caching TestUsedConsistently shells out to git grep to find forbidden http.Method* uses across the repo. Since the test itself doesn't open any repo files, Go's test cache considers it unchanged between commits and serves stale passing results even when new violations are introduced. Fix by opening .git/index, which makes Go's test cache track it as an input. The index file changes on git reset, checkout, pull, etc., so the cache is properly invalidated when moving between commits. Updates tailscale/corp#40359 Change-Id: If1497b992a545351bdd68cff279d60f5591fe70b Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-15 15:44:19 -07:00
David Bond	eea39eaf52	cmd/k8s-operator: add affinity rules to DNSConfig (#19360 ) This commit modifies the `DNSConfig` custom resource to allow the user to specify affinity rules on the nameserver pods. Updates: https://github.com/tailscale/tailscale/issues/18556 Signed-off-by: David Bond <davidsbond93@gmail.com>	2026-04-15 22:39:04 +01:00
Jonathan Nobels	acc43356c6	control/controlclient: enable request signatures on macOS (#19317 ) fixes tailscale/corp#39422 Updates tailscale/certstore for properly macOS support and builds the request signing support into macOS builds. iOS and builds that do not use cGo are omitted. Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>	2026-04-15 14:11:14 -04:00
M. J. Fromberger	1e4934659b	ipn/ipnlocal: discard cached netmaps upon panic during SetNetworkMap (#19414 ) For debugging purposes, unstable builds will sometimes intentionally panic for unexpected behaviours. We observed such a panic after loading a cached netmap, but because we had a valid cached map, the client was unable to recover on its own and the operator had to manually reset the cache. As a defensive hedge, when netmap caching is enabled, check for a panic during installation of a net network map: If one occurs, discard any cached netmaps before letting the panic unwind, so that we do not lose the panic itself, but reduce the need for manual intervention. Updates #12639 Updates tailscale/corp#27300 Change-Id: I0436889c6bdc2fa728c9cb83630cd7b00a72ce68 Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>	2026-04-15 11:07:42 -07:00
Anton Tolchanov	958bcda5bf	control/controlclient: handle 429 responses during node registration If we get a 429 response during node registration, use the `Retry-After` header for backoff instead of the regular exponential backoff. The rate limiter error is propagated to the user, just like other registration errors are, e.g. ``` $ tailscale up backend error: node registration rate limited; will retry after 57s exit status 1 ``` Updates tailscale/corp#39533 Signed-off-by: Anton Tolchanov <anton@tailscale.com>	2026-04-15 18:54:08 +01:00
Jordan Whited	d8190e0de5	derp/derpserver: implement hierarchical token bucket rate limiting By adding a server-global parent bucket. Per-client rate limiting is subject to the parent bucket if global rate limiting is enabled. This implementation is experimental, and all related APIs should be considered unstable. Updates tailscale/corp#40291 Signed-off-by: Jordan Whited <jordan@tailscale.com>	2026-04-15 09:06:03 -07:00
Tom Meadows	5eb0b4be31	cmd/containerboot,cmd/k8s-proxy,kube: add authkey renewal to k8s-proxy (#19221 ) * kube/authkey,cmd/containerboot: extract shared auth key reissue package Move auth key reissue logic (set marker, wait for new key, clear marker, read config) into a shared kube/authkey package and update containerboot to use it. No behaviour change. Updates #14080 Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk> * kube/authkey,kube/state,cmd/containerboot: preserve device_id across restarts Stop clearing device_id, device_fqdn, and device_ips from state on startup. These keys are now preserved across restarts so the operator can track device identity. Expand ClearReissueAuthKey to clear device state and tailscaled profile data when performing a full auth key reissue. Updates #14080 Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk> * cmd/containerboot: use root context for auth key reissue wait Pass the root context instead of bootCtx to setAndWaitForAuthKeyReissue. The 60-second bootCtx timeout was cancelling the reissue wait before the operator had time to respond, causing the pod to crash-loop. Updates #14080 Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk> * cmd/k8s-proxy: add auth key renewal support Add auth key reissue handling to k8s-proxy, mirroring containerboot. When the proxy detects an auth failure (login-state health warning or NeedsLogin state), it disconnects from control, signals the operator via the state Secret, waits for a new key, clears stale state, and exits so Kubernetes restarts the pod with the new key. A health watcher goroutine runs alongside ts.Up() to short-circuit the startup timeout on terminal auth failures. Updates #14080 Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk> --------- Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>	2026-04-15 16:13:46 +01:00
Brad Fitzpatrick	dbf468740b	control/controlclient: add patchify miss stats Add an opt-in metrics.LabelMap tracking why patchifyPeer fails to convert a PeersChanged entry into a PeersChangedPatch. The stats are gated behind the TS_DEBUG_PATCHIFY_PEER_MISS envknob so there is zero overhead in normal operation. peerChangeDiff now takes an optional onFalse callback that is called with the field name on every non-patchable return path. When the envknob is off, nil is passed and replaced with a no-op at the top of peerChangeDiff. The resulting metric renders as: counter_patchify_miss{why="Hostinfo"} 2 counter_patchify_miss{why="peer_not_found"} 1170 Updates tailscale/corp#40088 Change-Id: I2d4b9074bf42ec03ab296c0629a54106bafa873e Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-15 08:05:57 -07:00
Claus Lensbøl	61c95f409c	control/controlclient: accept key if last seen on exist node is absent (#19402 ) On some nodes (found via natlab), the existing nodes last seen could be unset. For these cases, we would want to accept the key and write a last seen. This was breaking the cached netmap natlab tests. Updates #12639 Signed-off-by: Claus Lensbøl <claus@tailscale.com>	2026-04-15 03:53:40 -04:00
Avery Pennarun	effbe67fe3	wgengine/magicsock: remove pickPort, use port 0 to avoid TOCTOU race pickPort would bind a UDP socket on :0 to get a free port, close the socket, then hope to rebind to the same port in NewConn. This is a TOCTOU race that can cause flaky test failures when another process grabs the port in between. Instead, pass Port: 0 to NewConn and let the OS assign the port atomically, then read back the assigned port via conn.LocalPort(). Fixes #19409 Change-Id: Ie44b599fb93c361e29a05f2171ad747c46f82b7a Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com> Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>	2026-04-14 18:08:47 -07:00
Naman Sood	6301a6ce4b	util/linuxfw,wgengine/router: allow incoming CGNAT range traffic with nodeattr Clients with the newly added node attribute `"disable-linux-cgnat-drop-rule"` will not automatically drop inbound traffic on non-Tailscale network interfaces with the source IP in the CGNAT IP range. This is an initial proof-of-concept for enabling connectivity with off-Tailnet CGNAT endpoints. Fixes tailscale/corp#36270. Signed-off-by: Naman Sood <mail@nsood.in>	2026-04-14 16:45:06 -04:00
Fernando Serboncini	5834058269	wgengine: replace reflect.DeepEqual with typed Equal for maybeReconfigInputs (#19365 ) reflect.DeepEqual is expensive and allocates heavily. Replace it with a field-by-field comparison that does zero allocations. Adds tests and benchmarks for the new Equal method. Fixes #19363 Signed-off-by: Fernando Serboncini <fserb@tailscale.com>	2026-04-14 13:16:21 -04:00
Brad Fitzpatrick	943b426038	util/linuxfw: fix nil deref in nftables chain check Fix a panic in getOrCreateChain when the kernel lacks nftables support (CONFIG_NF_TABLES). When the nftables netlink connection fails, chain objects returned by getChainFromTable can have nil Hooknum and Priority fields. Dereferencing these caused tailscaled to SIGSEGV during router configuration, which manifested as tailscaled silently crashing ~13 seconds after "tailscale up" on arm64 gokrazy (whose kernel.arm64 build doesn't include nftables). Updates #13038 Change-Id: I14433616da5ed57895cad37038921fb4f79c3534 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-14 07:45:01 -07:00
Brad Fitzpatrick	a0a8fae856	tstest/integration: use linkat to hardlink test binaries on Linux Use linkat via /proc/self/fd with AT_SYMLINK_FOLLOW to create a hardlink of the test binary instead of copying it. This avoids copying ~50MB+ binaries into each test's temp directory, making test setup faster and reducing disk I/O. The simpler os.Link(b.Path, ret.Path) can't be used here because the source binary lives in the first test's TempDir, which may be cleaned up before later tests call CopyTo. The open FD keeps the inode alive after the path is deleted, but os.Link needs a valid path. (See also b9f468240f which tried os.Link but is racy for this reason.) The /proc/self/fd approach works without elevated privileges, unlike AT_EMPTY_PATH which requires CAP_DAC_READ_SEARCH. If the linkat fails for any reason (e.g. cross-filesystem temp dirs), it falls back to the existing full-copy path. Fixes #19397 Change-Id: I4b1f97f7e63a9ae9e09dce36dfbdd1f6cff92320 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-14 07:13:10 -07:00
Avery Pennarun	621dc9cf1b	tstest: fix kernel version parsing for Debian-style version strings The kernel version parser used strings.Cut with "-" to handle versions like "5.4.0-76-generic", but Debian uses "+" in versions like "6.12.41+deb13-amd64". Use strings.IndexAny to find the first "-" or "+" and truncate there. Fixes TestKernelVersion on Debian systems. Fixes #19395 Change-Id: I70e5f95682d54baf908e51f9f4b51c130b00aaaa Co-Authored-By: Brad Fitzpatrick <bradfitz@tailscale.com> Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>	2026-04-14 07:11:44 -07:00
Brad Fitzpatrick	6aa10576c9	wgengine/magicsock: deflake TestTwoDevicePing compare-metrics-stats The compare-metrics-stats subtest reset two independent counting systems (physical connection counters and expvar.Int user metrics) non-atomically. Background WireGuard keepalives arriving between the resets could increment one system but not the other, causing off-by-one packet/byte mismatches in either direction. Replace the reset-then-compare pattern with snapshot-and-delta: snapshot both systems before pings, snapshot again after, and compare the deltas. This eliminates the non-atomic reset window entirely. As a belt-and-suspenders safety net, tolerate a difference of exactly one packet (and corresponding bytes) from a stray keepalive that could still arrive in the narrow window between the two snapshots. flakestress passes with ~5900 runs (~2800 without -race, ~3100 with -race) but it also passed previously too. This is an annoying one to repro. Fixes #11762 Change-Id: I3447ad67e71c8146e85eed38b7a665033ef9e284 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-14 06:57:24 -07:00
Brad Fitzpatrick	49eb1b5d26	net/dns: fix TestDNSTrampleRecovery failure under flakestress The test had two problems: 1. runFileWatcher passed hardcoded "/etc/" to the inotify watcher, but the test filesystem uses a temp directory prefix. The watcher was watching the real /etc/, never seeing the test's file writes. 2. The test's watchFile used gonotify.NewDirWatcher which creates goroutines that block on real inotify syscalls. These don't work inside synctest's fake-time bubble. The test only passed standalone by accident: gonotify walks /etc/ on startup producing fake events that happened to trigger trample detection at the right time. Fix the path issue by adding ActualPath to the wholeFileFS interface, which translates logical paths (like "/etc/resolv.conf") to real filesystem paths (respecting any test prefix). Use it in runFileWatcher so the inotify watch targets the correct directory. Replace gonotify in the test with a one-shot timer that synctest can advance through fake time, reliably triggering the trample check. Fixes #19400 Change-Id: Idb252881ec24d0ab3b3c1d154dbdaf532db837d4 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-14 06:55:35 -07:00
Claus Lensbøl	27f1d4c15d	control/controlclient: improve filter on netmap updates (#19308 ) The previous filters would allow for a handful of subtle issues such as updating the last seen date when the key or online status had not changed, and making online keys unconditionally make an engine update. These have been fixed along side making no change updates from TSMP into a no-op for the engine so we don't have to reconfigure. A bunch of additional testing has been added as well. Updates #12639 Signed-off-by: Claus Lensbøl <claus@tailscale.com>	2026-04-14 08:43:07 -04:00

1 2 3 4 5 ...

10501 Commits