tailscale

mirror of https://github.com/tailscale/tailscale.git synced 2025-12-07 10:22:06 +01:00

Author	SHA1	Message	Date
Jordan Whited	755309c04e	net/udprelay: use blake2s-256 MAC for handshake challenge This commit replaces crypto/rand challenge generation with a blake2s-256 MAC. This enables the peer relay server to respond to multiple forward disco.BindUDPRelayEndpoint messages per handshake generation without sacrificing the proof of IP ownership properties of the handshake. Responding to multiple forward disco.BindUDPRelayEndpoint messages per handshake generation improves client address/path selection where lowest client->server path/addr one-way delay does not necessarily equate to lowest client<->server round trip delay. It also improves situations where outbound traffic is filtered independent of input, and the first reply disco.BindUDPRelayEndpointChallenge message is dropped on the reply path, but a later reply using a different source would make it through. Reduction in serverEndpoint state saves 112 bytes per instance, trading for slightly more expensive crypto ops: 277ns/op vs 321ns/op on an M1 Macbook Pro. Updates tailscale/corp#34414 Signed-off-by: Jordan Whited <jordan@tailscale.com>	2025-11-24 14:52:34 -08:00
Tom Proctor	6637003cc8	cmd/cigocacher,go.mod: add cigocacher cmd Adds cmd/cigocacher as the client to cigocached for Go caching over HTTP. The HTTP cache is best-effort only, and builds will fall back to disk-only cache if it's not available, much like regular builds. Not yet used in CI; that will follow in another PR once we have runners available in this repo with the right network setup for reaching cigocached. Updates tailscale/corp#10808 Change-Id: I13ae1a12450eb2a05bd9843f358474243989e967 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>	2025-11-24 21:15:46 +00:00
Andrew Dunham	698eecda04	ipn/ipnlocal: fix panic in driveTransport on network error When the underlying transport returns a network error, the RoundTrip method returns (nil, error). The defer was attempting to access resp without checking if it was nil first, causing a panic. Fix this by checking for nil in the defer. Also changes driveTransport.tr from *http.Transport to http.RoundTripper and adds a test. Fixes #17306 Signed-off-by: Andrew Dunham <andrew@tailscale.com> Change-Id: Icf38a020b45aaa9cfbc1415d55fd8b70b978f54c	2025-11-24 10:35:23 -05:00
Andrew Dunham	a20cdb5c93	tstest/integration/testcontrol: de-flake TestUserMetricsRouteGauges SetSubnetRoutes was not sending update notifications to nodes when their approved routes changed, causing nodes to not fetch updated netmaps with PrimaryRoutes populated. This resulted in TestUserMetricsRouteGauges flaking because it waited for PrimaryRoutes to be set, which only happened if the node happened to poll for other reasons. Now send updateSelfChanged notification to affected nodes so they fetch an updated netmap immediately. Fixes #17962 Signed-off-by: Andrew Dunham <andrew@tailscale.com>	2025-11-23 21:13:23 -05:00
Andrew Dunham	16587746ed	portlist,tstest: skip tests on kernels with /proc/net/tcp regression Linux kernel versions 6.6.102-104 and 6.12.42-45 have a regression in /proc/net/tcp that causes seek operations to fail with "illegal seek". This breaks portlist tests on these kernels. Add kernel version detection for Linux systems and a SkipOnKernelVersions helper to tstest. Use it to skip affected portlist tests on the broken kernel versions. Thanks to philiptaron for the list of kernels with the issue and fix. Updates #16966 Signed-off-by: Andrew Dunham <andrew@tailscale.com>	2025-11-21 22:33:57 -05:00
Nick Khyl	1ccece0f78	util/eventbus: use unbounded event queues for DeliveredEvents in subscribers Bounded DeliveredEvent queues reduce memory usage, but they can deadlock under load. Two common scenarios trigger deadlocks when the number of events published in a short period exceeds twice the queue capacity (there's a PublishedEvent queue of the same size): - a subscriber tries to acquire the same mutex as held by a publisher, or - a subscriber for A events publishes B events Avoiding these scenarios is not practical and would limit eventbus usefulness and reduce its adoption, pushing us back to callbacks and other legacy mechanisms. These deadlocks already occurred in customer devices, dev machines, and tests. They also make it harder to identify and fix slow subscribers and similar issues we have been seeing recently. Choosing an arbitrary large fixed queue capacity would only mask the problem. A client running on a sufficiently large and complex customer environment can exceed any meaningful constant limit, since event volume depends on the number of peers and other factors. Behavior also changes based on scheduling of publishers and subscribers by the Go runtime, OS, and hardware, as the issue is essentially a race between publishers and subscribers. Additionally, on lower-end devices, an unreasonably high constant capacity is practically the same as using unbounded queues. Therefore, this PR changes the event queue implementation to be unbounded by default. The PublishedEvent queue keeps its existing capacity of 16 items, while subscribers' DeliveredEvent queues become unbounded. This change fixes known deadlocks and makes the system stable under load, at the cost of higher potential memory usage, including cases where a queue grows during an event burst and does not shrink when load decreases. Further improvements can be implemented in the future as needed. Fixes #17973 Fixes #18012 Signed-off-by: Nick Khyl <nickk@tailscale.com>	2025-11-21 16:00:12 -06:00
Jordan Whited	9245c7131b	feature/relayserver: don't publish from within a subscribe fn goroutine Updates #17830 Signed-off-by: Jordan Whited <jordan@tailscale.com>	2025-11-21 12:28:38 -08:00
Claus Lensbøl	e7f5ca1d5e	wgengine/userspace: run link change subscribers in eventqueue (#18024 ) Updates #17996 Signed-off-by: Claus Lensbøl <claus@tailscale.com>	2025-11-21 14:49:37 -05:00
Nick Khyl	3780f25d51	util/eventbus: add tests for a subscriber publishing events As of 2025-11-20, publishing more events than the eventbus's internal queues can hold may deadlock if a subscriber tries to publish events itself. This commit adds a test that demonstrates this deadlock, and skips it until the bug is fixed. Updates #18012 Signed-off-by: Nick Khyl <nickk@tailscale.com>	2025-11-21 13:35:48 -06:00
Nick Khyl	016ccae2da	util/eventbus: add tests for a subscriber trying to acquire the same mutex as a publisher As of 2025-11-20, publishing more events than the eventbus's internal queues can hold may deadlock if a subscriber tries to acquire a mutex that can also be held by a publisher. This commit adds a test that demonstrates this deadlock, and skips it until the bug is fixed. Updates #17973 Signed-off-by: Nick Khyl <nickk@tailscale.com>	2025-11-21 13:35:48 -06:00
Alex Chan	ce95bc77fb	tka: don't panic if no clock set in tka.Mem This is causing confusing panics in tailscale/corp#34485. We'll keep using the tka.ChonkMem constructor as much as we can, but don't panic if you create a tka.Mem directly -- we know what the sensible thing is. Updates #cleanup Signed-off-by: Alex Chan <alexc@tailscale.com> Change-Id: I49309f5f403fc26ce4f9a6cf0edc8eddf6a6f3a4	2025-11-21 17:20:28 +00:00
Andrew Lytvynov	c679aaba32	cmd/tailscaled,ipn: show a health warning when state store fails to open (#17883 ) With the introduction of node sealing, store.New fails in some cases due to the TPM device being reset or unavailable. Currently it results in tailscaled crashing at startup, which is not obvious to the user until they check the logs. Instead of crashing tailscaled at startup, start with an in-memory store with a health warning about state initialization and a link to (future) docs on what to do. When this health message is set, also block any login attempts to avoid masking the problem with an ephemeral node registration. Updates #15830 Updates #17654 Signed-off-by: Andrew Lytvynov <awly@tailscale.com>	2025-11-20 15:52:58 -06:00
Andrew Lytvynov	de8ed203e0	go.mod: bump golang.org/x/crypto (#18011 ) Pick up fixes for https://pkg.go.dev/vuln/GO-2025-4134 Updates #cleanup Signed-off-by: Andrew Lytvynov <awly@tailscale.com>	2025-11-20 14:10:38 -06:00
Harry Harpham	ac74d28190	ipn/ipnlocal: add validations when setting serve config (#17950 ) These validations were previously performed in the CLI frontend. There are two motivations for moving these to the local backend: 1. The backend controls synchronization around the relevant state, so only the backend can guarantee many of these validations. 2. Doing these validations in the back-end avoids the need to repeat them across every frontend (e.g. the CLI and tsnet). Updates tailscale/corp#27200 Signed-off-by: Harry Harpham <harry@tailscale.com>	2025-11-20 13:40:05 -06:00
David Bond	42a5262016	cmd/k8s-operator: add multi replica support for recorders (#17864 ) This commit adds the `spec.replicas` field to the `Recorder` custom resource that allows for a highly available deployment of `tsrecorder` within a kubernetes cluster. Many changes were required here as the code hard-coded the assumption of a single replica. This has required a few loops, similar to what we do for the `Connector` resource to create auth and state secrets. It was also required to add a check to remove dangling state and auth secrets should the recorder be scaled down. Updates: https://github.com/tailscale/tailscale/issues/17965 Signed-off-by: David Bond <davidsbond93@gmail.com>	2025-11-20 11:46:34 +00:00
Jonathan Nobels	682172ca2d	net/netns: remove spammy logs for interface binding caps fixes tailscale/tailscale#17990 The logging for the netns caps is spammy. Log only on changes to the values and don't log Darwin specific stuff on non Darwin clients. Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>	2025-11-19 18:19:07 -08:00
Brad Fitzpatrick	7d19813618	net/batching: fix import formatting From #17842 Updates #cleanup Change-Id: Ie041b50659361b50558d5ec1f557688d09935f7c Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2025-11-19 18:13:46 -08:00
David Bond	86a849860e	cmd/k8s-operator: use stable image for k8s-nameserver (#17985 ) This commit modifies the kubernetes operator to use the "stable" version of `k8s-nameserver` by default. Updates: https://github.com/tailscale/corp/issues/19028 Signed-off-by: David Bond <davidsbond93@gmail.com>	2025-11-20 00:00:27 +00:00
KevinLiang10	a0d059d74c	cmd/tailscale/cli: allow remote target as service destination (#17607 ) This commit enables user to set service backend to remote destinations, that can be a partial URL or a full URL. The commit also prevents user to set remote destinations on linux system when socket mark is not working. For user on any version of mac extension they can't serve a service either. The socket mark usability is determined by a new local api. Fixes tailscale/corp#24783 Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>	2025-11-19 12:29:08 -05:00
License Updater	12c598de28	licenses: update license notices Signed-off-by: License Updater <noreply+license-updater@tailscale.com>	2025-11-19 07:06:18 -08:00
Alex Chan	976bf24f5e	ipn/ipnlocal: remove the always-true CanSupportNetworkLock() Now that we support using an in-memory backend for TKA state (#17946), this function always returns `nil` – we can always support Network Lock. We don't need it any more. Plus, clean up a couple of errant TODOs from that PR. Updates tailscale/corp#33599 Change-Id: Ief93bb9adebb82b9ad1b3e406d1ae9d2fa234877 Signed-off-by: Alex Chan <alexc@tailscale.com>	2025-11-19 14:51:13 +00:00
Brad Fitzpatrick	6ac4356bce	util/eventbus: simplify some reflect in Bus.pump Updates #cleanup Change-Id: Ib7b497e22c6cdd80578c69cf728d45754e6f909e Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2025-11-19 06:23:34 -08:00
Alex Chan	336df56f85	cmd/tailscale/cli: remove Latin abbreviations from CLI help text Our style guide recommends avoiding Latin abbreviations in technical documentation, which includes the CLI help text. This is causing linter issues for the docs site, because this help text is copied into the docs. See http://go/style-guide/kb/language-and-grammar/abbreviations#latin-abbreviations Updates #cleanup Change-Id: I980c28d996466f0503aaaa65127685f4af608039 Signed-off-by: Alex Chan <alexc@tailscale.com>	2025-11-19 13:22:13 +00:00
Alex Chan	aeda3e8183	ipn/ipnlocal: reduce profileManager boilerplate in network-lock tests Updates tailscale/corp#33537 Signed-off-by: Alex Chan <alexc@tailscale.com>	2025-11-19 13:21:52 +00:00
Raj Singh	62d64c05e1	cmd/k8s-operator: fix type comparison in apiserver proxy template (#17981 ) ArgoCD sends boolean values but the template expects strings, causing "incompatible types for comparison" errors. Wrap values with toString so both work. Fixes #17158 Signed-off-by: Raj Singh <raj@tailscale.com>	2025-11-19 13:08:40 +00:00
Alex Chan	e1dd9222d4	ipn/ipnlocal, tka: compact TKA state after every sync Previously a TKA compaction would only run when a node starts, which means a long-running node could use unbounded storage as it accumulates ever-increasing amounts of TKA state. This patch changes TKA so it runs a compaction after every sync. Updates https://github.com/tailscale/corp/issues/33537 Change-Id: I91df887ea0c5a5b00cb6caced85aeffa2a4b24ee Signed-off-by: Alex Chan <alexc@tailscale.com>	2025-11-19 12:27:04 +00:00
David Bond	38ccdbe35c	cmd/k8s-operator: default to stable image (#17848 ) This commit modifies the helm/static manifest configuration for the k8s-operator to prefer the stable image tag. This avoids making those using static manifests seeing unstable behaviour by default if they do not manually make the change. This is managed for us when using helm but not when generating the static manifests. Updates https://github.com/tailscale/tailscale/issues/10655 Signed-off-by: David Bond <davidsbond93@gmail.com>	2025-11-19 11:57:27 +00:00
Brad Fitzpatrick	408336a089	feature/featuretags: add CacheNetMap feature tag for upcoming work (trying to get in smaller obvious chunks ahead of later PRs to make them smaller) Updates #17925 Change-Id: I184002001055790484e4792af8ffe2a9a2465b2e Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2025-11-18 18:11:20 -08:00
Brad Fitzpatrick	5b0c57f497	tailcfg: add some omitzero, adjust some omitempty to omitzero Updates tailscale/corp#25406 Change-Id: I7832dbe3dce3774bcc831e3111feb75bcc9e021d Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2025-11-18 17:36:27 -08:00
Joe Tsai	3b865d7c33	cmd/netlogfmt: support resolving IP addresses to synonymous labels (#17955 ) We now embed node information into network flow logs. By default, netlogfmt still prints out using Tailscale IP addresses. Support a "--resolve-addrs=TYPE" flag that can be used to specify resolving IP addresses as node IDs, hostnames, users, or tags. Updates tailscale/corp#33352 Signed-off-by: Joe Tsai <joetsai@digital-static.net>	2025-11-18 14:16:27 -08:00
James Tucker	c09c95ef67	types/key,wgengine/magicsock,control/controlclient,ipn: add debug disco key rotation Adds the ability to rotate discovery keys on running clients, needed for testing upcoming disco key distribution changes. Introduces key.DiscoKey, an atomic container for a disco private key, public key, and the public key's ShortString, replacing the prior separate atomic fields. magicsock.Conn has a new RotateDiscoKey method, and access to this is provided via localapi and a CLI debug command. Note that this implementation is primarily for testing as it stands, and regular use should likely introduce an additional mechanism that allows the old key to be used for some time, to provide a seamless key rotation rather than one that invalidates all sessions. Updates tailscale/corp#34037 Signed-off-by: James Tucker <james@tailscale.com>	2025-11-18 12:16:15 -08:00
Fran Bull	da508c504d	appc: add ippool type As part of the conn25 work we will want to be able to keep track of a pool of IP Addresses and know which have been used and which have not. Fixes tailscale/corp#34247 Signed-off-by: Fran Bull <fran@tailscale.com>	2025-11-18 10:46:28 -08:00
Alex Chan	d0daa5a398	tka: marshal AUMHash totext even if Tailnet Lock is omitted We use `tka.AUMHash` in `netmap.NetworkMap`, and we serialise it as JSON in the `/debug/netmap` C2N endpoint. If the binary omits Tailnet Lock support, the debug endpoint returns an error because it's unable to marshal the AUMHash. This patch adds a sentinel value so this marshalling works, and we can use the debug endpoint. Updates https://github.com/tailscale/tailscale/issues/17115 Signed-off-by: Alex Chan <alexc@tailscale.com> Change-Id: I51ec1491a74e9b9f49d1766abd89681049e09ce4	2025-11-18 18:34:09 +00:00
Anton Tolchanov	04a9d25a54	tka: mark young AUMs as active even if the chain is long Existing compaction logic seems to have had an assumption that markActiveChain would cover a longer part of the chain than markYoungAUMs. This prevented long, but fresh, chains, from being compacted correctly. Updates tailscale/corp#33537 Signed-off-by: Anton Tolchanov <anton@tailscale.com>	2025-11-18 18:04:12 +00:00
Brad Fitzpatrick	bd29b189fe	types/netmap,*: remove some redundant fields from NetMap Updates #12639 Change-Id: Ia50b15529bd1c002cdd2c937cdfbe69c06fa2dc8 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2025-11-18 07:56:10 -08:00
Brad Fitzpatrick	2a6cbb70d9	.github/workflows: make go_generate check detect new files Updates #17957 Change-Id: I904fd5b544ac3090b58c678c4726e7ace41a52dd Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2025-11-18 06:42:08 -08:00
Brad Fitzpatrick	4e2f2d1088	feature/buildfeatures: re-run go generate 6a73c0bdf55 added a feature tag but didn't re-run go generate on ./feature/buildfeatures. Updates #9192 Change-Id: I7819450453e6b34c60cad29d2273e3e118291643 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2025-11-18 06:42:08 -08:00
Alex Chan	af7c26aa05	cmd/vet/jsontags: fix a typo in an error message Updates #17945 Change-Id: I8987271420feb190f5e4d85caff305c8d4e84aae Signed-off-by: Alex Chan <alexc@tailscale.com>	2025-11-18 13:01:35 +00:00
Alex Chan	85373ef822	tka: move RemoveAll() to CompactableChonk I added a RemoveAll() method on tka.Chonk in #17946, but it's only used in the node to purge local AUMs. We don't need it in the SQLite storage, which currently implements tka.Chonk, so move it to CompactableChonk instead. Also add some automated tests, as a safety net. Updates tailscale/corp#33599 Change-Id: I54de9ccf1d6a3d29b36a94eccb0ebd235acd4ebc Signed-off-by: Alex Chan <alexc@tailscale.com>	2025-11-18 12:53:52 +00:00
Alex Chan	c2e474e729	all: rename variables with lowercase-l/uppercase-I See http://go/no-ell Signed-off-by: Alex Chan <alexc@tailscale.com> Updates #cleanup Change-Id: I8c976b51ce7a60f06315048b1920516129cc1d5d	2025-11-18 09:12:34 +00:00
James 'zofrex' Sanderson	9048ea25db	ipn/localapi: log calls to localapi (#17880 ) Updates tailscale/corp#34238 Signed-off-by: James Sanderson <jsanderson@tailscale.com>	2025-11-18 08:04:03 +00:00
James 'zofrex' Sanderson	a2e9dfacde	cmd/tailscale/cli: warn if a simple up would change prefs (#17877 ) Updates tailscale/corp#21570 Signed-off-by: James Sanderson <jsanderson@tailscale.com>	2025-11-18 07:53:42 +00:00
Joe Tsai	4860c460f5	wgengine/netlog: strip dot suffix from node name (#17954 ) The REST API does not return a node name with a trailing dot, while the internal node name reported in the netmap does have one. In order to be consistent with the API, strip the dot when recording node information. Updates tailscale/corp#33352 Signed-off-by: Joe Tsai <joetsai@digital-static.net>	2025-11-17 19:17:02 -08:00
James Tucker	41662f5128	ssh/tailssh: fix incubator tests on macOS arm64 Perform a path check first before attempting exec of `true`. Try /usr/bin/true first, as that is now and increasingly so, the more common and more portable path. Fixes tests on macOS arm64 where exec was returning a different kind of path error than previously checked. Updates #16569 Signed-off-by: James Tucker <james@tailscale.com>	2025-11-17 16:20:46 -08:00
Andrew Lytvynov	26f9b50247	feature/tpm: disable dictionary attack protection on sealing key (#17952 ) DA protection is not super helpful because we don't set an authorization password on the key. But if authorization fails for other reasons (like TPM being reset), we will eventually cause DA lockout with tailscaled trying to load the key. DA lockout then leads to (1) issues for other processes using the TPM and (2) the underlying authorization error being masked in logs. Updates #17654 Signed-off-by: Andrew Lytvynov <awly@tailscale.com>	2025-11-17 14:42:15 -08:00
Brad Fitzpatrick	f1cddc6ecf	ipn{,/local},cmd/tailscale: add "sync" flag and pref to disable control map poll For manual (human) testing, this lets the user disable control plane map polls with "tailscale set --sync=false" (which survives restarts) and "tailscale set --sync" to restore. A high severity health warning is shown while this is active. Updates #12639 Updates #17945 Change-Id: I83668fa5de3b5e5e25444df0815ec2a859153a6d Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2025-11-17 12:37:31 -08:00
Brad Fitzpatrick	165a24744e	tka: fix typo in comment Let's fix all the typos, which lets the code be more readable, lest we confuse our readers. Updates #cleanup Change-Id: I4954601b0592b1fda40269009647bb517a4457be Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2025-11-17 10:13:24 -08:00
Alex Chan	1723cb83ed	ipn/ipnlocal: use an in-memory TKA store if FS is unavailable This requires making the internals of LocalBackend a bit more generic, and implementing the `tka.CompactableChonk` interface for `tka.Mem`. Signed-off-by: Alex Chan <alexc@tailscale.com> Updates https://github.com/tailscale/corp/issues/33599	2025-11-17 18:12:33 +00:00
Andrew Lytvynov	d01081683c	go.mod: bump golang.org/x/crypto (#17907 ) Pick up a fix for https://pkg.go.dev/vuln/GO-2025-4116 (even though we're not affected). Updates #cleanup Change-Id: I9f2571b17c1f14db58ece8a5a34785805217d9dd Signed-off-by: Andrew Lytvynov <awly@tailscale.com>	2025-11-17 09:05:18 -08:00
Alex Chan	200383dce5	various: add more missing apostrophes in comments Updates #cleanup Change-Id: I79a0fda9783064a226ee9bcee2c1148212f6df7b Signed-off-by: Alex Chan <alexc@tailscale.com>	2025-11-17 16:47:17 +00:00

1 2 3 4 5 ...

9931 Commits