tailscale

mirror of https://github.com/tailscale/tailscale.git synced 2026-05-15 17:26:14 +02:00

Author	SHA1	Message	Date
Alex Chan	0cb432ed84	all: update more references to Tailnet/Network Lock Updates tailscale/corp#37904 Change-Id: I09e73b3248b9ddf86dafe33dfb621bd560f6596d Signed-off-by: Alex Chan <alexc@tailscale.com>	2026-05-15 16:23:50 +01:00
Fernando Serboncini	c355618e73	wgengine/router/osrouter: skip netfilter add-ons when chain setup fails (#19757 ) linuxRouter has two blocks (connmark rules and the CGNAT drop rule) that gate on cfg.NetfilterMode, the requested config state. This may cause an error when setNetfilterModeLocked fails, since it may keep assuming this config is valid. We now gate both blocks on r.netfilterMode, matching the pattern used by SNAT, stateful, and loopback paths. Fixes #19737 Change-Id: Ia6003a082db99c376e662132d725661afbac0ee9 Signed-off-by: Fernando Serboncini <fserb@tailscale.com>	2026-05-15 09:32:30 -04:00
License Updater	1d3562b314	licenses: update license notices Signed-off-by: License Updater <noreply+license-updater@tailscale.com>	2026-05-14 21:04:41 -07:00
Brad Fitzpatrick	ef1bb5ac16	util/cibuild, cache_key_test: skip TestTsgoRevInCacheKey outside Tailscale CI cibuild.On() returns true for any CI environment that sets CI=true, including Alpine Linux's package build CI. TestTsgoRevInCacheKey was guarded by cibuild.On() (or use of tsgo), so it ran under Alpine's CI with stock Go, where go.toolchain.rev isn't blended into build cache keys, and unsurprisingly failed. Add cibuild.OnTailscaleCI, which keys off GITHUB_REPOSITORY_OWNER to distinguish tailscale/tailscale's own GitHub Actions CI from arbitrary downstream CI, and use it in TestTsgoRevInCacheKey. Fixes #19754 Change-Id: Id31cfe71903a235f1460dca1e2fdf334e3ba1ee5 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-14 15:55:05 -07:00
Brad Fitzpatrick	fa49009eee	wgengine: simplify ResetAndStop, drop drain loop Since f343b496c3 ("wgengine, all: remove LazyWG, use wireguard-go callback API for on-demand peers"), Reconfig is fully synchronous: magicConn.UpdatePeers, wgdev.RemovePeer, router.Set, and dns.Set all return when the work is done, and the peer list is updated under wgLock before Reconfig returns. So after Reconfig with empty configs, len(st.Peers) is already 0. The old loop also waited for st.DERPs to drain to 0, but UpdatePeers only edits maps; active DERP connections idle out on their own timeout. The sole caller (LocalBackend.stopEngineAndWait) doesn't inspect st.DERPs anyway; it just hands the Status to setWgengineStatusLocked. So the drain-wait was for nothing observable and could theoretically (or at least appear to readers to) loop forever holding b.mu. Remove that reader confusion by removing the backoff loop entirely. Updates #19759 Change-Id: Ibfac3f0baabcad7604b713c934a8fc37932e0a50 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-14 15:45:38 -07:00
Brad Fitzpatrick	93440604e0	tstest/natlab/vmtest: add TestPeerRelay Add a VM-based natlab test that exercises the peer-relay feature (feature/relayserver) end-to-end across three Tailscale nodes whose network topology makes a direct A<->B UDP path impossible: both peers are behind HardNAT (FreeBSD/pfSense-style endpoint-dependent NAT) with no port-mapping services, while the relay node is behind One2OneNAT so its STUN-discovered WAN endpoint is reachable from both peers. The test enables the relay server via EditPrefs, then waits for an a->b PingDisco whose PingResult.PeerRelay is set (proving magicsock chose the peer-relay path, not DERP), and finally asserts that the relay's DebugPeerRelaySessions LocalAPI reports the session. The existing TestPeerRelayPing in tstest/integration runs three tailscaled processes on the loopback interface with no NATs; this new vmtest covers peer relay through real per-VM kernels and NATs. To wire control-server capabilities into vmtest, also add a PeerRelayGrants() EnvOption (sibling of AllOnline, SameTailnetUser) that flips testcontrol.Server.PeerRelayGrants so the wildcard packet filter grants tailcfg.PeerCapabilityRelay and PeerCapabilityRelayTarget; without those caps magicsock won't consider any peer a candidate relay. Updates #13038 Change-Id: Ib3440b83ec442da0d3b89ffa48ceea9398ea9062 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-14 14:47:29 -07:00
Andrew Lytvynov	9437a634e6	scripts/installer.sh: handle Zorin OS versions separately from Ubuntu (#19758 ) Their version scheme is different, even though the OS is based on Ubuntu. We need to check Zorin's version numbers to pick the right APT_KEY_TYPE. Updates #18925 Signed-off-by: Andrew Lytvynov <awly@tailscale.com>	2026-05-14 14:04:04 -07:00
M. J. Fromberger	4eb977413a	tstest/natlab/vmtest: add helpers for fatal step errors (#19753 ) In a lot of places, we construct an error to End a step, then immediately log it to the governing test as test fatal. Save ourselves a bit of boilerplate by putting methods on Step for that. There are a couple cases this doesn't cover, e.g., where we construct the Step outside a subtest that wants to fail individually, but it helps enough to pay for its lines. Updates #13038 Change-Id: I71f9900942962de16609b6b198d3ba13d6958a5f Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>	2026-05-14 09:24:47 -07:00
Claus Lensbøl	8203edc099	.github/workflows: change natlab test trigger label (#19750 ) The label "natlab" is a bit confusing and also used for other things. Instead, change the trigger label to "run-natlab-tests". Updates #13038 Signed-off-by: Claus Lensbøl <claus@tailscale.com>	2026-05-14 11:53:13 -04:00
Fernando Serboncini	2a06fb66d0	cmd/cloner: preserve nil-valued entries when cloning map (#19749 ) The codegen path for map-of-slice-of-pointer fields, skipped nil-valued entries. That dropped the key from the map. This broke how dns.Config.Routes uses nil values sentinels. Fixes #19730 Fixes #19732 Fixes #19746 Fixes #19744 Change-Id: Ic6400227f4ab21b3ca0e8c0eeecf9b83d145a9ab Signed-off-by: Fernando Serboncini <fserb@tailscale.com>	2026-05-14 10:30:59 -04:00
Mike O'Driscoll	48919f708b	util/linuxfw: fix nftables endianness and add connmark conditional check (#19725 ) Fix the following issues: 1. Endianness Bug: The nftables runner used hardcoded big-endian byte arrays for firewall mark values (0xff0000, etc.), breaking bitwise operations on little-endian systems (all x86/x64, ARM). This caused connmark save/restore rules to silently fail. Fixed by using binary.NativeEndian to generate correct byte order for the host system. 2. Connmark Restore Conditional Check: The connmark restore mechanism unconditionally overwrote packet marks, even when Tailscale hadn't set any mark bits in conntrack. This destroyed mark bits set by other systems (VPNs, policy routing, vendor flags), breaking coexistence. Fixed by adding a conditional check to only restore when (ct mark & 0xff0000) != 0, preventing the worst case of wiping all marks to zero. Changes: - util/linuxfw/linuxfw.go: Added nativeEndianUint32() helper and updated all mask functions to use native byte order instead of hardcoded bytes - util/linuxfw/nftables_runner.go: Added conditional check in makeConnmarkRestoreExprs() to only restore when ct mark has Tailscale bits set; added detailed comment about bit preservation limitations - util/linuxfw/iptables_runner.go: Added conditional check using -m connmark ! --mark to match nftables behavior - Tests updated: Fixed byte-level regression tests to expect little-endian byte sequences and verify the new conditional check Note: Perfect bit preservation in nftables remains challenging due to nftables expression VM limitations. The current implementation prevents the critical case of wiping marks with zero. Updates #3310 Fixes #11803 Related to #8555 Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>	2026-05-14 09:11:24 -04:00
James Tucker	e7415e6393	util/eventbus: unify Subscriber/SubscriberFunc cores; structural symmetry Brings Subscriber[T] in line with the same non-generic-core pattern already applied to SubscriberFunc[T] and Publisher[T]: - Renames subscriberFuncCore to subscriberCore and shares it between Subscriber[T] and SubscriberFunc[T]. Both typed facades hold a subscriberCore plus their respective per-T delivery state (Subscriber: chan T; SubscriberFunc: nothing, the user callback is captured in the dispatch closure). - The bus's outputs map and subscriber-interface itab key on subscriberCore for both subscriber kinds, so adding a new Subscribe[T] call site no longer pays a per-T itab, dictionary, or equality function for the subscriber-interface side. - Subscribe[T] now hoists the non-generic constructor portion into newSubscriberCore (timer setup, core allocation, cached type/typeName, unregister method-value), matching SubscribeFunc. The dispatch loop is intentionally NOT extracted to a non-generic helper for Subscriber[T], unlike SubscriberFunc[T]. The reason is the typed channel send 'case s.read <- t:' must appear lexically inside the select; the only way to lift it into a non-generic loop is to bridge typed and untyped via a per-event goroutine, which costs ~2.7x throughput on BenchmarkBasicThroughput. We keep dispatchTyped on the generic facade and accept the per-shape stencil cost as the cheaper alternative. Symbol-level effect on tailscaled (linux/amd64, measured via `go tool nm -size`): Before: (Subscriber[T]).dispatch 2 shape stencils: 1,682 + 1,549 = 3,231 B 3 thin per-T wrappers: 124 B each = 372 B 2 deferwrap1 helpers: 62 B each = 124 B total: 3,727 B After: (Subscriber[T]).dispatchTyped 2 shape stencils: 1,678 + 1,582 = 3,260 B 0 per-T wrappers (replaced by closure stored on core) 2 deferwrap1 helpers: 62 B each = 124 B total: 3,384 B dispatch path .text delta: -343 B (-9.2%) Per-shape stencils are ~1,600 B (.text body) + ~1,100 B (pclntab) = ~2,700 B each on production tailscaled. The shape count matches before/after (two distinct GC shapes for the Subscriber[T] event types in this binary). What changes is that the per-T thin wrappers are eliminated because Subscriber[T] no longer implements the subscriber interface directly. Whole-binary section deltas: .text: -2,304 B (includes the dispatch savings plus other small downstream effects) .rodata: +512 B (additional closure-type metadata) .gopclntab: -2,981 B (fewer per-T compiled functions => less metadata) Stripped tailscaled (linux/amd64): no change at the file level (the savings fall below the linker's section-alignment boundary). Unstripped builds shrink by ~2,900 B. Behavior is unchanged: BenchmarkBasicThroughput: 2,161 ns/op, 0 B/op, 0 allocs/op BenchmarkBasicFuncThroughput: 2,493 ns/op, 144 B/op, 2 allocs/op BenchmarkSubsThroughput: 3,727 ns/op, 0 B/op, 0 allocs/op Updates #12614 Change-Id: I97918ec68bd2cdb15958bbfd7687592b39663efe Signed-off-by: James Tucker <james@tailscale.com>	2026-05-13 17:36:30 -07:00
Brad Fitzpatrick	dc323b1351	derp/derpserver: collapse clients and clientsAtomic into one hashtriemap Server.clientsAtomic was introduced in 6b729795c30f3 as a lock-free mirror of Server.clients to skip Server.mu on the packet send hot path. This drops the non-concurrent map and makes all the existing callers of the old plain map just use the concurrent map, but still holding Server.mu. BenchmarkLookupDestHashTrie is unchanged at ~2ns/op. Fixes #19726 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com> Change-Id: I0894e4d86914d152b9b5fef969a3184bcb96f678	2026-05-13 16:57:26 -07:00
Nick Khyl	4d68493144	health: avoid publishing health.Change when warnable visibility remains unchanged Warnables with a non-zero TimeToVisible are only published on the eventbus when they remain unhealthy long enough to become visible. However, we still publish a health.Change when a warning that was never visible (and was never published to the eventbus) becomes healthy. This PR fixes that and reduces churn when there is no actual state change. In particular, it avoids unnecessary IPN bus notifications sent to GUI/CLI clients, captive portal detection, etc. Updates tailscale/corp#39759 (noticed while working on it) Signed-off-by: Nick Khyl <nickk@tailscale.com>	2026-05-13 17:02:35 -05:00
Adriano Sela Aviles	41286c2b56	ipn/ipnlocal,tsd: add NoiseRoundTripper to tsd.Sys Adds a new NoiseRoundTripper field to tsd.Sys to expose an http.RoundTripper to make requests over the control plane Noise connection. This will be used in PAM use cases soon. Updates tailscale/corp#41800 Signed-off-by: Adriano Sela Aviles <adriano@tailscale.com>	2026-05-13 14:56:28 -07:00
Nick Khyl	32f984f54c	net/dns: create a new hosts file if it doesn't exist on Windows A missing hosts file is not a fatal error. We should log it, but still proceed and create a new one instead of failing the DNS reconfiguration completely. Fixes #19733 Signed-off-by: Nick Khyl <nickk@tailscale.com>	2026-05-13 16:10:36 -05:00
Claus Lensbøl	bb47ea2c6b	tstest/natlab/vmtest: start migrating old natlab tests to vmtest (#19727 ) Instead of having two entry points for running natlab tests, start converting the connectivity tests to use the vmtest framework. Grid and pair tests have yet to be moved over. Updates #13038 Signed-off-by: Claus Lensbøl <claus@tailscale.com>	2026-05-13 16:44:53 -04:00
Fran Bull	3a6261b79b	feature/conn25: keep addrAssignments through pool reconfig Fixes tailscale/corp#40250 Signed-off-by: Fran Bull <fran@tailscale.com>	2026-05-13 11:00:47 -07:00
Simon Law	e4e59a2af0	wgengine/netstack: stop inject goroutine from leaking in Impl.Start (#19721 ) This patch fixes a data race in wgengine/netstack that surfaced while running both TestTCPForwardLimits and TestTCPForwardLimits_PerClient. Because these two tests both setup the TS_DEBUG_NETSTACK envknob, a race happens because netstack.Impl.Close leaked its inject goroutine. The inject goroutine also reads the TS_DEBUG_NETSTACK envknob, so if it is still running when the next test starts, then it will break. This patch also cleans up the tests a bit, ensuring that neither of them run in T.Parallel. It also adds a T.Cleanup call to clear the envknob. Fixes #19720 Signed-off-by: Simon Law <sfllaw@tailscale.com>	2026-05-13 08:13:40 -07:00
Simon Law	6467f0d067	ipn/ipnlocal: fix minor typo in shouldUseOneCGNATRoute (#19719 ) This fixes a log message where ipn/ipnlocal.shouldUseOneCGNATRoute would claim that an android machines was actually macOS. Updates #cleanup Updates #19652 Signed-off-by: Simon Law <sfllaw@tailscale.com>	2026-05-12 21:55:29 -07:00
Brad Fitzpatrick	6b729795c3	derp/derpserver: use hashtriemap for peer lookup Replace the process-global Server.mu lookup in the packet send hot path with a global hashtriemap mirror of local clientSet entries. The authoritative clients map remains guarded by Server.mu; clientsAtomic is only a lock-free fast path for active local clients. Misses, stale inactive client sets, duplicate accounting, and mesh forwarding still fall back to lookupDestUncached. This avoids taking Server.mu for the common local active-client send path, at the cost of adding one global concurrent map that mirrors Server.clients for local peers. The benchmark uses four destination peers. The before run sets TS_DEBUG_DERP_DISABLE_PEER_HASHTRIE=true to force the old mutex lookup path; the after run uses the hashtrie fast path. goos: linux goarch: amd64 pkg: tailscale.com/derp/derpserver cpu: Intel(R) Xeon(R) 6975P-C │ before │ after │ │ sec/op │ sec/op vs base │ LookupDestHashTrie-16 176.050n ± 1% 1.904n ± 6% -98.92% (p=0.000 n=10) │ before │ after │ │ B/op │ B/op vs base │ LookupDestHashTrie-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ ¹ all samples are equal │ before │ after │ │ allocs/op │ allocs/op vs base │ LookupDestHashTrie-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ ¹ all samples are equal Updates #3560 (very indirectly, historically) Updates #19713 (as an alternative to that PR) Change-Id: Ifb72e5c9854ad00e938cd24c6ab9c27312f297e8 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-12 16:08:16 -07:00
Adriano Sela Aviles	72578de033	ipn/{ipnlocal,localapi},client/local: add per-dst cap resolution for services Adds two new cap resolution methods alongside the existing PeerCaps: PeerCapsForService(src netip.Addr, svcName tailcfg.ServiceName) resolves the service name to its VIP addresses via the node's service IP mappings and returns caps scoped to that service. Exposed on /v0/whois via the svc_name query parameter and on client/local.Client as WhoIsForService. PeerCapsForIP(src, dst netip.Addr) resolves caps against an arbitrary destination IP. Exposed on /v0/whois via the svc_addr query parameter and on client/local.Client as WhoIsForIP. svc_name takes priority over svc_addr when both are present. Invalid values for either return 400. The existing PeerCaps/WhoIs path is unchanged: without a service parameter, WhoIs returns only host-level caps. Updates tailscale/corp#41632 Signed-off-by: Adriano Sela Aviles <adriano@tailscale.com>	2026-05-12 15:50:39 -07:00
DeedleFake	ad8ead9c94	cmd/tailscale/cli: add RunWithContext Fixes #12778 Change-Id: If9f8b299cef0cb68f93b344845b5c6a5b7554d2c Signed-off-by: DeedleFake <deedlefake@users.noreply.github.com>	2026-05-12 12:27:55 -07:00
M. J. Fromberger	9f48567bf1	ipn/ipnlocal,wgengine/magicsock: add basic counters for cached peer connectivity (#19699 ) Add new clientmetric counters for establishing contact with peers while using cached network map data. To do this, instrument the magicsock.Conn with a bit to indicate whether its peer data came from a cached netmap. If so, there are two conditions we will count as establishing connectivity to a peer: - Receipt of a CallMeMaybe from a peer via disco. - Establishing a valid endpoint address for a peer. In vmtest, add Env.ClientMetrics to scrape metrics from the specified node. Use this to check that counters were updated in caching tests. Updates https://github.com/tailscale/projects/issues/13 Updates #12639 Change-Id: Ie8cf3244ac8af4f5bcfe4d0d944078da2ba08990 Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>	2026-05-12 12:01:05 -07:00
James Tucker	120bfcf1cc	util/eventbus: extract non-generic SubscriberFunc constructor body and cache type name Two changes that share the same intent of reducing per-T duplication in code that doesn't actually depend on T: 1. Hoist the non-generic portion of newSubscriberFunc[T] into a newSubscriberFuncCore() helper. The hoisted work is the time timer setup, the subscriberFuncCore allocation, and the unregister closure (which captures only the non-generic reflect.Type and subscribeState). The generic body now does only the two T-bound things it has to: compute reflect.TypeFor[T] and create the dispatch closure. Effect on the per-shape-stencil body of newSubscriberFunc[T]: before: 523 B per shape (in synthetic test) after: 293 B per shape (-230 B per shape; -56% on this body) 2. Cache reflect.Type.String() once at construction (in core.typeName) instead of recomputing it every time the dispatch closure runs. The dispatch closure also now takes the subscriberFuncCore directly rather than building an intermediate dispatchFuncState struct on every call. Effect on the dispatch closure body (newSubscriberFunc[T].func1): before: 581 B per shape after: 480 B per shape (-101 B per shape; -17%) Combined effect on tailscaled (linux/amd64): named-symbol savings via symcost: ~7 KB stripped binary delta: -8 KB (page-quantized) arm64 binary delta: 0 (page-quantized) cumulative reduction from baseline (5167ff412): linux/amd64: -110,592 bytes (-0.391%) linux/arm64: -131,072 bytes (-0.499%) Throughput is also improved by the typeName cache: BenchmarkBasic goes from 2018 ns/op to 1864 ns/op (-7.6%) because the dispatch hot path no longer allocates a string on every event. Updates #12614 Change-Id: Ib3a3d6796785e16506330ec034e1144580d467a3 Signed-off-by: James Tucker <james@tailscale.com>	2026-05-12 11:16:04 -07:00
Brad Fitzpatrick	758ebe9839	tstest/natlab/vmtest: use short paths for Unix sockets macOS limits Unix socket paths to 104 bytes. The Go test TempDir path (e.g. /var/folders/.../TestDirectConnection...679197086/001/) easily exceeds that, causing "bind: invalid argument". Create a short /tmp/vmtest* directory for all socket files (vnet, QMP, dgram) so the paths stay well under the limit on every platform. Updates #13038 Change-Id: I721d24561d1766aaa964692bc77f40a131aa9455 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-11 21:54:27 -07:00
Brad Fitzpatrick	f4c5613156	tstest/natlab/vmtest: don't require KVM; use TCG on macOS startCloudQEMU hardcoded -machine q35,accel=kvm and -cpu host, which fails on any host without KVM (notably macOS). Replace with a qemuAccelArgs helper that probes /dev/kvm and falls back to QEMU's TCG software emulation, matching the pattern already used by tstest/integration/nat. Also wire the helper into startGokrazyQEMU so gokrazy VMs pick up KVM when available. Updates #13038 Change-Id: I7745518db823279b1880957bb14ca2ffdaab4c50 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-11 19:18:17 -07:00
Brad Fitzpatrick	e062b46984	tstest/natlab, .github/workflows: add opt-in natlab CI workflow The natlab vmtest suite (tstest/natlab/vmtest) and the integration nat tests are gated behind --run-vm-tests because they need KVM and are slow. Until now nothing in CI exercised them apart from a single canary TestEasyEasy run on every PR. Add .github/workflows/natlab-test.yml that runs the full opt-in suite on demand (workflow_dispatch), on PRs labeled "natlab", and on main every 12 hours via cron. The workflow has two phases: - "prepare" builds the gokrazy VM image, downloads the Ubuntu and FreeBSD cloud images once via the new natlabprep tool, and emits a dynamic JSON matrix of every TestX function it finds in the two opt-in packages. - "test" is a per-test matrix that depends on prepare. Each matrix job restores the shared caches and runs a single test, so adding a new TestFoo is automatically picked up on the next run without any workflow edits. Rename the existing natlab-integrationtest.yml to natlab-basic.yml since it's the small smoke variant (just TestEasyEasy on every PR); the new natlab-test.yml is the bigger suite. The job inside is renamed to EasyEasy for the same reason. Move the macOS arm64 host check from vmtest.Env.Start into vmtest.Env.AddNode so a test that adds a vmtest.MacOS node skips immediately on a non-macOS host, and add an explicit skipIfNotMacOSArm64 helper at the top of the two macOS-only tests so the platform requirement is obvious to readers. Quiet the takeAgentConnOne miss log in tstest/natlab/vnet by default (it was the overwhelming majority of bytes in CI logs, with no signal in healthy runs) and replace it with a periodic "still waiting" line that only fires after 10s, so a truly stuck agent connection still surfaces. Updates #13038 Change-Id: I4582098d8865200fd5a73a9b696942319ccf3bf0 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-11 17:14:46 -07:00
James Tucker	4eec4423b4	util/eventbus: move Publisher publisher-interface impl to a non-generic core Mirrors the same refactor previously applied to SubscriberFunc: - Publisher[T]: a thin user-facing facade. Holds a pointer to a non-generic publisherCore and exposes Publish/Close/ShouldPublish. - publisherCore: a non-generic struct that owns the Client back- pointer, stop flag, and cached reflect.Type. It implements the package-private publisher interface (publishType, Close). The bus's per-Client publisher set is set.Set[publisher] keyed on this single non-generic type. The publisher interface only exists to support diagnostic introspection (Debugger.PublishTypes returning the list of types a client publishes). Previously, satisfying that diagnostic-only interface forced Publisher[T] to be the implementor and cost a per-T itab, generic dictionary, and equality function on every event type ever passed through Publish[T]. Moving the implementation to a non-generic core lets the diagnostic surface work unchanged while charging zero per-T cost for the diagnostic-driven generic interface. Publisher[T].Publish is also slimmed: the channel/select/stopFlag loop is now a non-generic publish() helper that takes the value as 'any'. The per-T body is reduced to forwarding the boxed value to the helper. Measured impact (util/eventbus/sizetest): total per-flow binary cost: linux/amd64: 2252.8 B/flow -> 1900.5 B/flow (-352.3 B / -15.6%) linux/arm64: 2228.2 B/flow -> 1835.0 B/flow (-393.2 B / -17.6%) Publisher per-receiver attribution: linux/amd64: 635.2 B/flow -> 369.6 B/flow (-265.6 B / -41.8%) linux/arm64: 751.7 B/flow -> 373.2 B/flow (-378.5 B / -50.4%) Cumulative reduction from the original baseline (5167ff412): linux/amd64: 3096.6 B/flow -> 1900.5 B/flow (-1196.1 B / -38.6%) linux/arm64: 3145.7 B/flow -> 1835.0 B/flow (-1310.7 B / -41.7%) Dropped per-T symbols (200-flow eventbus binary): - .dict.Publisher[T] was 14,400 B (72 B/T) - type:.eq.Publisher[T] was 11,832 B (58 B/T) - go:itab.Publisher[T],publisher was 8,000 B (40 B/T) - (Publisher[T]).Close shape stencils collapsed to 1 Behavior is unchanged: BenchmarkBasicThroughput is within noise (2018 -> 2038 ns/op at -benchtime=2s) and all eventbus tests pass. Updates #12614 Change-Id: I61979c2bf95d2a711c2321e6e0b4b7d15980e9f5 Signed-off-by: James Tucker <james@tailscale.com>	2026-05-11 14:39:42 -07:00
James Tucker	d72cde1a6b	util/eventbus: move SubscriberFunc subscriber-interface impl to a non-generic core Splits SubscriberFunc[T] into: - SubscriberFunc[T]: a thin user-facing facade that holds only a pointer to a non-generic core. It exposes Close() to user code, which forwards to the core. - subscriberFuncCore: a non-generic struct that owns all the subscriber state (stop flag, unregister, logf, slow timer, cached reflect.Type) and implements the bus's package-private subscriber interface. Its dispatch() invokes a closure captured at construction time that performs the vals.Peek().Event.(T) type assertion and runs the user callback on the unboxed value. The bus's outputs map and subscriber-interface itab are parameterized only by subscriberFuncCore, not by T, eliminating both the per-T itab and the per-T generic dictionary that previously scaled with the number of subscribed event types. Measured impact (util/eventbus/sizetest): total per-flow binary cost: linux/amd64: 3039.2 B/flow -> 2252.8 B/flow (-786.4 B / -25.9%) linux/arm64: 3145.7 B/flow -> 2228.2 B/flow (-917.5 B / -29.2%) SubscriberFunc per-receiver attribution: linux/amd64: 840.8 B/flow -> 300.8 B/flow (-540.0 B / -64.2%) linux/arm64: 849.9 B/flow -> 303.8 B/flow (-546.1 B / -64.3%) Dropped per-T symbols (200-flow eventbus binary): - (SubscriberFunc[T]).dispatch was 26,639 B total (130 B/T) - (SubscriberFunc[T]).subscribeType was 3,600 B total ( 18 B/T) - .dict.SubscriberFunc[T] was 14,400 B total ( 72 B/T) - go:itab.SubscriberFunc[T],... was 9,600 B total ( 48 B/T) Of the original 913 B/flow attributed to SubscriberFunc, 540 B/flow is now gone, dropping the receiver to 300 B/flow. Behavior is unchanged: BenchmarkBasicThroughput is within noise (1955 -> 1941 ns/op on the test box) and all eventbus tests pass. Updates #12614 Change-Id: I646b3b05fd8d95f9afead59bfd0f69cd18b7a709 Signed-off-by: James Tucker <james@tailscale.com>	2026-05-11 12:14:05 -07:00
Francois Marier	ead5ce65a3	cmd/pgproxy: fix client TLS handshake timeout There is a 30-second timeout set on client TLS connections but the handshake was called on the wrong connection and so the timeout was never used in practice. Signed-off-by: Francois Marier <francois@fmarier.org>	2026-05-11 11:12:11 -07:00
Fran Bull	2f45a6a9d8	feature/conn25: return expired assignments to address pools Make it possible to remove the least recently used expired address assignment from addrAssignments. Before checking out a new address from the IP pools, return a handful of expired addresses. Updates tailscale/corp#39975 Signed-off-by: Fran Bull <fran@tailscale.com>	2026-05-08 14:33:06 -07:00
Fran Bull	82346f3882	feature/conn25: move addrAssignments to their own file Updates tailscale/corp#39975 Signed-off-by: Fran Bull <fran@tailscale.com>	2026-05-08 14:33:06 -07:00
Claus Lensbøl	469d356ed8	tstest/natlab/vmtest: add test for direct conn with cached netmap (#19660 ) When a peer is not able to connect to control after a restart and is using a cached netmap, that nodes should be able to connect to another peer in its tailnet (given that the home DERP of that peer has not changed in the meantime). Add test that starts two peers and connects them to a tailnet with caching enabled. Then blackhole traffic to control from one peer and restart it. Verify that the connection between the two ends up direct. Adds facilities for expecting a certain path type between nodes. Updates: #19597 Signed-off-by: Claus Lensbøl <claus@tailscale.com>	2026-05-08 16:57:27 -04:00
Fran Bull	ee2378b141	feature/conn25: follow CNAMEs when rewriting DNS response If a DNS query for a domain that should be routed through a connector results in CNAME records in the response, collapse the CNAME chain to an A/AAAA record for the domain -> magic IP. Fixes tailscale/corp#39978 Signed-off-by: Fran Bull <fran@tailscale.com>	2026-05-08 08:12:24 -07:00
Brad Fitzpatrick	24eb157448	go.toolchain.rev: bump to Go 1.26.3 Updates tailscale/corp#41490 Change-Id: I35b67bdbcd71468fea03b033b17aeefe1319dc45 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-07 15:33:05 -07:00
Alex Chan	d6ffc0d986	tka,ipn: reduce boilerplate in Tailnet Lock tests The `CreateStateForTest` helper reduces boilerplate in cases where the test only cares about the trusted keys and not the disablement values (and makes it more obvious where the disablement values are meaningful). The `setupChonkStorage` helper reduces the boilerplate when creating on-disk TKA storage in tests. The `fakeLocalBackend` helper reduces the boilerplate when setting up a `LocalBackend` instance in the IPN tests. Updates #cleanup Change-Id: Iacfba1be5f7fab208eec11e4369d63c7d7519da5 Signed-off-by: Alex Chan <alexc@tailscale.com>	2026-05-07 21:49:27 +01:00
Fernando Serboncini	495d3acc7b	tstest/natlab/vmtest: kill QEMU when test process dies (#19676 ) Re-exec the test binary as a thin wrapper that holds a pipe inherited from the test. When the test goes away (any reason, including SIGKILL, panic, or OOM), the kernel closes the pipe write end; the wrapper sees EOF and SIGKILLs itself, taking QEMU and its children with it. Updates #13038 Change-Id: Ib2151098193551396c1d7bb51b07da3bd6b2cfb4 Signed-off-by: Fernando Serboncini <fserb@tailscale.com>	2026-05-07 16:14:27 -04:00
Claus Lensbøl	76248a68b2	tstest/natlab/vnet: close gonet sockets when test is done (#19677 ) Running all vmtests in tstest/natlab/vmtest locally was breaking later tasks in the queue. The goroutine dump on timeout had goroutines hanging around for 9 minutes, meaning that something was not getting cleaned up. goroutine 262 [select, 9 minutes]: gvisor.dev/gvisor/pkg/tcpip/adapters/gonet.commonRead({...}) Add a timeout of Now() to gonet TCP connections when the test ends (inspired by ServeUnixConn()), and wait for them to shut down before exiting the test. Updates #13038 Signed-off-by: Claus Lensbøl <claus@tailscale.com>	2026-05-07 14:57:07 -04:00
Hazel T	33b9579c21	scripts/installer.sh: add openSUSE Slowroll as a Tumbleweed derivative (#19662 ) Fixes: #14927 Signed-off-by: Hazel T <hazel@tailscale.com>	2026-05-07 12:43:55 +01:00
Erisa A	76712b32d9	.github: install ca-certificates on Kali to fix installer tests (#19673 ) Updates #cleanup Signed-off-by: Erisa A <erisa@tailscale.com>	2026-05-07 12:20:09 +01:00
James Tucker	0def0f19bd	util/eventbus: extract SubscriberFunc.dispatch loop to a non-generic helper The (SubscriberFunc[T]).dispatch method body — a ~40-line select loop with slow-subscriber timer, snapshot handling, ctx-cancel draining, and a CI stack-dump branch — was previously fully duplicated by the Go compiler for every distinct GC shape of T. None of that body actually depends on T except for the type assertion and the user callback invocation. This change moves the loop body into a non-generic dispatchFunc() helper, leaving (SubscriberFunc[T]).dispatch as a tiny wrapper that: - performs the vals.Peek().Event.(T) type assertion - spawns the callback goroutine via `go runFuncCallback(s.read, t, callDone)` — a regular generic function call, not a closure, so that `go` binds the args to the goroutine's frame instead of allocating a closure on the heap. This preserves the zero-extra-allocation behavior of the original (*SubscriberFunc[T]).runCallback method. - resolves T's name via reflect.TypeFor[T]().String() (cached on the stack rather than recomputed on each %T formatting) - calls dispatchFunc with the callDone channel The %T formatting in the original logf calls is replaced with %s on the resolved name string, removing per-T fmt instantiations. A new BenchmarkBasicFuncThroughput is added alongside the existing BenchmarkBasicThroughput so per-event allocation behavior on the SubscribeFunc dispatch path is covered by the benchmark suite. Measured impact (util/eventbus/sizetest): SubscriberFunc per-flow attribution: linux/amd64: 912.5 B/flow -> 840.8 B/flow (-71.7 B/flow) linux/arm64: 917.5 B/flow -> 849.9 B/flow (-67.6 B/flow) The total per-flow size delta on amd64 dropped from 3,096.6 B to 3,039.2 B (-57 B/flow). The arm64 total stayed at 3,145.7 B because the linker's page-aligned section sizing absorbed the improvement on this binary; the symcost-attributed per-receiver number is the real signal. Behavior is unchanged: BenchmarkBasicThroughput stays at 0 allocs/op and BenchmarkBasicFuncThroughput holds at the same 2 allocs/op, 144 B/op as the prior eventbus implementation. All eventbus tests pass. Updates #12614 Change-Id: I85f933f50f58cd25bbfe5cc46bdda7aab22f0bf7 Signed-off-by: James Tucker <james@tailscale.com>	2026-05-06 18:56:09 -07:00
Brad Fitzpatrick	87a74c3aa2	tsnet: make workload identity federation opt-in The tailscale.com/wif package brings in the AWS SDK (github.com/aws/aws-sdk-go-v2/{config,sts,...} and github.com/aws/smithy-go) to support fetching ID tokens from AWS IMDS for workload identity federation. Until now, tsnet pulled this in unconditionally via feature/condregister/identityfederation, costing ~70 unwanted deps for every tsnet program whether or not it uses workload identity federation. These AWS SDK deps were originally removed from tsnet on 2025-09-29 by commit 69c79cb9f ("ipn/store, feature/condregister: move AWS + Kube store registration to condregister"). They were then accidentally added back on 2026-01-14 by commit 6a6aa805d ("cmd,feature: add identity token auto generation for workload identity", PR #18373) when the new wif package was wired into tsnet via feature/identityfederation. Drop the blanket import. tsnet programs that want workload identity federation now opt in with: import _ "tailscale.com/feature/identityfederation" The hook lookup in resolveAuthKey already uses GetOk and degrades gracefully when the feature isn't linked, so existing programs that don't use workload identity federation see no behavior change. The tailscale CLI still imports the condregister wrapper directly, so its behavior is also unchanged. Lock this in with TestDeps additions: tailscale.com/wif as a BadDep, plus substring checks in OnDep that fail on any github.com/aws/ or k8s.io/ dependency creeping back in. Also, switch cmd/gitops-pusher from the condregister wrapper to a direct import of feature/identityfederation: gitops-pusher's auth flow calls HookExchangeJWTForTokenViaWIF directly, so it shouldn't be subject to the ts_omit_identityfederation build tag. Updates #12614 Change-Id: I70599f2bdd4d3666b26a859d5b76caa5d6b94507 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-06 18:43:45 -07:00
Adriano Sela Aviles	daddb14b8f	control/controlhttp: use ws:// when HTTPSPort is NoPort in JS dialer When HTTPS is explicitly disabled (HTTPSPort == NoPort), the JS WebSocket dialer should use ws:// instead of wss://. This matches the behavior of the non-JS client and fixes connections to development control servers e.g. http://localhost:31544. Updates tailscale/corp#40944 Signed-off-by: Adriano Sela Aviles <adriano@tailscale.com>	2026-05-06 15:58:58 -07:00
Brad Fitzpatrick	d06cc56987	wgengine/magicsock: add more docs, checks to Test32bitAlignment Per recent chat with @raggi about all this, I went and looked at this test again. Updates #cleanup Change-Id: Icb7d87b1ed2cebf481ee4e358a3aa603e63fb8a4 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-06 15:29:44 -07:00
Brad Fitzpatrick	15bb10dbce	tsnet: ban awsstore and kubestore as deps in TestDeps Commit 69c79cb9f (Sep 2025) moved awsstore and kubestore registration behind condregister build tags so tsnet wouldn't pull in the AWS SDK and Kubernetes client by default. The accompanying TestDeps BadDeps entry was missed, so PR #19667 (which re-added those imports) wasn't caught by the test. Add the two packages to BadDeps so future regressions fail the test. Updates #19667 Updates #12614 Change-Id: I903b7c976e5e122cc0c0b956dc73740f5d474fac Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-06 14:57:47 -07:00
Tom Proctor	b74eeda055	cmd/testwrapper: print unit for package duration (#19663 ) Include the unit (s) when printing the time taken to test each package. Updates #cleanup Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>	2026-05-06 22:31:48 +01:00
kari-ts	c721189cef	ipn/ipnlocal: prefer one CGNAT route on Android (#19652 ) Android rebuilds its VpnService interface when the VPN route configuration changes, which tears down long lived TCP connections through the tunnel. Use the same automatic OneCGNATRoute behavior as macOS on Android, and prefer the single CGNAT route when no other interface is using the CGNAT, falling back to fine grained peer routes otherwise. Updates tailscale/tailscale#19591 Signed-off-by: kari <kari@tailscale.com>	2026-05-05 19:11:17 -07:00
Brad Fitzpatrick	f844c8bc32	util/winutil/gp: deflake TestGroupPolicyReadLockClose The test goroutine read lockCnt immediately after Lock returned, racing with Close: close(lk.closing) wakes lockSlow's select, whose deferred Add(-2) on lockCnt can run before Close's CAS clears the LSB. When that happens, lockCnt is briefly 1 (3 - 2) instead of 0 (1 + 2 - 2 - 1), producing "lockCnt: got 1; want 0". Move the lockCnt assertion into the main test goroutine, after both Close has returned and the Lock goroutine has finished, so both updates have settled before we read. Fixes #19647 Change-Id: Ia67036ff73a1beb528cbd621460db9048f3066ad Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-05 14:02:35 -07:00
Jonathan Nobels	872d79089e	VERSION.txt: this is v1.99.0 (#19645 ) Signed-off-by: Jonathan Nobels <jonathan@tailscale.com> v1.99.0-pre	2026-05-05 15:07:20 -04:00

1 2 3 4 5 ...

10636 Commits