852 Commits

Author SHA1 Message Date
Brad Fitzpatrick
cf520a3371 feature/featuretags: add LazyWG modular feature
Due to iOS memory limitations in 2020 (see
https://tailscale.com/blog/go-linker, etc) and wireguard-go using
multiple goroutines per peer, commit 16a9cfe2f4ce7d introduced some
convoluted pathsways through Tailscale to look at packets before
they're delivered to wireguard-go and lazily reconfigure wireguard on
the fly before delivering a packet, only telling wireguard about peers
that are active.

We eventually want to remove that code and integrate wireguard-go's
configuration with Tailscale's existing netmap tracking.

To make it easier to find that code later, this makes it modular. It
saves 12 KB (of disk) to turn it off (at the expense of lots of RAM),
but that's not really the point. The point is rather making it obvious
(via the new constants) where this code even is.

Updates #12614

Change-Id: I113b040f3e35f7d861c457eaa710d35f47cee1cb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-06 07:49:40 -07:00
Jordan Whited
e44e28efcd
wgengine/magicsock: fix relayManager deadlock (#17449)
Updates tailscale/corp#32978

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-10-04 20:27:57 -07:00
Jordan Whited
3aa8b6d683
wgengine/magicsock: remove misleading unexpected log message (#17445)
Switching to a Geneve-encapsulated (peer relay) path in
endpoint.handlePongConnLocked is expected around port rebinds, which end
up clearing endpoint.bestAddr.

Fixes tailscale/corp#33036

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-10-04 15:05:41 -07:00
Brad Fitzpatrick
3c7e351671 net/connstats: make it modular (omittable)
Saves only 12 KB, but notably removes some deps on packages that future
changes can then eliminate entirely.

Updates #12614

Change-Id: Ibf830d3ee08f621d0a2011b1d4cd175427ef50df
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-04 13:17:25 -07:00
Brad Fitzpatrick
447cbdd1d0 health: make it omittable
Saves 86 KB.

And stop depending on expvar and usermetrics when disabled,
in prep to removing all the expvar/metrics/tsweb stuff.

Updates #12614

Change-Id: I35d2479ddd1d39b615bab32b1fa940ae8cbf9b11
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-03 17:23:54 -07:00
Brad Fitzpatrick
c45f8813b4 feature/featuretags, all: add build features, use existing ones in more places
Saves 270 KB.

Updates #12614

Change-Id: I4c3fe06d32c49edb3a4bb0758a8617d83f291cf5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-02 08:07:25 -07:00
Brad Fitzpatrick
ee034d48fc feature/featuretags: add a catch-all "Debug" feature flag
Saves 168 KB.

Updates #12614

Change-Id: Iaab3ae3efc6ddc7da39629ef13e5ec44976952ba
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-09-30 11:32:33 -07:00
Brad Fitzpatrick
01e645fae1 util/backoff: rename logtail/backoff package to util/backoff
It has nothing to do with logtail and is confusing named like that.

Updates #cleanup
Updates #17323

Change-Id: Idd34587ba186a2416725f72ffc4c5778b0b9db4a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-09-28 11:55:07 -07:00
Alex Chan
002ecb78d0 all: don't rebind variables in for loops
See https://tip.golang.org/wiki/LoopvarExperiment#does-this-mean-i-dont-have-to-write-x--x-in-my-loops-anymore

Updates https://github.com/tailscale/tailscale/issues/11058

Signed-off-by: Alex Chan <alexc@tailscale.com>
2025-09-26 16:19:42 +01:00
James Tucker
8b3e88cd09
wgengine/magicsock: fix rebind debouncing (#17282)
On platforms that are causing EPIPE at a high frequency this is
resulting in non-working connections, for example when Apple decides to
forcefully close UDP sockets due to an unsoliced packet rejection in the
firewall.

Too frequent rebinds cause a failure to solicit the endpoints triggering
the rebinds, that would normally happen via CallMeMaybe.

Updates #14551
Updates tailscale/corp#25648

Signed-off-by: James Tucker <james@tailscale.com>
2025-09-26 11:06:39 -04:00
Simon Law
34242df51b
derp/derpserver: clean up extraction of derp.Server (#17264)
PR #17258 extracted `derp.Server` into `derp/derpserver.Server`.

This followup patch adds the following cleanups:
1. Rename `derp_server*.go` files to `derpserver*.go` to match
   the package name.
2. Rename the `derpserver.NewServer` constructor to `derpserver.New`
   to reduce stuttering.
3. Remove the unnecessary `derpserver.Conn` type alias.

Updates #17257
Updates #cleanup

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2025-09-24 10:38:30 -07:00
Brad Fitzpatrick
21dc5f4e21 derp/derpserver: split off derp.Server out of derp into its own package
This exports a number of things from the derp (generic + client) package
to be used by the new derpserver package, as now used by cmd/derper.

And then enough other misc changes to lock in that cmd/tailscaled can
be configured to not bring in tailscale.com/client/local. (The webclient
in particular, even when disabled, was bringing it in, so that's now fixed)

Fixes #17257

Change-Id: I88b6c7958643fb54f386dd900bddf73d2d4d96d5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-09-24 09:19:01 -07:00
Brad Fitzpatrick
b54cdf9f38 all: use buildfeatures.HasCapture const in a handful of places
Help out the linker's dead code elimination.

Updates #12614

Change-Id: I6c13cb44d3250bf1e3a01ad393c637da4613affb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-09-24 08:31:25 -07:00
Jonathan Nobels
4af15a1148
magicsock: fix deadlock in SetStaticEndpoints (#17247)
updates tailscale/corp#32600

A localAPI/cli call to reload-config can end up leaving magicsock's mutex
locked.   We were missing an unlock for the early exit where there's no change in
the static endpoints when the disk-based config is loaded.  This is not likely
the root cause of the linked issue - just noted during investigation.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
2025-09-23 13:35:22 -04:00
M. J. Fromberger
2b6bc11586
wgengine: use eventbus.Client.Monitor to simplify subscriber maintenance (#17203)
This commit does not change the order or meaning of any eventbus activity, it
only updates the way the plumbing is set up.

Updates #15160

Change-Id: I40c23b183c2a6a6ea3feec7767c8e5417019fc07
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2025-09-19 13:20:50 -07:00
Brad Fitzpatrick
99b3f69126 feature/portmapper: make the portmapper & its debugging tools modular
Starting at a minimal binary and adding one feature back...
    tailscaled tailscale combined (linux/amd64)
     30073135  17451704  31543692 omitting everything
    +  480302 +   10258 +  493896 .. add debugportmapper
    +  475317 +  151943 +  467660 .. add portmapper
    +  500086 +  162873 +  510511 .. add portmapper+debugportmapper

Fixes #17148

Change-Id: I90bd0e9d1bd8cbe64fa2e885e9afef8fb5ee74b1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-09-16 11:35:49 -07:00
M. J. Fromberger
8608e42103
feature,ipn/ipnlocal,wgengine: improve how eventbus shutdown is handled (#17156)
Instead of waiting for a designated subscription to close as a canary for the
bus being stopped, use the bus Client's own signal for closure added in #17118.

Updates #cleanup

Change-Id: I384ea39f3f1f6a030a6282356f7b5bdcdf8d7102
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2025-09-16 10:52:39 -07:00
Claus Lensbøl
2015ce4081
health,ipn/ipnlocal: introduce eventbus in heath.Tracker (#17085)
The Tracker was using direct callbacks to ipnlocal. This PR moves those
to be triggered via the eventbus.

Additionally, the eventbus is now closed on exit from tailscaled
explicitly, and health is now a SubSystem in tsd.

Updates #15160

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2025-09-16 11:25:29 -04:00
Brad Fitzpatrick
8b48f3847d net/netmon, wgengine/magicsock: simplify LinkChangeLogLimiter signature
Remove the need for the caller to hold on to and call an unregister
function. Both two callers (one real, one test) already have a context
they can use. Use context.AfterFunc instead. There are no observable
side effects from scheduling too late if the goroutine doesn't run sync.

Updates #17148

Change-Id: Ie697dae0e797494fa8ef27fbafa193bfe5ceb307
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-09-15 16:12:24 -07:00
Alex Chan
5c24f0ed80 wgengine/magicsock: send a valid payload in TestNetworkDownSendErrors
This test ostensibly checks whether we record an error metric if a packet
is dropped because the network is down, but the network connectivity is
irrelevant -- the send error is actually because the arguments to Send()
are invalid:

    RebindingUDPConn.WriteWireGuardBatchTo:
    [unexpected] offset (0) != Geneve header length (8)

This patch changes the test so we try to send a valid packet, and we
verify this by sending it once before taking the network down.  The new
error is:

    magicsock: network down

which is what we're trying to test.

We then test sending an invalid payload as a separate test case.

Updates tailscale/corp#22075

Signed-off-by: Alex Chan <alexc@tailscale.com>
2025-09-15 23:26:32 +01:00
Jordan Whited
998a667cd5
wgengine/magicsock: don't add DERP addrs to endpointState (#17147)
endpointState is used for tracking UDP direct connection candidate
addresses. If it contains a DERP addr, then direct connection path
discovery will always send a wasteful disco ping over it. Additionally,
CLI "tailscale ping" via peer relay will race over DERP, leading to a
misleading result if pong arrives via DERP first.

Disco pongs arriving via DERP never influence path selection. Disco
ping/pong via DERP only serves "tailscale ping" reporting.

Updates #17121

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-09-15 15:22:13 -07:00
Jordan Whited
fb9d9ba86e
wgengine/magicsock: add TS_DEBUG_NEVER_DIRECT_UDP debug knob (#17094)
Updates tailscale/corp#30903

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-09-10 16:48:40 -07:00
Jordan Whited
6feb6f3c75
wgengine/magicsock: add relayManager event logs (#17091)
These are gated behind magicsock component debug logging.

Updates tailscale/corp#30818

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-09-10 12:36:53 -07:00
Jordan Whited
2d9d869d3d
wgengine/magicsock: fix debug disco printing of alloc resp disco keys (#17087)
Updates tailscale/corp#30818

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-09-09 15:38:08 -07:00
James Tucker
a29545e9cc wgengine/magicsock: log the peer failing disco writes are intended for
Updates tailscale/corp#31762

Signed-off-by: James Tucker <james@tailscale.com>
2025-09-05 19:02:17 -07:00
James Tucker
3b68d607be wgengine/magicsock: drop DERP queue from head rather than tail
If the DERP queue is full, drop the oldest item first, rather than the
youngest, on the assumption that older data is more likely to be
unanswerable.

Updates tailscale/corp#31762

Signed-off-by: James Tucker <james@tailscale.com>
2025-08-29 15:13:02 -07:00
James Tucker
f5d3c59a92 wgengine/magicsock: shorten process internal DERP queue
DERP writes go via TCP and the host OS will have plenty of buffer space.
We've observed in the wild with a backed up TCP socket kernel side
buffers of >2.4MB. The DERP internal queue being larger causes an
increase in the probability that the contents of the backbuffer are
"dead letters" - packets that were assumed to be lost.

A first step to improvement is to size this queue only large enough to
avoid some of the initial connect stall problem, but not large enough
that it is contributing in a substantial way to buffer bloat /
dead-letter retention.

Updates tailscale/corp#31762

Signed-off-by: James Tucker <james@tailscale.com>
2025-08-28 20:44:47 -07:00
James Tucker
d42f0b6a21 util/ringbuffer: rename to ringlog
I need a ringbuffer in the more traditional sense, one that has a notion
of item removal as well as tail loss on overrun. This implementation is
really a clearable log window, and is used as such where it is used.

Updates #cleanup
Updates tailscale/corp#31762

Signed-off-by: James Tucker <james@tailscale.com>
2025-08-28 15:41:07 -07:00
Jordan Whited
575664b263
wgengine/magicsock: make endpoint.discoPing peer relay aware (#16946)
Updates tailscale/corp#30333

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-08-26 09:22:36 -07:00
Jordan Whited
9403ba8c69
wgengine/magicsock: trigger peer relay path discovery on CallMeMaybe RX (#16929)
Updates tailscale/corp#30333

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-08-25 09:40:15 -07:00
Jordan Whited
b17cfe4aed
wgengine/magicsock,net/sockopts: export Windows ICMP suppression logic (#16917)
For eventual use by net/udprelay.Server.

Updates tailscale/corp#31506

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-08-21 13:44:13 -07:00
Jordan Whited
641a90ea33
net/sockopts,wgengine/magicsock: export socket buffer sizing logic (#16909)
For eventual use by net/udprelay.Server

Updates tailscale/corp#31164

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-08-20 16:24:00 -07:00
Jordan Whited
16bc0a5558
net/{batching,packet},wgengine/magicsock: export batchingConn (#16848)
For eventual use by net/udprelay.Server.

Updates tailscale/corp#31164

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-08-13 13:13:11 -07:00
Jordan Whited
cde65dba16
wgengine/magicsock: add clientmetric for Peer Relay challenge reception (#16834)
Updates tailscale/corp#30527

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-08-11 14:53:25 -07:00
Jordan Whited
4fa27db8dd
wgengine/magicsock: add clientmetrics for locally delivered Peer Relay alloc disco (#16833)
Expected when Peer Relay'ing via self. These disco messages never get
sealed, and never leave the process.

Updates tailscale/corp#30527

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-08-11 14:48:19 -07:00
Jordan Whited
36397f1794
wgengine/magicsock: add clientmetrics for TX direction Peer Relay disco messages (#16831)
Updates tailscale/corp#30527

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-08-11 13:29:57 -07:00
Jordan Whited
d122f0350e
control/controlknobs,tailcfg,wgengine/magicsock: deprecate NodeAttrDisableMagicSockCryptoRouting (#16818)
Peer Relay is dependent on crypto routing, therefore crypto routing is
now mandatory.

Updates tailscale/corp#20732
Updates tailscale/corp#31083

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-08-11 09:04:03 -07:00
Jordan Whited
4666d4ca2a
wgengine/magicsock: fix missing Conn.hasPeerRelayServers.Store() call (#16792)
This commit also extends the updateRelayServersSet unit tests to cover
onNodeViewsUpdate.

Fixes tailscale/corp#31080

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-08-06 14:57:55 -07:00
Jordan Whited
0374e6d906
wgengine/magicsock: add lazyEndpoint.FromPeer tests (#16791)
Updates tailscale/corp#30903

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-08-06 14:55:34 -07:00
Jordan Whited
02967ffcf2
wgengine/magicsock: add lazyEndpoint.InitiationMessagePublicKey tests (#16790)
Updates tailscale/corp#30903

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-08-06 14:41:05 -07:00
Jordan Whited
908f20e0a5
wgengine/magicsock: add receiveIP() unit tests (#16781)
One of these tests highlighted a Geneve encap bug, which is also fixed
in this commit.

looksLikeInitMsg was passed a packet post Geneve header stripping with
slice offsets that had not been updated to account for the stripping.

Updates tailscale/corp#30903

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-08-06 09:35:25 -07:00
Jordan Whited
b0018f1e7d
wgengine/magicsock: fix looksLikeInitiationMsg endianness (#16771)
WireGuard message type is little-endian encoded.

Updates tailscale/corp#30903

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-08-04 14:21:32 -07:00
M. J. Fromberger
b34cdc9710
ipn,net,tsnet,wgengine: make an eventbus mandatory where it is used (#16594)
In the components where an event bus is already plumbed through, remove the
exceptions that allow it to be omitted, and update all the tests that relied on
those workarounds execute properly.

This change applies only to the places where we're already using the bus; it
does not enforce the existence of a bus in other components (yet),

Updates #15160

Change-Id: Iebb92243caba82b5eb420c49fc3e089a77454f65
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2025-07-29 09:04:08 -07:00
Jordan Whited
a9f3fd1c67
wgengine/magicsock: fix magicsock deadlock around Conn.NoteRecvActivity (#16687)
Updates #16651
Updates tailscale/corp#30836

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-07-28 09:26:24 -07:00
Jordan Whited
179745b83e
wgengine/magicsock: update discoInfo docs (#16638)
discoInfo is also used for holding peer relay server disco keys.

Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-07-23 12:30:04 -07:00
Jordan Whited
1677fb1905
wgengine/magicsock,all: allocate peer relay over disco instead of PeerAPI (#16603)
Updates tailscale/corp#30583
Updates tailscale/corp#30534
Updates tailscale/corp#30557

Signed-off-by: Dylan Bargatze <dylan@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Co-authored-by: Dylan Bargatze <dylan@tailscale.com>
2025-07-21 10:02:37 -07:00
Jordan Whited
36aeacb297
wgengine/magicsock: add peer relay metrics (#16582)
Updates tailscale/corp#30040

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-07-16 14:34:05 -07:00
Jordan Whited
3c6d17e6f1
cmd/tailscale/cli,ipn/ipnlocal,wgengine/magicsock: implement tailscale debug peer-relay-servers (#16577)
Updates tailscale/corp#30036

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-07-16 10:03:05 -07:00
Jordan Whited
d65c0fd2d0
tailcfg,wgengine/magicsock: set peer relay CapVer (#16531)
Updates tailscale/corp#27502
Updates tailscale/corp#30051

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-07-15 12:29:07 -07:00
Jordan Whited
b63f8a457d
wgengine/magicsock: prioritize trusted peer relay paths over untrusted (#16559)
A trusted peer relay path is always better than an untrusted direct or
peer relay path.

Updates tailscale/corp#30412

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-07-14 15:09:31 -07:00