9706 Commits

Author SHA1 Message Date
Brad Fitzpatrick
525f9921fe cmd/testwrapper/flakytest: use t.Attr annotation on flaky tests
Updates #17460

Change-Id: I7381e9a6dd73514c73deb6b863749eef1a87efdc
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-06 10:58:48 -07:00
Brad Fitzpatrick
541a4ed5b4 all: use buildfeatures consts in a few more places
Saves ~25 KB.

Updates #12614

Change-Id: I7b976e57819a0d2692824d779c8cc98033df0d30
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-06 10:48:55 -07:00
Jordan Whited
44e1d735c3
tailcfg: bump CapVer for magicsock deadlock fix (#17450)
The fix that was applied in e44e28efcd95596c0a86270c177ef912119bf851.

Updates tailscale/corp#32978

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-10-06 09:41:52 -07:00
Alex Chan
6db8957744 tstest/integration: mark TestPeerRelayPing as flaky
Updates #17251

Signed-off-by: Alex Chan <alexc@tailscale.com>
2025-10-06 17:01:02 +01:00
Brad Fitzpatrick
f208bf8cb1 types/lazy: document difference from sync.OnceValue
Updates #8419
Updates github.com/golang#62202

Change-Id: I0c082c4258fb7a95a17054f270dc32019bcc7581
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-06 08:35:04 -07:00
Brad Fitzpatrick
cf520a3371 feature/featuretags: add LazyWG modular feature
Due to iOS memory limitations in 2020 (see
https://tailscale.com/blog/go-linker, etc) and wireguard-go using
multiple goroutines per peer, commit 16a9cfe2f4ce7d introduced some
convoluted pathsways through Tailscale to look at packets before
they're delivered to wireguard-go and lazily reconfigure wireguard on
the fly before delivering a packet, only telling wireguard about peers
that are active.

We eventually want to remove that code and integrate wireguard-go's
configuration with Tailscale's existing netmap tracking.

To make it easier to find that code later, this makes it modular. It
saves 12 KB (of disk) to turn it off (at the expense of lots of RAM),
but that's not really the point. The point is rather making it obvious
(via the new constants) where this code even is.

Updates #12614

Change-Id: I113b040f3e35f7d861c457eaa710d35f47cee1cb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-06 07:49:40 -07:00
kscooo
f80c7e7c23 net/wsconn: clarify package comment
Explain that this file stays forked from coder/websocket until we can
depend on an upstream release for the helper.

Updates #cleanup

Signed-off-by: kscooo <kscowork@gmail.com>
2025-10-04 21:28:30 -07:00
Brad Fitzpatrick
6820ec5bbb wgengine: stop importing flowtrack when unused
Updates #12614

Change-Id: I42b5c4d623d356af4bee5bbdabaaf0f6822f2bf4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-04 20:52:13 -07:00
Jordan Whited
e44e28efcd
wgengine/magicsock: fix relayManager deadlock (#17449)
Updates tailscale/corp#32978

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-10-04 20:27:57 -07:00
Jordan Whited
3aa8b6d683
wgengine/magicsock: remove misleading unexpected log message (#17445)
Switching to a Geneve-encapsulated (peer relay) path in
endpoint.handlePongConnLocked is expected around port rebinds, which end
up clearing endpoint.bestAddr.

Fixes tailscale/corp#33036

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-10-04 15:05:41 -07:00
Brad Fitzpatrick
3c7e351671 net/connstats: make it modular (omittable)
Saves only 12 KB, but notably removes some deps on packages that future
changes can then eliminate entirely.

Updates #12614

Change-Id: Ibf830d3ee08f621d0a2011b1d4cd175427ef50df
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-04 13:17:25 -07:00
Brad Fitzpatrick
2e381557b8 feature/c2n: move answerC2N code + deps out of control/controlclient
c2n was already a conditional feature, but it didn't have a
feature/c2n directory before (rather, it was using consts + DCE). This
adds it, and moves some code, which removes the httprec dependency.

Also, remove some unnecessary code from our httprec fork.

Updates #12614

Change-Id: I2fbe538e09794c517038e35a694a363312c426a2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-04 13:16:49 -07:00
Brad Fitzpatrick
db65f3fcf8 ipn/ipnlocal: use buildfeature consts in a few more places
Updates #12614

Change-Id: I561d434d9829172a3d7f6933399237924ff80490
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-04 13:16:49 -07:00
Brad Fitzpatrick
223ced84b5 feature/ace: make ACE modular
Updates #12614

Change-Id: Iaee75d8831c4ba5c9705d7877bb78044424c6da1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-03 19:37:42 -07:00
Brad Fitzpatrick
141eb64d3f wgengine/router/osrouter: fix data race in magicsock port update callback
As found by @cmol in #17423.

Updates #17423

Change-Id: I1492501f74ca7b57a8c5278ea6cb87a56a4086b9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-03 17:38:18 -07:00
Brad Fitzpatrick
447cbdd1d0 health: make it omittable
Saves 86 KB.

And stop depending on expvar and usermetrics when disabled,
in prep to removing all the expvar/metrics/tsweb stuff.

Updates #12614

Change-Id: I35d2479ddd1d39b615bab32b1fa940ae8cbf9b11
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-03 17:23:54 -07:00
Simon Law
9c3aec58ba
ipn/ipnlocal: remove junk from suggestExitNodeUsingTrafficSteering (#17436)
This patch removes some code that didn’t get removed before merging
the changes in #16580.

Updates #cleanup
Updates #16551

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2025-10-03 16:29:50 -07:00
Brad Fitzpatrick
f42be719de all: use buildfeature constants in a few more places
Saves 21 KB.

Updates #12614

Change-Id: I0cd3e735937b0f5c0fcc9f09a24476b1c4ac9a15
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-03 10:08:06 -07:00
Alex Chan
59a39841c3 tstest/integration: mark TestClientSideJailing as flaky
Updates #17419

Signed-off-by: Alex Chan <alexc@tailscale.com>
2025-10-03 17:58:08 +01:00
Tom Meadows
8d4ea55cc1
cmd/k8s-proxy: switching to using ipn/store/kubestore (#17402)
kubestore init function has now been moved to a more explicit path of
ipn/store/kubestore meaning we can now avoid the generic import of
feature/condregister.

Updates #12614

Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
2025-10-03 17:19:38 +01:00
Alex Chan
304dabce17 ipn/ipnauth: fix a null pointer panic in GetConnIdentity
When running integration tests on macOS, we get a panic from a nil
pointer dereference when calling `ci.creds.PID()`.

This panic occurs because the `ci.creds != nil` check is insufficient
after a recent refactoring (c45f881) that changed `ci.creds` from a
pointer to the `PeerCreds` interface. Now `ci.creds` always compares as
non-nil, so we enter this block even when the underlying value is nil.

The integration tests fail on macOS when `peercred.Get()` returns the
error `unix.GetsockoptInt: socket is not connected`. This error isn't
new, and the previous code was ignoring it correctly.

Since we trust that `peercred` returns either a usable value or an error,
checking for a nil error is a sufficient and correct gate to prevent the
method call and avoid the panic.

Fixes #17421

Signed-off-by: Alex Chan <alexc@tailscale.com>
2025-10-03 16:21:34 +01:00
Brad Fitzpatrick
206d98e84b control/controlclient: restore aggressive Direct.Close teardown
In the earlier http2 package migration (1d93bdce20ddd2, #17394) I had
removed Direct.Close's tracking of the connPool, thinking it wasn't
necessary.

Some tests (in another repo) are strict and like it to tear down the
world and wait, to check for leaked goroutines. And they caught this
letting some goroutines idle past Close, even if they'd eventually
close down on their own.

This restores the connPool accounting and the aggressife close.

Updates #17305
Updates #17394

Change-Id: I5fed283a179ff7c3e2be104836bbe58b05130cc7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-02 20:50:28 -07:00
Simon Law
cd523eae52
ipn/ipnlocal: introduce the concept of client-side-reachability (#17367)
The control plane will sometimes determine that a node is not online,
while the node is still able to connect to its peers. This patch
doesn’t solve this problem, but it does mitigate it.

This PR introduces the `client-side-reachability` node attribute that
switches the node to completely ignore the online signal from control.

In the future, the client itself should collect reachability data from
active Wireguard flows and Tailscale pings.

Updates #17366
Updates tailscale/corp#30379
Updates tailscale/corp#32686

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2025-10-02 16:01:55 -07:00
Brad Fitzpatrick
24e38eb729 control/controlclient,health,ipn/ipnlocal,health: fix deadlock by deleting health reporting
A recent change (009d702adfa0fc) introduced a deadlock where the
/machine/update-health network request to report the client's health
status update to the control plane was moved to being synchronous
within the eventbus's pump machinery.

I started to instead make the health reporting be async, but then we
realized in the three years since we added that, it's barely been used
and doesn't pay for itself, for how many HTTP requests it makes.

Instead, delete it all and replace it with a c2n handler, which
provides much more helpful information.

Fixes tailscale/corp#32952

Change-Id: I9e8a5458269ebfdda1c752d7bbb8af2780d71b04
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-02 12:48:22 -07:00
Brad Fitzpatrick
a208cb9fd5 feature/featuretags: add features for c2n, peerapi, advertise/use routes/exit nodes
Saves 262 KB so far. I'm sure I missed some places, but shotizam says
these were the low hanging fruit.

Updates #12614

Change-Id: Ia31c01b454f627e6d0470229aae4e19d615e45e3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-02 12:48:12 -07:00
Brad Fitzpatrick
2cd518a8b6 control/controlclient: optimize zstd decode of KeepAlive messages
Maybe it matters? At least globally across all nodes?

Fixes #17343

Change-Id: I3f61758ea37de527e16602ec1a6e453d913b3195
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-02 10:51:30 -07:00
Brad Fitzpatrick
3ae7a351b4 feature/featuretags: make clientmetrics optional
Saves 57 KB

Updates #12614

Change-Id: If7eebec12b3cb30ae6264171d36a258c04b05a70
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-02 10:05:12 -07:00
M. J. Fromberger
127a967207
appc,*: publish events for route updates and storage (#17392)
Add and wire up event publishers for these two event types in the AppConnector.
Nothing currently subscribes to them, so this is harmless. Subscribers for
these events will be added in a near-future commit.

As part of this, move the appc.RouteInfo type to the types/appctype package.
It does not contain any package-specific details from appc. Beside it, add
appctype.RouteUpdate to carry route update event state, likewise not specific
to appc.  Update all usage of the appc.* types throughout to use appctype.*
instead, and update depaware files to reflect these changes.

Add a Close method to the AppConnector to make sure the client gets cleaned up
when the connector is dropped (we re-create connectors).

Update the unit tests in the appc package to also check the events published
alongside calls to the RouteAdvertiser.

For now the tests still rely on the RouteAdvertiser for correctness; this is OK
for now as the two methods are always performed together.  In the near future,
we need to rework the tests so not require that, but that will require building
some more test fixtures that we can handle separately.

Updates #15160
Updates #17192

Change-Id: I184670ba2fb920e0d2cb2be7c6816259bca77afe
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2025-10-02 09:31:42 -07:00
M. J. Fromberger
3c32f87624
feature/relayserver: use eventbus.Monitor to simplify lifecycle management (#17234)
Instead of using separate channels to manage the lifecycle of the eventbus
client, use the recently-added eventbus.Monitor, which handles signaling the
processing loop to stop and waiting for it to complete.  This allows us to
simplify some of the setup and cleanup code in the relay server.

Updates #15160

Change-Id: Ia1a47ce2e5a31bc8f546dca4c56c3141a40d67af
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2025-10-02 09:18:55 -07:00
Brad Fitzpatrick
1d93bdce20 control/controlclient: remove x/net/http2, use net/http
Saves 352 KB, removing one of our two HTTP/2 implementations linked
into the binary.

Fixes #17305
Updates #15015

Change-Id: I53a04b1f2687dca73c8541949465038b69aa6ade
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-02 08:25:14 -07:00
Brad Fitzpatrick
c45f8813b4 feature/featuretags, all: add build features, use existing ones in more places
Saves 270 KB.

Updates #12614

Change-Id: I4c3fe06d32c49edb3a4bb0758a8617d83f291cf5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-02 08:07:25 -07:00
Tom Proctor
aa5b2ce83b
cmd/k8s-operator: add .gitignore for generated chart CRDs (#17406)
Add a .gitignore for the chart version of the CRDs that we never commit,
because the static manifest CRD files are the canonical version. This
makes it easier to deploy the CRDs via the helm chart in a way that
reflects the production workflow without making the git checkout
"dirty".

Given that the chart CRDs are ignored, we can also now safely generate
them for the kube-generate-all Makefile target without being a nuisance
to the state of the git checkout. Added a slightly more robust repo root
detection to the generation logic to make sure the command works from
the context of both the Makefile and the image builder command we run
for releases in corp.

Updates tailscale/corp#32085

Change-Id: Id44a4707c183bfaf95a160911ec7a42ffb1a1287

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-10-02 13:30:00 +01:00
Tom Proctor
16e0abe031
build_docker.sh: support including extra files (#17405)
mkctr already has support for including extra files in the built
container image. Wire up a new optional environment variable to thread
that through to mkctr. The operator e2e tests will use this to bake
additional trusted CAs into the test image without significantly
departing from the normal build or deployment process for our
containers.

Updates tailscale/corp#32085

Change-Id: Ica94ed270da13782c4f5524fdc949f9218f79477

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-10-02 13:29:03 +01:00
Alex Chan
7dfa26778e derp/derphttp: de-flake DERP HTTP clients tests with memnet and synctest
Using memnet and synctest removes flakiness caused by real networking
and subtle timing differences.

Additionally, remove the `t.Logf` call inside the server's shutdown
goroutine that was causing a false positive data race detection.

The race detector is flagging a double write during this `t.Logf` call.
This is a common pattern, noted in golang/go#40343 and elsehwere in
this file, where using `t.Logf` after a test has finished can interact
poorly with the test runner.

This is a long-standing issue which became more common after rewriting
this test to use memnet and synctest.

Fixed #17355

Signed-off-by: Alex Chan <alexc@tailscale.com>
2025-10-02 08:38:11 +01:00
Andrew Lytvynov
cca70ddbfc
cmd/tailscaled: default --encrypt-state to true if TPM is available (#17376)
Whenever running on a platform that has a TPM (and tailscaled can access
it), default to encrypting the state. The user can still explicitly set
this flag to disable encryption.

Updates https://github.com/tailscale/corp/issues/32909

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2025-10-01 20:18:58 -07:00
Brad Fitzpatrick
78af49dd1a control/ts2021: rename from internal/noiseconn in prep for controlclient split
A following change will split out the controlclient.NoiseClient type
out, away from the rest of the controlclient package which is
relatively dependency heavy.

A question was where to move it, and whether to make a new (a fifth!)
package in the ts2021 dependency chain.

@creachadair and I brainstormed and decided to merge
internal/noiseconn and controlclient.NoiseClient into one package,
with names ts2021.Conn and ts2021.Client.

For ease of reviewing the subsequent PR, this is the first step that
just renames the internal/noiseconn package to control/ts2021.

Updates #17305

Change-Id: Ib5ea162dc1d336c1d805bdd9548d1702dd6e1468
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-01 15:34:57 -07:00
Brad Fitzpatrick
801aac59db Makefile, cmd/*/depaware.txt: split out vendor packages explicitly
depaware was merging golang.org/x/foo and std's
vendor/golang.org/x/foo packages (which could both be in the binary!),
leading to confusing output, especially when I was working on
eliminating duplicate packages imported under different names.

This makes the depaware output longer and grosser, but doesn't hide
reality from us.

Updates #17305

Change-Id: I21cc3418014e127f6c1a81caf4e84213ce84ab57
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-01 13:02:06 -07:00
M. J. Fromberger
67f1081269
appc,ipn/ipnlocal: add a required event bus to the AppConnector type (#17390)
Require the presence of the bus, but do not use it yet.  Check for required
fields and update tests and production use to plumb the necessary arguments.

Updates #15160
Updates #17192

Change-Id: I8cefd2fdb314ca9945317d3320bd5ea6a92e8dcb
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2025-10-01 12:00:32 -07:00
Claus Lensbøl
ce752b8a88
net/netmon: remove usage of direct callbacks from netmon (#17292)
The callback itself is not removed as it is used in other repos, making
it simpler for those to slowly transition to the eventbus.

Updates #15160

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2025-10-01 14:59:38 -04:00
M. J. Fromberger
6f7ce5eb5d
appc: factor app connector arguments into a Config type (#17389)
Replace the positional arguments to NewAppConnector with a Config struct.
Update the existing uses. Other than the API change, there are no functional
changes in this commit.

Updates #15160
Updates #17192

Change-Id: Ibf37f021372155a4db8aaf738f4b4f2c746bf623
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2025-10-01 11:39:01 -07:00
Brad Fitzpatrick
05a4c8e839 tsnet: remove AuthenticatedAPITransport (API-over-noise) support
It never launched and I've lost hope of it launching and it's in my
way now, so I guess it's time to say goodbye.

Updates tailscale/corp#4383
Updates #17305

Change-Id: I2eb551d49f2fb062979cc307f284df4b3dfa5956
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-01 08:13:24 -07:00
Brad Fitzpatrick
c2f37c891c all: use Go 1.20's errors.Join instead of our multierr package
Updates #7123

Change-Id: Ie9be6814831f661ad5636afcd51d063a0d7a907d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-01 08:10:59 -07:00
Brad Fitzpatrick
91fa51ca15 ipn/store, feature/condregister: permit callers to empty import optonal ipn stores
This permits other programs (in other repos) to conditionally
import ipn/store/awsstore and/or ipn/store/kubestore and have them
register themselves, rather than feature/condregister doing it.

Updates tailscale/corp#32922

Change-Id: I2936229ce37fd2acf9be5bf5254d4a262d090ec1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-01 07:52:57 -07:00
James Sanderson
ebc370e517 ipn/ipnlocal: fail test if more notifies are put than expected
The `put` callback runs on a different goroutine to the test, so calling
t.Fatalf in put had no effect. `drain` is always called when checking what
was put and is called from the test goroutine, so that's a good place to
fail the test if the channel was too full.

Updates #17363

Signed-off-by: James Sanderson <jsanderson@tailscale.com>
2025-10-01 15:09:40 +01:00
Tom Meadows
af1114e896
cmd/k8s-proxy: importing feature/condregister on cmd/k8s-proxy (#17383)
https://github.com/tailscale/tailscale/pull/17346 moved the kube and aws
arn store initializations to feature/condregister, under the assumption
that anything using it would use kubestore.New. Unfortunately,
cmd/k8s-proxy makes use of store.New, which compares the `<prefix>:`
supplied in the provided `path string` argument against known stores. If
it doesn't find it, it fallsback to using a FileStore.

Since cmd/k8s-proxy uses store.New to try and initialize a kube store in
some cases (without importing feature/condregister), it silently creates
a FileStore and that leads to misleading errors further along in
execution.

This fixes this issue by importing condregister, and successfully
initializes a kube store.

Updates #12614

Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
2025-10-01 12:24:21 +01:00
Nick Khyl
9781b7c25c ipn/ipnlocal: plumb logf into nodeBackend
Updates #cleanup

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2025-09-30 20:59:13 -05:00
Brad Fitzpatrick
5b09913d64 ipn/ipnlocal, engine: avoid runtime/pprof with two usages of ts_omit_debug
Saves 258 KB.

Updates #12614

Change-Id: I37c2f7f916480e3534883f338de4c64d08f7ef2b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-09-30 14:58:25 -07:00
Brad Fitzpatrick
f7afb9b6ca feature/featuretags, ipn/conffile: make HuJSON support in config files optional
Saves 33 KB.

Updates #12614

Change-Id: Ie701c230e0765281f409f29ed263910b9be9cc77
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-09-30 14:32:55 -07:00
Brad Fitzpatrick
6c6a1d8341 feature/appconnectors: start making it modular
Saves 45 KB.

Updates #12614

Change-Id: Iaeb73e69633878ce0a0f58c986024784bbe218f1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-09-30 13:35:44 -07:00
Brad Fitzpatrick
9386a101d8 cmd/tailscaled, ipn/localapi, util/eventbus: don't link in regexp when debug is omitted
Saves 442 KB. Lock it with a new min test.

Updates #12614

Change-Id: Ia7bf6f797b6cbf08ea65419ade2f359d390f8e91
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-09-30 12:13:17 -07:00