On the corp tailnet (using Mullvad exit nodes + bunch of expired
devices + subnet routers), these were generating big ~35 KB blobs of
logging regularly.
This logging shouldn't even exist at this level, and should be rate
limited at a higher level, but for now as a bandaid, make it less
spammy.
Updates #cleanup
Change-Id: I0b5e9e6e859f13df5f982cd71cd5af85b73f0c0a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
When TS_LOG_TARGET is set to an invalid URL, url.Parse returns an error
and nil pointer, which caused a panic when accessing u.Host.
Now we check the error from url.Parse and log a helpful message while
falling back to the default log host.
Fixes#17792
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
As a baby step towards eventbus-ifying controlclient, make the
Observer optional.
This also means callers that don't care (like this network lock test,
and some tests in other repos) can omit it, rather than passing in a
no-op one.
Updates #12639
Change-Id: Ibd776b45b4425c08db19405bc3172b238e87da4e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This commit replaces usage of local.Client in net/udprelay with DERPMap
plumbing over the eventbus. This has been a longstanding TODO. This work
was also accelerated by a memory leak in net/http when using
local.Client over long periods of time. So, this commit also addresses
said leak.
Updates #17801
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Instead of trying to call View() on something that's already a View
type (or trying to Clone the view unnecessarily), we can re-use the
existing View values in a map[T]ViewType.
Fixes#17866
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
They distracted me in some refactoring. They're set but never used.
Updates #17858
Change-Id: I6ec7d6841ab684a55bccca7b7cbf7da9c782694f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
updates tailscale/corp#31571
It appears that on the latest macOS, iOS and tVOS versions, the work
that netns is doing to bind outgoing connections to the default interface (and all
of the trimmings and workarounds in netmon et al that make that work) are
not needed. The kernel is extension-aware and doing nothing, is the right
thing. This is, however, not the case for tailscaled (which is not a
special process).
To allow us to test this assertion (and where it might break things), we add a
new node cap that turns this behaviour off only for network-extension equipped clients,
making it possible to turn this off tailnet-wide, without breaking any tailscaled
macos nodes.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
I noticed a deadlock in a test in a in-development PR where during a
shutdown storm of things (from a tsnet.Server.Close), LocalBackend was
trying to call magicsock.Conn.Synchronize but the magicsock and/or
eventbus was already shut down and no longer processing events.
Updates #16369
Change-Id: I58b1f86c8959303c3fb46e2e3b7f38f6385036f1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Unfortunately I closed the tab and lost it in my sea of CI failures
I'm currently fighting.
Updates #cleanup
Change-Id: I4e3a652d57d52b75238f25d104fc1987add64191
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
So they're not all run N times on the sharded oss builders
and are only run one time each.
Updates tailscale/corp#28679
Change-Id: Ie21e84b06731fdc8ec3212eceb136c8fc26b0115
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
When systemd notification support was omitted from the build, or on
non-Linux systems, we were unnecessarily emitting code and generating
garbage stringifying addresses upon transition to the Running state.
Updates #12614
Change-Id: If713f47351c7922bb70e9da85bf92725b25954b9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This removes one of the O(n=peers) allocs in getStatus, as
Engine.getStatus happens more often than Reconfig.
Updates #17814
Change-Id: I8a87fbebbecca3aedadba38e46cc418fd163c2b0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Previously if `chains` was empty, it would be passed to `computeActiveAncestor()`,
which would fail with the misleading error "multiple distinct chains".
Updates tailscale/corp#33846
Signed-off-by: Alex Chan <alexc@tailscale.com>
Change-Id: Ib93a755dbdf4127f81cbf69f3eece5a388db31c8
* lock released early just to call `b.send` when it can call
`b.sendToLocked` instead
* `UnlockEarly` called to release the lock before trivially fast
operations, we can wait for a defer there
Updates #11649
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
It was disabled in May 2024 in #12205 (9eb72bb51).
This removes the unused symbols.
Updates #188
Updates tailscale/corp#19106
Updates tailscale/corp#19116
Change-Id: I5208b7b750b18226ed703532ed58c4ea17195a8e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Use GetGlobalAddrs() to discover all STUN endpoints, handling bad NATs
that create multiple mappings. When MappingVariesByDestIP is true, also
add the first STUN IPv4 address with the relay's local port for static
port mapping scenarios.
Updates #17796
Signed-off-by: Raj Singh <raj@tailscale.com>
The feature is currently in private alpha, so requires a tailnet feature
flag. Initially focuses on supporting the operator's own auth, because the
operator is the only device we maintain that uses static long-lived
credentials. All other operator-created devices use single-use auth keys.
Testing steps:
* Create a cluster with an API server accessible over public internet
* kubectl get --raw /.well-known/openid-configuration | jq '.issuer'
* Create a federated OAuth client in the Tailscale admin console with:
* The issuer from the previous step
* Subject claim `system:serviceaccount:tailscale:operator`
* Write scopes services, devices:core, auth_keys
* Tag tag:k8s-operator
* Allow the Tailscale control plane to get the public portion of
the ServiceAccount token signing key without authentication:
* kubectl create clusterrolebinding oidc-discovery \
--clusterrole=system:service-account-issuer-discovery \
--group=system:unauthenticated
* helm install --set oauth.clientId=... --set oauth.audience=...
Updates #17457
Change-Id: Ib29c85ba97b093c70b002f4f41793ffc02e6c6e9
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Now that the feature is in beta, no one should encounter this error.
Updates #cleanup
Change-Id: I69ed3f460b7f28c44da43ce2f552042f980a0420
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This starts running the jsontags vet checker on the module.
All existing findings are adding to an allowlist.
Updates tailscale/corp#791
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
The cmd/jsontags is non-idiomatic since it is not a main binary.
Move it to a vet directory, which will eventually contain a vettool binary.
Update tailscale/corp#791
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Include the node's OS with network flow log information.
Refactor the JSON-length computation to be a bit more precise.
Updates tailscale/corp#33352Fixestailscale/corp#34030
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Prior to this change a SubscriberFunc treated the call to the subscriber's
function as the completion of delivery. But that means when we are closing the
subscriber, that callback could continue to execute for some time after the
close returns.
For channel-based subscribers that works OK because the close takes effect
before the subscriber ever sees the event. To make the two subscriber types
symmetric, we should also wait for the callback to finish before returning.
This ensures that a Close of the client means the same thing with both kinds of
subscriber.
Updates #17638
Change-Id: I82fd31bcaa4e92fab07981ac0e57e6e3a7d9d60b
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Add options to the eventbus.Bus to plumb in a logger.
Route that logger in to the subscriber machinery, and trigger a log message to
it when a subscriber fails to respond to its delivered events for 5s or more.
The log message includes the package, filename, and line number of the call
site that created the subscription.
Add tests that verify this works.
Updates #17680
Change-Id: I0546516476b1e13e6a9cf79f19db2fe55e56c698
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
In particular on Windows, the `transport.TPMCloser` we get is not safe
for concurrent use. This is especially noticeable because
`tpm.attestationKey.Clone` uses the same open handle as the original
key. So wrap the operations on ak.tpm with a mutex and make a deep copy
with a new connection in Clone.
Updates #15830
Updates #17662
Updates #17644
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Specify the app apability that failed the test, instead of the
entire comma-separated list.
Fixes #cleanup
Signed-off-by: Gesa Stupperich <gesa@tailscale.com>
In #17639 we moved the subscription into NewLogger to ensure we would not race
subscribing with shutdown of the eventbus client. Doing so fixed that problem,
but exposed another: As we were only servicing events occasionally when waiting
for the network to come up, we could leave the eventbus to stall in cases where
a number of network deltas arrived later and weren't processed.
To address that, let's separate the concerns: As before, we'll Subscribe early
to avoid conflicts with shutdown; but instead of using the subscriber directly
to determine readiness, we'll keep track of the last-known network state in a
selectable condition that the subscriber updates for us. When we want to wait,
we'll wait on that condition (or until our context ends), ensuring all the
events get processed in a timely manner.
Updates #17638
Updates #15160
Change-Id: I28339a372be4ab24be46e2834a218874c33a0d2d
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>