1011 Commits

Author SHA1 Message Date
KevinLiang10
45f989f52a
ipn/ipnlocal: warn incompatibility between no-snat-routes and exitnode (#19023)
* ipn/ipnlocal: warn incompatibility between no-snat-routes and exitnode

This commit adds a warning to health check when the --snat-subnet-routes=false flag for subnet router is
set alone side --advertise-exit-node=true. These two would conflict with each other and result internet-bound
traffic from peers using this exit node no masqueraded to the node's source IP and fail to route return
packets back. The described combination is not valid until we figure out a way to separate exitnode masquerade rule and skip it for subnet routes.

Updates #18725

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

* use date instead of for now to clarify effectivness

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

---------

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
2026-03-26 12:36:31 -04:00
Fran Bull
2d5962f524 feature/conn25,ipn/ipnext,ipn/ipnlocal: add ExtraRouterConfigRoutes hook
conn25 needs to add routes to the operating system to direct handling
of the addresses in the magic IP range to the tailscale0 TUN and
tailscaled.

The way we do this for exit nodes and VIP services is that we add routes
to the Routes field of router.Config, and then the config is passed to
the WireGuard engine Reconfig.

conn25 is implemented as an ipnext.Extension and so this commit adds a
hook to ipnext.Hooks to allow any extension to provide routes to the
config. The hook if provided is called in routerConfigLocked, similarly
to exit nodes and VIP services.

Fixes tailscale/corp#38123

Signed-off-by: Fran Bull <fran@tailscale.com>
2026-03-25 19:28:33 -07:00
kari-ts
9992b7c817
ipn,ipn/local: broadcast ClientVersion if AutoUpdate.Check (#19107)
If AutoUpdate.Check is false, the client has opted out of checking for updates, so we shouldn't broadcast ClientVersion. If the client has opted in, it should be included in the initial Notify.

Updates tailscale/corp#32629

Signed-off-by: kari-ts <kari@tailscale.com>
2026-03-24 15:06:20 -07:00
Michael Ben-Ami
ea7040eea2 ipn/{ipnext,ipnlocal}: expose authReconfig in ipnext.Host as AuthReconfigAsync
Also implement a limit of one on the number of goroutines that can be
waiting to do a reconfig via AuthReconfig, to prevent extensions from
calling too fast and taxing resources.

Even with the protection, the new method should only be used in
experimental or proof-of-concept contexts. The current intended use is
for an extension to be able force a reconfiguration of WireGuard, and
have the reconfiguration call back into the extension for extra Allowed
IPs.

If in the future if WireGuard is able to reconfigure individual peers more
dynamically, an extension might be able to hook into that process, and
this method on ipnext.Host may be deprecated.

Fixes tailscale/corp#38120
Updates tailscale/corp#38124
Updates tailscale/corp#38125

Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
2026-03-20 17:29:11 -04:00
Brendan Creane
ffa7df2789
ipn: reject advertised routes with non-address bits set (#18649)
* ipn: reject advertised routes with non-address bits set

The config file path, EditPrefs local API, and App Connector API were
accepting invalid subnet route prefixes with non-address bits set (e.g.,
2a01:4f9:c010:c015::1/64 instead of 2a01:4f9:c010:c015::/64). All three
paths now reject prefixes where prefix != prefix.Masked() with an error
message indicating the expected masked form.

Updates tailscale/corp#36738

Signed-off-by: Brendan Creane <bcreane@gmail.com>

* address review comments

Signed-off-by: Brendan Creane <bcreane@gmail.com>

---------

Signed-off-by: Brendan Creane <bcreane@gmail.com>
2026-03-20 10:10:43 -07:00
Gesa Stupperich
ca9aa20255
ipn/ipnlocal: populate Groups field in profileFromView
This populates UserProfile.Groups in the WhoIs response from the
local backend with the groups of the corresponding user in the
netmap.

This allows tsnet apps to see (and e.g. forward) which groups a
user making a request belongs to - as long as the tsnet app runs
on a node that been granted the tailscale.com/visible-groups
capability via node attributes. If that's not the case or the
user doesn't belong to any groups allow-listed via the node
attribute, Groups won't be populated.

Updates tailscale/corp#31529

Signed-off-by: Gesa Stupperich <gesa@tailscale.com>
2026-03-19 21:46:55 +00:00
Mike O'Driscoll
4e88d231d5
control,health,ipn: move IP forwarding check to health tracker (#19007)
Currently IP forwarding health check is done on sending MapRequests.

Move ip forwarding to the health service to gain the benefits
of the health tracker and perodic monitoring out of band from
the MapRequest path. ipnlocal now provides a closure to
the health service to provide the check if forwarding is broken.

Removed `skipIPForwardingCheck` from controlclient/direct.go,
it wasn't being used as the comments describe it, that check
has moved to ipnlocal for the closure to the health tracker.

Updates #18976

Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
2026-03-18 16:24:12 -04:00
kari-ts
4c7c1091ba
netns: add Android callback to bind socket to network (#18915)
After switching from cellular to wifi without ipv6, ForeachInterface still sees rmnet prefixes, so HaveV6 stays true, and magicsock keeps attempting ipv6 connections that either route through cellular or time out for users on wifi without ipv6

This:
-Adds SetAndroidBindToNetworkFunc, a callback to bind the socket to the selected Android Network object

Updates tailscale/tailscale#6152

Signed-off-by: kari-ts <kari@tailscale.com>
2026-03-11 12:28:28 -07:00
Brad Fitzpatrick
f905871fb1 ipn/ipnlocal, feature/ssh: move SSH code out of LocalBackend to feature
This makes tsnet apps not depend on x/crypto/ssh and locks that in with a test.

It also paves the wave for tsnet apps to opt-in to SSH support via a
blank feature import in the future.

Updates #12614

Change-Id: Ica85628f89c8f015413b074f5001b82b27c953a9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-10 17:27:17 -07:00
Gesa Stupperich
6a19995f13 tailcfg: reintroduce UserProfile.Groups
This change reintroduces UserProfile.Groups, a slice that contains
the ACL-defined and synced groups that a user is a member of.

The slice will only be non-nil for clients with the node attribute
see-groups, and will only contain groups that the client is allowed
to see as per the app payload of the see-groups node attribute.

For example:
```
"nodeAttrs": [
  {
    "target": ["tag:dev"],
    "app": {
      "tailscale.com/see-groups": [{"groups": ["group:dev"]}]
    }
  },

  [...]

]
```

UserProfile.Groups will also be gated by a feature flag for the time
being.

Updates tailscale/corp#31529

Signed-off-by: Gesa Stupperich <gesa@tailscale.com>
2026-03-09 11:08:45 +00:00
Brad Fitzpatrick
bd2a2d53d3 all: use Go 1.26 things, run most gofix modernizers
I omitted a lot of the min/max modernizers because they didn't
result in more clear code.

Some of it's older "for x := range 123".

Also: errors.AsType, any, fmt.Appendf, etc.

Updates #18682

Change-Id: I83a451577f33877f962766a5b65ce86f7696471c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-06 13:32:03 -08:00
Michael Ben-Ami
40858a61fe ipnext,ipnlocal: add ExtraWireGuardAllowedIPs hook
This hook addition is motivated by the Connectors 2025 work, in which
NATed "Transit IPs" are used to route interesting traffic to the
appropriate peer, without advertising the actual real IPs.

It overlaps with #17858, and specifically with the WIP PR #17861.
If that work completes, this hook may be replaced by other ones
that fit the new WireGuard configuration paradigm.

Fixes tailscale/corp#37146

Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
2026-03-06 09:42:44 -05:00
Brad Fitzpatrick
2a64c03c95 types/ptr: deprecate ptr.To, use Go 1.26 new
Updates #18682

Change-Id: I62f6aa0de2a15ef8c1435032c6aa74a181c25f8f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-05 20:13:18 -08:00
M. J. Fromberger
26951a1cbb
ipn/ipnlocal: skip writing netmaps to disk when disabled (#18883)
We use the TS_USE_CACHED_NETMAP knob to condition loading a cached netmap, but
were hitherto writing the map out to disk even when it was disabled. Let's not
do that; the two should travel together.

Updates #12639

Change-Id: Iee5aa828e2c59937d5b95093ea1ac26c9536721e
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2026-03-04 15:13:30 -08:00
Mike O'Driscoll
2c9ffdd188
cmd/tailscale,ipn,net/netutil: remove rp_filter strict mode warnings (#18863)
PR #18860 adds firewall rules in the mangle table to save outbound packet
marks to conntrack and restore them on reply packets before the routing
decision. When reply packets have their marks restored, the kernel uses
the correct routing table (based on the mark) and the packets pass the
rp_filter check.

This makes the risk check and reverse path filtering warnings unnecessary.

Updates #3310
Fixes tailscale/corp#37846

Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
2026-03-04 14:09:19 -05:00
joshua stein
518d241700 netns,wgengine: add OpenBSD support to netns via an rtable
When an exit node has been set and a new default route is added,
create a new rtable in the default rdomain and add the current
default route via its physical interface.  When control() is
requesting a connection not go through the exit-node default route,
we can use the SO_RTABLE socket option to force it through the new
rtable we created.

Updates #17321

Signed-off-by: joshua stein <jcs@jcs.org>
2026-02-25 12:44:32 -08:00
Michael Ben-Ami
811fe7d18e ipnext,ipnlocal,wgengine/filter: add extension hooks for custom filter matchers
Add PacketMatch hooks to the packet filter, allowing extensions to
customize filtering decisions:

- IngressAllowHooks: checked in RunIn after pre() but before the
  standard runIn4/runIn6 match rules. Hooks can accept packets to
  destinations outside the local IP set. First match wins; the
  returned why string is used for logging.

- LinkLocalAllowHooks: checked inside pre() for both ingress and
  egress, providing exceptions to the default policy of dropping
  link-local unicast packets. First match wins. The GCP DNS address
  (169.254.169.254) is always allowed regardless of hooks.

PacketMatch returns (match bool, why string) to provide a log reason
consistent with the existing filter functions.

Hooks are registered via the new FilterHooks struct in ipnext.Hooks
and wired through to filter.Filter in LocalBackend.updateFilterLocked.

Fixes tailscale/corp#35989
Fixes tailscale/corp#37207

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
2026-02-24 10:54:56 -05:00
M. J. Fromberger
f4aea70f7a
ipn/ipnlocal: add basic support for netmap caching (#18530)
This commit is based on ff0978ab, and extends #18497 to connect network map
caching to the LocalBackend. As implemented, only "whole" netmap values are
stored, and we do not yet handle incremental updates. As-written, the feature must
be explicitly enabled via the TS_USE_CACHED_NETMAP envknob, and must be
considered experimental.

Updates #12639

Co-Authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I48a1e92facfbf7fb3a8e67cff7f2c9ab4ed62c83
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2026-02-17 14:51:54 -08:00
Will Norris
a8204568d8 all: replace UserVisibleError with vizerror package
Updates tailscale/corp#9025

Signed-off-by: Will Norris <will@tailscale.com>
2026-02-16 13:20:51 -08:00
Simon Law
6854d2982b
ipn/ipnlocal: log errors when suggesting exit nodes (#18728)
In PR #18681, we started logging which exit nodes were being
suggested. However, we did not log if there were errors encountered.
This patch corrects this oversight.

Updates: tailscale/corp#29964
Updates: tailscale/corp#36446

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-02-13 18:19:27 -08:00
Simon Law
12188c0ade
ipn/ipnlocal: log traffic steering scores and suggested exit nodes (#18681)
When traffic steering is enabled, some users are suggested an exit
node that is inappropriately far from their location. This seems to
happen right when the client connects to the control plane and the
client eventually fixes itself. But whenever an affected client
reconnects, its suggested exit node flaps, and this happens often
enough to be noticeable because connections drop whenever the exit
node is switched. This should not happen, since the map response that
contains the list of suggested exit nodes that the client picks from,
also contains the scores for those nodes.

Since our current logging and diagnostic tools don’t give us enough
insight into what is happening, this PR adds additional logging when:
- traffic steering scores are used to suggest an exit node
- an exit node is suggested, no matter how it was determined

Updates: tailscale/corp#29964
Updates: tailscale/corp#36446

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-02-10 18:14:32 -08:00
Brad Fitzpatrick
dc1d811d48 magicsock, ipnlocal: revert eventbus-based node/filter updates, remove Synchronize hack
Restore synchronous method calls from LocalBackend to magicsock.Conn
for node views, filter, and delta mutations. The eventbus delivery
introduced in 8e6f63cf1 was invalid for these updates because
subsequent operations in the same call chain depend on magicsock
already having the current state. The Synchronize/settleEventBus
workaround was fragile and kept requiring more workarounds and
introducing new mystery bugs.

Since eventbus was added, we've since learned more about when to use
eventbus, and this wasn't one of the cases.

We can take another swing at using eventbus for netmap changes in a
future change.

Fixes #16369
Updates #18575 (likely fixes)

Change-Id: I79057cc9259993368bb1e350ff0e073adf6b9a8f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-02-10 07:32:05 -08:00
KevinLiang10
5eaaf9786b tailcfg: add peerRelay bool to hostinfo
This commit adds a bool named PeerRelay to Hostinfo, to identify the host's status of acting as a peer relay.
Considering the RelayServerPort number can be 0, I just made this a bool in stead of a port number. If the port
info is needed in future this would also help indicating if the port was set to 0 (meaning any port in peer relay
context).

Updates tailscale/corp#35862

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
2026-02-06 18:25:40 -07:00
Will Hannah
058cc3f82b
ipn/ipnlocal: skip AuthKey use if profiles exist (#18619)
If any profiles exist and an Authkey is provided via syspolicy, the
AuthKey is ignored on backend start, preventing re-auth attempts. This
is useful for one-time device provisioning scenarios, skipping authKey
use after initial setup when the authKey may no longer be valid.

updates #18618

Signed-off-by: Will Hannah <willh@tailscale.com>
2026-02-06 09:40:55 -05:00
KevinLiang10
03461ea7fb
wgengine/netstack: add local tailscale service IPs to route and terminate locally (#18461)
* wgengine/netstack: add local tailscale service IPs to route and terminate locally

This commit adds the tailscales service IPs served locally to OS routes, and
make interception to packets so that the traffic terminates locally without
making affects to the HA traffics.

Fixes tailscale/corp#34048

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

* fix test

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

* add ready field to avoid accessing lb before netstack starts

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

* wgengine/netstack: store values from lb to avoid acquiring a lock

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

* add active services to netstack on starts with stored prefs.

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

* fix comments

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

* update comments

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

---------

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
2026-01-30 16:46:03 -05:00
Will Norris
3ec5be3f51 all: remove AUTHORS file and references to it
This file was never truly necessary and has never actually been used in
the history of Tailscale's open source releases.

A Brief History of AUTHORS files
---

The AUTHORS file was a pattern developed at Google, originally for
Chromium, then adopted by Go and a bunch of other projects. The problem
was that Chromium originally had a copyright line only recognizing
Google as the copyright holder. Because Google (and most open source
projects) do not require copyright assignemnt for contributions, each
contributor maintains their copyright. Some large corporate contributors
then tried to add their own name to the copyright line in the LICENSE
file or in file headers. This quickly becomes unwieldy, and puts a
tremendous burden on anyone building on top of Chromium, since the
license requires that they keep all copyright lines intact.

The compromise was to create an AUTHORS file that would list all of the
copyright holders. The LICENSE file and source file headers would then
include that list by reference, listing the copyright holder as "The
Chromium Authors".

This also become cumbersome to simply keep the file up to date with a
high rate of new contributors. Plus it's not always obvious who the
copyright holder is. Sometimes it is the individual making the
contribution, but many times it may be their employer. There is no way
for the proejct maintainer to know.

Eventually, Google changed their policy to no longer recommend trying to
keep the AUTHORS file up to date proactively, and instead to only add to
it when requested: https://opensource.google/docs/releasing/authors.
They are also clear that:

> Adding contributors to the AUTHORS file is entirely within the
> project's discretion and has no implications for copyright ownership.

It was primarily added to appease a small number of large contributors
that insisted that they be recognized as copyright holders (which was
entirely their right to do). But it's not truly necessary, and not even
the most accurate way of identifying contributors and/or copyright
holders.

In practice, we've never added anyone to our AUTHORS file. It only lists
Tailscale, so it's not really serving any purpose. It also causes
confusion because Tailscalars put the "Tailscale Inc & AUTHORS" header
in other open source repos which don't actually have an AUTHORS file, so
it's ambiguous what that means.

Instead, we just acknowledge that the contributors to Tailscale (whoever
they are) are copyright holders for their individual contributions. We
also have the benefit of using the DCO (developercertificate.org) which
provides some additional certification of their right to make the
contribution.

The source file changes were purely mechanical with:

    git ls-files | xargs sed -i -e 's/\(Tailscale Inc &\) AUTHORS/\1 contributors/g'

Updates #cleanup

Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
2026-01-23 15:49:45 -08:00
M. J. Fromberger
ce12863ee5
ipn/ipnlocal: manage per-profile subdirectories in TailscaleVarRoot (#18485)
In order to better manage per-profile data resources on the client, add methods
to the LocalBackend to support creation of per-profile directory structures in
local storage. These methods build on the existing TailscaleVarRoot config, and
have the same limitation (i.e., if no local storage is available, it will
report an error when used).

The immediate motivation is to support netmap caching, but we can also use this
mechanism for other per-profile resources including pending taildrop files and
Tailnet Lock authority caches.

This commit only adds the directory-management plumbing; later commits will
handle migrating taildrop, TKA, etc. to this mechanism, as well as caching
network maps.

Updates #12639

Change-Id: Ia75741955c7bf885e49c1ad99f856f669a754169
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2026-01-23 10:09:46 -08:00
Harry Harpham
3840183be9 tsnet: add support for Services
This change allows tsnet nodes to act as Service hosts by adding a new
function, tsnet.Server.ListenService. Invoking this function will
advertise the node as a host for the Service and create a listener to
receive traffic for the Service.

Fixes #17697
Fixes tailscale/corp#27200
Signed-off-by: Harry Harpham <harry@tailscale.com>
2026-01-16 15:28:31 -07:00
Jonathan Nobels
643e91f2eb
net/netmon: move TailscaleInterfaceIndex out of netmon.State (#18428)
fixes tailscale/tailscale#18418

Both Serve and PeerAPI broke when we moved the TailscaleInterfaceName
into State, which is updated asynchronously and may not be
available when we configure the listeners.

This extracts the explicit interface name property from netmon.State
and adds as a static struct with getters that have proper error
handling.

The bug is only found in sandboxed Darwin clients, where we
need to know the Tailscale interface details in order to set up the
listeners correctly (they must bind to our interface explicitly to escape
the network sandboxing that is applied by NECP).

Currently set only sandboxed macOS and Plan9 set this but it will
also be useful on Windows to simplify interface filtering in netns.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
2026-01-16 14:53:23 -05:00
Tom Meadows
c3b7f24051
ipn,ipn/local: always accept routes for Tailscale Services (cgnat range) (#18173)
Updates #18198

Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
Co-authored-by: James Tucker <raggi@tailscale.com>
2026-01-14 18:20:00 +00:00
Andrew Lytvynov
68617bb82e
cmd/tailscaled: disable state encryption / attestation by default (#18336)
TPM-based features have been incredibly painful due to the heterogeneous
devices in the wild, and many situations in which the TPM "changes" (is
reset or replaced). All of this leads to a lot of customer issues.

We hoped to iron out all the kinks and get all users to benefit from
state encryption and hardware attestation without manually opting in,
but the long tail of kinks is just too long.

This change disables TPM-based features on Windows and Linux by default.
Node state should get auto-decrypted on update, and old attestation keys
will be removed.

There's also tailscaled-on-macOS, but it won't have a TPM or Keychain
bindings anyway.

Updates #18302
Updates #15830

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2026-01-05 17:05:00 -08:00
Jonathan Nobels
3e89068792
net/netmon, wgengine/userspace: purge ChangeDelta.Major and address TODOs (#17823)
updates tailscale/corp#33891

Addresses several older the TODO's in netmon.  This removes the 
Major flag precomputes the ChangeDelta state, rather than making
consumers of ChangeDeltas sort that out themselves.   We're also seeing
a lot of ChangeDelta's being flagged as "Major" when they are
not interesting, triggering rebinds in wgengine that are not needed.  This
cleans that up and adds a host of additional tests.

The dependencies are cleaned, notably removing dependency on netmon
itself for calculating what is interesting, and what is not.  This includes letting
individual platforms set a bespoke global "IsInterestingInterface"
function.  This is only used on Darwin.

RebindRequired now roughly follows how "Major" was historically
calculated but includes some additional checks for various
uninteresting events such as changes in interface addresses that
shouldn't trigger a rebind.  This significantly reduces thrashing (by
roughly half on Darwin clients which switching between nics).   The individual
values that we roll  into RebindRequired are also exposed so that
components consuming netmap.ChangeDelta can ask more
targeted questions.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
2025-12-17 12:32:40 -05:00
Brad Fitzpatrick
0df4631308 ipn/ipnlocal: avoid ResetAndStop panic
Updates #18187

Change-Id: If7375efb7df0452a5e85b742fc4c4eecbbd62717
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-12-11 09:07:45 -08:00
Nick Khyl
da0ea8ef3e Revert "ipn/ipnlocal: shut down old control client synchronously on reset"
It appears (*controlclient.Auto).Shutdown() can still deadlock when called with b.mu held, and therefore the changes in #18127 are unsafe.

This reverts #18127 until we figure out what causes it.

This reverts commit d199ecac80083e64d32baf3b473c67b11a6e6936.

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2025-12-08 15:37:08 -06:00
James 'zofrex' Sanderson
cf40cf5ccb
ipn/ipnlocal: add peer API endpoints to Hostinfo on initial client creation (#17851)
Previously we only set this when it updated, which was fine for the first
call to Start(), but after that point future updates would be skipped if
nothing had changed. If Start() was called again, it would wipe the peer API
endpoints and they wouldn't get added back again, breaking exit nodes (and
anything else requiring peer API to be advertised).

Updates tailscale/corp#27173

Signed-off-by: James Sanderson <jsanderson@tailscale.com>
2025-12-05 13:33:47 +00:00
Nick Khyl
557457f3c2
ipn/ipnlocal: fix LocalBackend deadlock when packet arrives during profile switch (#18126)
If a packet arrives while WireGuard is being reconfigured with b.mu held, such as during a profile switch,
calling back into (*LocalBackend).GetPeerAPIPort from (*Wrapper).filterPacketInboundFromWireGuard
may deadlock when it tries to acquire b.mu.

This occurs because a peer cannot be removed while an inbound packet is being processed.
The reconfig and profile switch wait for (*Peer).RoutineSequentialReceiver to return, but it never finishes
because GetPeerAPIPort needs b.mu, which the waiting goroutine already holds.

In this PR, we make peerAPIPorts a new syncs.AtomicValue field that is written with b.mu held
but can be read by GetPeerAPIPort without holding the mutex, which fixes the deadlock.

There might be other long-term ways to address the issue, such as moving peer API listeners
from LocalBackend to nodeBackend so they can be accessed without holding b.mu,
but these changes are too large and risky at this stage in the v1.92 release cycle.

Updates #18124

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2025-12-04 10:13:13 -05:00
Nick Khyl
d199ecac80 ipn/ipnlocal: shut down old control client synchronously on reset
Previously, callers of (*LocalBackend).resetControlClientLocked were supposed
to call Shutdown on the returned controlclient.Client after releasing b.mu.
In #17804, we started calling Shutdown while holding b.mu, which caused
deadlocks during profile switches due to the (*ExecQueue).RunSync implementation.

We first patched this in #18053 by calling Shutdown in a new goroutine,
which avoided the deadlocks but made TestStateMachine flaky because
the shutdown order was no longer guaranteed.

In #18070, we updated (*ExecQueue).RunSync to allow shutting down
the queue without waiting for RunSync to return. With that change,
shutting down the control client while holding b.mu became safe.

Therefore, this PR updates (*LocalBackend).resetControlClientLocked
to shut down the old client synchronously during the reset, instead of
returning it and shifting that responsibility to the callers.

This fixes the flaky tests and simplifies the code.

Fixes #18052

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2025-12-03 20:35:25 -06:00
Brad Fitzpatrick
74ed589042 syncs: add means of declare locking assumptions for debug mode validation
Updates #17852

Change-Id: I42a64a990dcc8f708fa23a516a40731a19967aba
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-11-26 13:04:28 -08:00
Alex Chan
b38dd1ae06 ipn/ipnlocal: don't panic if there are no suitable exit nodes
In suggestExitNodeLocked, if no exit node candidates have a home DERP or
valid location info, `bestCandidates` is an empty slice. This slice is
passed to `selectNode` (`randomNode` in prod):

```go func randomNode(nodes views.Slice[tailcfg.NodeView], …) tailcfg.NodeView {
	…
	return nodes.At(rand.IntN(nodes.Len()))
}
```

An empty slice becomes a call to `rand.IntN(0)`, which panics.

This patch changes the behaviour, so if we've filtered out all the
candidates before calling `selectNode`, reset the list and then pick
from any of the available candidates.

This patch also updates our tests to give us more coverage of `randomNode`,
so we can spot other potential issues.

Updates #17661

Change-Id: I63eb5e4494d45a1df5b1f4b1b5c6d5576322aa72
Signed-off-by: Alex Chan <alexc@tailscale.com>
2025-11-25 19:05:13 +00:00
Simon Law
848978e664
ipn/ipnlocal: test traffic-steering when feature is not enabled (#17997)
In PR tailscale/corp#34401, the `traffic-steering` feature flag does
not automatically enable traffic steering for all nodes. Instead, an
admin must add the `traffic-steering` node attribute to each client
node that they want opted-in.

For backwards compatibility with older clients, tailscale/corp#34401
strips out the `traffic-steering` node attribute if the feature flag
is not enabled, even if it is set in the policy file. This lets us
safely disable the feature flag.

This PR adds a missing test case for suggested exit nodes that have no
priority.

Updates tailscale/corp#34399

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2025-11-25 09:21:55 -08:00
Nick Khyl
7073f246d3 ipn/ipnlocal: do not call controlclient.Client.Shutdown with b.mu held
This fixes a regression in #17804 that caused a deadlock.

Updates #18052

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2025-11-25 09:22:50 -06:00
Simon Law
9c3a2aa797
ipn/ipnlocal: replace log.Printf with logf (#18045)
Updates #cleanup

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2025-11-24 17:42:58 -08:00
Andrew Lytvynov
c679aaba32
cmd/tailscaled,ipn: show a health warning when state store fails to open (#17883)
With the introduction of node sealing, store.New fails in some cases due
to the TPM device being reset or unavailable. Currently it results in
tailscaled crashing at startup, which is not obvious to the user until
they check the logs.

Instead of crashing tailscaled at startup, start with an in-memory store
with a health warning about state initialization and a link to (future)
docs on what to do. When this health message is set, also block any
login attempts to avoid masking the problem with an ephemeral node
registration.

Updates #15830
Updates #17654

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2025-11-20 15:52:58 -06:00
James Tucker
c09c95ef67 types/key,wgengine/magicsock,control/controlclient,ipn: add debug disco key rotation
Adds the ability to rotate discovery keys on running clients, needed for
testing upcoming disco key distribution changes.

Introduces key.DiscoKey, an atomic container for a disco private key,
public key, and the public key's ShortString, replacing the prior
separate atomic fields.

magicsock.Conn has a new RotateDiscoKey method, and access to this is
provided via localapi and a CLI debug command.

Note that this implementation is primarily for testing as it stands, and
regular use should likely introduce an additional mechanism that allows
the old key to be used for some time, to provide a seamless key rotation
rather than one that invalidates all sessions.

Updates tailscale/corp#34037

Signed-off-by: James Tucker <james@tailscale.com>
2025-11-18 12:16:15 -08:00
Brad Fitzpatrick
bd29b189fe types/netmap,*: remove some redundant fields from NetMap
Updates #12639

Change-Id: Ia50b15529bd1c002cdd2c937cdfbe69c06fa2dc8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-11-18 07:56:10 -08:00
Brad Fitzpatrick
f1cddc6ecf ipn{,/local},cmd/tailscale: add "sync" flag and pref to disable control map poll
For manual (human) testing, this lets the user disable control plane
map polls with "tailscale set --sync=false" (which survives restarts)
and "tailscale set --sync" to restore.

A high severity health warning is shown while this is active.

Updates #12639
Updates #17945

Change-Id: I83668fa5de3b5e5e25444df0815ec2a859153a6d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-11-17 12:37:31 -08:00
Brad Fitzpatrick
99b06eac49 syncs: add Mutex/RWMutex alias/wrappers for future mutex debugging
Updates #17852

Change-Id: I477340fb8e40686870e981ade11cd61597c34a20
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-11-16 19:13:59 -08:00
Brad Fitzpatrick
653d0738f9 types/netmap: remove PrivateKey from NetworkMap
It's an unnecessary nuisance having it. We go out of our way to redact
it in so many places when we don't even need it there anyway.

Updates #12639

Change-Id: I5fc72e19e9cf36caeb42cf80ba430873f67167c3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-11-16 15:32:51 -08:00
James Tucker
a96ef432cf control/controlclient,ipn/ipnlocal: replace State enum with boolean flags
Remove the State enum (StateNew, StateNotAuthenticated, etc.) from
controlclient and replace it with two explicit boolean fields:
- LoginFinished: indicates successful authentication
- Synced: indicates we've received at least one netmap

This makes the state more composable and easier to reason about, as
multiple conditions can be true independently rather than being
encoded in a single enum value.

The State enum was originally intended as the state machine for the
whole client, but that abstraction moved to ipn.Backend long ago.
This change continues moving away from the legacy state machine by
representing state as a combination of independent facts.

Also adds test helpers in ipnlocal that check independent, observable
facts (hasValidNetMap, needsLogin, etc.) rather than relying on
derived state enums, making tests more robust.

Updates #12639

Signed-off-by: James Tucker <james@tailscale.com>
2025-11-14 16:18:34 -08:00
James 'zofrex' Sanderson
124301fbb6
ipn/ipnlocal: log prefs changes and reason in Start (#17876)
Updates tailscale/corp#34238

Signed-off-by: James Sanderson <jsanderson@tailscale.com>
2025-11-14 13:21:56 +00:00