1031 Commits

Author SHA1 Message Date
Jonathan Nobels
8730cce217 ipn/ipnlocal: send initial NetMap to watchers subscribing to peer change deltas
updates tailscale/corp#40994

If a watcher doesn't also set NotifyInitialNetMap, they can end up receiving
deltas temporarily without any base netmap to which to apply them to.
This change ensures  that there is no requirement to set both flags.  There
is no scenario in which a caller doesn't need the base netmap in the first response.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
2026-05-01 09:20:10 -04:00
Brad Fitzpatrick
a6c5d23742 ipn, ipn/ipnlocal: add Notify.SelfChange
Add a new bus signal that lets reactive consumers (containerboot, kube
agents, sniproxy, tsconsensus, etc.) react to self-node updates without
having to subscribe to the full netmap. Today those consumers either
watch Notify.NetMap (which on large tailnets is expensive to encode and
ship per watcher) or poll. SelfChange is a cheap, narrow alternative:
addresses, name, key expiry, capabilities, etc.

Consumers that need additional state can react to SelfChange and then
fetch the relevant bits on demand via existing LocalClient methods.

Producer-side, every netmap-bearing setControlClientStatus call now
also publishes SelfChange. Future changes will migrate individual
in-tree consumers off Notify.NetMap to this signal, and eventually
gate the legacy NetMap emission to platforms whose host GUIs still
require it.

Updates #12542

Change-Id: I4441650b0e085d663eb6bf26a03748b7d961ca49
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-30 14:47:03 -07:00
Brad Fitzpatrick
159cf8707a ipn/ipnlocal, all: split LocalBackend.NetMap into NetMapNoPeers / NetMapWithPeers
Add two narrower accessors alongside the existing
[LocalBackend.NetMap], with docs that distinguish their semantics:

  - NetMapNoPeers: cheap (returns the cached *netmap.NetworkMap with
    a possibly-stale Peers slice). For callers that only read non-Peers
    fields like SelfNode, DNS, PacketFilter, capabilities.
  - NetMapWithPeers: documented as returning an up-to-date Peers slice.
    For callers that genuinely need to iterate Peers or call
    PeerByXxx.

Mark the existing NetMap deprecated and point readers at the two new
accessors. NetMap, NetMapNoPeers, and NetMapWithPeers all currently
return the same value (b.currentNode().NetMap()): this commit is a
no-op behaviorally, just a renaming and migration of in-tree callers.
A subsequent change in the same series will switch
NetMapWithPeers to actually rebuild the Peers slice from the live
per-node-backend peers map (O(N) per call), at which point the
distinction between the two new accessors becomes load-bearing.

Migrate in-tree callers to the appropriate accessor based on what
fields they read:

  - NetMapNoPeers (most common): localapi handlers, peerapi accept,
    GetCertPEMWithValidity, web client noise request, doctor DNS
    resolver check, tsnet CertDomains/TailscaleIPs, ssh/tailssh
    SSH-policy/cap reads, several LocalBackend internals
    (isLocalIP, allowExitNodeDNSProxyToServeName, pauseForNetwork
    nil-check, serve config).
  - NetMapWithPeers: writeNetmapToDiskLocked (persist full netmap to
    disk for fast restart), PeerByTailscaleIP lookup.

Tests still call the legacy NetMap; they'll see the deprecation
warning but otherwise behave identically.

Also add two pieces of plumbing the next change in this series will
need, but which are already useful on their own:

  - [client/local.GetDebugResultJSON]: a generic [Client.DebugResultJSON]
    that decodes directly into a target type T, avoiding the
    marshal/unmarshal roundtrip callers otherwise need.
  - localapi "current-netmap" debug action: returns the current
    netmap (with peers) as JSON. Documented as debug-only — the
    netmap.NetworkMap shape is internal and may change without notice.

This commit is part of a series breaking up a larger change for
review; on its own it is a no-op refactor.

Updates #12542

Change-Id: Idbb30707414f8da3149c44ca0273262708375b02
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-30 11:14:06 -07:00
Brad Fitzpatrick
f343b496c3 wgengine, all: remove LazyWG, use wireguard-go callback API for on-demand peers
Replace the UAPI text protocol-based wireguard configuration with
wireguard-go's new direct callback API (SetPeerLookupFunc,
SetPeerByIPPacketFunc, RemoveMatchingPeers, SetPrivateKey).

Instead of computing a trimmed wireguard config ahead of time upon
control plane updates and pushing it via UAPI, install callbacks so
wireguard-go creates peers on demand when packets arrive. This removes
all the LazyWG trimming machinery: idle peer tracking, activity maps,
noteRecvActivity callbacks, the KeepFullWGConfig control knob, and the
ts_omit_lazywg build tag.

For incoming packets, PeerLookupFunc answers wireguard-go's questions
about unknown public keys by looking up the peer in the full config.
For outgoing packets, PeerByIPPacketFunc (installed from
LocalBackend.lookupPeerByIP) maps destination IPs to node public keys
using the existing nodeByAddr index.

Updates tailscale/corp#12345

Change-Id: I4cba80979ac49a1231d00a01fdba5f0c2af95dd8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-29 19:46:19 -07:00
Claus Lensbøl
978b6a81b2
ipn/ipnlocal: always ReSTUN when starting up without a cache (#19586)
78627c1 introduced starting up and preserving the DERP server from
cache, but also changed it so the initial ReSTUN would not fire when
setting the DERPMap.

Change this so when not working from a cache, the ReSTUN will always
fire during startup.

Updates #19585

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-04-29 18:56:57 -04:00
Brad Fitzpatrick
15cba0a3f6 tstest/natlab/vmtest: add TestDiscoKeyChange
Add a vmtest that brings up two gokrazy nodes A and B behind two
One2OneNAT networks (so direct UDP works in both directions and any
slowness can't be blamed on NAT traversal), establishes a WireGuard
tunnel A → B with TSMP, then rotates B's disco key four times and
asserts that the data plane recovers in both directions after each
rotation. All pings are TSMP (the data-plane ping; disco pings would
not exercise the WireGuard tunnel itself).

The five pings:

  1. A → B  (initial; brings up the tunnel; 30s budget)
  2. B → A  after rotate (LocalAPI rotate-disco-key debug action)
  3. A → B  after rotate (LocalAPI)
  4. B → A  after restart (SIGKILL; gokrazy supervisor respawns)
  5. A → B  after restart (SIGKILL)

Each post-rotation ping gets a 15-second budget. Two unavoidable
multi-second waits dominate today:

  - The rotate-then-a→b phase takes ~10s on main because of LazyWG.
    After B's WantRunning bounce, B's wgengine resets its
    sentActivityAt/recvActivityAt maps and trims A out of the
    wireguard-go config as an "idle peer"; B only re-adds A on
    inbound activity, by which point A's first few TSMP packets
    have been silently dropped at B's tundev. The
    bradfitz/rm_lazy_wg branch removes that trimming entirely
    (verified locally: this phase drops to <100ms there).

  - The restart phases take ~5s for wireguard-go's RekeyTimeout
    handshake retry. After SIGKILL+respawn the first WG handshake
    init from the restarted node sometimes goes into the void
    (likely the brief peer-removed window in the receiver's
    two-step maybeReconfigWireguardLocked reconfig during which
    the peer is absent from wireguard-go), and wg-go's 5s+jitter
    retransmit timer is the next opportunity to retry. That retry
    succeeds and the staged TSMP packet flushes. Intrinsic to the
    protocol's retransmit policy.

Once LazyWG is removed and the first-handshake-after-reconfig race
is fixed, the budget should drop to 5s.

Supporting changes:

  ipn/ipnlocal: DebugRotateDiscoKey now toggles WantRunning off and
  back on after rotating the disco key. magicsock.Conn.RotateDiscoKey
  only resets local disco state; without also dropping wireguard-go
  session keys, peers keep encrypting with their stale per-peer
  session against us until their rekey timer fires (WireGuard has no
  data-plane signaling to invalidate sessions). Bouncing WantRunning
  runs the engine through Reconfig(empty) → authReconfig, which
  drops every peer's WG session so the next packet either way
  triggers a fresh handshake.

  ipn/ipnlocal, ipn/localapi: add a debug-only "peer-disco-keys"
  LocalAPI action ([LocalBackend.DebugPeerDiscoKeys]) that returns
  a map[NodePublic]DiscoPublic from the current netmap. Tests reach
  it via [local.Client.DebugResultJSON]. We do not surface disco
  keys via [ipnstate.PeerStatus] because adding a non-comparable
  [key.DiscoPublic] field there breaks reflect-based test helpers
  (e.g. TestFilterFormatAndSortExitNodes' use of cmp.Diff), and
  general LocalAPI clients have no need for disco keys. Since the
  debug LocalAPI is gated behind the ts_omit_debug build tag, this
  endpoint is automatically stripped from small binaries.

  cmd/tta: add /restart-tailscaled handler (Linux-only, via /proc walk)
  to drive the SIGKILL phase. On gokrazy the supervisor respawns
  tailscaled within a second.

  tstest/integration/testcontrol: add Server.AllOnline. When set,
  every peer entry in MapResponses is marked Online=true. Several
  disco-key handling fast paths in controlclient and wgengine
  (removeUnwantedDiscoUpdates, removeUnwantedDiscoUpdatesFromFull
  NetmapUpdate, the wgengine tsmpLearnedDisco fast path) only fire
  for online peers; without this flag, tests exercising disco-key
  rotation only hit the offline-peer code paths, which mask issues
  and are several seconds slower in this scenario. Finer-grained
  per-node online tracking can be added later.

  tstest/natlab/vmtest: add Env.RotateDiscoKey,
  Env.RestartTailscaled, Env.PeerDiscoKey, Node.Name, an
  [AllOnline] EnvOption that plumbs through to
  testcontrol.Server.AllOnline, and an exported
  Env.Ping(from, to, type, timeout). Ping replaces the unexported
  helper so callers can specify both a ping type (PingDisco for
  warming peer state, PingTSMP for asserting end-to-end
  connectivity) and a deadline. PeerDiscoKey returns its LocalAPI
  error so callers inside tstest.WaitFor can retry transient
  failures rather than fataling the test.

Updates #12639
Updates #13038

Change-Id: I3644f27fc30e52990ba25a3983498cc582ddb958
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-29 12:58:00 -07:00
Brad Fitzpatrick
22ff402da9 wgengine/magicsock: restore SetDERPMap signature, add SetDERPMapWithoutReSTUN
Commit 78627c132f changed the signature of magicsock.Conn.SetDERPMap to
take an additional bool doReStun parameter. Avoid both the boolean
parameter and the API signature change by restoring SetDERPMap to its
original single-argument form and adding a new SetDERPMapWithoutReSTUN
method for the cache-loading caller that wants to skip the post-set
ReSTUN.

Updates #19490

Change-Id: I97d9e82156bfc546ccf59756d1ea52f039b5de06
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-29 12:46:15 -07:00
Claus Lensbøl
78627c132f
wgengine/magicsock,ipn/ipnlocal: store and load homeDERP from cache (#19491)
With netmap caching, the home DERP of the self node was neither saved to
the cache or loaded from it, making nodes not stick to a DERP when
starting without a connection to control.

Instead, make sure that when a cache is available, load that cache,
before looking for DERP servers. This is implemented by allowing a skip
of ReSTUN in setting the DERP map (we must have a DERP map before
setting the home DERP), so the DERP from cache will set itself and be
sticky until a connection to control is established.

Making DERP only change when connected to control is handled by existing
code from f072d017bd8241675aa946a27fc1827f570435cb.

Updates #19490

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-04-29 10:24:09 -04:00
Alex Chan
bb91bb842c all: remove everything related to non-seamless key renewal
Seamless key renewal has been the default in all clients since 1.90.
We retained the ability to disable it from the control plane as a
precaution, but we haven't seen any issues that require us to disable it.

We're now removing all the code for non-seamless key renewal, because we
don't expect to turn it on again, and indeed it's been untested in the
field for three releases so might contain latent bugs!

Updates tailscale/corp#33042

Change-Id: I4b80bf07a3a50298d1c303743484169accc8844b
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-04-29 10:03:26 +01:00
Brad Fitzpatrick
ad5436af0d tstest/largetailnet, tstest/integration/testcontrol: add in-process large-tailnet benchmark
Add a Go benchmark that exercises a single tailnet client (a [tsnet.Server]
running in the test process) against a synthetic large initial netmap and
a stream of caller-driven peer add/remove deltas, all in-process.

The harness is split in two parts:

  - tstest/largetailnet, a reusable package containing a [Streamer]
    that hijacks the map long-poll on a [testcontrol.Server] via the new
    AltMapStream hook, sends one initial MapResponse with N synthetic
    peers, and forwards caller-supplied delta MapResponses on the same
    stream. Helpers like MakePeer / AllocPeer build synthetic peers with
    unique IDs and addresses derived from the Tailscale ULA range.

  - tstest/largetailnet/largetailnet_test.go, BenchmarkGiantTailnet
    (headless tailscaled workload, no IPN bus subscriber) and
    BenchmarkGiantTailnetBusWatcher (GUI-client workload with one
    Notify subscriber attached). Both are gated on
    --actually-test-giant-tailnet (skipped by default), stand up an
    in-process testcontrol + tsnet.Server, let Up block until the
    initial N-peer netmap has been processed, then ResetTimer and run
    add+remove pairs via b.Loop. Per-delta sync is via a test-only
    [ipnlocal.LocalBackend.AwaitNodeKeyForTest] channel that closes
    once the just-added peer key appears in the netmap (no-watcher
    variant) or via bus-Notify drain (bus-watcher variant).

To support the hijack, [testcontrol.Server] grows an AltMapStream hook
and a small MapStreamWriter interface for benchmarks/stress tests that
need to drive a controlled MapResponse sequence; the normal serveMap
path is untouched when AltMapStream is nil. The streamer answers
non-streaming "lite" map polls (which controlclient issues before the
streaming long-poll to push HostInfo) with an empty MapResponse and
returns immediately, so the streaming poll that follows is the one
that gets the initial netmap.

The benchmark is intended for before/after comparisons of netmap- and
delta-handling changes targeted at large tailnets. CPU profiles on
unmodified main show the expected O(N) hotspots:
setControlClientStatusLocked / authReconfigLocked /
userspaceEngine.Reconfig / setNetMapLocked, plus JSON encoding of the
full Notify.NetMap to bus watchers (which dominates the BusWatcher
variant).

Median ms/op over 10 runs on unmodified main, by tailnet size N:

       N      no-watcher   bus-watcher
   10000          32          166
   50000         222          865
  100000         504         1765
  250000        1551         4696

Recommended invocation:

	go test ./tstest/largetailnet/ -run=^$ \
	    -bench='BenchmarkGiantTailnet(BusWatcher)?$' \
	    -benchtime=2000x -timeout=10m \
	    --actually-test-giant-tailnet \
	    --giant-tailnet-n=250000 \
	    -cpuprofile=/tmp/giant.cpu.pprof

Updates #12542

Change-Id: I4f5b2bb271a36ba853d5a0ffe82054ef2b15c585
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-27 11:47:12 -07:00
Evan Lowry
3a05c450ce
posture: add HealthTracker for serial number retrieval (#19181)
Device posture checking can fail while enabled if tailscaled does not
have access to smbios. Previously, this was only observable by looking
in the tailscaled logs.

Fixes tailscale/corp#39314

Signed-off-by: Evan Lowry <evan@tailscale.com>
2026-04-25 15:42:47 -03:00
Scott Graham
cb5a53c424 ipn/ipnlocal: preserve b.loginFlags in auto-login cc.Login calls
LocalBackend stores loginFlags at construction so that per-instance
properties (e.g. LoginEphemeral set by tsnet.Server.Ephemeral) persist
for the session. StartLoginInteractiveAs already merges b.loginFlags
into its cc.Login call, but the two auto-login call sites pass bare
controlclient.LoginDefault, silently dropping any stored flags.

Merge b.loginFlags at both auto-login call sites to match the existing
StartLoginInteractiveAs pattern. LoginDefault is zero so this is a
no-op when loginFlags is empty, and restores the documented behavior
when it isn't.

Fixes #15852

Signed-off-by: Scott Graham <scott.github@h4ck3r.net>
2026-04-17 23:31:18 -05:00
M. J. Fromberger
1e4934659b
ipn/ipnlocal: discard cached netmaps upon panic during SetNetworkMap (#19414)
For debugging purposes, unstable builds will sometimes intentionally panic for
unexpected behaviours. We observed such a panic after loading a cached netmap,
but because we had a valid cached map, the client was unable to recover on its
own and the operator had to manually reset the cache.

As a defensive hedge, when netmap caching is enabled, check for a panic during
installation of a net network map: If one occurs, discard any cached netmaps
before letting the panic unwind, so that we do not lose the panic itself, but
reduce the need for manual intervention.

Updates #12639
Updates tailscale/corp#27300

Change-Id: I0436889c6bdc2fa728c9cb83630cd7b00a72ce68
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2026-04-15 11:07:42 -07:00
Naman Sood
6301a6ce4b
util/linuxfw,wgengine/router: allow incoming CGNAT range traffic with nodeattr
Clients with the newly added node attribute
`"disable-linux-cgnat-drop-rule"` will not automatically drop inbound
traffic on non-Tailscale network interfaces with the source IP in the
CGNAT IP range. This is an initial proof-of-concept for enabling
connectivity with off-Tailnet CGNAT endpoints.

Fixes tailscale/corp#36270.

Signed-off-by: Naman Sood <mail@nsood.in>
2026-04-14 16:45:06 -04:00
Jonathan Nobels
03c3551ee5
ipn/ipnlocal: add netmap mutations to the ipn bus (#19120)
ipn/local: add netmap mutations to the ipn bus

updates tailscale/tailscale#1909

This adds a new new NotifyWatchOpt that allows watchers to
receive PeerChange events (derived from node mutations)
on the IPN bus in lieu of a complete netmap.  We'll continue
to send the full netmap for any map response that includes it,
but for  mutations, sending PeerChange events gives the client
the option to manage it's own models more selectively and cuts
way down on json serialization overhead.

On chatty tailnets, this will vastly reduce the amount of
chatter on the bus.

This change should be backwards compatible, it is
purely additive.  Clients that subscribe to NotifyNetmap will
get the full netmap for every delta.  New clients can
omit that and instead opt into NotifyPeerChanges.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
2026-04-09 15:45:41 -04:00
Brad Fitzpatrick
a182b864ac tsd, all: add Sys.ExtraRootCAs, plumb through TLS dial paths
Add ExtraRootCAs *x509.CertPool to tsd.System and plumb it through
the control client, noise transport, DERP, and wgengine layers so
that platforms like Android can inject user-installed CA certificates
into Go's TLS verification.

tlsdial.Config now honors base.RootCAs as additional trusted roots,
tried after system roots and before the baked-in LetsEncrypt fallback.
SetConfigExpectedCert gets the same treatment for domain-fronted DERP.

The Android client will set sys.ExtraRootCAs with a pool built from
x509.SystemCertPool + user-installed certs obtained via the Android
KeyStore API, replacing the current SSL_CERT_DIR environment variable
approach.

Updates #8085

Change-Id: Iecce0fd140cd5aa0331b124e55a7045e24d8e0c2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-07 18:10:54 -07:00
James Tucker
21695cdbf8 ipn/ipnlocal,net/netmon: make frequent darkwake more efficient
Investigating battery costs on a busy tailnet I noticed a large number
of nodes regularly reconnecting to control and DERP. In one case I was
able to analyze closely `pmset` reported the every-minute wake-ups being
triggered by bluetooth. The node was by side effect reconnecting to
control constantly, and this was at times visible to peers as well.

Three changes here improve the situation:
- Short time jumps (less than 10 minutes) no longer produce "major
  network change" events, and so do not trigger full rebind/reconnect.
- Many "incidental" fields on interfaces are ignored, like MTU, flags
  and so on - if the route is still good, the rest should be manageable.
- Additional log output will provide more detail about the cause of
  major network change events.

Updates #3363

Signed-off-by: James Tucker <james@tailscale.com>
2026-04-06 15:46:51 -07:00
M. J. Fromberger
211ef67222
tailcfg,ipn/ipnlocal: regulate netmap caching via a node attribute (#19117)
Add a new tailcfg.NodeCapability (NodeAttrCacheNetworkMaps) to control whether
a node with support for caching network maps will attempt to do so. Update the
capability version to reflect this change (mainly as a safety measure, as the
control plane does not currently need to know about it).

Use the presence (or absence) of the node attribute to decide whether to create
and update a netmap cache for each profile. If caching is disabled, discard the
cached data; this allows us to use the presence of a cached netmap as an
indicator it should be used (unless explicitly overridden). Add a test that
verifies the attribute is respected. Reverse the sense of the environment knob
to be true by default, with an override to disable caching at the client
regardless what the node attribute says.

Move the creation/update of the netmap cache (when enabled) until after
successfully applying the network map, to reduce the possibility that we will
cache (and thus reuse after a restart) a network map that fails to correctly
configure the client.

Updates #12639

Change-Id: I1df4dd791fdb485c6472a9f741037db6ed20c47e
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2026-04-01 15:02:53 -07:00
Harry Harpham
61ac021c5d wgengine/magicsock: assume network up for tests
Without this, any test relying on underlying use of magicsock will fail
without network connectivity, even when the test logic has no need for a
network connection. Tests currently in this bucket include many in
tstest/integration and in tsnet.

Further explanation:

ipn only becomes Running when it sees at least one live peer or DERP
connection:
0cc1b2ff76/ipn/ipnlocal/local.go (L5861-L5866)

When tests only use a single node, they will never see a peer, so the
node has to wait to see a DERP server.

magicsock sets the preferred DERP server in updateNetInfo(), but this
function returns early if the network is down.
0cc1b2ff76/wgengine/magicsock/magicsock.go (L1053-L1106)

Because we're checking the real network, this prevents ipn from entering
"Running" and causes the test to fail or hang.

In tests, we can assume the network is up unless we're explicitly testing
the behaviour of tailscaled when the network is down. We do something similar
in magicsock/derp.go, where we assume we're connected to control unless
explicitly testing otherwise:
7d2101f352/wgengine/magicsock/derp.go (L166-L177)

This is the template for the changes to `networkDown()`.

Fixes #17122

Co-authored-by: Alex Chan <alexc@tailscale.com>
Signed-off-by: Harry Harpham <harry@tailscale.com>
2026-03-31 09:57:14 -06:00
Claus Lensbøl
bf467727fc
control/controlclient,ipn/ipnlocal,wgengine: avoid restarting wireguard when key is learned via tsmp (#19142)
When disco keys are learned on a node that is connected to control and
has a mapSession, wgengine will see the key as having changed, and
assume that any existing connections will need to be reset.

For keys learned via TSMP, the connection should not be reset as that
key is learned via an active wireguard connection. If wgengine resets
that connetion, a 15s timeout will occur.

This change adds a map to track new keys coming in via TSMP, and removes
them from the list of keys that needs to trigger wireguard resets. This
is done with an interface chain from controlclient down via localBackend
to userspaceEngine via the watchdog.

Once a key has been actively used for preventing a wireguard reset, the
key is removed from the map.

If mapSession becomes a long lived process instead of being dependent on
having a connection to control. This interface chain can be removed, and
the event sequence from wrap->controlClient->userspaceEngine, can be
changed to wrap->userspaceEngine->controlClient as we know the map will
not be gunked up with stale TSMP entries.

Updates #12639

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-03-30 14:26:08 -04:00
KevinLiang10
45f989f52a
ipn/ipnlocal: warn incompatibility between no-snat-routes and exitnode (#19023)
* ipn/ipnlocal: warn incompatibility between no-snat-routes and exitnode

This commit adds a warning to health check when the --snat-subnet-routes=false flag for subnet router is
set alone side --advertise-exit-node=true. These two would conflict with each other and result internet-bound
traffic from peers using this exit node no masqueraded to the node's source IP and fail to route return
packets back. The described combination is not valid until we figure out a way to separate exitnode masquerade rule and skip it for subnet routes.

Updates #18725

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

* use date instead of for now to clarify effectivness

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

---------

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
2026-03-26 12:36:31 -04:00
Fran Bull
2d5962f524 feature/conn25,ipn/ipnext,ipn/ipnlocal: add ExtraRouterConfigRoutes hook
conn25 needs to add routes to the operating system to direct handling
of the addresses in the magic IP range to the tailscale0 TUN and
tailscaled.

The way we do this for exit nodes and VIP services is that we add routes
to the Routes field of router.Config, and then the config is passed to
the WireGuard engine Reconfig.

conn25 is implemented as an ipnext.Extension and so this commit adds a
hook to ipnext.Hooks to allow any extension to provide routes to the
config. The hook if provided is called in routerConfigLocked, similarly
to exit nodes and VIP services.

Fixes tailscale/corp#38123

Signed-off-by: Fran Bull <fran@tailscale.com>
2026-03-25 19:28:33 -07:00
kari-ts
9992b7c817
ipn,ipn/local: broadcast ClientVersion if AutoUpdate.Check (#19107)
If AutoUpdate.Check is false, the client has opted out of checking for updates, so we shouldn't broadcast ClientVersion. If the client has opted in, it should be included in the initial Notify.

Updates tailscale/corp#32629

Signed-off-by: kari-ts <kari@tailscale.com>
2026-03-24 15:06:20 -07:00
Michael Ben-Ami
ea7040eea2 ipn/{ipnext,ipnlocal}: expose authReconfig in ipnext.Host as AuthReconfigAsync
Also implement a limit of one on the number of goroutines that can be
waiting to do a reconfig via AuthReconfig, to prevent extensions from
calling too fast and taxing resources.

Even with the protection, the new method should only be used in
experimental or proof-of-concept contexts. The current intended use is
for an extension to be able force a reconfiguration of WireGuard, and
have the reconfiguration call back into the extension for extra Allowed
IPs.

If in the future if WireGuard is able to reconfigure individual peers more
dynamically, an extension might be able to hook into that process, and
this method on ipnext.Host may be deprecated.

Fixes tailscale/corp#38120
Updates tailscale/corp#38124
Updates tailscale/corp#38125

Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
2026-03-20 17:29:11 -04:00
Brendan Creane
ffa7df2789
ipn: reject advertised routes with non-address bits set (#18649)
* ipn: reject advertised routes with non-address bits set

The config file path, EditPrefs local API, and App Connector API were
accepting invalid subnet route prefixes with non-address bits set (e.g.,
2a01:4f9:c010:c015::1/64 instead of 2a01:4f9:c010:c015::/64). All three
paths now reject prefixes where prefix != prefix.Masked() with an error
message indicating the expected masked form.

Updates tailscale/corp#36738

Signed-off-by: Brendan Creane <bcreane@gmail.com>

* address review comments

Signed-off-by: Brendan Creane <bcreane@gmail.com>

---------

Signed-off-by: Brendan Creane <bcreane@gmail.com>
2026-03-20 10:10:43 -07:00
Gesa Stupperich
ca9aa20255
ipn/ipnlocal: populate Groups field in profileFromView
This populates UserProfile.Groups in the WhoIs response from the
local backend with the groups of the corresponding user in the
netmap.

This allows tsnet apps to see (and e.g. forward) which groups a
user making a request belongs to - as long as the tsnet app runs
on a node that been granted the tailscale.com/visible-groups
capability via node attributes. If that's not the case or the
user doesn't belong to any groups allow-listed via the node
attribute, Groups won't be populated.

Updates tailscale/corp#31529

Signed-off-by: Gesa Stupperich <gesa@tailscale.com>
2026-03-19 21:46:55 +00:00
Mike O'Driscoll
4e88d231d5
control,health,ipn: move IP forwarding check to health tracker (#19007)
Currently IP forwarding health check is done on sending MapRequests.

Move ip forwarding to the health service to gain the benefits
of the health tracker and perodic monitoring out of band from
the MapRequest path. ipnlocal now provides a closure to
the health service to provide the check if forwarding is broken.

Removed `skipIPForwardingCheck` from controlclient/direct.go,
it wasn't being used as the comments describe it, that check
has moved to ipnlocal for the closure to the health tracker.

Updates #18976

Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
2026-03-18 16:24:12 -04:00
kari-ts
4c7c1091ba
netns: add Android callback to bind socket to network (#18915)
After switching from cellular to wifi without ipv6, ForeachInterface still sees rmnet prefixes, so HaveV6 stays true, and magicsock keeps attempting ipv6 connections that either route through cellular or time out for users on wifi without ipv6

This:
-Adds SetAndroidBindToNetworkFunc, a callback to bind the socket to the selected Android Network object

Updates tailscale/tailscale#6152

Signed-off-by: kari-ts <kari@tailscale.com>
2026-03-11 12:28:28 -07:00
Brad Fitzpatrick
f905871fb1 ipn/ipnlocal, feature/ssh: move SSH code out of LocalBackend to feature
This makes tsnet apps not depend on x/crypto/ssh and locks that in with a test.

It also paves the wave for tsnet apps to opt-in to SSH support via a
blank feature import in the future.

Updates #12614

Change-Id: Ica85628f89c8f015413b074f5001b82b27c953a9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-10 17:27:17 -07:00
Gesa Stupperich
6a19995f13 tailcfg: reintroduce UserProfile.Groups
This change reintroduces UserProfile.Groups, a slice that contains
the ACL-defined and synced groups that a user is a member of.

The slice will only be non-nil for clients with the node attribute
see-groups, and will only contain groups that the client is allowed
to see as per the app payload of the see-groups node attribute.

For example:
```
"nodeAttrs": [
  {
    "target": ["tag:dev"],
    "app": {
      "tailscale.com/see-groups": [{"groups": ["group:dev"]}]
    }
  },

  [...]

]
```

UserProfile.Groups will also be gated by a feature flag for the time
being.

Updates tailscale/corp#31529

Signed-off-by: Gesa Stupperich <gesa@tailscale.com>
2026-03-09 11:08:45 +00:00
Brad Fitzpatrick
bd2a2d53d3 all: use Go 1.26 things, run most gofix modernizers
I omitted a lot of the min/max modernizers because they didn't
result in more clear code.

Some of it's older "for x := range 123".

Also: errors.AsType, any, fmt.Appendf, etc.

Updates #18682

Change-Id: I83a451577f33877f962766a5b65ce86f7696471c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-06 13:32:03 -08:00
Michael Ben-Ami
40858a61fe ipnext,ipnlocal: add ExtraWireGuardAllowedIPs hook
This hook addition is motivated by the Connectors 2025 work, in which
NATed "Transit IPs" are used to route interesting traffic to the
appropriate peer, without advertising the actual real IPs.

It overlaps with #17858, and specifically with the WIP PR #17861.
If that work completes, this hook may be replaced by other ones
that fit the new WireGuard configuration paradigm.

Fixes tailscale/corp#37146

Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
2026-03-06 09:42:44 -05:00
Brad Fitzpatrick
2a64c03c95 types/ptr: deprecate ptr.To, use Go 1.26 new
Updates #18682

Change-Id: I62f6aa0de2a15ef8c1435032c6aa74a181c25f8f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-05 20:13:18 -08:00
M. J. Fromberger
26951a1cbb
ipn/ipnlocal: skip writing netmaps to disk when disabled (#18883)
We use the TS_USE_CACHED_NETMAP knob to condition loading a cached netmap, but
were hitherto writing the map out to disk even when it was disabled. Let's not
do that; the two should travel together.

Updates #12639

Change-Id: Iee5aa828e2c59937d5b95093ea1ac26c9536721e
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2026-03-04 15:13:30 -08:00
Mike O'Driscoll
2c9ffdd188
cmd/tailscale,ipn,net/netutil: remove rp_filter strict mode warnings (#18863)
PR #18860 adds firewall rules in the mangle table to save outbound packet
marks to conntrack and restore them on reply packets before the routing
decision. When reply packets have their marks restored, the kernel uses
the correct routing table (based on the mark) and the packets pass the
rp_filter check.

This makes the risk check and reverse path filtering warnings unnecessary.

Updates #3310
Fixes tailscale/corp#37846

Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
2026-03-04 14:09:19 -05:00
joshua stein
518d241700 netns,wgengine: add OpenBSD support to netns via an rtable
When an exit node has been set and a new default route is added,
create a new rtable in the default rdomain and add the current
default route via its physical interface.  When control() is
requesting a connection not go through the exit-node default route,
we can use the SO_RTABLE socket option to force it through the new
rtable we created.

Updates #17321

Signed-off-by: joshua stein <jcs@jcs.org>
2026-02-25 12:44:32 -08:00
Michael Ben-Ami
811fe7d18e ipnext,ipnlocal,wgengine/filter: add extension hooks for custom filter matchers
Add PacketMatch hooks to the packet filter, allowing extensions to
customize filtering decisions:

- IngressAllowHooks: checked in RunIn after pre() but before the
  standard runIn4/runIn6 match rules. Hooks can accept packets to
  destinations outside the local IP set. First match wins; the
  returned why string is used for logging.

- LinkLocalAllowHooks: checked inside pre() for both ingress and
  egress, providing exceptions to the default policy of dropping
  link-local unicast packets. First match wins. The GCP DNS address
  (169.254.169.254) is always allowed regardless of hooks.

PacketMatch returns (match bool, why string) to provide a log reason
consistent with the existing filter functions.

Hooks are registered via the new FilterHooks struct in ipnext.Hooks
and wired through to filter.Filter in LocalBackend.updateFilterLocked.

Fixes tailscale/corp#35989
Fixes tailscale/corp#37207

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
2026-02-24 10:54:56 -05:00
M. J. Fromberger
f4aea70f7a
ipn/ipnlocal: add basic support for netmap caching (#18530)
This commit is based on ff0978ab, and extends #18497 to connect network map
caching to the LocalBackend. As implemented, only "whole" netmap values are
stored, and we do not yet handle incremental updates. As-written, the feature must
be explicitly enabled via the TS_USE_CACHED_NETMAP envknob, and must be
considered experimental.

Updates #12639

Co-Authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I48a1e92facfbf7fb3a8e67cff7f2c9ab4ed62c83
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2026-02-17 14:51:54 -08:00
Will Norris
a8204568d8 all: replace UserVisibleError with vizerror package
Updates tailscale/corp#9025

Signed-off-by: Will Norris <will@tailscale.com>
2026-02-16 13:20:51 -08:00
Simon Law
6854d2982b
ipn/ipnlocal: log errors when suggesting exit nodes (#18728)
In PR #18681, we started logging which exit nodes were being
suggested. However, we did not log if there were errors encountered.
This patch corrects this oversight.

Updates: tailscale/corp#29964
Updates: tailscale/corp#36446

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-02-13 18:19:27 -08:00
Simon Law
12188c0ade
ipn/ipnlocal: log traffic steering scores and suggested exit nodes (#18681)
When traffic steering is enabled, some users are suggested an exit
node that is inappropriately far from their location. This seems to
happen right when the client connects to the control plane and the
client eventually fixes itself. But whenever an affected client
reconnects, its suggested exit node flaps, and this happens often
enough to be noticeable because connections drop whenever the exit
node is switched. This should not happen, since the map response that
contains the list of suggested exit nodes that the client picks from,
also contains the scores for those nodes.

Since our current logging and diagnostic tools don’t give us enough
insight into what is happening, this PR adds additional logging when:
- traffic steering scores are used to suggest an exit node
- an exit node is suggested, no matter how it was determined

Updates: tailscale/corp#29964
Updates: tailscale/corp#36446

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-02-10 18:14:32 -08:00
Brad Fitzpatrick
dc1d811d48 magicsock, ipnlocal: revert eventbus-based node/filter updates, remove Synchronize hack
Restore synchronous method calls from LocalBackend to magicsock.Conn
for node views, filter, and delta mutations. The eventbus delivery
introduced in 8e6f63cf1 was invalid for these updates because
subsequent operations in the same call chain depend on magicsock
already having the current state. The Synchronize/settleEventBus
workaround was fragile and kept requiring more workarounds and
introducing new mystery bugs.

Since eventbus was added, we've since learned more about when to use
eventbus, and this wasn't one of the cases.

We can take another swing at using eventbus for netmap changes in a
future change.

Fixes #16369
Updates #18575 (likely fixes)

Change-Id: I79057cc9259993368bb1e350ff0e073adf6b9a8f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-02-10 07:32:05 -08:00
KevinLiang10
5eaaf9786b tailcfg: add peerRelay bool to hostinfo
This commit adds a bool named PeerRelay to Hostinfo, to identify the host's status of acting as a peer relay.
Considering the RelayServerPort number can be 0, I just made this a bool in stead of a port number. If the port
info is needed in future this would also help indicating if the port was set to 0 (meaning any port in peer relay
context).

Updates tailscale/corp#35862

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
2026-02-06 18:25:40 -07:00
Will Hannah
058cc3f82b
ipn/ipnlocal: skip AuthKey use if profiles exist (#18619)
If any profiles exist and an Authkey is provided via syspolicy, the
AuthKey is ignored on backend start, preventing re-auth attempts. This
is useful for one-time device provisioning scenarios, skipping authKey
use after initial setup when the authKey may no longer be valid.

updates #18618

Signed-off-by: Will Hannah <willh@tailscale.com>
2026-02-06 09:40:55 -05:00
KevinLiang10
03461ea7fb
wgengine/netstack: add local tailscale service IPs to route and terminate locally (#18461)
* wgengine/netstack: add local tailscale service IPs to route and terminate locally

This commit adds the tailscales service IPs served locally to OS routes, and
make interception to packets so that the traffic terminates locally without
making affects to the HA traffics.

Fixes tailscale/corp#34048

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

* fix test

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

* add ready field to avoid accessing lb before netstack starts

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

* wgengine/netstack: store values from lb to avoid acquiring a lock

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

* add active services to netstack on starts with stored prefs.

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

* fix comments

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

* update comments

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

---------

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
2026-01-30 16:46:03 -05:00
Will Norris
3ec5be3f51 all: remove AUTHORS file and references to it
This file was never truly necessary and has never actually been used in
the history of Tailscale's open source releases.

A Brief History of AUTHORS files
---

The AUTHORS file was a pattern developed at Google, originally for
Chromium, then adopted by Go and a bunch of other projects. The problem
was that Chromium originally had a copyright line only recognizing
Google as the copyright holder. Because Google (and most open source
projects) do not require copyright assignemnt for contributions, each
contributor maintains their copyright. Some large corporate contributors
then tried to add their own name to the copyright line in the LICENSE
file or in file headers. This quickly becomes unwieldy, and puts a
tremendous burden on anyone building on top of Chromium, since the
license requires that they keep all copyright lines intact.

The compromise was to create an AUTHORS file that would list all of the
copyright holders. The LICENSE file and source file headers would then
include that list by reference, listing the copyright holder as "The
Chromium Authors".

This also become cumbersome to simply keep the file up to date with a
high rate of new contributors. Plus it's not always obvious who the
copyright holder is. Sometimes it is the individual making the
contribution, but many times it may be their employer. There is no way
for the proejct maintainer to know.

Eventually, Google changed their policy to no longer recommend trying to
keep the AUTHORS file up to date proactively, and instead to only add to
it when requested: https://opensource.google/docs/releasing/authors.
They are also clear that:

> Adding contributors to the AUTHORS file is entirely within the
> project's discretion and has no implications for copyright ownership.

It was primarily added to appease a small number of large contributors
that insisted that they be recognized as copyright holders (which was
entirely their right to do). But it's not truly necessary, and not even
the most accurate way of identifying contributors and/or copyright
holders.

In practice, we've never added anyone to our AUTHORS file. It only lists
Tailscale, so it's not really serving any purpose. It also causes
confusion because Tailscalars put the "Tailscale Inc & AUTHORS" header
in other open source repos which don't actually have an AUTHORS file, so
it's ambiguous what that means.

Instead, we just acknowledge that the contributors to Tailscale (whoever
they are) are copyright holders for their individual contributions. We
also have the benefit of using the DCO (developercertificate.org) which
provides some additional certification of their right to make the
contribution.

The source file changes were purely mechanical with:

    git ls-files | xargs sed -i -e 's/\(Tailscale Inc &\) AUTHORS/\1 contributors/g'

Updates #cleanup

Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
2026-01-23 15:49:45 -08:00
M. J. Fromberger
ce12863ee5
ipn/ipnlocal: manage per-profile subdirectories in TailscaleVarRoot (#18485)
In order to better manage per-profile data resources on the client, add methods
to the LocalBackend to support creation of per-profile directory structures in
local storage. These methods build on the existing TailscaleVarRoot config, and
have the same limitation (i.e., if no local storage is available, it will
report an error when used).

The immediate motivation is to support netmap caching, but we can also use this
mechanism for other per-profile resources including pending taildrop files and
Tailnet Lock authority caches.

This commit only adds the directory-management plumbing; later commits will
handle migrating taildrop, TKA, etc. to this mechanism, as well as caching
network maps.

Updates #12639

Change-Id: Ia75741955c7bf885e49c1ad99f856f669a754169
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2026-01-23 10:09:46 -08:00
Harry Harpham
3840183be9 tsnet: add support for Services
This change allows tsnet nodes to act as Service hosts by adding a new
function, tsnet.Server.ListenService. Invoking this function will
advertise the node as a host for the Service and create a listener to
receive traffic for the Service.

Fixes #17697
Fixes tailscale/corp#27200
Signed-off-by: Harry Harpham <harry@tailscale.com>
2026-01-16 15:28:31 -07:00
Jonathan Nobels
643e91f2eb
net/netmon: move TailscaleInterfaceIndex out of netmon.State (#18428)
fixes tailscale/tailscale#18418

Both Serve and PeerAPI broke when we moved the TailscaleInterfaceName
into State, which is updated asynchronously and may not be
available when we configure the listeners.

This extracts the explicit interface name property from netmon.State
and adds as a static struct with getters that have proper error
handling.

The bug is only found in sandboxed Darwin clients, where we
need to know the Tailscale interface details in order to set up the
listeners correctly (they must bind to our interface explicitly to escape
the network sandboxing that is applied by NECP).

Currently set only sandboxed macOS and Plan9 set this but it will
also be useful on Windows to simplify interface filtering in netns.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
2026-01-16 14:53:23 -05:00
Tom Meadows
c3b7f24051
ipn,ipn/local: always accept routes for Tailscale Services (cgnat range) (#18173)
Updates #18198

Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
Co-authored-by: James Tucker <raggi@tailscale.com>
2026-01-14 18:20:00 +00:00