The test was flaky under stress with "AddRawMapResponse N: node not
connected" failures. The root cause was in testcontrol's addDebugMessage:
it conflated "no streaming poll registered" with "wake-up channel buffer
momentarily full". The single-slot updatesCh is just a lossy wake-up
signal, but the streaming serveMap loop has fast paths
(takeRawMapMessage and the hasPendingRawMapMessage continue) that don't
drain it. A stale notification could remain buffered, causing the next
sendUpdate to fail even though msgToSend had been queued and the
streaming poll would still pick it up.
Detect the real failure case (no streaming poll) by checking
s.updates[nodeID] directly, and treat sendUpdate's buffer-full result as
benign — the message is in msgToSend, which is the source of truth.
Also plumb an optional *health.Tracker through tsp.ClientOpts to the
underlying ts2021.Client and supply one in the tests, eliminating the
"## WARNING: (non-fatal) nil health.Tracker (being strict in CI)" stack
dumps emitted by controlhttp.(*Dialer).forceNoise443 under CI.
Fixes#19583
Change-Id: Ib2334376585e8d6562f000a0b71dea0117acb0ff
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Add a vmtest that brings up two gokrazy nodes A and B behind two
One2OneNAT networks (so direct UDP works in both directions and any
slowness can't be blamed on NAT traversal), establishes a WireGuard
tunnel A → B with TSMP, then rotates B's disco key four times and
asserts that the data plane recovers in both directions after each
rotation. All pings are TSMP (the data-plane ping; disco pings would
not exercise the WireGuard tunnel itself).
The five pings:
1. A → B (initial; brings up the tunnel; 30s budget)
2. B → A after rotate (LocalAPI rotate-disco-key debug action)
3. A → B after rotate (LocalAPI)
4. B → A after restart (SIGKILL; gokrazy supervisor respawns)
5. A → B after restart (SIGKILL)
Each post-rotation ping gets a 15-second budget. Two unavoidable
multi-second waits dominate today:
- The rotate-then-a→b phase takes ~10s on main because of LazyWG.
After B's WantRunning bounce, B's wgengine resets its
sentActivityAt/recvActivityAt maps and trims A out of the
wireguard-go config as an "idle peer"; B only re-adds A on
inbound activity, by which point A's first few TSMP packets
have been silently dropped at B's tundev. The
bradfitz/rm_lazy_wg branch removes that trimming entirely
(verified locally: this phase drops to <100ms there).
- The restart phases take ~5s for wireguard-go's RekeyTimeout
handshake retry. After SIGKILL+respawn the first WG handshake
init from the restarted node sometimes goes into the void
(likely the brief peer-removed window in the receiver's
two-step maybeReconfigWireguardLocked reconfig during which
the peer is absent from wireguard-go), and wg-go's 5s+jitter
retransmit timer is the next opportunity to retry. That retry
succeeds and the staged TSMP packet flushes. Intrinsic to the
protocol's retransmit policy.
Once LazyWG is removed and the first-handshake-after-reconfig race
is fixed, the budget should drop to 5s.
Supporting changes:
ipn/ipnlocal: DebugRotateDiscoKey now toggles WantRunning off and
back on after rotating the disco key. magicsock.Conn.RotateDiscoKey
only resets local disco state; without also dropping wireguard-go
session keys, peers keep encrypting with their stale per-peer
session against us until their rekey timer fires (WireGuard has no
data-plane signaling to invalidate sessions). Bouncing WantRunning
runs the engine through Reconfig(empty) → authReconfig, which
drops every peer's WG session so the next packet either way
triggers a fresh handshake.
ipn/ipnlocal, ipn/localapi: add a debug-only "peer-disco-keys"
LocalAPI action ([LocalBackend.DebugPeerDiscoKeys]) that returns
a map[NodePublic]DiscoPublic from the current netmap. Tests reach
it via [local.Client.DebugResultJSON]. We do not surface disco
keys via [ipnstate.PeerStatus] because adding a non-comparable
[key.DiscoPublic] field there breaks reflect-based test helpers
(e.g. TestFilterFormatAndSortExitNodes' use of cmp.Diff), and
general LocalAPI clients have no need for disco keys. Since the
debug LocalAPI is gated behind the ts_omit_debug build tag, this
endpoint is automatically stripped from small binaries.
cmd/tta: add /restart-tailscaled handler (Linux-only, via /proc walk)
to drive the SIGKILL phase. On gokrazy the supervisor respawns
tailscaled within a second.
tstest/integration/testcontrol: add Server.AllOnline. When set,
every peer entry in MapResponses is marked Online=true. Several
disco-key handling fast paths in controlclient and wgengine
(removeUnwantedDiscoUpdates, removeUnwantedDiscoUpdatesFromFull
NetmapUpdate, the wgengine tsmpLearnedDisco fast path) only fire
for online peers; without this flag, tests exercising disco-key
rotation only hit the offline-peer code paths, which mask issues
and are several seconds slower in this scenario. Finer-grained
per-node online tracking can be added later.
tstest/natlab/vmtest: add Env.RotateDiscoKey,
Env.RestartTailscaled, Env.PeerDiscoKey, Node.Name, an
[AllOnline] EnvOption that plumbs through to
testcontrol.Server.AllOnline, and an exported
Env.Ping(from, to, type, timeout). Ping replaces the unexported
helper so callers can specify both a ping type (PingDisco for
warming peer state, PingTSMP for asserting end-to-end
connectivity) and a deadline. PeerDiscoKey returns its LocalAPI
error so callers inside tstest.WaitFor can retry transient
failures rather than fataling the test.
Updates #12639
Updates #13038
Change-Id: I3644f27fc30e52990ba25a3983498cc582ddb958
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Add a Go benchmark that exercises a single tailnet client (a [tsnet.Server]
running in the test process) against a synthetic large initial netmap and
a stream of caller-driven peer add/remove deltas, all in-process.
The harness is split in two parts:
- tstest/largetailnet, a reusable package containing a [Streamer]
that hijacks the map long-poll on a [testcontrol.Server] via the new
AltMapStream hook, sends one initial MapResponse with N synthetic
peers, and forwards caller-supplied delta MapResponses on the same
stream. Helpers like MakePeer / AllocPeer build synthetic peers with
unique IDs and addresses derived from the Tailscale ULA range.
- tstest/largetailnet/largetailnet_test.go, BenchmarkGiantTailnet
(headless tailscaled workload, no IPN bus subscriber) and
BenchmarkGiantTailnetBusWatcher (GUI-client workload with one
Notify subscriber attached). Both are gated on
--actually-test-giant-tailnet (skipped by default), stand up an
in-process testcontrol + tsnet.Server, let Up block until the
initial N-peer netmap has been processed, then ResetTimer and run
add+remove pairs via b.Loop. Per-delta sync is via a test-only
[ipnlocal.LocalBackend.AwaitNodeKeyForTest] channel that closes
once the just-added peer key appears in the netmap (no-watcher
variant) or via bus-Notify drain (bus-watcher variant).
To support the hijack, [testcontrol.Server] grows an AltMapStream hook
and a small MapStreamWriter interface for benchmarks/stress tests that
need to drive a controlled MapResponse sequence; the normal serveMap
path is untouched when AltMapStream is nil. The streamer answers
non-streaming "lite" map polls (which controlclient issues before the
streaming long-poll to push HostInfo) with an empty MapResponse and
returns immediately, so the streaming poll that follows is the one
that gets the initial netmap.
The benchmark is intended for before/after comparisons of netmap- and
delta-handling changes targeted at large tailnets. CPU profiles on
unmodified main show the expected O(N) hotspots:
setControlClientStatusLocked / authReconfigLocked /
userspaceEngine.Reconfig / setNetMapLocked, plus JSON encoding of the
full Notify.NetMap to bus watchers (which dominates the BusWatcher
variant).
Median ms/op over 10 runs on unmodified main, by tailnet size N:
N no-watcher bus-watcher
10000 32 166
50000 222 865
100000 504 1765
250000 1551 4696
Recommended invocation:
go test ./tstest/largetailnet/ -run=^$ \
-bench='BenchmarkGiantTailnet(BusWatcher)?$' \
-benchtime=2000x -timeout=10m \
--actually-test-giant-tailnet \
--giant-tailnet-n=250000 \
-cpuprofile=/tmp/giant.cpu.pprof
Updates #12542
Change-Id: I4f5b2bb271a36ba853d5a0ffe82054ef2b15c585
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
If we get a 429 response during node registration, use the `Retry-After`
header for backoff instead of the regular exponential backoff.
The rate limiter error is propagated to the user, just like other
registration errors are, e.g.
```
$ tailscale up
backend error: node registration rate limited; will retry after 57s
exit status 1
```
Updates tailscale/corp#39533
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
When SetSubnetRoutes is called, also send updatePeerChanged to all
other connected nodes so they re-fetch their MapResponse and learn
about the updated AllowedIPs. Without this, peers never see new
subnet routes until they happen to reconnect to the control server.
Discovered while working on a three-VM natlab subnet router
integration test, coming later.
Updates #13038
Change-Id: I20e7a2fda994a8ab0e7a24240e6eae536f4f5f15
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Updating a node on a testcontrol server should trigger netmap updates to
all connected streaming clients. This was not the case previous to this
change and consequently caused race conditions in tests. It was possible
for a test to call UpdateNode and for connected nodes to never see the
update propagate.
Updates #16340Fixes#18703
Signed-off-by: Harry Harpham <harry@tailscale.com>
Two methods were recently added to the testcontrol.Server type:
AddDNSRecords and SetGlobalAppCaps. These two methods should trigger
netmap updates for all nodes connected to the Server instance, the way
that other state-change methods do (see SetNodeCapMap, for example).
This will also allow us to get rid of Server.ForceNetmapUpdate, which
was a band-aid fix to force the netmap updates which should have been
triggered by the aforementioned methods.
Fixestailscale/corp#37102
Signed-off-by: Harry Harpham <harry@tailscale.com>
This file was never truly necessary and has never actually been used in
the history of Tailscale's open source releases.
A Brief History of AUTHORS files
---
The AUTHORS file was a pattern developed at Google, originally for
Chromium, then adopted by Go and a bunch of other projects. The problem
was that Chromium originally had a copyright line only recognizing
Google as the copyright holder. Because Google (and most open source
projects) do not require copyright assignemnt for contributions, each
contributor maintains their copyright. Some large corporate contributors
then tried to add their own name to the copyright line in the LICENSE
file or in file headers. This quickly becomes unwieldy, and puts a
tremendous burden on anyone building on top of Chromium, since the
license requires that they keep all copyright lines intact.
The compromise was to create an AUTHORS file that would list all of the
copyright holders. The LICENSE file and source file headers would then
include that list by reference, listing the copyright holder as "The
Chromium Authors".
This also become cumbersome to simply keep the file up to date with a
high rate of new contributors. Plus it's not always obvious who the
copyright holder is. Sometimes it is the individual making the
contribution, but many times it may be their employer. There is no way
for the proejct maintainer to know.
Eventually, Google changed their policy to no longer recommend trying to
keep the AUTHORS file up to date proactively, and instead to only add to
it when requested: https://opensource.google/docs/releasing/authors.
They are also clear that:
> Adding contributors to the AUTHORS file is entirely within the
> project's discretion and has no implications for copyright ownership.
It was primarily added to appease a small number of large contributors
that insisted that they be recognized as copyright holders (which was
entirely their right to do). But it's not truly necessary, and not even
the most accurate way of identifying contributors and/or copyright
holders.
In practice, we've never added anyone to our AUTHORS file. It only lists
Tailscale, so it's not really serving any purpose. It also causes
confusion because Tailscalars put the "Tailscale Inc & AUTHORS" header
in other open source repos which don't actually have an AUTHORS file, so
it's ambiguous what that means.
Instead, we just acknowledge that the contributors to Tailscale (whoever
they are) are copyright holders for their individual contributions. We
also have the benefit of using the DCO (developercertificate.org) which
provides some additional certification of their right to make the
contribution.
The source file changes were purely mechanical with:
git ls-files | xargs sed -i -e 's/\(Tailscale Inc &\) AUTHORS/\1 contributors/g'
Updates #cleanup
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
This change allows tsnet nodes to act as Service hosts by adding a new
function, tsnet.Server.ListenService. Invoking this function will
advertise the node as a host for the Service and create a listener to
receive traffic for the Service.
Fixes#17697Fixestailscale/corp#27200
Signed-off-by: Harry Harpham <harry@tailscale.com>
This patch adds an integration test for Tailnet Lock, checking that a node can't
talk to peers in the tailnet until it becomes signed.
This patch also introduces a new package `tstest/tkatest`, which has some helpers
for constructing a mock control server that responds to TKA requests. This allows
us to reduce boilerplate in the IPN tests.
Updates tailscale/corp#33599
Signed-off-by: Alex Chan <alexc@tailscale.com>
And fix up the TestAutoUpdateDefaults integration tests as they
weren't testing reality: the DefaultAutoUpdate is supposed to only be
relevant on the first MapResponse in the stream, but the tests weren't
testing that. They were instead injecting a 2nd+ MapResponse.
This changes the test control server to add a hook to modify the first
map response, and then makes the test control when the node goes up
and down to make new map responses.
Also, the test now runs on macOS where the auto-update feature being
disabled would've previously t.Skipped the whole test.
Updates #11502
Change-Id: If2319bd1f71e108b57d79fe500b2acedbc76e1a6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
SetSubnetRoutes was not sending update notifications to nodes when their
approved routes changed, causing nodes to not fetch updated netmaps with
PrimaryRoutes populated. This resulted in TestUserMetricsRouteGauges
flaking because it waited for PrimaryRoutes to be set, which only happened
if the node happened to poll for other reasons.
Now send updateSelfChanged notification to affected nodes so they fetch
an updated netmap immediately.
Fixes#17962
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
This patch fixes several issues related to printing login and device
approval URLs, especially when `tailscale up` is interrupted:
1. Only print a login URL that will cause `tailscale up` to complete.
Don't print expired URLs or URLs from previous login attempts.
2. Print the device approval URL if you run `tailscale up` after
previously completing a login, but before approving the device.
3. Use the correct control URL for device approval if you run a bare
`tailscale up` after previously completing a login, but before
approving the device.
4. Don't print the device approval URL more than once (or at least,
not consecutively).
Updates tailscale/corp#31476
Updates #17361
## How these fixes work
This patch went through a lot of trial and error, and there may still
be bugs! These notes capture the different scenarios and considerations
as we wrote it, which are also captured by integration tests.
1. We were getting stale login URLs from the initial IPN state
notification.
When the IPN watcher was moved to before Start() in c011369, we
mistakenly continued to request the initial state. This is only
necessary if you start watching after you call Start(), because
you may have missed some notifications.
By getting the initial state before calling Start(), we'd get
a stale login URL. If you clicked that URL, you could complete
the login in the control server (if it wasn't expired), but your
instance of `tailscale up` would hang, because it's listening for
login updates from a different login URL.
In this patch, we no longer request the initial state, and so we
don't print a stale URL.
2. Once you skip the initial state from IPN, the following sequence:
* Run `tailscale up`
* Log into a tailnet with device approval
* ^C after the device approval URL is printed, but without approving
* Run `tailscale up` again
means that nothing would ever be printed.
`tailscale up` would send tailscaled the pref `WantRunning: true`,
but that was already the case so nothing changes. You never get any
IPN notifications, and in particular you never get a state change to
`NeedsMachineAuth`. This means we'd never print the device approval URL.
In this patch, we add a hard-coded rule that if you're doing a simple up
(which won't trigger any other IPN notifications) and you start in the
`NeedsMachineAuth` state, we print the device approval message without
waiting for an IPN notification.
3. Consider the following sequence:
* Run `tailscale up --login-server=<custom server>`
* Log into a tailnet with device approval
* ^C after the device approval URL is printed, but without approving
* Run `tailscale up` again
We'd print the device approval URL for the default control server,
rather than the real control server, because we were using the `prefs`
from the CLI arguments (which are all the defaults) rather than the
`curPrefs` (which contain the custom login server).
In this patch, we use the `prefs` if the user has specified any settings
(and other code will ensure this is a complete set of settings) or
`curPrefs` if it's a simple `tailscale up`.
4. Consider the following sequence: you've logged in, but not completed
device approval, and you run `down` and `up` in quick succession.
* `up`: sees state=NeedsMachineAuth
* `up`: sends `{wantRunning: true}`, prints out the device approval URL
* `down`: changes state to Stopped
* `up`: changes state to Starting
* tailscaled: changes state to NeedsMachineAuth
* `up`: gets an IPN notification with the state change, and prints
a second device approval URL
Either URL works, but this is annoying for the user.
In this patch, we track whether the last printed URL was the device
approval URL, and if so, we skip printing it a second time.
Signed-off-by: Alex Chan <alexc@tailscale.com>
This patch extends the integration tests for `tailscale up` to include tailnets
where new devices need to be approved. It doesn't change the CLI, because it's
mostly working correctly already -- these tests are just to prevent future
regressions.
I've added support for `MachineAuthorized` to mock control, and I've refactored
`TestOneNodeUpAuth` to be more flexible. It now takes a sequence of steps to
run and asserts whether we got a login URL and/or machine approval URL after
each step.
Updates tailscale/corp#31476
Updates #17361
Signed-off-by: Alex Chan <alexc@tailscale.com>
Expand the integration tests to cover a wider range of scenarios, including:
* Before and after a successful initial login
* Auth URLs and auth keys
* With and without the `--force-reauth` flag
* With and without seamless key renewal
These tests expose a race condition when using `--force-reauth` on an
already-logged in device. The command completes too quickly, preventing
the auth URL from being displayed. This issue is identified and will be
fixed in a separate commit.
Updates #17108
Signed-off-by: Alex Chan <alexc@tailscale.com>
For debugging purposes, add a new C2N endpoint returning the current
netmap. Optionally, coordination server can send a new "candidate" map
response, which the client will generate a separate netmap for.
Coordination server can later compare two netmaps, detecting unexpected
changes to the client state.
Updates tailscale/corp#32095
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
Instead of a single hard-coded C2N handler, add support for calling
arbitrary C2N endpoints via a node roundtripper.
Updates tailscale/corp#32095
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
To support integration testing of client features that rely on it, e.g.
peer relay.
Updates tailscale/corp#30903
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Taildrop has never had an end-to-end test since it was introduced.
This adds a basic one.
It caught two recent refactoring bugs & one from 2022 (0f7da5c7dc0).
This is prep for moving the rest of Taildrop out of LocalBackend, so
we can do more refactorings with some confidence.
Updates #15812
Change-Id: I6182e49c5641238af0bfdd9fea1ef0420c112738
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Only send a stored raw map message in reply to a streaming map response.
Otherwise a non-streaming map response might pick it up first, and
potentially drop it. This guarantees that a map response sent via
AddRawMapResponse will be picked up by the main map response loop in the
client.
Fixes#15362
Signed-off-by: James Sanderson <jsanderson@tailscale.com>
This deprecates the old "DERP string" packing a DERP region ID into an
IP:port of 127.3.3.40:$REGION_ID and just uses an integer, like
PeerChange.DERPRegion does.
We still support servers sending the old form; they're converted to
the new form internally right when they're read off the network.
Updates #14636
Change-Id: I9427ec071f02a2c6d75ccb0fcbf0ecff9f19f26f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
These erroneously blocked a recent PR, which I fixed by simply
re-running CI. But we might as well fix them anyway.
These are mostly `printf` to `print` and a couple of `!=` to `!Equal()`
Updates #cleanup
Signed-off-by: Will Norris <will@tailscale.com>
Back in the day this testcontrol package only spoke the
nacl-boxed-based control protocol, which used this.
Then we added ts2021, which didn't, but still sometimes used it.
Then we removed the old mode and didn't remove this parameter
in 2409661a0da956.
Updates #11585
Change-Id: Ifd290bd7dbbb52b681b3599786437a15bc98b6a5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Otherwise all the clients only using control/controlhttp for the
ts2021 HTTP client were also pulling in WebSocket libraries, as the
server side always needs to speak websockets, but only GOOS=js clients
speak it.
This doesn't yet totally remove the websocket dependency on Linux because
Linux has a envknob opt-in to act like GOOS=js for manual testing and force
the use of WebSockets for DERP only (not control). We can put that behind
a build tag in a future change to eliminate the dep on all GOOSes.
Updates #1278
Change-Id: I4f60508f4cad52bf8c8943c8851ecee506b7ebc9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This adds a new bool that can be sent down from control
to do jailing on the client side. Previously this would
only be done from control by modifying the packet filter
we sent down to clients. This would result in a lot of
additional work/CPU on control, we could instead just
do this on the client. This has always been a TODO which
we keep putting off, might as well do it now.
Updates tailscale/corp#19623
Signed-off-by: Maisem Ali <maisem@tailscale.com>
We were storing server-side lots of:
"Auth":{"Provider":"","LoginName":"","Oauth2Token":null,"AuthKey":""},
That was about 7% of our total storage of pending RegisterRequest
bodies.
Updates tailscale/corp#19327
Change-Id: Ib73842759a2b303ff5fe4c052a76baea0d68ae7d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Use the zstdframe package where sensible instead of plumbing
around our own zstd.Encoder just for stateless operations.
This causes logtail to have a dependency on zstd,
but that's arguably okay since zstd support is implicit
to the protocol between a client and the logging service.
Also, virtually every caller to logger.NewLogger was
manually setting up a zstd.Encoder anyways,
meaning that zstd was functionally always a dependency.
Updates #cleanup
Updates tailscale/corp#18514
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
* Implement missing tests for sniproxy
* Wire sniproxy to new appc package
* Add support to tsnet for routing subnet router traffic into netstack, so it can be handled
Updates: https://github.com/tailscale/corp/issues/15038
Signed-off-by: Tom DNetto <tom@tailscale.com>
Terminating traffic to IPs which are not the native IPs of the node requires
the netstack subsystem to intercept trafic to an IP it does not consider local.
This PR switches on such interception. In addition to supporting such termination,
this change will also enable exit nodes and subnet routers when running in
userspace mode.
DO NOT MERGE until 1.52 is cut.
Signed-off-by: Tom DNetto <tom@tailscale.com>
Updates: https://github.com/tailscale/corp/issues/15038
This PR plumbs through awareness of an IPv6 SNAT/masquerade address from the wire protocol
through to the low-level (tstun / wgengine). This PR is the first in two PRs for implementing
IPv6 NAT support to/from peers.
A subsequent PR will implement the data-plane changes to implement IPv6 NAT - this is just plumbing.
Signed-off-by: Tom DNetto <tom@tailscale.com>
Updates ENG-991
This adds a new integration test with two nodes where the first gets a
incremental MapResponse (with only PeersRemoved set) saying that the
second node disappeared.
This extends the testcontrol package to support sending raw
MapResponses to nodes.
Updates #1909
Change-Id: Iea0c25c19cf0d72b52dba5a46d01b5cc87b9b39d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We use it a number of places in different repos. Might as well make
one. Another use is coming.
Updates #cleanup
Change-Id: Ib7ce38de0db35af998171edee81ca875102349a4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Now a nodeAttr: ForceBackgroundSTUN, DERPRoute, TrimWGConfig,
DisableSubnetsIfPAC, DisableUPnP.
Kept support for, but also now a NodeAttr: RandomizeClientPort.
Removed: SetForceBackgroundSTUN, SetRandomizeClientPort (both never
used, sadly... never got around to them. But nodeAttrs are better
anyway), EnableSilentDisco (will be a nodeAttr later when that effort
resumes).
Updates #8923
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>