553 Commits

Author SHA1 Message Date
Brad Fitzpatrick
b313bffbe7 control/tsp, tstest/integration/testcontrol: deflake TestMapAgainstTestControl
The test was flaky under stress with "AddRawMapResponse N: node not
connected" failures. The root cause was in testcontrol's addDebugMessage:
it conflated "no streaming poll registered" with "wake-up channel buffer
momentarily full". The single-slot updatesCh is just a lossy wake-up
signal, but the streaming serveMap loop has fast paths
(takeRawMapMessage and the hasPendingRawMapMessage continue) that don't
drain it. A stale notification could remain buffered, causing the next
sendUpdate to fail even though msgToSend had been queued and the
streaming poll would still pick it up.

Detect the real failure case (no streaming poll) by checking
s.updates[nodeID] directly, and treat sendUpdate's buffer-full result as
benign — the message is in msgToSend, which is the source of truth.

Also plumb an optional *health.Tracker through tsp.ClientOpts to the
underlying ts2021.Client and supply one in the tests, eliminating the
"## WARNING: (non-fatal) nil health.Tracker (being strict in CI)" stack
dumps emitted by controlhttp.(*Dialer).forceNoise443 under CI.

Fixes #19583

Change-Id: Ib2334376585e8d6562f000a0b71dea0117acb0ff
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-29 16:11:00 -07:00
Brad Fitzpatrick
15cba0a3f6 tstest/natlab/vmtest: add TestDiscoKeyChange
Add a vmtest that brings up two gokrazy nodes A and B behind two
One2OneNAT networks (so direct UDP works in both directions and any
slowness can't be blamed on NAT traversal), establishes a WireGuard
tunnel A → B with TSMP, then rotates B's disco key four times and
asserts that the data plane recovers in both directions after each
rotation. All pings are TSMP (the data-plane ping; disco pings would
not exercise the WireGuard tunnel itself).

The five pings:

  1. A → B  (initial; brings up the tunnel; 30s budget)
  2. B → A  after rotate (LocalAPI rotate-disco-key debug action)
  3. A → B  after rotate (LocalAPI)
  4. B → A  after restart (SIGKILL; gokrazy supervisor respawns)
  5. A → B  after restart (SIGKILL)

Each post-rotation ping gets a 15-second budget. Two unavoidable
multi-second waits dominate today:

  - The rotate-then-a→b phase takes ~10s on main because of LazyWG.
    After B's WantRunning bounce, B's wgengine resets its
    sentActivityAt/recvActivityAt maps and trims A out of the
    wireguard-go config as an "idle peer"; B only re-adds A on
    inbound activity, by which point A's first few TSMP packets
    have been silently dropped at B's tundev. The
    bradfitz/rm_lazy_wg branch removes that trimming entirely
    (verified locally: this phase drops to <100ms there).

  - The restart phases take ~5s for wireguard-go's RekeyTimeout
    handshake retry. After SIGKILL+respawn the first WG handshake
    init from the restarted node sometimes goes into the void
    (likely the brief peer-removed window in the receiver's
    two-step maybeReconfigWireguardLocked reconfig during which
    the peer is absent from wireguard-go), and wg-go's 5s+jitter
    retransmit timer is the next opportunity to retry. That retry
    succeeds and the staged TSMP packet flushes. Intrinsic to the
    protocol's retransmit policy.

Once LazyWG is removed and the first-handshake-after-reconfig race
is fixed, the budget should drop to 5s.

Supporting changes:

  ipn/ipnlocal: DebugRotateDiscoKey now toggles WantRunning off and
  back on after rotating the disco key. magicsock.Conn.RotateDiscoKey
  only resets local disco state; without also dropping wireguard-go
  session keys, peers keep encrypting with their stale per-peer
  session against us until their rekey timer fires (WireGuard has no
  data-plane signaling to invalidate sessions). Bouncing WantRunning
  runs the engine through Reconfig(empty) → authReconfig, which
  drops every peer's WG session so the next packet either way
  triggers a fresh handshake.

  ipn/ipnlocal, ipn/localapi: add a debug-only "peer-disco-keys"
  LocalAPI action ([LocalBackend.DebugPeerDiscoKeys]) that returns
  a map[NodePublic]DiscoPublic from the current netmap. Tests reach
  it via [local.Client.DebugResultJSON]. We do not surface disco
  keys via [ipnstate.PeerStatus] because adding a non-comparable
  [key.DiscoPublic] field there breaks reflect-based test helpers
  (e.g. TestFilterFormatAndSortExitNodes' use of cmp.Diff), and
  general LocalAPI clients have no need for disco keys. Since the
  debug LocalAPI is gated behind the ts_omit_debug build tag, this
  endpoint is automatically stripped from small binaries.

  cmd/tta: add /restart-tailscaled handler (Linux-only, via /proc walk)
  to drive the SIGKILL phase. On gokrazy the supervisor respawns
  tailscaled within a second.

  tstest/integration/testcontrol: add Server.AllOnline. When set,
  every peer entry in MapResponses is marked Online=true. Several
  disco-key handling fast paths in controlclient and wgengine
  (removeUnwantedDiscoUpdates, removeUnwantedDiscoUpdatesFromFull
  NetmapUpdate, the wgengine tsmpLearnedDisco fast path) only fire
  for online peers; without this flag, tests exercising disco-key
  rotation only hit the offline-peer code paths, which mask issues
  and are several seconds slower in this scenario. Finer-grained
  per-node online tracking can be added later.

  tstest/natlab/vmtest: add Env.RotateDiscoKey,
  Env.RestartTailscaled, Env.PeerDiscoKey, Node.Name, an
  [AllOnline] EnvOption that plumbs through to
  testcontrol.Server.AllOnline, and an exported
  Env.Ping(from, to, type, timeout). Ping replaces the unexported
  helper so callers can specify both a ping type (PingDisco for
  warming peer state, PingTSMP for asserting end-to-end
  connectivity) and a deadline. PeerDiscoKey returns its LocalAPI
  error so callers inside tstest.WaitFor can retry transient
  failures rather than fataling the test.

Updates #12639
Updates #13038

Change-Id: I3644f27fc30e52990ba25a3983498cc582ddb958
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-29 12:58:00 -07:00
Brad Fitzpatrick
fd6ae2fad4 tstest/natlab/vmtest: serialize per-platform setup with sync.Once
Two cloud-platform nodes (e.g. sr-a and sr-b in TestSiteToSite) boot in
parallel via errgroup and both call ensureCompiled and the inline image
preparation block, racing to Begin() the same shared *Step (which is
deduped by name in Env.Step). The second goroutine panics:

    panic: Step "Compile linux_amd64 binaries": Begin called in state running
    panic: Step "Prepare ubuntu-24.04 image": Begin called in state done

ensureCompiled had a TOCTOU dedup attempt (released compileMu before
doing the work, only added to the compiled set at the end), and image
preparation had no dedup at all.

Replace the compiled set with a per-key map[string]*sync.Once for each
of compile and image preparation, so concurrent callers serialize on
the Once and only the first executes Begin/work/End.

Fixes commit 02ffe5baa8ccb2b81c4cfba3b59653e2cff10e01.

Updates #13038

Change-Id: If710bcc9e0aafebf0ad5b61553bae11458d976d7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-29 09:54:58 -07:00
Brad Fitzpatrick
02ffe5baa8 tstest/natlab/vmtest: add macOS VM snapshot caching for fast test starts
Cache a pre-booted macOS VM snapshot on disk so subsequent test runs
restore from the snapshot instead of cold-booting. The snapshot is keyed
by the Tart base image digest and a code version constant
(macOSSnapshotCodeVersion); bumping either invalidates the cache.

Snapshot preparation (one-time):
- Boot the Tart base image with a NAT NIC (--nat-nic flag)
- Wait for SSH, compile and install cmd/tta as a LaunchDaemon
- TTA polls the host via AF_VSOCK for an IP assignment; during prep
  the host replies "wait"
- Disconnect NIC, save VM state via SIGINT

Test fast path (cached, ~7s to agent connected):
- APFS clone the snapshot, write test-specific config.json
- Launch Host.app with --disconnected-nic --attach-network --assign-ip
- VZ restores from SaveFile.vzvmsave (~5s with 4GB RAM)
- TTA's vsock poll gets the IP config, sets static IP via ifconfig
  (bypasses DHCP entirely), switches driver addr to the IP directly
  (bypasses DNS), and resets the dial context so the reverse-dial
  reconnects immediately
- TTA agent connects to test driver within ~2s of IP assignment

Key optimizations:
- 4GB RAM instead of 8GB: halves SaveFile.vzvmsave (1.4GB vs 2.4GB),
  halves restore time (5.5s vs 11s)
- AF_VSOCK IP assignment: bypasses macOS DHCP (~5-7s saved)
- Direct IP dial: bypasses DNS resolution for test-driver.tailscale
- Dial context reset: cancels stale in-flight dials from snapshot
- Kill instead of SIGINT for test VM cleanup (no state save needed)
- Parallel VM launches

Also:
- Add TestDriverIPv4/TestDriverPort constants to vnet
- Add --nat-nic and --assign-ip flags to Host.app
- Fix SIGINT handler: retain DispatchSource globally, use dispatchMain()
- Add vsock listener (port 51011) to Host.app for IP config protocol
- Add disconnectNetwork() to VMController for clean snapshot state
- Fix Makefile: set -o pipefail so xcodebuild failures aren't swallowed

Updates #13038

Change-Id: Icbab73b57af7df3ae96136fb49cda2536310f31b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-29 08:17:13 -07:00
Brad Fitzpatrick
4cec06b8f2 tstest/natlab/vmtest: add macOS VM screenshot streaming to web UI
When --vmtest-web is set, Host.app is launched with --screenshot-port 0
to start a localhost HTTP server that captures the VZVirtualMachineView
display. The Go test harness parses the SCREENSHOT_PORT=<port> line from
stdout, then polls every 2 seconds for JPEG thumbnails and pushes them
over WebSocket to the web dashboard.

Clicking a screenshot thumbnail opens a full-resolution image proxied
through the web UI's /screenshot/{node} endpoint.

Screenshot events are excluded from the EventBus history (they're large
and only the latest matters, stored in NodeStatus.Screenshot).

Updates #13038

Change-Id: I9bc67ddd1cc72948b33c555d4be3d8db06a41f6d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-29 07:48:26 -07:00
Alex Chan
bb91bb842c all: remove everything related to non-seamless key renewal
Seamless key renewal has been the default in all clients since 1.90.
We retained the ability to disable it from the control plane as a
precaution, but we haven't seen any issues that require us to disable it.

We're now removing all the code for non-seamless key renewal, because we
don't expect to turn it on again, and indeed it's been untested in the
field for three releases so might contain latent bugs!

Updates tailscale/corp#33042

Change-Id: I4b80bf07a3a50298d1c303743484169accc8844b
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-04-29 10:03:26 +01:00
Brad Fitzpatrick
b2d4ba04b6 tstest/natlab/vmtest: add macOS VM support using Tart base images
Add macOS VM support to the vmtest framework using Tart's pre-built
macOS images (ghcr.io/cirruslabs/macos-tahoe-base) instead of building
from IPSW. The Tart image has SIP disabled and SSH enabled.

At test time, the Tart base image's disk, NVRAM, and hardware identity
are APFS-cloned into a tailmac-compatible directory layout, and the VM
is booted headlessly via tailmac's Host.app (Virtualization.framework)
with its NIC connected to vnet's dgram socket.

New features:
- tailmac.go: ensureTartImage (auto-pull), cloneTartToTailmac (format
  conversion), startTailMacVM (launch + cleanup)
- NoAgent() node option for VMs without TTA installed
- LANPing() for ICMP reachability testing via TTA's /ping endpoint
- IsMacOS field on OSImage, with GOOS/GOARCH support
- Dgram socket listener in Start() for macOS VMs
- Fix ReadFromUnix error spam on dgram socket close in vnet

TestMacOSAndLinuxCanPing verifies a macOS Tart VM and a gokrazy Linux
VM can ping each other on the same vnet LAN.

Updates #13038

Change-Id: I5e73a27878abf009f780fdf11a346fc857711cff
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-28 12:51:40 -07:00
Brad Fitzpatrick
ec7b11d986 tstest/natlab/vmtest, cmd/tta: add TestTaildrop
Add a vmtest that brings up two Ubuntu nodes, each behind its own
EasyNAT, joined to the tailnet. The sender pushes a small file via
"tailscale file cp" and the receiver fetches it via "tailscale file
get --wait", asserting that the filename and contents round-trip
unchanged.

To make Taildrop work in vmtest, three small pieces were needed:

The Linux/FreeBSD cloud-init now starts tailscaled with --statedir as
well as --state=mem:, so the daemon has a VarRoot to host Taildrop's
incoming-files directory. State itself remains in-memory (so nothing
persists across reboots); only the var-root scratch space is on disk.

vmtest.New grows a variadic EnvOption parameter and a SameTailnetUser
helper. When the option is passed, Start sets AllNodesSameUser=true
on the embedded testcontrol.Server. Cross-node Taildrop requires the
sender and receiver to share a Tailnet user (or have an explicit
PeerCapabilityFileSharingTarget granted between them, which we don't
plumb here), so TestTaildrop opts in. Existing tests don't.

cmd/tta gains /taildrop-send and /taildrop-recv handlers that wrap
"tailscale file cp" and "tailscale file get --wait", plus
Env.SendTaildropFile and Env.RecvTaildropFile helpers in vmtest that
drive them.

Updates #13038

Change-Id: I8f5f70f88106e6e2ee07780dd46fe00f8efcfdf1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-28 12:27:55 -07:00
Brad Fitzpatrick
4b8e0ede6d tstest/natlab/{vmtest,vnet}, cmd/tta: add TestMullvadExitNode
Add a vmtest that brings up a Tailscale client, an Ubuntu VM acting
as a Mullvad-style plain-WireGuard exit node, and a non-Tailscale
webserver, each on its own NAT'd vnet network with a distinct WAN
IP. The test exercises Tailscale's IsWireGuardOnly peer code path:
the way the control plane wires Mullvad exit nodes into a client's
netmap, including the per-client SelfNodeV4MasqAddrForThisPeer
source-IP rewrite that lets a Tailscale CGNAT IP egress through a
plain-WireGuard tunnel that has no idea what Tailscale is.

The mullvad VM doesn't run wireguard-tools or kernel WireGuard;
instead, a new TTA endpoint /wg-server-up creates a real Linux TUN
named wg0, drives it with wireguard-go (already vendored), and
configures the kernel side (ip addr/up, ip_forward, iptables NAT
MASQUERADE) so decrypted traffic from the peer egresses with the
mullvad VM's WAN IP. Userspace vs kernel WireGuard makes no
difference on the wire — what's being tested is Tailscale's
plain-WireGuard exit-node code path, not the kernel module — and
this lets the test avoid downloading and installing .deb packages
inside the VM.

Adds Env.BringUpMullvadWGServer (calls /wg-server-up, returns the
generated WG public key as a key.NodePublic), Env.SetExitNodeIP
(EditPrefs ExitNodeIP directly, for exit nodes whose IPs aren't
discoverable via TTA), Env.ControlServer (exposes the underlying
testcontrol.Server so tests can UpdateNode / SetMasqueradeAddresses
to inject custom peers), and Env.Status (fetches a node's tailscale
status, used to read the client's pubkey so we can pin it as the
WG server's only allowed peer).

The test verifies that the webserver's echoed source IP is the
client's WAN with no exit node selected, the mullvad VM's WAN with
the WG-only peer selected as exit, and the client's WAN again after
clearing.

Updates #13038

Change-Id: I5bac4e0d832f05929f12cb77fa9946d7f5fb5ef1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-28 11:31:48 -07:00
Brad Fitzpatrick
b9eac14ef9 tstest/natlab/vmtest: add web UI for watching VM tests live
Add an optional --vmtest-web flag that starts an HTTP server showing a
live dashboard for vmtest runs. The dashboard includes:

- Step progress tracker showing all test phases (compile, image prep,
  QEMU launch, agent connect, tailscale up, test-specific steps)
  with status icons and elapsed times
- Per-VM "virtual monitor" cards showing serial console output
  streamed in realtime via WebSocket
- Per-NIC DHCP status (supporting multi-homed VMs like subnet routers)
- Per-node Tailscale status (hidden for non-tailnet VMs)
- Test status badge (Running/Passed/Failed) with live elapsed timer
- Event log showing all lifecycle events chronologically

Architecture follows the existing util/eventbus HTMX+WebSocket pattern:
the server pushes HTML fragments with hx-swap-oob attributes over a
WebSocket, and HTMX routes them to the correct DOM elements by ID.

Key components:
- vmstatus.go: Step tracker (Begin/End lifecycle), EventBus (pub/sub
  with history for late joiners), VMEvent types, NodeStatus tracking
- web.go: HTTP server, WebSocket handler, template loading, ANSI-to-HTML
  conversion via robert-nix/ansihtml, deterministic port selection
- assets/: HTML templates, CSS, HTMX library (copied from eventbus)
- vnet/vnet.go: DHCP event callback on Server for observing DHCP lifecycle
- qemu.go: Console log file tailing with manual offset-based reading

Usage:
  go test ./tstest/natlab/vmtest/ --run-vm-tests --vmtest-web=:0 -v

When using :0, a deterministic port based on the test name is tried
first so re-runs get the same URL, falling back to OS-assigned on
conflict.

Updates #13038

Change-Id: I45281347b3d7af78ed9f4ff896033984f84dcb4d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-28 07:46:04 -07:00
Brad Fitzpatrick
cb239808a6 tstest/natlab/vmtest: add --test-version flag
Add a --test-version flag to run the natlab VM tests against
released tailscale/tailscaled binaries downloaded from
pkgs.tailscale.com instead of building from the source tree.

The value can be a concrete release like "1.97.255", or "stable" /
"unstable" which resolve to the latest TarballsVersion on that track
via pkgs.tailscale.com/<track>/?mode=json. The track for a concrete
version is derived from its minor (even=stable, odd=unstable). The
host architecture (amd64 or arm64) selects the tarball.

Tarballs are cached + extracted under
~/.cache/tailscale-vmtest/builds/<version>_<arch>/ so they are not
re-fetched per test. tta is still always built from the local tree.
Cloud VMs (Ubuntu, Debian) pick up the downloaded binaries via the
existing files.tailscale file server. Non-Linux GOOS (FreeBSD) falls
back to building from source since pkgs.tailscale.com only ships
Linux tarballs. Gokrazy nodes continue to use binaries baked into
the gokrazy image; --test-version is a no-op for them.

Updates #13038

Change-Id: I213ef7db362dd17bf69d2685cbf2ab0ec5a3fee1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-28 06:59:26 -07:00
Brad Fitzpatrick
d0ae993334 tstest/natlab/vmtest: add more subnet router tests
Add two tests building on TestExitNode's framework:

TestSubnetRouterPublicIP brings up a client, a subnet router, and a
webserver, each on its own NAT'd network with distinct WAN IPs. The
subnet router advertises the webserver's network as a route. The test
toggles the client's --accept-routes preference and asserts that the
webserver's echoed source IP switches between the client's own WAN
(direct dial) and the subnet router's WAN (forwarded through the
router and SNAT'd).

TestSubnetRouterAndExitNode adds a fourth node, an exit node that
advertises 0.0.0.0/0 + ::/0, and uses a table-driven layout with
subtests to cover the four combinations of (exit on/off, subnet
on/off). The case where both are on confirms longest-prefix match
wins: the subnet router's /24 takes precedence over the exit node's
/0. The exit node itself is configured with --accept-routes=off so
that, in the exit-only case, it forwards directly to the simulated
internet rather than re-routing the forwarded traffic via the subnet
router (which would otherwise mask the exit node's WAN as the
observed source).

Adds an Env.SetAcceptRoutes helper for toggling the RouteAll pref via
EditPrefs, used by both tests.

Updates #13038

Change-Id: Ifc2726db1df2f039c477c222484f535bebc40445
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-27 17:06:17 -07:00
Brad Fitzpatrick
c0e6ffed0d tstest/tailmac: add NIC hot-swap, disconnected NIC, and screenshot server
Add NIC attachment hot-swap support to Host.app: VZNetworkDevice.attachment
is writable at runtime, so --disconnected-nic creates a NIC with no
attachment, and --attach-network hot-swaps it to a vnet dgram socket
after boot/restore. macOS detects link-up and does DHCP.

Refactor TailMacConfigHelper: extract createDgramAttachment() and
createDisconnectedNetworkDeviceConfiguration() from the monolithic
createSocketNetworkDeviceConfiguration().

Add --screenshot-port flag for headless mode. Host.app serves GET
/screenshot as JPEG via a localhost HTTP server, capturing the
VZVirtualMachineView via CGWindowListCreateImage. The Go test harness
polls these to push live thumbnails to the web dashboard.

Also: SIGINT handler in headless mode for clean VM state save.

Updates #13038

Change-Id: I42fba0ecd760371b4ec5b26a0557e3dd0ba9ecae
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-27 17:03:09 -07:00
Brad Fitzpatrick
5c1738fd56 tstest/natlab/{vmtest,vnet}, cmd/tta: add TestExitNode
Add a vmtest TestExitNode that brings up a client, two exit nodes, and a
non-Tailscale webserver, each on its own NAT'd vnet network with a
distinct WAN IP. The test cycles the client's exit node setting between
off, exit1, and exit2 and asserts that the webserver echoes the expected
post-NAT source IP for each.

Three pieces were needed to make this work:

vnet now forwards TCP between simulated networks at the packet level,
mirroring the existing UDP path. When a guest VM sends TCP to another
simulated network's WAN IP, the source network's gateway rewrites src
via doNATOut and routeTCPPacket hands the packet off to the destination
network, which rewrites dst via doNATIn and writes the rewritten frame
onto the destination LAN. The TCP stacks of the two guest VM kernels
talk end-to-end; vnet just NATs the IP/port headers in flight, so all
TCP semantics (handshakes, options, sequence numbers, payload) are
preserved without a gvisor TCP termination in the middle. Adds a
focused TestInterNetworkTCP that exercises this path without any
Tailscale machinery.

cmd/tta binds its outbound dial to the default route's interface using
SO_BINDTODEVICE. Without that, the moment tailscaled installs
0.0.0.0/0 → tailscale0 in response to setting an exit node, TTA's
existing TCP connection to test-driver gets rerouted through the exit
node. From the test driver's perspective the connection's packets then
arrive with the exit node's WAN IP as the source rather than the
client's, so they don't match the existing flow and the connection is
dead — manifesting in the test as a hang on EditPrefs (which had
actually completed in milliseconds on the daemon side, but whose
response never made it back). Pinning the socket to the underlying NIC
keeps TTA's agent connection on a real interface regardless of any
policy routing tailscaled installs later. We bind rather than carry the
Tailscale bypass fwmark because the fwmark approach is conditional on
tailscaled having configured SO_MARK-based policy routing, while
binding is unconditional.

vmtest grows an Env.SetExitNode helper that sets ExitNodeIP via
EditPrefs through the agent, used by the new test.

Updates #13038

Change-Id: I9fc8f91848b7aa2297ef3eaf71fed9d96056a024
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-27 16:54:20 -07:00
Alex Chan
10b63f27ce tstest/clock: explain what happens if you don't set a Start time
While working on #19444, I assumed that omitting `Start` would return a
clock that started at January 1, year 1, because that's the zero value
for a `time.Time`, but actually it uses the current UTC time instead.

This behaviour is non-obvious, so document it.

Updates #cleanup

Change-Id: Id91400778578655953ff3e1671ce470db97cfe91
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-04-28 00:15:46 +02:00
Brad Fitzpatrick
ad5436af0d tstest/largetailnet, tstest/integration/testcontrol: add in-process large-tailnet benchmark
Add a Go benchmark that exercises a single tailnet client (a [tsnet.Server]
running in the test process) against a synthetic large initial netmap and
a stream of caller-driven peer add/remove deltas, all in-process.

The harness is split in two parts:

  - tstest/largetailnet, a reusable package containing a [Streamer]
    that hijacks the map long-poll on a [testcontrol.Server] via the new
    AltMapStream hook, sends one initial MapResponse with N synthetic
    peers, and forwards caller-supplied delta MapResponses on the same
    stream. Helpers like MakePeer / AllocPeer build synthetic peers with
    unique IDs and addresses derived from the Tailscale ULA range.

  - tstest/largetailnet/largetailnet_test.go, BenchmarkGiantTailnet
    (headless tailscaled workload, no IPN bus subscriber) and
    BenchmarkGiantTailnetBusWatcher (GUI-client workload with one
    Notify subscriber attached). Both are gated on
    --actually-test-giant-tailnet (skipped by default), stand up an
    in-process testcontrol + tsnet.Server, let Up block until the
    initial N-peer netmap has been processed, then ResetTimer and run
    add+remove pairs via b.Loop. Per-delta sync is via a test-only
    [ipnlocal.LocalBackend.AwaitNodeKeyForTest] channel that closes
    once the just-added peer key appears in the netmap (no-watcher
    variant) or via bus-Notify drain (bus-watcher variant).

To support the hijack, [testcontrol.Server] grows an AltMapStream hook
and a small MapStreamWriter interface for benchmarks/stress tests that
need to drive a controlled MapResponse sequence; the normal serveMap
path is untouched when AltMapStream is nil. The streamer answers
non-streaming "lite" map polls (which controlclient issues before the
streaming long-poll to push HostInfo) with an empty MapResponse and
returns immediately, so the streaming poll that follows is the one
that gets the initial netmap.

The benchmark is intended for before/after comparisons of netmap- and
delta-handling changes targeted at large tailnets. CPU profiles on
unmodified main show the expected O(N) hotspots:
setControlClientStatusLocked / authReconfigLocked /
userspaceEngine.Reconfig / setNetMapLocked, plus JSON encoding of the
full Notify.NetMap to bus watchers (which dominates the BusWatcher
variant).

Median ms/op over 10 runs on unmodified main, by tailnet size N:

       N      no-watcher   bus-watcher
   10000          32          166
   50000         222          865
  100000         504         1765
  250000        1551         4696

Recommended invocation:

	go test ./tstest/largetailnet/ -run=^$ \
	    -bench='BenchmarkGiantTailnet(BusWatcher)?$' \
	    -benchtime=2000x -timeout=10m \
	    --actually-test-giant-tailnet \
	    --giant-tailnet-n=250000 \
	    -cpuprofile=/tmp/giant.cpu.pprof

Updates #12542

Change-Id: I4f5b2bb271a36ba853d5a0ffe82054ef2b15c585
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-27 11:47:12 -07:00
Brad Fitzpatrick
f289f7e77c tstest/natlab/vmtest,cmd/tta: add TestSiteToSite
Verifies that site-to-site Tailscale subnet routing with
--snat-subnet-routes=false preserves the original source IP
end-to-end.

Topology: two sites, each with a Linux subnet router on a NATted WAN
plus an internal LAN, and a non-Tailscale backend on each LAN. Backends
are given static routes pointing to their local subnet router for the
remote site's prefix; an HTTP GET from backend-a to backend-b over
Tailscale returns a body containing backend-a's LAN IP.

Adds the supporting vmtest.SNATSubnetRoutes NodeOption and plumbs
snat-subnet-routes through TTA's /up handler. The webserver started by
vmtest.WebServer now also echoes the remote IP, for the preservation
assertion.

Adds a /add-route TTA endpoint (Linux-only for now) and a vmtest
Env.AddRoute helper so the test can install the backend static routes
through TTA rather than needing a host SSH key and debug NIC.

ensureGokrazy now always rebuilds the natlab qcow2 (once per test
process, via sync.Once) so the test picks up the new TTA and webserver
behavior.

This is pulled out of a larger pending change that adds FreeBSD
site-to-site subnet routing support; figured we should have at least
the Linux test covering what works today.

Updates #5573

Change-Id: I881c55b0f118ac9094546b5fbe68dddf179bb042
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-22 12:11:30 -07:00
Brad Fitzpatrick
dfc2667f8f tstest/integration/testcontrol: make Stream w/ capver >= 68 match docs, prod
testcontrol wasn't following the document specs (and prod behavior) breaking
a WIP integration test elsewhere.

Updates tailscale/corp#40088

Change-Id: I02cf70894346bad7c85940b617d99c21c5310664
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-20 07:34:04 -07:00
Tom Proctor
c2da563fef
tstest/integration/vms: skip cloud-init package updates (#19443)
The package updates started getting really slow yesterday. We can do
better, but attempt a band aid fix for now, as the test is failing about
a third of the time on PR CI.

Updates tailscale/corp#40465

Change-Id: Icf53292ba83dd1ed76b9bdf9fb94a8f6fb448c07

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2026-04-17 10:39:47 +01:00
Anton Tolchanov
958bcda5bf control/controlclient: handle 429 responses during node registration
If we get a 429 response during node registration, use the `Retry-After`
header for backoff instead of the regular exponential backoff.

The rate limiter error is propagated to the user, just like other
registration errors are, e.g.

```
$ tailscale up
backend error: node registration rate limited; will retry after 57s
exit status 1
```

Updates tailscale/corp#39533

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2026-04-15 18:54:08 +01:00
Claus Lensbøl
61c95f409c
control/controlclient: accept key if last seen on exist node is absent (#19402)
On some nodes (found via natlab), the existing nodes last seen could be
unset. For these cases, we would want to accept the key and write a last
seen. This was breaking the cached netmap natlab tests.

Updates #12639

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-04-15 03:53:40 -04:00
Naman Sood
6301a6ce4b
util/linuxfw,wgengine/router: allow incoming CGNAT range traffic with nodeattr
Clients with the newly added node attribute
`"disable-linux-cgnat-drop-rule"` will not automatically drop inbound
traffic on non-Tailscale network interfaces with the source IP in the
CGNAT IP range. This is an initial proof-of-concept for enabling
connectivity with off-Tailnet CGNAT endpoints.

Fixes tailscale/corp#36270.

Signed-off-by: Naman Sood <mail@nsood.in>
2026-04-14 16:45:06 -04:00
Brad Fitzpatrick
a0a8fae856 tstest/integration: use linkat to hardlink test binaries on Linux
Use linkat via /proc/self/fd with AT_SYMLINK_FOLLOW to create a
hardlink of the test binary instead of copying it. This avoids
copying ~50MB+ binaries into each test's temp directory, making
test setup faster and reducing disk I/O.

The simpler os.Link(b.Path, ret.Path) can't be used here because
the source binary lives in the first test's TempDir, which may be
cleaned up before later tests call CopyTo. The open FD keeps the
inode alive after the path is deleted, but os.Link needs a valid
path. (See also b9f468240f which tried os.Link but is racy for
this reason.)

The /proc/self/fd approach works without elevated privileges,
unlike AT_EMPTY_PATH which requires CAP_DAC_READ_SEARCH. If the
linkat fails for any reason (e.g. cross-filesystem temp dirs), it
falls back to the existing full-copy path.

Fixes #19397

Change-Id: I4b1f97f7e63a9ae9e09dce36dfbdd1f6cff92320
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-14 07:13:10 -07:00
Avery Pennarun
621dc9cf1b tstest: fix kernel version parsing for Debian-style version strings
The kernel version parser used strings.Cut with "-" to handle versions
like "5.4.0-76-generic", but Debian uses "+" in versions like
"6.12.41+deb13-amd64".

Use strings.IndexAny to find the first "-" or "+" and truncate there.

Fixes TestKernelVersion on Debian systems.

Fixes #19395

Change-Id: I70e5f95682d54baf908e51f9f4b51c130b00aaaa
Co-Authored-By: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2026-04-14 07:11:44 -07:00
Avery Pennarun
ab74ea0a67 tstest/integration: clear SSH_CLIENT env to prevent false positive detection
When running integration tests over SSH (e.g., in remote development
environments), the SSH_CLIENT environment variable is set. This causes
isSSHOverTailscale() to incorrectly detect an SSH session and change
behavior.

Clear SSH_CLIENT in the test node environment to prevent these false
positives.

Fixes #19393

Change-Id: I1411abf0be9704cce37051476efb04d59beed386
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2026-04-13 18:53:07 -07:00
Brad Fitzpatrick
7dcb378875 tstest/integration/nat, tstest/natlab/vnet: fix natlab test flake
The natlab-integrationtest CI job frequently flakes by exhausting its
3m go test timeout. The root cause is that the QEMU VMs run under
pure software emulation (TCG) with no KVM. Under TCG, the guest
kernel's timer calibration busy-loops are at the mercy of host CPU
scheduling. When two VMs boot simultaneously on a 2-core CI runner,
one VM's calibration gets starved and produces wrong results, leaving
the kernel with broken timers that prevent it from ever completing
boot — even after the other VM finishes and frees up CPU.

Additionally, the microvm machine type doesn't provide HPET hardware,
but the kernel command line specified clocksource=hpet. And the VM
image build (make natlab) ran inside the test itself, consuming most
of the 3m timeout budget before the actual test started.

Fix by:

 - Enabling KVM when /dev/kvm is available, so timer calibration
   uses real hardware timers unaffected by host CPU scheduling.

 - Adding a CI step to set /dev/kvm permissions on the GitHub
   Actions runner (ubuntu-latest provides KVM but needs a udev rule).

 - Pre-building the VM image in a separate CI step so it doesn't
   cut into the go test -timeout budget.

 - Replacing the hardcoded 60s context timeout with one derived from
   t.Deadline(), so the test uses the full -timeout budget.

 - Adding VM boot progress detection (AwaitFirstPacket) and QMP
   diagnostics, so boot failures produce clear errors instead of
   opaque "context deadline exceeded" messages.

With KVM enabled, the test passes reliably even on a single CPU core
with 3 parallel workers — a scenario that was 100% broken under TCG.

Fixes #18906

Change-Id: I4c87631a9c9678d185b9f30cb05c0f7bfa9f5c62
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13 16:34:15 -07:00
Brad Fitzpatrick
dbd19e4b65 tstest: add AssertNotParallel helper
For tests to loudly declare (and panic on violation) when they're doing
something that's not safe in a parallel test.

Fixes #19385

Change-Id: If79693b0c235c146871a05ed74fa9ea75bb500f9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13 16:14:33 -07:00
Brad Fitzpatrick
674f866ecc tstest/tailmac: add headless mode for automated VM testing
Add a --headless flag to the Host.app Run subcommand for running
macOS VMs without a GUI, enabling use from test frameworks.

Key changes:

  - HostCli.swift: When --headless is set, run the VM via VMController
    + RunLoop.main.run() instead of NSApplicationMain. Using the
    RunLoop (not dispatchMain) is required because VZ framework
    callbacks depend on RunLoop sources.

  - VMController.swift: Add headless parameter to createVirtualMachine
    that configures a single socket-based NIC (no NAT NIC). This
    matches the NIC configuration used when creating/saving VMs, so
    saved state restoration works correctly. A NIC count mismatch
    causes VZ to silently fail to execute guest code.

  - TailMacConfigHelper.swift: Clean up socket network device logging.

  - Config.swift: Move VM storage from ~/VM.bundle to
    ~/.cache/tailscale/vmtest/macos/.

  - TailMac.swift: Fix dispatchMain→RunLoop.main.run() in the create
    command (same VZ RunLoop requirement).

Updates #13038

Change-Id: Iea51c043aa92e8fc6257139b9f0e2e7677072fa2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-11 12:50:53 -07:00
Brad Fitzpatrick
5e81840b57 tstest: add RequireRoot helper
Start using a common helper for tests to declare that they require root.

This is step 1. A later step will then make this helper track which tests were
skipped so a subsequent pass will run these test as root.

Updates tailscale/corp#40007

Change-Id: I4979e1def0fa3691d38c83f48c89aaa443e7f62e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-10 10:48:50 -07:00
Brad Fitzpatrick
dca1d8eea1 tstest/natlab: add TestSubnetRouterFreeBSD with FreeBSD cloud image support
As a warm-up to making natlab support multiple operating systems,
start with an easy one (in that it's also Unixy and open source like
Linux) and add FreeBSD 15.0 as a VM OS option for the vmtest
integration test framework, and add TestSubnetRouterFreeBSD which
tests subnet routing through a FreeBSD VM (Gokrazy → FreeBSD →
Gokrazy).

Key changes:
- Add FreeBSD150 OSImage using the official FreeBSD 15.0
  BASIC-CLOUDINIT cloud image (xz-compressed qcow2)
- Add GOOS()/IsFreeBSD() methods to OSImage for cross-compilation
  and OS-specific behavior
- Handle xz-compressed image downloads in ensureImage
- Refactor compileBinaries into compileBinariesForOS to support
  multiple GOOS targets (linux, freebsd), with binaries registered
  at <goos>/<name> paths on the file server VIP
- Add FreeBSD-specific cloud-init (nuageinit) user-data generation:
  string-form runcmd (nuageinit doesn't support YAML arrays),
  fetch(1) instead of curl, FreeBSD sysctl names for IP forwarding,
  mkdir /usr/local/bin, PATH setup for tta
- Skip network-config in cidata ISO for FreeBSD (DHCP via rc.conf)

Updates tailscale/tailscale#13038

Change-Id: Ibeb4f7d02659d5cd8e3a7c3a66ee7b1a92a0110d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-09 07:49:07 -07:00
Brad Fitzpatrick
ec0b23a21f vmtest: add VM-based integration test framework
Add tstest/natlab/vmtest, a high-level framework for running multi-VM
integration tests with mixed OS types (gokrazy + Ubuntu/Debian cloud
images) connected via natlab's vnet virtual network.

The vmtest package provides:
  - Env type that orchestrates vnet, QEMU processes, and agent connections
  - OS image support (Gokrazy, Ubuntu2404, Debian12) with download/cache
  - QEMU launch per OS type (microvm for gokrazy, q35+KVM for cloud)
  - Cloud-init seed ISO generation with network-config for multi-NIC
  - Cross-compilation of test binaries for cloud VMs
  - Debug SSH NIC on cloud VMs for interactive debugging
  - Test helpers: ApproveRoutes, HTTPGet, TailscalePing, DumpStatus,
    WaitForPeerRoute, SSHExec

TTA enhancements (cmd/tta):
  - Parameterize /up (accept-routes, advertise-routes, snat-subnet-routes)
  - Add /set, /start-webserver, /http-get endpoints
  - /http-get uses local.Client.UserDial for Tailscale-routed requests
  - Fix /ping for non-gokrazy systems

TestSubnetRouter exercises a 3-VM subnet router scenario:
  client (gokrazy) → subnet-router (Ubuntu, dual-NIC) → backend (gokrazy)
  Verifies HTTP access to the backend webserver through the Tailscale
  subnet route. Passes in ~30 seconds.

Updates tailscale/tailscale#13038

Change-Id: I165b64af241d37f5f5870e796a52502fc56146fa
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-08 17:24:18 -07:00
Brad Fitzpatrick
814161303f tstest/natlab/vnet: add multi-NIC node support, DHCP fixes, and VIPs
Multi-NIC support:
  - Add nodeNIC type and node.extraNICs for secondary network interfaces
  - Add netForMAC/macForNet to route packets to the correct network by MAC
  - Update initFromConfig to allocate a MAC + LAN IP per network
  - Fix handleEthernetFrameFromVM, ServeUnixConn to use netForMAC
  - Fix MACOfIP, writeEth, WriteUDPPacketNoNAT, gVisor write path, and
    createARPResponse to use macForNet (return the MAC actually on that
    network, not the node's primary MAC)
  - Fix createDHCPResponse for multi-NIC (correct client IP and subnet)
  - Add nodeNICMac for secondary NIC MAC generation
  - Add Node accessors: NumNICs, NICMac, Networks, LanIP

DHCP fixes:
  - Include LeaseTime, SubnetMask, Router, DNS in DHCP Offer (not just
    Ack). systemd-networkd requires these to accept an Offer.
  - Fix DHCP response source IP: use gateway IP instead of echoing
    the request's destination (which was 255.255.255.255 for discovers)

New VIPs:
  - cloud-init.tailscale: serves per-node cloud-init meta-data, user-data,
    and network-config for VMs booting with nocloud datasource
  - files.tailscale: serves binary files (tta, tailscale, tailscaled)
    registered via RegisterFile for cloud VM provisioning
  - Add ControlServer() accessor for test control server

This is necessary for a three-VM natlab subnet router
integration test, coming later.

Updates #13038

Change-Id: I59f9f356bae9b5509c117265237983972dfdd5af
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-08 11:04:26 -07:00
Brad Fitzpatrick
ccef06b968 tstest/integration/testcontrol: notify peers when subnet routes change
When SetSubnetRoutes is called, also send updatePeerChanged to all
other connected nodes so they re-fetch their MapResponse and learn
about the updated AllowedIPs. Without this, peers never see new
subnet routes until they happen to reconnect to the control server.

Discovered while working on a three-VM natlab subnet router
integration test, coming later.

Updates #13038

Change-Id: I20e7a2fda994a8ab0e7a24240e6eae536f4f5f15
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-08 10:39:09 -07:00
Brad Fitzpatrick
8a7e160a6e ipn/desktop: move behind feature/condregister
Move the ipn/desktop blank import from cmd/tailscaled/tailscaled_windows.go
into feature/condregister/maybe_desktop_sessions.go, consistent with how
all other modular features are registered. tailscaled already imports
feature/condregister, so it still gets ipn/desktop on Windows.

Updates #12614

Change-Id: I92418c4bf0e67f0ab40542e47584762ac0ffa2b2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-07 11:37:47 -07:00
Brad Fitzpatrick
5ef3713c9f cmd/vet: add subtestnames analyzer; fix all existing violations
Add a new vet analyzer that checks t.Run subtest names don't contain
characters requiring quoting when re-running via "go test -run". This
enforces the style guide rule: don't use spaces or punctuation in
subtest names.

The analyzer flags:
- Direct t.Run calls with string literal names containing spaces,
  regex metacharacters, quotes, or other problematic characters
- Table-driven t.Run(tt.name, ...) calls where tt ranges over a
  slice/map literal with bad name field values

Also fix all 978 existing violations across 81 test files, replacing
spaces with hyphens and shortening long sentence-like names to concise
hyphenated forms.

Updates #19242

Change-Id: Ib0ad96a111bd8e764582d1d4902fe2599454ab65
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-05 15:52:51 -07:00
Naman Sood
d6b626f5bb
tstest: add test for connectivity to off-tailnet CGNAT endpoints
This test is currently known-broken, but work is underway to fix it.
tailscale/corp#36270 tracks this work.

Updates tailscale/corp#36270
Fixes tailscale/corp#36272

Signed-off-by: Naman Sood <mail@nsood.in>
2026-04-02 14:44:40 -04:00
Alex Chan
edb2be1a01 cmd/tailscale: improve tailscale lock error message if no keys
Previously, running `add/remove/revoke-keys` without passing any keys
would fail with an unhelpful error:

```console
$ tailscale lock revoke-keys
generation of recovery AUM failed: sending generate-recovery-aum: 500 Internal Server Error: no provided key is currently trusted
```

or

```console
$ tailscale lock revoke-keys
generation of recovery AUM failed: sending generate-recovery-aum: 500 Internal Server Error: network-lock is not active
```

Now they fail with a more useful error:

```console
$ tailscale lock revoke-keys
missing argument, expected one or more tailnet lock keys
```

Fixes #19130

Change-Id: I9d81fe2f5b92a335854e71cbc6928e7e77e537e3
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-03-29 09:28:52 +01:00
Brad Fitzpatrick
4c91f90776 tstest/integration: add userspace-networking + proxymap WhoIs integration test
Before sending a fix for #18991, this adds an integration test that
locks in that the proxymap WhoIs code works with two nodes running as
different users, with the second node running a localhost service and
able to use its local tailscaled to identify a Tailscale connection
from the other tailscaled.

Updates #18991

Change-Id: I6fbb0810204d77d2ac558f0cc786b73e3248d031
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-13 15:01:31 -07:00
Brad Fitzpatrick
f905871fb1 ipn/ipnlocal, feature/ssh: move SSH code out of LocalBackend to feature
This makes tsnet apps not depend on x/crypto/ssh and locks that in with a test.

It also paves the wave for tsnet apps to opt-in to SSH support via a
blank feature import in the future.

Updates #12614

Change-Id: Ica85628f89c8f015413b074f5001b82b27c953a9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-10 17:27:17 -07:00
Brad Fitzpatrick
99bde5a406 tstest/integration: deflake TestCollectPanic
Two issues caused TestCollectPanic to flake:

1. ETXTBSY: The test exec'd the tailscaled binary directly without
   going through StartDaemon/awaitTailscaledRunnable, so it lacked
   the retry loop that other tests use to work around a mysterious
   ETXTBSY on GitHub Actions.

2. Shared filch files: The test didn't pass --statedir or TS_LOGS_DIR,
   so all parallel test instances wrote panic logs to the shared system
   state directory (~/.local/share/tailscale). Concurrent runs would
   clobber each other's filch log files, causing the second run to not
   find the panic data from the first.

Fix both by adding awaitTailscaledRunnable before the first exec, and
passing --statedir and TS_LOGS_DIR to isolate each test's log files,
matching what StartDaemon does.

It now passes x/tools/cmd/stress.

Fixes #15865

Change-Id: If18b9acf8dbe9a986446a42c5d98de7ad8aae098
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-10 14:16:47 -07:00
Brad Fitzpatrick
bd2a2d53d3 all: use Go 1.26 things, run most gofix modernizers
I omitted a lot of the min/max modernizers because they didn't
result in more clear code.

Some of it's older "for x := range 123".

Also: errors.AsType, any, fmt.Appendf, etc.

Updates #18682

Change-Id: I83a451577f33877f962766a5b65ce86f7696471c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-06 13:32:03 -08:00
Brad Fitzpatrick
2a64c03c95 types/ptr: deprecate ptr.To, use Go 1.26 new
Updates #18682

Change-Id: I62f6aa0de2a15ef8c1435032c6aa74a181c25f8f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-05 20:13:18 -08:00
Brad Fitzpatrick
2810f0c6f1 all: fix typos in comments
Fix its/it's, who's/whose, wether/whether, missing apostrophes
in contractions, and other misspellings across the codebase.

Updates #cleanup

Change-Id: I20453b81a7aceaa14ea2a551abba08a2e7f0a1d8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-05 13:52:01 -08:00
Claus Lensbøl
9657a93217
tstest/natlab: add test for no control and rotated disco key (#18261)
Updates #12639

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-03-05 16:00:36 -05:00
Brad Fitzpatrick
d784dcc61b go.toolchain.branch: switch to Go 1.26
Updates #18682

Change-Id: I1eadfab950e55d004484af880a5d8df6893e85e8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-04 21:57:05 -08:00
Fernando Serboncini
54de5daae0
tstest/integration/nat: use per-call timeout in natlab ping (#18811)
The test ping() passed the full 60s context to each PingWithOpts call,
so if the first attempt hung (DERP not yet registered), the retry loop
never reached attempt 2. Use a 2s per-call timeout instead.

Updates: #18810

Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
2026-02-25 17:41:51 -05:00
Harry Harpham
299f1bf581 testcontrol: ensure Server.UpdateNode triggers netmap updates
Updating a node on a testcontrol server should trigger netmap updates to
all connected streaming clients. This was not the case previous to this
change and consequently caused race conditions in tests. It was possible
for a test to call UpdateNode and for connected nodes to never see the
update propagate.

Updates #16340
Fixes #18703

Signed-off-by: Harry Harpham <harry@tailscale.com>
2026-02-18 09:08:12 -07:00
Brad Fitzpatrick
371d6369cd gokrazy: use monorepo for gokrazy appliance builds (monogok)
This switches our gokrazy builds to use a new variant of cmd/gok called
opinionated about using monorepos: https://github.com/bradfitz/monogok

And with that, we can get rid of all the go.mod files and builddir forests
under gokrazy/**.

Updates #13038
Updates gokrazy/gokrazy#361

Change-Id: I9f18fbe59b8792286abc1e563d686ea9472c622d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-02-13 16:19:14 -08:00
Harry Harpham
84ee5b640b testcontrol: send updates for new DNS records or app capabilities
Two methods were recently added to the testcontrol.Server type:
AddDNSRecords and SetGlobalAppCaps. These two methods should trigger
netmap updates for all nodes connected to the Server instance, the way
that other state-change methods do (see SetNodeCapMap, for example).

This will also allow us to get rid of Server.ForceNetmapUpdate, which
was a band-aid fix to force the netmap updates which should have been
triggered by the aforementioned methods.

Fixes tailscale/corp#37102

Signed-off-by: Harry Harpham <harry@tailscale.com>
2026-02-11 11:49:15 -07:00
Fernando Serboncini
73d09316e2
tstest: update clock to always use UTC (#18663)
Instead of relying on the local timezone, which may cause
non-deterministic behavior in some CIs, we force timezone
to be UTC on default created clocks.

Fixes: tailscale/corp#37005

Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
2026-02-11 13:47:48 -05:00