370 Commits

Author SHA1 Message Date
Brad Fitzpatrick
dfc2667f8f tstest/integration/testcontrol: make Stream w/ capver >= 68 match docs, prod
testcontrol wasn't following the document specs (and prod behavior) breaking
a WIP integration test elsewhere.

Updates tailscale/corp#40088

Change-Id: I02cf70894346bad7c85940b617d99c21c5310664
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-20 07:34:04 -07:00
Tom Proctor
c2da563fef
tstest/integration/vms: skip cloud-init package updates (#19443)
The package updates started getting really slow yesterday. We can do
better, but attempt a band aid fix for now, as the test is failing about
a third of the time on PR CI.

Updates tailscale/corp#40465

Change-Id: Icf53292ba83dd1ed76b9bdf9fb94a8f6fb448c07

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2026-04-17 10:39:47 +01:00
Anton Tolchanov
958bcda5bf control/controlclient: handle 429 responses during node registration
If we get a 429 response during node registration, use the `Retry-After`
header for backoff instead of the regular exponential backoff.

The rate limiter error is propagated to the user, just like other
registration errors are, e.g.

```
$ tailscale up
backend error: node registration rate limited; will retry after 57s
exit status 1
```

Updates tailscale/corp#39533

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2026-04-15 18:54:08 +01:00
Claus Lensbøl
61c95f409c
control/controlclient: accept key if last seen on exist node is absent (#19402)
On some nodes (found via natlab), the existing nodes last seen could be
unset. For these cases, we would want to accept the key and write a last
seen. This was breaking the cached netmap natlab tests.

Updates #12639

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-04-15 03:53:40 -04:00
Naman Sood
6301a6ce4b
util/linuxfw,wgengine/router: allow incoming CGNAT range traffic with nodeattr
Clients with the newly added node attribute
`"disable-linux-cgnat-drop-rule"` will not automatically drop inbound
traffic on non-Tailscale network interfaces with the source IP in the
CGNAT IP range. This is an initial proof-of-concept for enabling
connectivity with off-Tailnet CGNAT endpoints.

Fixes tailscale/corp#36270.

Signed-off-by: Naman Sood <mail@nsood.in>
2026-04-14 16:45:06 -04:00
Brad Fitzpatrick
a0a8fae856 tstest/integration: use linkat to hardlink test binaries on Linux
Use linkat via /proc/self/fd with AT_SYMLINK_FOLLOW to create a
hardlink of the test binary instead of copying it. This avoids
copying ~50MB+ binaries into each test's temp directory, making
test setup faster and reducing disk I/O.

The simpler os.Link(b.Path, ret.Path) can't be used here because
the source binary lives in the first test's TempDir, which may be
cleaned up before later tests call CopyTo. The open FD keeps the
inode alive after the path is deleted, but os.Link needs a valid
path. (See also b9f468240f which tried os.Link but is racy for
this reason.)

The /proc/self/fd approach works without elevated privileges,
unlike AT_EMPTY_PATH which requires CAP_DAC_READ_SEARCH. If the
linkat fails for any reason (e.g. cross-filesystem temp dirs), it
falls back to the existing full-copy path.

Fixes #19397

Change-Id: I4b1f97f7e63a9ae9e09dce36dfbdd1f6cff92320
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-14 07:13:10 -07:00
Avery Pennarun
ab74ea0a67 tstest/integration: clear SSH_CLIENT env to prevent false positive detection
When running integration tests over SSH (e.g., in remote development
environments), the SSH_CLIENT environment variable is set. This causes
isSSHOverTailscale() to incorrectly detect an SSH session and change
behavior.

Clear SSH_CLIENT in the test node environment to prevent these false
positives.

Fixes #19393

Change-Id: I1411abf0be9704cce37051476efb04d59beed386
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2026-04-13 18:53:07 -07:00
Brad Fitzpatrick
7dcb378875 tstest/integration/nat, tstest/natlab/vnet: fix natlab test flake
The natlab-integrationtest CI job frequently flakes by exhausting its
3m go test timeout. The root cause is that the QEMU VMs run under
pure software emulation (TCG) with no KVM. Under TCG, the guest
kernel's timer calibration busy-loops are at the mercy of host CPU
scheduling. When two VMs boot simultaneously on a 2-core CI runner,
one VM's calibration gets starved and produces wrong results, leaving
the kernel with broken timers that prevent it from ever completing
boot — even after the other VM finishes and frees up CPU.

Additionally, the microvm machine type doesn't provide HPET hardware,
but the kernel command line specified clocksource=hpet. And the VM
image build (make natlab) ran inside the test itself, consuming most
of the 3m timeout budget before the actual test started.

Fix by:

 - Enabling KVM when /dev/kvm is available, so timer calibration
   uses real hardware timers unaffected by host CPU scheduling.

 - Adding a CI step to set /dev/kvm permissions on the GitHub
   Actions runner (ubuntu-latest provides KVM but needs a udev rule).

 - Pre-building the VM image in a separate CI step so it doesn't
   cut into the go test -timeout budget.

 - Replacing the hardcoded 60s context timeout with one derived from
   t.Deadline(), so the test uses the full -timeout budget.

 - Adding VM boot progress detection (AwaitFirstPacket) and QMP
   diagnostics, so boot failures produce clear errors instead of
   opaque "context deadline exceeded" messages.

With KVM enabled, the test passes reliably even on a single CPU core
with 3 parallel workers — a scenario that was 100% broken under TCG.

Fixes #18906

Change-Id: I4c87631a9c9678d185b9f30cb05c0f7bfa9f5c62
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13 16:34:15 -07:00
Brad Fitzpatrick
5e81840b57 tstest: add RequireRoot helper
Start using a common helper for tests to declare that they require root.

This is step 1. A later step will then make this helper track which tests were
skipped so a subsequent pass will run these test as root.

Updates tailscale/corp#40007

Change-Id: I4979e1def0fa3691d38c83f48c89aaa443e7f62e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-10 10:48:50 -07:00
Brad Fitzpatrick
ccef06b968 tstest/integration/testcontrol: notify peers when subnet routes change
When SetSubnetRoutes is called, also send updatePeerChanged to all
other connected nodes so they re-fetch their MapResponse and learn
about the updated AllowedIPs. Without this, peers never see new
subnet routes until they happen to reconnect to the control server.

Discovered while working on a three-VM natlab subnet router
integration test, coming later.

Updates #13038

Change-Id: I20e7a2fda994a8ab0e7a24240e6eae536f4f5f15
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-08 10:39:09 -07:00
Brad Fitzpatrick
8a7e160a6e ipn/desktop: move behind feature/condregister
Move the ipn/desktop blank import from cmd/tailscaled/tailscaled_windows.go
into feature/condregister/maybe_desktop_sessions.go, consistent with how
all other modular features are registered. tailscaled already imports
feature/condregister, so it still gets ipn/desktop on Windows.

Updates #12614

Change-Id: I92418c4bf0e67f0ab40542e47584762ac0ffa2b2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-07 11:37:47 -07:00
Brad Fitzpatrick
5ef3713c9f cmd/vet: add subtestnames analyzer; fix all existing violations
Add a new vet analyzer that checks t.Run subtest names don't contain
characters requiring quoting when re-running via "go test -run". This
enforces the style guide rule: don't use spaces or punctuation in
subtest names.

The analyzer flags:
- Direct t.Run calls with string literal names containing spaces,
  regex metacharacters, quotes, or other problematic characters
- Table-driven t.Run(tt.name, ...) calls where tt ranges over a
  slice/map literal with bad name field values

Also fix all 978 existing violations across 81 test files, replacing
spaces with hyphens and shortening long sentence-like names to concise
hyphenated forms.

Updates #19242

Change-Id: Ib0ad96a111bd8e764582d1d4902fe2599454ab65
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-05 15:52:51 -07:00
Naman Sood
d6b626f5bb
tstest: add test for connectivity to off-tailnet CGNAT endpoints
This test is currently known-broken, but work is underway to fix it.
tailscale/corp#36270 tracks this work.

Updates tailscale/corp#36270
Fixes tailscale/corp#36272

Signed-off-by: Naman Sood <mail@nsood.in>
2026-04-02 14:44:40 -04:00
Alex Chan
edb2be1a01 cmd/tailscale: improve tailscale lock error message if no keys
Previously, running `add/remove/revoke-keys` without passing any keys
would fail with an unhelpful error:

```console
$ tailscale lock revoke-keys
generation of recovery AUM failed: sending generate-recovery-aum: 500 Internal Server Error: no provided key is currently trusted
```

or

```console
$ tailscale lock revoke-keys
generation of recovery AUM failed: sending generate-recovery-aum: 500 Internal Server Error: network-lock is not active
```

Now they fail with a more useful error:

```console
$ tailscale lock revoke-keys
missing argument, expected one or more tailnet lock keys
```

Fixes #19130

Change-Id: I9d81fe2f5b92a335854e71cbc6928e7e77e537e3
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-03-29 09:28:52 +01:00
Brad Fitzpatrick
4c91f90776 tstest/integration: add userspace-networking + proxymap WhoIs integration test
Before sending a fix for #18991, this adds an integration test that
locks in that the proxymap WhoIs code works with two nodes running as
different users, with the second node running a localhost service and
able to use its local tailscaled to identify a Tailscale connection
from the other tailscaled.

Updates #18991

Change-Id: I6fbb0810204d77d2ac558f0cc786b73e3248d031
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-13 15:01:31 -07:00
Brad Fitzpatrick
f905871fb1 ipn/ipnlocal, feature/ssh: move SSH code out of LocalBackend to feature
This makes tsnet apps not depend on x/crypto/ssh and locks that in with a test.

It also paves the wave for tsnet apps to opt-in to SSH support via a
blank feature import in the future.

Updates #12614

Change-Id: Ica85628f89c8f015413b074f5001b82b27c953a9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-10 17:27:17 -07:00
Brad Fitzpatrick
99bde5a406 tstest/integration: deflake TestCollectPanic
Two issues caused TestCollectPanic to flake:

1. ETXTBSY: The test exec'd the tailscaled binary directly without
   going through StartDaemon/awaitTailscaledRunnable, so it lacked
   the retry loop that other tests use to work around a mysterious
   ETXTBSY on GitHub Actions.

2. Shared filch files: The test didn't pass --statedir or TS_LOGS_DIR,
   so all parallel test instances wrote panic logs to the shared system
   state directory (~/.local/share/tailscale). Concurrent runs would
   clobber each other's filch log files, causing the second run to not
   find the panic data from the first.

Fix both by adding awaitTailscaledRunnable before the first exec, and
passing --statedir and TS_LOGS_DIR to isolate each test's log files,
matching what StartDaemon does.

It now passes x/tools/cmd/stress.

Fixes #15865

Change-Id: If18b9acf8dbe9a986446a42c5d98de7ad8aae098
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-10 14:16:47 -07:00
Brad Fitzpatrick
bd2a2d53d3 all: use Go 1.26 things, run most gofix modernizers
I omitted a lot of the min/max modernizers because they didn't
result in more clear code.

Some of it's older "for x := range 123".

Also: errors.AsType, any, fmt.Appendf, etc.

Updates #18682

Change-Id: I83a451577f33877f962766a5b65ce86f7696471c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-06 13:32:03 -08:00
Brad Fitzpatrick
2a64c03c95 types/ptr: deprecate ptr.To, use Go 1.26 new
Updates #18682

Change-Id: I62f6aa0de2a15ef8c1435032c6aa74a181c25f8f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-05 20:13:18 -08:00
Claus Lensbøl
9657a93217
tstest/natlab: add test for no control and rotated disco key (#18261)
Updates #12639

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-03-05 16:00:36 -05:00
Fernando Serboncini
54de5daae0
tstest/integration/nat: use per-call timeout in natlab ping (#18811)
The test ping() passed the full 60s context to each PingWithOpts call,
so if the first attempt hung (DERP not yet registered), the retry loop
never reached attempt 2. Use a 2s per-call timeout instead.

Updates: #18810

Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
2026-02-25 17:41:51 -05:00
Harry Harpham
299f1bf581 testcontrol: ensure Server.UpdateNode triggers netmap updates
Updating a node on a testcontrol server should trigger netmap updates to
all connected streaming clients. This was not the case previous to this
change and consequently caused race conditions in tests. It was possible
for a test to call UpdateNode and for connected nodes to never see the
update propagate.

Updates #16340
Fixes #18703

Signed-off-by: Harry Harpham <harry@tailscale.com>
2026-02-18 09:08:12 -07:00
Brad Fitzpatrick
371d6369cd gokrazy: use monorepo for gokrazy appliance builds (monogok)
This switches our gokrazy builds to use a new variant of cmd/gok called
opinionated about using monorepos: https://github.com/bradfitz/monogok

And with that, we can get rid of all the go.mod files and builddir forests
under gokrazy/**.

Updates #13038
Updates gokrazy/gokrazy#361

Change-Id: I9f18fbe59b8792286abc1e563d686ea9472c622d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-02-13 16:19:14 -08:00
Harry Harpham
84ee5b640b testcontrol: send updates for new DNS records or app capabilities
Two methods were recently added to the testcontrol.Server type:
AddDNSRecords and SetGlobalAppCaps. These two methods should trigger
netmap updates for all nodes connected to the Server instance, the way
that other state-change methods do (see SetNodeCapMap, for example).

This will also allow us to get rid of Server.ForceNetmapUpdate, which
was a band-aid fix to force the netmap updates which should have been
triggered by the aforementioned methods.

Fixes tailscale/corp#37102

Signed-off-by: Harry Harpham <harry@tailscale.com>
2026-02-11 11:49:15 -07:00
James Tucker
1183f7a191 tstest/integration/testcontrol: fix unguarded read of DNS config
Fixes #18498

Signed-off-by: James Tucker <james@tailscale.com>
2026-01-24 14:38:48 -08:00
Will Norris
3ec5be3f51 all: remove AUTHORS file and references to it
This file was never truly necessary and has never actually been used in
the history of Tailscale's open source releases.

A Brief History of AUTHORS files
---

The AUTHORS file was a pattern developed at Google, originally for
Chromium, then adopted by Go and a bunch of other projects. The problem
was that Chromium originally had a copyright line only recognizing
Google as the copyright holder. Because Google (and most open source
projects) do not require copyright assignemnt for contributions, each
contributor maintains their copyright. Some large corporate contributors
then tried to add their own name to the copyright line in the LICENSE
file or in file headers. This quickly becomes unwieldy, and puts a
tremendous burden on anyone building on top of Chromium, since the
license requires that they keep all copyright lines intact.

The compromise was to create an AUTHORS file that would list all of the
copyright holders. The LICENSE file and source file headers would then
include that list by reference, listing the copyright holder as "The
Chromium Authors".

This also become cumbersome to simply keep the file up to date with a
high rate of new contributors. Plus it's not always obvious who the
copyright holder is. Sometimes it is the individual making the
contribution, but many times it may be their employer. There is no way
for the proejct maintainer to know.

Eventually, Google changed their policy to no longer recommend trying to
keep the AUTHORS file up to date proactively, and instead to only add to
it when requested: https://opensource.google/docs/releasing/authors.
They are also clear that:

> Adding contributors to the AUTHORS file is entirely within the
> project's discretion and has no implications for copyright ownership.

It was primarily added to appease a small number of large contributors
that insisted that they be recognized as copyright holders (which was
entirely their right to do). But it's not truly necessary, and not even
the most accurate way of identifying contributors and/or copyright
holders.

In practice, we've never added anyone to our AUTHORS file. It only lists
Tailscale, so it's not really serving any purpose. It also causes
confusion because Tailscalars put the "Tailscale Inc & AUTHORS" header
in other open source repos which don't actually have an AUTHORS file, so
it's ambiguous what that means.

Instead, we just acknowledge that the contributors to Tailscale (whoever
they are) are copyright holders for their individual contributions. We
also have the benefit of using the DCO (developercertificate.org) which
provides some additional certification of their right to make the
contribution.

The source file changes were purely mechanical with:

    git ls-files | xargs sed -i -e 's/\(Tailscale Inc &\) AUTHORS/\1 contributors/g'

Updates #cleanup

Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
2026-01-23 15:49:45 -08:00
Harry Harpham
3840183be9 tsnet: add support for Services
This change allows tsnet nodes to act as Service hosts by adding a new
function, tsnet.Server.ListenService. Invoking this function will
advertise the node as a host for the Service and create a listener to
receive traffic for the Service.

Fixes #17697
Fixes tailscale/corp#27200
Signed-off-by: Harry Harpham <harry@tailscale.com>
2026-01-16 15:28:31 -07:00
Alex Chan
b7658a4ad2 tstest/integration: add integration test for Tailnet Lock
This patch adds an integration test for Tailnet Lock, checking that a node can't
talk to peers in the tailnet until it becomes signed.

This patch also introduces a new package `tstest/tkatest`, which has some helpers
for constructing a mock control server that responds to TKA requests. This allows
us to reduce boilerplate in the IPN tests.

Updates tailscale/corp#33599

Signed-off-by: Alex Chan <alexc@tailscale.com>
2025-11-26 11:54:48 +00:00
Brad Fitzpatrick
ac0b15356d tailcfg, control/controlclient: start moving MapResponse.DefaultAutoUpdate to a nodeattr
And fix up the TestAutoUpdateDefaults integration tests as they
weren't testing reality: the DefaultAutoUpdate is supposed to only be
relevant on the first MapResponse in the stream, but the tests weren't
testing that. They were instead injecting a 2nd+ MapResponse.

This changes the test control server to add a hook to modify the first
map response, and then makes the test control when the node goes up
and down to make new map responses.

Also, the test now runs on macOS where the auto-update feature being
disabled would've previously t.Skipped the whole test.

Updates #11502

Change-Id: If2319bd1f71e108b57d79fe500b2acedbc76e1a6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-11-25 10:45:34 -08:00
Andrew Dunham
a20cdb5c93 tstest/integration/testcontrol: de-flake TestUserMetricsRouteGauges
SetSubnetRoutes was not sending update notifications to nodes when their
approved routes changed, causing nodes to not fetch updated netmaps with
PrimaryRoutes populated. This resulted in TestUserMetricsRouteGauges
flaking because it waited for PrimaryRoutes to be set, which only happened
if the node happened to poll for other reasons.

Now send updateSelfChanged notification to affected nodes so they fetch
an updated netmap immediately.

Fixes #17962

Signed-off-by: Andrew Dunham <andrew@tailscale.com>
2025-11-23 21:13:23 -05:00
Andrew Lytvynov
c679aaba32
cmd/tailscaled,ipn: show a health warning when state store fails to open (#17883)
With the introduction of node sealing, store.New fails in some cases due
to the TPM device being reset or unavailable. Currently it results in
tailscaled crashing at startup, which is not obvious to the user until
they check the logs.

Instead of crashing tailscaled at startup, start with an in-memory store
with a health warning about state initialization and a link to (future)
docs on what to do. When this health message is set, also block any
login attempts to avoid masking the problem with an ephemeral node
registration.

Updates #15830
Updates #17654

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2025-11-20 15:52:58 -06:00
Alex Chan
c2e474e729 all: rename variables with lowercase-l/uppercase-I
See http://go/no-ell

Signed-off-by: Alex Chan <alexc@tailscale.com>

Updates #cleanup

Change-Id: I8c976b51ce7a60f06315048b1920516129cc1d5d
2025-11-18 09:12:34 +00:00
Brad Fitzpatrick
653d0738f9 types/netmap: remove PrivateKey from NetworkMap
It's an unnecessary nuisance having it. We go out of our way to redact
it in so many places when we don't even need it there anyway.

Updates #12639

Change-Id: I5fc72e19e9cf36caeb42cf80ba430873f67167c3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-11-16 15:32:51 -08:00
Brad Fitzpatrick
edb11e0e60 wgengine/magicsock: fix js/wasm crash regression loading non-existent portmapper
Thanks for the report, @Need-an-AwP!

Fixes #17681
Updates #9394

Change-Id: I2e0b722ef9b460bd7e79499192d1a315504ca84c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-28 08:59:00 -07:00
Alex Chan
c961d58091 cmd/tailscale: improve the error message for lock log with no lock
Previously, running `tailscale lock log` in a tailnet without Tailnet
Lock enabled would return a potentially confusing error:

    $ tailscale lock log
    2025/10/20 11:07:09 failed to connect to local Tailscale service; is Tailscale running?

It would return this error even if Tailscale was running.

This patch fixes the error to be:

    $ tailscale lock log
    Tailnet Lock is not enabled

Fixes #17586

Signed-off-by: Alex Chan <alexc@tailscale.com>
2025-10-20 12:15:57 +01:00
Alex Chan
b7fe1cea9f cmd/tailscale/cli: only print authURLs and device approval URLs once
This patch fixes several issues related to printing login and device
approval URLs, especially when `tailscale up` is interrupted:

1.  Only print a login URL that will cause `tailscale up` to complete.
    Don't print expired URLs or URLs from previous login attempts.

2.  Print the device approval URL if you run `tailscale up` after
    previously completing a login, but before approving the device.

3.  Use the correct control URL for device approval if you run a bare
    `tailscale up` after previously completing a login, but before
    approving the device.

4.  Don't print the device approval URL more than once (or at least,
    not consecutively).

Updates tailscale/corp#31476
Updates #17361

## How these fixes work

This patch went through a lot of trial and error, and there may still
be bugs! These notes capture the different scenarios and considerations
as we wrote it, which are also captured by integration tests.

1.  We were getting stale login URLs from the initial IPN state
    notification.

    When the IPN watcher was moved to before Start() in c011369, we
    mistakenly continued to request the initial state. This is only
    necessary if you start watching after you call Start(), because
    you may have missed some notifications.

    By getting the initial state before calling Start(), we'd get
    a stale login URL. If you clicked that URL, you could complete
    the login in the control server (if it wasn't expired), but your
    instance of `tailscale up` would hang, because it's listening for
    login updates from a different login URL.

    In this patch, we no longer request the initial state, and so we
    don't print a stale URL.

2.  Once you skip the initial state from IPN, the following sequence:

    *   Run `tailscale up`
    *   Log into a tailnet with device approval
    *   ^C after the device approval URL is printed, but without approving
    *   Run `tailscale up` again

    means that nothing would ever be printed.

    `tailscale up` would send tailscaled the pref `WantRunning: true`,
    but that was already the case so nothing changes. You never get any
    IPN notifications, and in particular you never get a state change to
    `NeedsMachineAuth`. This means we'd never print the device approval URL.

    In this patch, we add a hard-coded rule that if you're doing a simple up
    (which won't trigger any other IPN notifications) and you start in the
    `NeedsMachineAuth` state, we print the device approval message without
    waiting for an IPN notification.

3.  Consider the following sequence:

    *   Run `tailscale up --login-server=<custom server>`
    *   Log into a tailnet with device approval
    *   ^C after the device approval URL is printed, but without approving
    *   Run `tailscale up` again

    We'd print the device approval URL for the default control server,
    rather than the real control server, because we were using the `prefs`
    from the CLI arguments (which are all the defaults) rather than the
    `curPrefs` (which contain the custom login server).

    In this patch, we use the `prefs` if the user has specified any settings
    (and other code will ensure this is a complete set of settings) or
    `curPrefs` if it's a simple `tailscale up`.

4.  Consider the following sequence: you've logged in, but not completed
    device approval, and you run `down` and `up` in quick succession.

    *   `up`: sees state=NeedsMachineAuth
    *   `up`: sends `{wantRunning: true}`, prints out the device approval URL
    *   `down`: changes state to Stopped
    *   `up`: changes state to Starting
    *   tailscaled: changes state to NeedsMachineAuth
    *   `up`: gets an IPN notification with the state change, and prints
        a second device approval URL

    Either URL works, but this is annoying for the user.

    In this patch, we track whether the last printed URL was the device
    approval URL, and if so, we skip printing it a second time.

Signed-off-by: Alex Chan <alexc@tailscale.com>
2025-10-08 18:00:29 +01:00
Alex Chan
bb6bd46570 tstest/integration: log all the output printed by tailscale up
Updates tailscale/corp#31476
Updates #17361

Signed-off-by: Alex Chan <alexc@tailscale.com>
2025-10-08 18:00:29 +01:00
Alex Chan
06f12186d9 tstest/integration: test tailscale up when device approval is required
This patch extends the integration tests for `tailscale up` to include tailnets
where new devices need to be approved. It doesn't change the CLI, because it's
mostly working correctly already -- these tests are just to prevent future
regressions.

I've added support for `MachineAuthorized` to mock control, and I've refactored
`TestOneNodeUpAuth` to be more flexible. It now takes a sequence of steps to
run and asserts whether we got a login URL and/or machine approval URL after
each step.

Updates tailscale/corp#31476
Updates #17361

Signed-off-by: Alex Chan <alexc@tailscale.com>
2025-10-08 18:00:29 +01:00
Alex Chan
6db8957744 tstest/integration: mark TestPeerRelayPing as flaky
Updates #17251

Signed-off-by: Alex Chan <alexc@tailscale.com>
2025-10-06 17:01:02 +01:00
Alex Chan
59a39841c3 tstest/integration: mark TestClientSideJailing as flaky
Updates #17419

Signed-off-by: Alex Chan <alexc@tailscale.com>
2025-10-03 17:58:08 +01:00
Brad Fitzpatrick
c2f37c891c all: use Go 1.20's errors.Join instead of our multierr package
Updates #7123

Change-Id: Ie9be6814831f661ad5636afcd51d063a0d7a907d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-10-01 08:10:59 -07:00
Brad Fitzpatrick
442a3a779d feature, net/tshttpproxy: pull out support for using proxies as a feature
Saves 139 KB.

Also Synology support, which I saw had its own large-ish proxy parsing
support on Linux, but support for proxies without Synology proxy
support is reasonable, so I pulled that out as its own thing.

Updates #12614

Change-Id: I22de285a3def7be77fdcf23e2bec7c83c9655593
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-09-30 10:25:56 -07:00
Brad Fitzpatrick
038cdb4640 feature/clientupdate: move clientupdate to a modular feature, disabled for tsnet
Updates #12614

Change-Id: I5f685dec84a5396b7c2b66f2788ae3d286e1ddc6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-09-29 16:08:54 -07:00
Brad Fitzpatrick
01e645fae1 util/backoff: rename logtail/backoff package to util/backoff
It has nothing to do with logtail and is confusing named like that.

Updates #cleanup
Updates #17323

Change-Id: Idd34587ba186a2416725f72ffc4c5778b0b9db4a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-09-28 11:55:07 -07:00
Irbe Krumina
7df7e01d0f
tstest/integration/vms,.github/workflows: bump Ubuntu and NixOS for VM tests + cleanup (#16098)
This PR cleans up a bunch of things in ./tstest/integration/vms:

- Bumps version of Ubuntu that's actually run from CI 20.04 -> 24.04
- Removes Ubuntu 18.04 test
- Bumps NixOS 21.05 -> 25.05

Updates#cleanup

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-09-27 08:23:58 +01:00
Brad Fitzpatrick
afe909664b types/opt: de-weird the API a bit with new True and False consts
Updates #cleanup

Change-Id: I15d8d840877d43e2b884d42354b4eb156094df7d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-09-26 12:58:25 -07:00
Brad Fitzpatrick
e7a79ef5f1 tstest/integration: deflake TestC2NDebugNetmap, disable service collection
Fixes #17298

Change-Id: I83459fa1dad583c32395a80548510bc7ec035c41
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-09-26 12:40:10 -07:00
Brad Fitzpatrick
f715ee2be9 cmd/tailscaled: start implementing ts_omit_netstack
Baby steps. This permits building without much of gvisor, but not all of it.

Updates #17283

Change-Id: I8433146e259918cc901fe86b4ea29be22075b32c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-09-26 09:46:55 -07:00
Alex Chan
002ecb78d0 all: don't rebind variables in for loops
See https://tip.golang.org/wiki/LoopvarExperiment#does-this-mean-i-dont-have-to-write-x--x-in-my-loops-anymore

Updates https://github.com/tailscale/tailscale/issues/11058

Signed-off-by: Alex Chan <alexc@tailscale.com>
2025-09-26 16:19:42 +01:00
Alex Chan
41a2aaf1da cmd/tailscale/cli: fix race condition in up --force-reauth
This commit fixes a race condition where `tailscale up --force-reauth` would
exit prematurely on an already-logged in device.

Previously, the CLI would wait for IPN to report the "Running" state and then
exit. However, this could happen before the new auth URL was printed, leading
to two distinct issues:

*   **Without seamless key renewal:** The CLI could exit immediately after
    the `StartLoginInteractive` call, before IPN has time to switch into
    the "Starting" state or send a new auth URL back to the CLI.
*   **With seamless key renewal:** IPN stays in the "Running" state
    throughout the process, so the CLI exits immediately without performing
    any reauthentication.

The fix is to change the CLI's exit condition.

Instead of waiting for the "Running" state, if we're doing a `--force-reauth`
we now wait to see the node key change, which is a more reliable indicator
that a successful authentication has occurred.

Updates tailscale/corp#31476
Updates tailscale/tailscale#17108

Signed-off-by: Alex Chan <alexc@tailscale.com>
2025-09-26 14:27:20 +01:00