This patch fixes several issues related to printing login and device
approval URLs, especially when `tailscale up` is interrupted:
1. Only print a login URL that will cause `tailscale up` to complete.
Don't print expired URLs or URLs from previous login attempts.
2. Print the device approval URL if you run `tailscale up` after
previously completing a login, but before approving the device.
3. Use the correct control URL for device approval if you run a bare
`tailscale up` after previously completing a login, but before
approving the device.
4. Don't print the device approval URL more than once (or at least,
not consecutively).
Updates tailscale/corp#31476
Updates #17361
## How these fixes work
This patch went through a lot of trial and error, and there may still
be bugs! These notes capture the different scenarios and considerations
as we wrote it, which are also captured by integration tests.
1. We were getting stale login URLs from the initial IPN state
notification.
When the IPN watcher was moved to before Start() in c011369, we
mistakenly continued to request the initial state. This is only
necessary if you start watching after you call Start(), because
you may have missed some notifications.
By getting the initial state before calling Start(), we'd get
a stale login URL. If you clicked that URL, you could complete
the login in the control server (if it wasn't expired), but your
instance of `tailscale up` would hang, because it's listening for
login updates from a different login URL.
In this patch, we no longer request the initial state, and so we
don't print a stale URL.
2. Once you skip the initial state from IPN, the following sequence:
* Run `tailscale up`
* Log into a tailnet with device approval
* ^C after the device approval URL is printed, but without approving
* Run `tailscale up` again
means that nothing would ever be printed.
`tailscale up` would send tailscaled the pref `WantRunning: true`,
but that was already the case so nothing changes. You never get any
IPN notifications, and in particular you never get a state change to
`NeedsMachineAuth`. This means we'd never print the device approval URL.
In this patch, we add a hard-coded rule that if you're doing a simple up
(which won't trigger any other IPN notifications) and you start in the
`NeedsMachineAuth` state, we print the device approval message without
waiting for an IPN notification.
3. Consider the following sequence:
* Run `tailscale up --login-server=<custom server>`
* Log into a tailnet with device approval
* ^C after the device approval URL is printed, but without approving
* Run `tailscale up` again
We'd print the device approval URL for the default control server,
rather than the real control server, because we were using the `prefs`
from the CLI arguments (which are all the defaults) rather than the
`curPrefs` (which contain the custom login server).
In this patch, we use the `prefs` if the user has specified any settings
(and other code will ensure this is a complete set of settings) or
`curPrefs` if it's a simple `tailscale up`.
4. Consider the following sequence: you've logged in, but not completed
device approval, and you run `down` and `up` in quick succession.
* `up`: sees state=NeedsMachineAuth
* `up`: sends `{wantRunning: true}`, prints out the device approval URL
* `down`: changes state to Stopped
* `up`: changes state to Starting
* tailscaled: changes state to NeedsMachineAuth
* `up`: gets an IPN notification with the state change, and prints
a second device approval URL
Either URL works, but this is annoying for the user.
In this patch, we track whether the last printed URL was the device
approval URL, and if so, we skip printing it a second time.
Signed-off-by: Alex Chan <alexc@tailscale.com>
This patch extends the integration tests for `tailscale up` to include tailnets
where new devices need to be approved. It doesn't change the CLI, because it's
mostly working correctly already -- these tests are just to prevent future
regressions.
I've added support for `MachineAuthorized` to mock control, and I've refactored
`TestOneNodeUpAuth` to be more flexible. It now takes a sequence of steps to
run and asserts whether we got a login URL and/or machine approval URL after
each step.
Updates tailscale/corp#31476
Updates #17361
Signed-off-by: Alex Chan <alexc@tailscale.com>
This commit fixes a race condition where `tailscale up --force-reauth` would
exit prematurely on an already-logged in device.
Previously, the CLI would wait for IPN to report the "Running" state and then
exit. However, this could happen before the new auth URL was printed, leading
to two distinct issues:
* **Without seamless key renewal:** The CLI could exit immediately after
the `StartLoginInteractive` call, before IPN has time to switch into
the "Starting" state or send a new auth URL back to the CLI.
* **With seamless key renewal:** IPN stays in the "Running" state
throughout the process, so the CLI exits immediately without performing
any reauthentication.
The fix is to change the CLI's exit condition.
Instead of waiting for the "Running" state, if we're doing a `--force-reauth`
we now wait to see the node key change, which is a more reliable indicator
that a successful authentication has occurred.
Updates tailscale/corp#31476
Updates tailscale/tailscale#17108
Signed-off-by: Alex Chan <alexc@tailscale.com>
Expand the integration tests to cover a wider range of scenarios, including:
* Before and after a successful initial login
* Auth URLs and auth keys
* With and without the `--force-reauth` flag
* With and without seamless key renewal
These tests expose a race condition when using `--force-reauth` on an
already-logged in device. The command completes too quickly, preventing
the auth URL from being displayed. This issue is identified and will be
fixed in a separate commit.
Updates #17108
Signed-off-by: Alex Chan <alexc@tailscale.com>
For debugging purposes, add a new C2N endpoint returning the current
netmap. Optionally, coordination server can send a new "candidate" map
response, which the client will generate a separate netmap for.
Coordination server can later compare two netmaps, detecting unexpected
changes to the client state.
Updates tailscale/corp#32095
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
Instead of a single hard-coded C2N handler, add support for calling
arbitrary C2N endpoints via a node roundtripper.
Updates tailscale/corp#32095
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
* utils/expvarx: mark TestSafeFuncHappyPath as known flaky
Updates #15348
Signed-off-by: Alex Chan <alexc@tailscale.com>
* tstest/integration: mark TestCollectPanic as known flaky
Updates #15865
Signed-off-by: Alex Chan <alexc@tailscale.com>
---------
Signed-off-by: Alex Chan <alexc@tailscale.com>
Add a new `--encrypt-state` flag to `cmd/tailscaled`. Based on that
flag, migrate the existing state file to/from encrypted format if
needed.
Updates #15830
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Previously all tests shared their tailscale+tailscaled binaries in
system /tmp directories, which often leaked, and required TestMain to
clean up (which feature/taildrop didn't use).
This makes it use testing.T.TempDir for the binaries, but still only
builds them once and efficiently as possible depending on the OS
copies them around between each test's temp dir.
Updates #15812
Change-Id: I0e2585613f272c3d798a423b8ad1737f8916f527
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Taildrop has never had an end-to-end test since it was introduced.
This adds a basic one.
It caught two recent refactoring bugs & one from 2022 (0f7da5c7dc0).
This is prep for moving the rest of Taildrop out of LocalBackend, so
we can do more refactorings with some confidence.
Updates #15812
Change-Id: I6182e49c5641238af0bfdd9fea1ef0420c112738
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In prep for Taildrop integration tests using them from another package.
Updates #15812
Change-Id: I6a995de4e7400658229d99c90349ad5bd1f503ae
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
So it can be exported & used by other packages in future changes.
Updates #15812
Change-Id: I319000989ebc294e29c92be7f44a0e11ae6f7761
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Query for the const quad-100 reverse DNS name, for which a forward
record will also be served. This test was previously dependent on
search domain behavior, and now it is not.
Updates #15607
Signed-off-by: Jordan Whited <jordan@tailscale.com>
It was moved in f57fa3cbc30e.
Updates tailscale/corp#22748
Change-Id: I19f965e6bded1d4c919310aa5b864f2de0cd6220
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1eaad7d3deb regressed some tests in another repo that were starting up
a control server on `http://127.0.0.1:nnn`. Because there was no https
running, and because of a bug in 1eaad7d3deb (which ended up checking
the recently-dialed-control check twice in a single dial call), we
ended up forcing only the use of TLS dials in a test that only had
plaintext HTTP running.
Instead, plumb down support for explicitly disabling TLS fallbacks and
use it only when running in a test and using `http` scheme control
plane URLs to 127.0.0.1 or localhost.
This fixes the tests elsewhere.
Updates #13597
Change-Id: I97212ded21daf0bd510891a278078daec3eebaa6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
When the TS_DEBUG_NETSTACK_LOOPBACK_PORT environment variable is set,
netstack will loop back (dnat to addressFamilyLoopback:loopbackPort)
TCP & UDP flows originally destined to localServicesIP:loopbackPort.
localServicesIP is quad-100 or the IPv6 equivalent.
Updates tailscale/corp#22713
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This adds a variant for Connect that takes in a context.Context
which allows passing through cancellation etc by the caller.
Updates tailscale/corp#18266
Signed-off-by: Maisem Ali <maisem@tailscale.com>
The CLI's "up" is kinda chaotic and LocalBackend.Start is kinda
chaotic and they both need to be redone/deleted (respectively), but
this fixes some buggy behavior meanwhile. We were previously calling
StartLoginInteractive (to start the controlclient's RegisterRequest)
redundantly in some cases, causing test flakes depending on timing and
up's weird state machine.
We only need to call StartLoginInteractive in the client if Start itself
doesn't. But Start doesn't tell us that. So cheat a bit and a put the
information about whether there's a current NodeKey in the ipn.Status.
It used to be accessible over LocalAPI via GetPrefs as a private key but
we removed that for security. But a bool is fine.
So then only call StartLoginInteractive if that bool is false and don't
do it in the WatchIPNBus loop.
Fixes#12028
Updates #12042
Change-Id: I0923c3f704a9d6afd825a858eb9a63ca7c1df294
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This adds a new bool that can be sent down from control
to do jailing on the client side. Previously this would
only be done from control by modifying the packet filter
we sent down to clients. This would result in a lot of
additional work/CPU on control, we could instead just
do this on the client. This has always been a TODO which
we keep putting off, might as well do it now.
Updates tailscale/corp#19623
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Seems to deflake tstest/integration tests. I can't reproduce it
anymore on one of my VMs that was consistently flaking after a dozen
runs before. Now I can run hundreds of times.
Updates #11649Fixes#7036
Change-Id: I2f7d4ae97500d507bdd78af9e92cd1242e8e44b8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We have tstest/integration nowadays.
And this test was one of the lone holdouts using the to-be-nuked
SetControlClientGetterForTesting.
Updates #11649
Change-Id: Icf8a6a2e9b8ae1ac534754afa898c00dc0b7623b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This type seems to be a migration shim for TCP tailscaled sockets
(instead of unix/windows pipes). The `port` field was never set, so it
was effectively used as a string (`path` field).
Remove the whole type and simplify call sites to pass the socket path
directly to `safesocket.Connect`.
Updates #cleanup
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Misc cleanups and things noticed while working on #7894 and pulled out
of a separate change. Submitting them on their own to not distract
from later changes.
Updates #7894
Change-Id: Ie9abc8b88f121c559aeeb7e74db2aa532eb84d3d
Co-authored-by: Maisem Ali <maisem@tailscale.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It had exactly one user: netstack. Just have LocalBackend notify
netstack when here's a new netmap instead, simplifying the bloated
Engine interface that has grown a bunch of non-Engine-y things.
(plenty of rando stuff remains after this, but it's a start)
Updates #cleanup
Change-Id: I45e10ab48119e962fc4967a95167656e35b141d8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>