Adds the ability to rotate discovery keys on running clients, needed for
testing upcoming disco key distribution changes.
Introduces key.DiscoKey, an atomic container for a disco private key,
public key, and the public key's ShortString, replacing the prior
separate atomic fields.
magicsock.Conn has a new RotateDiscoKey method, and access to this is
provided via localapi and a CLI debug command.
Note that this implementation is primarily for testing as it stands, and
regular use should likely introduce an additional mechanism that allows
the old key to be used for some time, to provide a seamless key rotation
rather than one that invalidates all sessions.
Updates tailscale/corp#34037
Signed-off-by: James Tucker <james@tailscale.com>
Includes adding StartPaused, which will be used in a future change to
enable netmap caching testing.
Updates #12639
Change-Id: Iec39915d33b8d75e9b8315b281b1af2f5d13a44a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Remove the State enum (StateNew, StateNotAuthenticated, etc.) from
controlclient and replace it with two explicit boolean fields:
- LoginFinished: indicates successful authentication
- Synced: indicates we've received at least one netmap
This makes the state more composable and easier to reason about, as
multiple conditions can be true independently rather than being
encoded in a single enum value.
The State enum was originally intended as the state machine for the
whole client, but that abstraction moved to ipn.Backend long ago.
This change continues moving away from the legacy state machine by
representing state as a combination of independent facts.
Also adds test helpers in ipnlocal that check independent, observable
facts (hasValidNetMap, needsLogin, etc.) rather than relying on
derived state enums, making tests more robust.
Updates #12639
Signed-off-by: James Tucker <james@tailscale.com>
As a baby step towards eventbus-ifying controlclient, make the
Observer optional.
This also means callers that don't care (like this network lock test,
and some tests in other repos) can omit it, rather than passing in a
no-op one.
Updates #12639
Change-Id: Ibd776b45b4425c08db19405bc3172b238e87da4e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
A recent change (009d702adfa0fc) introduced a deadlock where the
/machine/update-health network request to report the client's health
status update to the control plane was moved to being synchronous
within the eventbus's pump machinery.
I started to instead make the health reporting be async, but then we
realized in the three years since we added that, it's barely been used
and doesn't pay for itself, for how many HTTP requests it makes.
Instead, delete it all and replace it with a c2n handler, which
provides much more helpful information.
Fixestailscale/corp#32952
Change-Id: I9e8a5458269ebfdda1c752d7bbb8af2780d71b04
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It never launched and I've lost hope of it launching and it's in my
way now, so I guess it's time to say goodbye.
Updates tailscale/corp#4383
Updates #17305
Change-Id: I2eb551d49f2fb062979cc307f284df4b3dfa5956
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It has nothing to do with logtail and is confusing named like that.
Updates #cleanup
Updates #17323
Change-Id: Idd34587ba186a2416725f72ffc4c5778b0b9db4a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Only changes how the go routine consuming the events starts and stops,
not what it does.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
Pulls out the last callback logic and ensures timers are still running.
The eventbustest package is updated support the absence of events.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
This is a small introduction of the eventbus into controlclient that
communicates with mainly ipnlocal. While ipnlocal is a complicated part
of the codebase, the subscribers here are from the perspective of
ipnlocal already called async.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
* control/controlclient,health,tailcfg: refactor control health messages
Updates tailscale/corp#27759
Signed-off-by: James Sanderson <jsanderson@tailscale.com>
Signed-off-by: Paul Scott <408401+icio@users.noreply.github.com>
Co-authored-by: Paul Scott <408401+icio@users.noreply.github.com>
updates tailscale/corp#26435
Adds client support for sending audit logs to control via /machine/audit-log.
Specifically implements audit logging for user initiated disconnections.
This will require further work to optimize the peristant storage and exclusion
via build tags for mobile:
tailscale/corp#27011tailscale/corp#27012
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
Updates #12172 (then need to update other repos)
Change-Id: I439f65e0119b09e00da2ef5c7a4f002f93558578
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The CLI's "up" is kinda chaotic and LocalBackend.Start is kinda
chaotic and they both need to be redone/deleted (respectively), but
this fixes some buggy behavior meanwhile. We were previously calling
StartLoginInteractive (to start the controlclient's RegisterRequest)
redundantly in some cases, causing test flakes depending on timing and
up's weird state machine.
We only need to call StartLoginInteractive in the client if Start itself
doesn't. But Start doesn't tell us that. So cheat a bit and a put the
information about whether there's a current NodeKey in the ipn.Status.
It used to be accessible over LocalAPI via GetPrefs as a private key but
we removed that for security. But a bool is fine.
So then only call StartLoginInteractive if that bool is false and don't
do it in the WatchIPNBus loop.
Fixes#12028
Updates #12042
Change-Id: I0923c3f704a9d6afd825a858eb9a63ca7c1df294
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I found this too hard to read before.
This is pulled out of #12033 as it's unrelated cleanup in retrospect.
Updates #12028
Change-Id: I727c47e573217e3d1973c5b66a76748139cf79ee
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This moves most of the health package global variables to a new
`health.Tracker` type.
But then rather than plumbing the Tracker in tsd.System everywhere,
this only goes halfway and makes one new global Tracker
(`health.Global`) that all the existing callers now use.
A future change will eliminate that global.
Updates #11874
Updates #4136
Change-Id: I6ee27e0b2e35f68cb38fecdb3b2dc4c3f2e09d68
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This is a useful primitive for asynchronous execution of ordered work I
want to use in another change.
Updates tailscale/corp#16833
Signed-off-by: James Tucker <james@tailscale.com>
It was a really a mutable field owned by mapSession that we didn't move
in earlier commits.
Once moved, it's then possible to de-func-ify the code and turn it into
a regular method rather than an installed optional hook.
Noticed while working to move map session lifetimes out of
Direct.sendMapRequest's single-HTTP-connection scope.
Updates #7175
Updates #cleanup
Change-Id: I6446b15793953d88d1cabf94b5943bb3ccac3ad9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Currently only the top four most popular changes: endpoints, DERP
home, online, and LastSeen.
Updates #1909
Change-Id: I03152da176b2b95232b56acabfb55dcdfaa16b79
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We would only check if the client was paused, but not
if the client was closed. This meant that a call to
Shutdown may block forever/leak goroutines
Updates #cleanup
Signed-off-by: Maisem Ali <maisem@tailscale.com>
resetControlClientLocked is called while b.mu was held and
would call cc.Shutdown which would wait for the observer queue
to drain.
However, there may be active callbacks from cc already waiting for
b.mu resulting in a deadlock.
This makes it so that resetControlClientLocked does not call
Shutdown, and instead just returns the value.
It also makes it so that any status received from previous cc
are ignored.
Updates tailscale/corp#12827
Signed-off-by: Maisem Ali <maisem@tailscale.com>
We want the overall state (used only for tests) to be computed from
the individual states of each component, rather than moving the state
around by hand in dozens of places.
In working towards that, we found a lot of things to clean up.
Updates #cleanup
Change-Id: Ieaaae5355dfae789a8ec7a56ce212f1d7e3a92db
Co-authored-by: Maisem Ali <maisem@tailscale.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Don't just start goroutines and hope for them to be ordered.
Fixes potential regression from earlier 7074a40c0.
Updates #cleanup
Change-Id: I501a6f3e4e8e6306b958bccdc1e47869991c31f7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We have cases where the SetControlClientStatus would result in
a Shutdown call back into the auto client that would block
forever. The right thing to do here is to fix the LocalBackend
state machine but thats a different dumpster fire that we
are slowly making progress towards.
This makes it so that the SetControlClientStatus happens in a
different goroutine so that calls back into the auto client
do not block.
Also add a few missing mu.Unlocks in LocalBackend.Start.
Updates #9181
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Then use the Locked variants in Shutdown while we already hold the lock.
Updates #cleanup
Change-Id: I367d53e6be6f37f783c8f43fc9c4d498d0adf501
Co-authored-by: Maisem Ali <maisem@tailscale.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Don't depend on the server to do it.
Updates #cleanup
Change-Id: I8ff40b02aa877155a71fd4db58cbecb872241ac8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
They were entirely redundant and 1:1 with the status field
so this turns them into methods instead.
Updates #cleanup
Updates #1909
Change-Id: I7d939750749edf7dae4c97566bbeb99f2f75adbc
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I'm trying to remove some stuff from the netmap update path.
Updates #1909
Change-Id: Iad2c728dda160cd52f33ef9cf0b75b4940e0ce64
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
De-pointer a *time.Time type, move it after the mutex which guard is,
rename two test-only methods with our conventional "ForTest" suffix.
Updates #cleanup
Change-Id: I4f4d1acd9c2de33d9c3cb6465d7349ed051aa9f9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
For now the method has only one interface (the same as the func it's
replacing) but it will grow, eventually with the goal to remove the
controlclient.Status type for most purposes.
Updates #1909
Change-Id: I715c8bf95e3f5943055a94e76af98d988558a2f2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
No need to have it on Auto or be behind a mutex; it's only read/written
from a single goroutine. Move it there.
Updates tailscale/corp#5761
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
See issue. This is a baby step towards passing through deltas
end-to-end from node to control back to node and down to the various
engine subsystems, not computing diffs from two full netmaps at
various levels. This will then let us support larger netmaps without
burning CPU.
But this change itself changes no behavior. It just changes a func
type to an interface with one method. That paves the way for future
changes to then add new NetmapUpdater methods that do more
fine-grained work than updating the whole world.
Updates #1909
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The read of the synced field for logging takes place outside the lock, and
races with other (locked) writes of this field, including for example the one
at current line 556 in mapRoutine.
Updates tailscale/corp#13856
Change-Id: I056b36d7a93025aafdf73528dd7645f10b791af6
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Instead of having updates replace the map polls, create
a third goroutine which is solely responsible for making
sure that control is aware of the latest client state.
This also makes it so that the streaming map polls are only
broken when there are auth changes, or the client is paused.
Updates tailscale/corp#5761
Signed-off-by: Maisem Ali <maisem@tailscale.com>
We were never resetting the backoff in streaming mapResponses.
The call to `PollNetMap` always returns with an error. Changing that contract
is harder, so manually reset backoff when a netmap is received.
Updates tailscale/corp#12894
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Using log.Printf may end up being printed out to the console, which
is not desirable. I noticed this when I was investigating some client
logs with `sockstats: trace "NetcheckClient" was overwritten by another`.
That turns to be harmless/expected (the netcheck client will fall back
to the DERP client in some cases, which does its own sockstats trace).
However, the log output could be visible to users if running the
`tailscale netcheck` CLI command, which would be needlessly confusing.
Updates tailscale/corp#9230
Signed-off-by: Mihai Parparita <mihai@tailscale.com>