52 Commits

Author SHA1 Message Date
Brad Fitzpatrick
02ffe5baa8 tstest/natlab/vmtest: add macOS VM snapshot caching for fast test starts
Cache a pre-booted macOS VM snapshot on disk so subsequent test runs
restore from the snapshot instead of cold-booting. The snapshot is keyed
by the Tart base image digest and a code version constant
(macOSSnapshotCodeVersion); bumping either invalidates the cache.

Snapshot preparation (one-time):
- Boot the Tart base image with a NAT NIC (--nat-nic flag)
- Wait for SSH, compile and install cmd/tta as a LaunchDaemon
- TTA polls the host via AF_VSOCK for an IP assignment; during prep
  the host replies "wait"
- Disconnect NIC, save VM state via SIGINT

Test fast path (cached, ~7s to agent connected):
- APFS clone the snapshot, write test-specific config.json
- Launch Host.app with --disconnected-nic --attach-network --assign-ip
- VZ restores from SaveFile.vzvmsave (~5s with 4GB RAM)
- TTA's vsock poll gets the IP config, sets static IP via ifconfig
  (bypasses DHCP entirely), switches driver addr to the IP directly
  (bypasses DNS), and resets the dial context so the reverse-dial
  reconnects immediately
- TTA agent connects to test driver within ~2s of IP assignment

Key optimizations:
- 4GB RAM instead of 8GB: halves SaveFile.vzvmsave (1.4GB vs 2.4GB),
  halves restore time (5.5s vs 11s)
- AF_VSOCK IP assignment: bypasses macOS DHCP (~5-7s saved)
- Direct IP dial: bypasses DNS resolution for test-driver.tailscale
- Dial context reset: cancels stale in-flight dials from snapshot
- Kill instead of SIGINT for test VM cleanup (no state save needed)
- Parallel VM launches

Also:
- Add TestDriverIPv4/TestDriverPort constants to vnet
- Add --nat-nic and --assign-ip flags to Host.app
- Fix SIGINT handler: retain DispatchSource globally, use dispatchMain()
- Add vsock listener (port 51011) to Host.app for IP config protocol
- Add disconnectNetwork() to VMController for clean snapshot state
- Fix Makefile: set -o pipefail so xcodebuild failures aren't swallowed

Updates #13038

Change-Id: Icbab73b57af7df3ae96136fb49cda2536310f31b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-29 08:17:13 -07:00
Brad Fitzpatrick
b2d4ba04b6 tstest/natlab/vmtest: add macOS VM support using Tart base images
Add macOS VM support to the vmtest framework using Tart's pre-built
macOS images (ghcr.io/cirruslabs/macos-tahoe-base) instead of building
from IPSW. The Tart image has SIP disabled and SSH enabled.

At test time, the Tart base image's disk, NVRAM, and hardware identity
are APFS-cloned into a tailmac-compatible directory layout, and the VM
is booted headlessly via tailmac's Host.app (Virtualization.framework)
with its NIC connected to vnet's dgram socket.

New features:
- tailmac.go: ensureTartImage (auto-pull), cloneTartToTailmac (format
  conversion), startTailMacVM (launch + cleanup)
- NoAgent() node option for VMs without TTA installed
- LANPing() for ICMP reachability testing via TTA's /ping endpoint
- IsMacOS field on OSImage, with GOOS/GOARCH support
- Dgram socket listener in Start() for macOS VMs
- Fix ReadFromUnix error spam on dgram socket close in vnet

TestMacOSAndLinuxCanPing verifies a macOS Tart VM and a gokrazy Linux
VM can ping each other on the same vnet LAN.

Updates #13038

Change-Id: I5e73a27878abf009f780fdf11a346fc857711cff
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-28 12:51:40 -07:00
Brad Fitzpatrick
b9eac14ef9 tstest/natlab/vmtest: add web UI for watching VM tests live
Add an optional --vmtest-web flag that starts an HTTP server showing a
live dashboard for vmtest runs. The dashboard includes:

- Step progress tracker showing all test phases (compile, image prep,
  QEMU launch, agent connect, tailscale up, test-specific steps)
  with status icons and elapsed times
- Per-VM "virtual monitor" cards showing serial console output
  streamed in realtime via WebSocket
- Per-NIC DHCP status (supporting multi-homed VMs like subnet routers)
- Per-node Tailscale status (hidden for non-tailnet VMs)
- Test status badge (Running/Passed/Failed) with live elapsed timer
- Event log showing all lifecycle events chronologically

Architecture follows the existing util/eventbus HTMX+WebSocket pattern:
the server pushes HTML fragments with hx-swap-oob attributes over a
WebSocket, and HTMX routes them to the correct DOM elements by ID.

Key components:
- vmstatus.go: Step tracker (Begin/End lifecycle), EventBus (pub/sub
  with history for late joiners), VMEvent types, NodeStatus tracking
- web.go: HTTP server, WebSocket handler, template loading, ANSI-to-HTML
  conversion via robert-nix/ansihtml, deterministic port selection
- assets/: HTML templates, CSS, HTMX library (copied from eventbus)
- vnet/vnet.go: DHCP event callback on Server for observing DHCP lifecycle
- qemu.go: Console log file tailing with manual offset-based reading

Usage:
  go test ./tstest/natlab/vmtest/ --run-vm-tests --vmtest-web=:0 -v

When using :0, a deterministic port based on the test name is tried
first so re-runs get the same URL, falling back to OS-assigned on
conflict.

Updates #13038

Change-Id: I45281347b3d7af78ed9f4ff896033984f84dcb4d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-28 07:46:04 -07:00
Brad Fitzpatrick
5c1738fd56 tstest/natlab/{vmtest,vnet}, cmd/tta: add TestExitNode
Add a vmtest TestExitNode that brings up a client, two exit nodes, and a
non-Tailscale webserver, each on its own NAT'd vnet network with a
distinct WAN IP. The test cycles the client's exit node setting between
off, exit1, and exit2 and asserts that the webserver echoes the expected
post-NAT source IP for each.

Three pieces were needed to make this work:

vnet now forwards TCP between simulated networks at the packet level,
mirroring the existing UDP path. When a guest VM sends TCP to another
simulated network's WAN IP, the source network's gateway rewrites src
via doNATOut and routeTCPPacket hands the packet off to the destination
network, which rewrites dst via doNATIn and writes the rewritten frame
onto the destination LAN. The TCP stacks of the two guest VM kernels
talk end-to-end; vnet just NATs the IP/port headers in flight, so all
TCP semantics (handshakes, options, sequence numbers, payload) are
preserved without a gvisor TCP termination in the middle. Adds a
focused TestInterNetworkTCP that exercises this path without any
Tailscale machinery.

cmd/tta binds its outbound dial to the default route's interface using
SO_BINDTODEVICE. Without that, the moment tailscaled installs
0.0.0.0/0 → tailscale0 in response to setting an exit node, TTA's
existing TCP connection to test-driver gets rerouted through the exit
node. From the test driver's perspective the connection's packets then
arrive with the exit node's WAN IP as the source rather than the
client's, so they don't match the existing flow and the connection is
dead — manifesting in the test as a hang on EditPrefs (which had
actually completed in milliseconds on the daemon side, but whose
response never made it back). Pinning the socket to the underlying NIC
keeps TTA's agent connection on a real interface regardless of any
policy routing tailscaled installs later. We bind rather than carry the
Tailscale bypass fwmark because the fwmark approach is conditional on
tailscaled having configured SO_MARK-based policy routing, while
binding is unconditional.

vmtest grows an Env.SetExitNode helper that sets ExitNodeIP via
EditPrefs through the agent, used by the new test.

Updates #13038

Change-Id: I9fc8f91848b7aa2297ef3eaf71fed9d96056a024
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-27 16:54:20 -07:00
Brad Fitzpatrick
7dcb378875 tstest/integration/nat, tstest/natlab/vnet: fix natlab test flake
The natlab-integrationtest CI job frequently flakes by exhausting its
3m go test timeout. The root cause is that the QEMU VMs run under
pure software emulation (TCG) with no KVM. Under TCG, the guest
kernel's timer calibration busy-loops are at the mercy of host CPU
scheduling. When two VMs boot simultaneously on a 2-core CI runner,
one VM's calibration gets starved and produces wrong results, leaving
the kernel with broken timers that prevent it from ever completing
boot — even after the other VM finishes and frees up CPU.

Additionally, the microvm machine type doesn't provide HPET hardware,
but the kernel command line specified clocksource=hpet. And the VM
image build (make natlab) ran inside the test itself, consuming most
of the 3m timeout budget before the actual test started.

Fix by:

 - Enabling KVM when /dev/kvm is available, so timer calibration
   uses real hardware timers unaffected by host CPU scheduling.

 - Adding a CI step to set /dev/kvm permissions on the GitHub
   Actions runner (ubuntu-latest provides KVM but needs a udev rule).

 - Pre-building the VM image in a separate CI step so it doesn't
   cut into the go test -timeout budget.

 - Replacing the hardcoded 60s context timeout with one derived from
   t.Deadline(), so the test uses the full -timeout budget.

 - Adding VM boot progress detection (AwaitFirstPacket) and QMP
   diagnostics, so boot failures produce clear errors instead of
   opaque "context deadline exceeded" messages.

With KVM enabled, the test passes reliably even on a single CPU core
with 3 parallel workers — a scenario that was 100% broken under TCG.

Fixes #18906

Change-Id: I4c87631a9c9678d185b9f30cb05c0f7bfa9f5c62
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13 16:34:15 -07:00
Brad Fitzpatrick
ec0b23a21f vmtest: add VM-based integration test framework
Add tstest/natlab/vmtest, a high-level framework for running multi-VM
integration tests with mixed OS types (gokrazy + Ubuntu/Debian cloud
images) connected via natlab's vnet virtual network.

The vmtest package provides:
  - Env type that orchestrates vnet, QEMU processes, and agent connections
  - OS image support (Gokrazy, Ubuntu2404, Debian12) with download/cache
  - QEMU launch per OS type (microvm for gokrazy, q35+KVM for cloud)
  - Cloud-init seed ISO generation with network-config for multi-NIC
  - Cross-compilation of test binaries for cloud VMs
  - Debug SSH NIC on cloud VMs for interactive debugging
  - Test helpers: ApproveRoutes, HTTPGet, TailscalePing, DumpStatus,
    WaitForPeerRoute, SSHExec

TTA enhancements (cmd/tta):
  - Parameterize /up (accept-routes, advertise-routes, snat-subnet-routes)
  - Add /set, /start-webserver, /http-get endpoints
  - /http-get uses local.Client.UserDial for Tailscale-routed requests
  - Fix /ping for non-gokrazy systems

TestSubnetRouter exercises a 3-VM subnet router scenario:
  client (gokrazy) → subnet-router (Ubuntu, dual-NIC) → backend (gokrazy)
  Verifies HTTP access to the backend webserver through the Tailscale
  subnet route. Passes in ~30 seconds.

Updates tailscale/tailscale#13038

Change-Id: I165b64af241d37f5f5870e796a52502fc56146fa
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-08 17:24:18 -07:00
Brad Fitzpatrick
814161303f tstest/natlab/vnet: add multi-NIC node support, DHCP fixes, and VIPs
Multi-NIC support:
  - Add nodeNIC type and node.extraNICs for secondary network interfaces
  - Add netForMAC/macForNet to route packets to the correct network by MAC
  - Update initFromConfig to allocate a MAC + LAN IP per network
  - Fix handleEthernetFrameFromVM, ServeUnixConn to use netForMAC
  - Fix MACOfIP, writeEth, WriteUDPPacketNoNAT, gVisor write path, and
    createARPResponse to use macForNet (return the MAC actually on that
    network, not the node's primary MAC)
  - Fix createDHCPResponse for multi-NIC (correct client IP and subnet)
  - Add nodeNICMac for secondary NIC MAC generation
  - Add Node accessors: NumNICs, NICMac, Networks, LanIP

DHCP fixes:
  - Include LeaseTime, SubnetMask, Router, DNS in DHCP Offer (not just
    Ack). systemd-networkd requires these to accept an Offer.
  - Fix DHCP response source IP: use gateway IP instead of echoing
    the request's destination (which was 255.255.255.255 for discovers)

New VIPs:
  - cloud-init.tailscale: serves per-node cloud-init meta-data, user-data,
    and network-config for VMs booting with nocloud datasource
  - files.tailscale: serves binary files (tta, tailscale, tailscaled)
    registered via RegisterFile for cloud VM provisioning
  - Add ControlServer() accessor for test control server

This is necessary for a three-VM natlab subnet router
integration test, coming later.

Updates #13038

Change-Id: I59f9f356bae9b5509c117265237983972dfdd5af
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-08 11:04:26 -07:00
Brad Fitzpatrick
bd2a2d53d3 all: use Go 1.26 things, run most gofix modernizers
I omitted a lot of the min/max modernizers because they didn't
result in more clear code.

Some of it's older "for x := range 123".

Also: errors.AsType, any, fmt.Appendf, etc.

Updates #18682

Change-Id: I83a451577f33877f962766a5b65ce86f7696471c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-06 13:32:03 -08:00
Brad Fitzpatrick
2810f0c6f1 all: fix typos in comments
Fix its/it's, who's/whose, wether/whether, missing apostrophes
in contractions, and other misspellings across the codebase.

Updates #cleanup

Change-Id: I20453b81a7aceaa14ea2a551abba08a2e7f0a1d8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-05 13:52:01 -08:00
Claus Lensbøl
9657a93217
tstest/natlab: add test for no control and rotated disco key (#18261)
Updates #12639

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-03-05 16:00:36 -05:00
Will Norris
3ec5be3f51 all: remove AUTHORS file and references to it
This file was never truly necessary and has never actually been used in
the history of Tailscale's open source releases.

A Brief History of AUTHORS files
---

The AUTHORS file was a pattern developed at Google, originally for
Chromium, then adopted by Go and a bunch of other projects. The problem
was that Chromium originally had a copyright line only recognizing
Google as the copyright holder. Because Google (and most open source
projects) do not require copyright assignemnt for contributions, each
contributor maintains their copyright. Some large corporate contributors
then tried to add their own name to the copyright line in the LICENSE
file or in file headers. This quickly becomes unwieldy, and puts a
tremendous burden on anyone building on top of Chromium, since the
license requires that they keep all copyright lines intact.

The compromise was to create an AUTHORS file that would list all of the
copyright holders. The LICENSE file and source file headers would then
include that list by reference, listing the copyright holder as "The
Chromium Authors".

This also become cumbersome to simply keep the file up to date with a
high rate of new contributors. Plus it's not always obvious who the
copyright holder is. Sometimes it is the individual making the
contribution, but many times it may be their employer. There is no way
for the proejct maintainer to know.

Eventually, Google changed their policy to no longer recommend trying to
keep the AUTHORS file up to date proactively, and instead to only add to
it when requested: https://opensource.google/docs/releasing/authors.
They are also clear that:

> Adding contributors to the AUTHORS file is entirely within the
> project's discretion and has no implications for copyright ownership.

It was primarily added to appease a small number of large contributors
that insisted that they be recognized as copyright holders (which was
entirely their right to do). But it's not truly necessary, and not even
the most accurate way of identifying contributors and/or copyright
holders.

In practice, we've never added anyone to our AUTHORS file. It only lists
Tailscale, so it's not really serving any purpose. It also causes
confusion because Tailscalars put the "Tailscale Inc & AUTHORS" header
in other open source repos which don't actually have an AUTHORS file, so
it's ambiguous what that means.

Instead, we just acknowledge that the contributors to Tailscale (whoever
they are) are copyright holders for their individual contributions. We
also have the benefit of using the DCO (developercertificate.org) which
provides some additional certification of their right to make the
contribution.

The source file changes were purely mechanical with:

    git ls-files | xargs sed -i -e 's/\(Tailscale Inc &\) AUTHORS/\1 contributors/g'

Updates #cleanup

Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
2026-01-23 15:49:45 -08:00
Simon Law
34242df51b
derp/derpserver: clean up extraction of derp.Server (#17264)
PR #17258 extracted `derp.Server` into `derp/derpserver.Server`.

This followup patch adds the following cleanups:
1. Rename `derp_server*.go` files to `derpserver*.go` to match
   the package name.
2. Rename the `derpserver.NewServer` constructor to `derpserver.New`
   to reduce stuttering.
3. Remove the unnecessary `derpserver.Conn` type alias.

Updates #17257
Updates #cleanup

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2025-09-24 10:38:30 -07:00
Brad Fitzpatrick
21dc5f4e21 derp/derpserver: split off derp.Server out of derp into its own package
This exports a number of things from the derp (generic + client) package
to be used by the new derpserver package, as now used by cmd/derper.

And then enough other misc changes to lock in that cmd/tailscaled can
be configured to not bring in tailscale.com/client/local. (The webclient
in particular, even when disabled, was bringing it in, so that's now fixed)

Fixes #17257

Change-Id: I88b6c7958643fb54f386dd900bddf73d2d4d96d5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-09-24 09:19:01 -07:00
Brad Fitzpatrick
fb96137d79 net/{netx,memnet},all: add netx.DialFunc, move memnet Network impl
This adds netx.DialFunc, unifying a type we have a bazillion other
places, giving it now a nice short name that's clickable in
editors, etc.

That highlighted that my earlier move (03b47a55c7956) of stuff from
nettest into netx moved too much: it also dragged along the memnet
impl, meaning all users of netx.DialFunc who just wanted netx for the
type definition were instead also pulling in all of memnet.

So move the memnet implementation netx.Network into memnet, a package
we already had.

Then use netx.DialFunc in a bunch of places. I'm sure I missed some.
And plenty remain in other repos, to be updated later.

Updates tailscale/corp#27636

Change-Id: I7296cd4591218e8624e214f8c70dab05fb884e95
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-04-08 10:07:47 -07:00
Brad Fitzpatrick
2a12e634bf cmd/vnet: add wsproxy mode
For hooking up websocket VM clients to natlab.

Updates #13038

Change-Id: Iaf728b9146042f3d0c2d3a5e25f178646dd10951
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-03-29 11:02:42 -07:00
Brad Fitzpatrick
05ac21ebe4 all: use new LocalAPI client package location
It was moved in f57fa3cbc30e.

Updates tailscale/corp#22748

Change-Id: I19f965e6bded1d4c919310aa5b864f2de0cd6220
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-02-05 14:41:42 -08:00
Joe Tsai
b62a013ecb
Switch logging service from log.tailscale.io to log.tailscale.com (#14398)
Updates tailscale/corp#23617

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2024-12-16 14:53:34 -08:00
James Tucker
c0a1ed86cb tstest/natlab: add latency & loss simulation
A simple implementation of latency and loss simulation, applied to
writes to the ethernet interface of the NIC. The latency implementation
could be optimized substantially later if necessary.

Updates #13355
Signed-off-by: James Tucker <james@tailscale.com>
2024-10-28 12:49:56 -07:00
Brad Fitzpatrick
c763b7a7db syncs: delete Map.Range, update callers to iterators
Updates #11038

Change-Id: I2819fed896cc4035aba5e4e141b52c12637373b1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-09 13:56:13 -07:00
Brad Fitzpatrick
c4d0237e5c tstest/natlab: add dual stack with blackholed IPv4
This reproduces the bug report from
https://github.com/tailscale/tailscale/issues/13346

It does not yet fix it.

Updates #13346

Change-Id: Ia5af7b0481a64a37efe259c798facdda6d9da618
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-09-03 17:16:26 -07:00
Brad Fitzpatrick
3d9e3a17fa tstest/natlab/vnet: move some boilerplate to mkPacket helper
No need to make callers specify the redundant IP version or
TTL/HopLimit or EthernetType in the common case. The mkPacket helper
can set those when unset.

And use the mkIPLayer in another place, simplifying some code.

And rename mkPacketErr to just mkPacket, then move mkPacket to
test-only code, as mustPacket.

Updates #13038

Change-Id: Ic216e44dda760c69ab9bfc509370040874a47d30
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-30 20:23:30 -07:00
Brad Fitzpatrick
7e88d6712e tstest/natlab/vnet: add syslog tests
Updates #13038

Change-Id: I4ac96cb0a9e46a2fb1e09ddedd3614eb006c2c8c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-30 14:22:01 -07:00
Brad Fitzpatrick
b1a5b40318 tstest/natlab/vnet: add DHCP tests, ignore DHCPv4 on v6-only networks
And clean up some of the test helpers in the process.

Updates #13038

Change-Id: I3e2b5f7028a32d97af7f91941e59399a8e222b25
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-30 08:49:01 -07:00
Brad Fitzpatrick
ffa1c93f59 tstest/natlab/vnet: use mkPacketErr in more places
I'd added this helper for tests, but then moved it to non-test code
and forgot some places to use it. This uses it in more places to
remove some boilerplate.

Updates #13038

Change-Id: Ic4dc339be1c47a55b71d806bab421097ee3d75ed
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-30 08:49:01 -07:00
Brad Fitzpatrick
82c2c5c597 tstest/natlab/vnet: add more tests
This adds tests for DNS requests, and ignoring IPv6 packets on v4-only
networks.

No behavior changes. But some things are pulled out into functions.

And the mkPacket helpers previously just for tests are moved into
non-test code to be used elsewhere to reduce duplication, doing the
checksum stuff automatically.

Updates #13038

Change-Id: I4dd0b73c75b2b9567b4be3f05a2792999d83f6a3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-28 21:39:29 -07:00
Brad Fitzpatrick
73b3c8fc8c tstest/natlab/vnet: add IPv6 all-nodes support
This adds support for sending packets to 33:33:00:00:01 at IPv6
multicast address ff02::1 to send to all nodes.

Nothing in Tailscale depends on this (yet?), but it makes debugging in
VMs behind natlab easier (e.g. you can ping all nodes), and other
things might depend on this in the future.

Mostly I'm trying to flesh out the IPv6 support in natlab now that we
can write vnet tests.

Updates #13038

Change-Id: If590031fcf075690ca35c7b230a38c3e72e621eb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-28 12:04:19 -07:00
Brad Fitzpatrick
8b23ba7d05 tstest/natlab/vnet: add qemu + Virtualization.framework protocol tests
To test how virtual machines connect to the natlab vnet code.

Updates #13038

Change-Id: Ia4fd4b0c1803580ee7d94cc9878d777ad4f24f82
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-27 22:30:20 -07:00
Brad Fitzpatrick
ff1d0aa027 tstest/natlab/vnet: start adding tests
And refactor some of vnet.go for testability.

The only behavioral change (with a new test) is that ethernet
broadcasts no longer get sent back to the sender.

Updates #13038

Change-Id: Ic2e7e7d6d8805b7b7f2b5c52c2c5ba97101cef14
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-27 18:32:48 -07:00
Brad Fitzpatrick
f99f970dc1 tstest/natlab/vnet: rename some things for clarity
The bad naming (which had only been half updated with the IPv6
changes) tripped me up in the earlier change.

Updates #13038

Change-Id: I65ce07c167e8219d35b87e1f4bf61aab4cac31ff
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-26 15:36:30 -07:00
Brad Fitzpatrick
0157000cab tstest/natlab: fix IPv6 tests, remove TODOs
The reason they weren't working was because the cmd/tta agent in the
guest was dialing out to the test and the vnet couldn't map its global
unicast IPv6 address to a node as it was just using a
map[netip.Addr]*node and blindly trusting the *node was
populated. Instead, it was nil, so the agent connection fetching
didn't work for its RoundTripper and the test could never drive the
node. That map worked for IPv4 but for IPv6 we need to use the method
that takes into account the node's IPv6 SLAAC address. Most call sites
had been converted but I'd missed that one.

Also clean up some debug, and prohibit nodes' link-local unicast
addresses from dialing 2000::/3 directly for now. We can allow that to
be configured opt-in later (some sort of IPv6 NAT mode. Whatever it's
called.) That mode was working on accident, but was confusing: Linux
would do source address selection from link local for the first few
seconds and then after SLAAC and DAD, switch to using the global
unicast source address. Be consistent for now and force it to use the
global unicast.

Updates #13038

Change-Id: I85e973aaa38b43c14611943ff45c7c825ee9200a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-26 15:36:30 -07:00
Brad Fitzpatrick
6dd1af0d1e tstest/natlab: refactor HandleEthernetPacketForRouter a bit
Move all the UDP handling to its own func to remove a bunch of "if
isUDP" checks in a bunch of blocks.

Updates #13038

Change-Id: If71d71b49e57651d15bd307a2233c43751cc8639
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-26 15:36:30 -07:00
Brad Fitzpatrick
3a8cfbc381 tstest/natlab: be more paranoid about IP versions from gvisor
I didn't actually see this, but added this while debugging something
and figured it'd be good to keep.

Updates #13038

Change-Id: I67934c8a329e0233f79c3b08516fd6bad6bfe22a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-26 15:36:30 -07:00
Brad Fitzpatrick
e0bdd5d058 tstest/natlab: simplify a defer
Updates #13038

Change-Id: I4d38701491523c64c81767b0838010609e683a9f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-26 15:36:30 -07:00
Brad Fitzpatrick
b78df4d48a tstest/natlab/vnet: add start of IPv6 support
Updates #13038

Change-Id: Ic3d095f167daf6c7129463e881b18f2e0d5693f5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-24 18:02:38 -07:00
Brad Fitzpatrick
e5fd36ad78 tstest/natlab: respect NATTable interface's invalid-means-drop everywhere
And sprinkle some more docs around.

Updates #13038

Change-Id: Ia2dcf567b68170481cc2094d64b085c6b94a778a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-23 14:21:59 -07:00
Brad Fitzpatrick
aa42ae9058 tstest/natlab: make a new virtualIP type in prep for IPv6 support
All the magic service names with virtual IPs will need IPv6 variants.

Pull this out in prep.

Updates #13038

Change-Id: I53b5eebd0679f9fa43dc0674805049258c83a0de
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-23 13:16:33 -07:00
Brad Fitzpatrick
5a99940dfa tstest/natlab/vnet: explicitly ignore PCP and SSDP UDP queries
So we don't log about them when verbose logging is enabled.

Updates #13038

Change-Id: I925bc3a23e6c93d60dd4fb4bf6a4fdc5a326de95
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-23 12:10:19 -07:00
Brad Fitzpatrick
3904e4d175 cmd/tta, tstest/natlab/vnet: remove unneeded port 124 log hack, add log buffer
The natlab Test Agent (tta) still had its old log streaming hack in
place where it dialed out to anything on TCP port 124 and those logs
were streamed to the host running the tests. But we'd since added gokrazy
syslog streaming support, which made that redundant.

So remove all the port 124 stuff. And then make sure we log to stderr
so gokrazy logs it to syslog.

Also, keep the first 1MB of logs in memory in tta too, exported via
localhost:8034/logs for interactive debugging. That was very useful
during debugging when I added IPv6 support. (which is coming in future
PRs)

Updates #13038

Change-Id: Ieed904a704410b9031d5fd5f014a73412348fa7f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-23 12:10:19 -07:00
Jonathan Nobels
1191eb0e3d tstest/natlab: add unix address to writer for dgram mode
updates tailcale/corp#22371

For dgram mode, we need to store the write addresses of
the client socket(s) alongside the writer functions and
the write operation needs to use WriteToUnix.

Unix also has multiple clients writing to the same socket,
so the serve method is modified to handle packets from
multiple mac addresses.

Cleans up a bit of cruft from the initial tailmac tooling
commit.

Now all the macOS packets are belong to us.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
2024-08-22 15:37:37 -07:00
Brad Fitzpatrick
84adfa1ba3 tstest/natlab/vnet: standardize on 1-based naming of nodes, networks, MACs
We had a mix of 0-based and 1-based nodes and MACs in logs.

Updates #13038

Change-Id: I36d1b00f7f94b37b4ae2cd439bcdc5dbee6eda4d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-13 08:50:03 -07:00
Brad Fitzpatrick
10d0ce8dde tstest/natlab: get tailscaled logs from gokrazy via syslog
Using https://github.com/gokrazy/gokrazy/pull/275

This is much lower latency than logcatcher, which is higher latency
and chunkier. And this is better than getting it via 'tailscale debug
daemon-logs', which misses early interesting logs.

Updates #13038

Change-Id: I499ec254c003a9494c0e9910f9c650c8ac44ef33
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-13 07:56:29 -07:00
Brad Fitzpatrick
a61825c7b8 cmd/tta, vnet: add host firewall, env var support, more tests
In particular, tests showing that #3824 works. But that test doesn't
actually work yet; it only gets a DERP connection. (why?)

Updates #13038

Change-Id: Ie1fd1b6a38d4e90fae7e72a0b9a142a95f0b2e8f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-12 15:32:12 -07:00
Maisem Ali
10c2bee9e1 tstest/natlab/vnet: capture network wan/lan interfaces
Updates #13038

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-08-12 14:54:38 -07:00
Maisem Ali
d4cc074187 tstest/natlab/vnet: add pcap support
Updates #13038

Change-Id: I89ce2129fee856f97986d6313d2b661c76476c0c
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-08-09 09:06:54 -07:00
Brad Fitzpatrick
f47a5fe52b vnet: reduce some log spam
Updates #13038

Change-Id: I76038a90dfde10a82063988a5b54190074d4b5c5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-09 09:06:54 -07:00
Brad Fitzpatrick
bb3e95c40d vnet: fix port mapping (w/ maisem + andrew)
Co-authored-by: Maisem Ali <maisem@tailscale.com>
Co-authored-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I703b39f05af2e3e1a979be8e77091586cb9ec3eb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-09 09:06:54 -07:00
Brad Fitzpatrick
17a10f702f vnet: add network.logf
Updates #13038

Change-Id: Ia5a9359b8bfa18264d64600dfa1ef01eb8728dc2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-09 09:06:54 -07:00
Brad Fitzpatrick
6798f8ea88 tstest/natlab/vnet: add port mapping
Updates #13038

Change-Id: Iaf274d250398973790873534b236d5cbb34fbe0e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-09 09:06:54 -07:00
Maisem Ali
12764e9db4 natlab: add NodeAgentClient
This adds a new NodeAgentClient type that can be used to
invoke the LocalAPI using the LocalClient instead of
handcrafted URLs. However, there are certain cases where
it does make sense for the node agent to provide more
functionality than whats possible with just the LocalClient,
as such it also exposes a http.Client to make requests directly.

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-08-09 09:06:54 -07:00
Brad Fitzpatrick
1016aa045f hostinfo: add hostinfo.IsNATLabGuestVM
And don't make guests under vnet/natlab upload to logcatcher,
as there won't be a valid cert anyway.

Updates #13038

Change-Id: Ie1ce0139788036b8ecc1804549a9b5d326c5fef5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-09 09:06:54 -07:00