8 Commits

Author SHA1 Message Date
Brad Fitzpatrick
7dcb378875 tstest/integration/nat, tstest/natlab/vnet: fix natlab test flake
The natlab-integrationtest CI job frequently flakes by exhausting its
3m go test timeout. The root cause is that the QEMU VMs run under
pure software emulation (TCG) with no KVM. Under TCG, the guest
kernel's timer calibration busy-loops are at the mercy of host CPU
scheduling. When two VMs boot simultaneously on a 2-core CI runner,
one VM's calibration gets starved and produces wrong results, leaving
the kernel with broken timers that prevent it from ever completing
boot — even after the other VM finishes and frees up CPU.

Additionally, the microvm machine type doesn't provide HPET hardware,
but the kernel command line specified clocksource=hpet. And the VM
image build (make natlab) ran inside the test itself, consuming most
of the 3m timeout budget before the actual test started.

Fix by:

 - Enabling KVM when /dev/kvm is available, so timer calibration
   uses real hardware timers unaffected by host CPU scheduling.

 - Adding a CI step to set /dev/kvm permissions on the GitHub
   Actions runner (ubuntu-latest provides KVM but needs a udev rule).

 - Pre-building the VM image in a separate CI step so it doesn't
   cut into the go test -timeout budget.

 - Replacing the hardcoded 60s context timeout with one derived from
   t.Deadline(), so the test uses the full -timeout budget.

 - Adding VM boot progress detection (AwaitFirstPacket) and QMP
   diagnostics, so boot failures produce clear errors instead of
   opaque "context deadline exceeded" messages.

With KVM enabled, the test passes reliably even on a single CPU core
with 3 parallel workers — a scenario that was 100% broken under TCG.

Fixes #18906

Change-Id: I4c87631a9c9678d185b9f30cb05c0f7bfa9f5c62
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13 16:34:15 -07:00
Claus Lensbøl
ea1f1616b9 .github/workflows: enable natlab in CI
After fixing the flakey tests in #18811 and #18814 we can enable running
the natlab testsuite running on CI generally.

Fixes #18810

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-03-04 15:02:07 -08:00
dependabot[bot]
8be5affa6d .github: bump actions/checkout from 6.0.1 to 6.0.2
Bumps [actions/checkout](https://github.com/actions/checkout) from 6.0.1 to 6.0.2.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](8e8c483db8...de0fac2e45)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: 6.0.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-02-23 08:44:40 -07:00
Brad Fitzpatrick
371d6369cd gokrazy: use monorepo for gokrazy appliance builds (monogok)
This switches our gokrazy builds to use a new variant of cmd/gok called
opinionated about using monorepos: https://github.com/bradfitz/monogok

And with that, we can get rid of all the go.mod files and builddir forests
under gokrazy/**.

Updates #13038
Updates gokrazy/gokrazy#361

Change-Id: I9f18fbe59b8792286abc1e563d686ea9472c622d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-02-13 16:19:14 -08:00
dependabot[bot]
9a6282b515 .github: Bump actions/checkout from 4.2.2 to 5.0.0
Bumps [actions/checkout](https://github.com/actions/checkout) from 4.2.2 to 5.0.0.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](11bd71901b...08c6903cd8)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: 5.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-01-06 11:48:32 -07:00
Irbe Krumina
077d52b22f .github/workflows: removes extra '$'
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-06-16 15:04:10 -07:00
Brad Fitzpatrick
96202a7c0c .github/workflows: descope natlab CI for now until GitHub flakes are fixed
The natlab VM tests are flaking on GitHub Actions.

To not distract people, disable them for now (unless they're touched
directly) until they're made more reliable, which will be some painful
debugging probably.

Updates #13038

Change-Id: I6570f1cd43f8f4d628a54af8481b67455ebe83dc
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-03-05 16:46:33 -08:00
Brad Fitzpatrick
5eafce7e25 gokrazy/natlab: update gokrazy, wire up natlab tests to GitHub CI
Updates #13038

Change-Id: I610f9076816f44d59c0ca405a1b4f5eb4c6c0594
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-03-04 18:57:29 -08:00