Ex.:
```
$ talosctl gen config foo 192.168.0.1
no scheme and port specified for the cluster endpoint URL
try: "https://192.168.0.1:6443"
```
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Initial version which only allows setting CNI using preset, no custom
CNI urls are supported at the moment. Still need to figure out what kind
of UI can be used for that.
Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
Talos 0.8 is going to ship with K8s 1.20.x.
Changes to support new `control-plane` label,
upgrade-k8s supports automated fixups for 1.20.
See also: https://github.com/talos-systems/bootkube-plugin/pull/22
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This allows config to be written to disk without being applied
immediately.
Small refactoring to extract common code paths.
At first, I tried to implement this via the sequencer, but looks like
it's too hard to get it right, as sequencer lacks context and config to
be written is not applied to the runtime.
Fixes#2828
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Fixes: https://github.com/talos-systems/talos/issues/2819
Only if requested config type is not `TypeInit`.
This functionality will help implementing TUI installer cluster
extension workflow.
Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
This is initial commit of the installer.
What's done:
- verifying node availability before starting any operations.
- gathering information about disks on the machine.
- allows setting: install disk, hostname, machine type, installer image,
kubernetes version, dns domain, cluster-name.
- dumps/merges talosconfig to a file after applying configuration.
Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
Bump to 0.7.0 as we have a new release.
Clean up the tests we do: 0.6.3 is a previous release, 0.7.0 is a stable
release, current version (0.8.x) is the "next" release.
We test the following:
* 0.6.3 -> 0.7.0
* 0.7.0 -> 0.8-current
* 0.7.0 -> 0.8-current (single node)
This tests upgrades always between two releases.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Fixes: https://github.com/talos-systems/talos/issues/2766
This API is implemented in Maintenance and Machine services.
Can be used to generate configuration on the node, instead of using
talosctl to generate it locally.
To be used in interactive installer and talosctl gen config.
Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
This fixes the reverse Go dependency from `pkg/machinery` to `talos`
package.
Add a check to `Dockerfile` to prevent `pkg/machinery/go.mod` getting
out of sync, this should prevent problems in the future.
Fix potential security issue in `token` authorizer to deny requests
without grpc metadata.
In provisioner, add support for launching nodes without the config
(config is not delivered to the provisioned nodes).
Breaking change in `pkg/provision`: now `NodeRequest.Type` should be set
to the node type (as config can be missing now).
In `talosctl cluster create` add a flag to skip providing config to the
nodes so that they enter maintenance mode, while the generated configs
are written down to disk (so they can be tweaked and applied easily).
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
User-disks are supported by QEMU and Firecracker providers.
Can be defined by using the following parameters:
```
--user-disk /mount/path:1GB
```
Can get more than 1 user disk.
Same set of user disks will be created for all master and worker nodes.
Additionally enable user-disks in qemu e2e test.
Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
For 0.6 -> 0.7 upgrade, in any case config.yaml is preserved and moved
from `/boot` to `/system/state`.
For single node upgrade, `EPHEMERAL` partition is not touched and other
partitions are re-created as needed.
Bump provision tests to 0.6/0.7 upgrades as we get closer to the new
release.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This fixes A/B upgrades and rollback API.
Installer manifest supports now an option to preserve partition contents
while disk is being re-partitioned and partitions are re-formatted.
Mount `/boot` partition as needed (to find current label before starting
the installation and in the rollback API).
Fix upgrade API for non-master nodes.
Contents of `/boot`, `/system/state` and META partitions are preserved
in memory while the disk is re-partitioned.
Remove `--save` flag from the installer as it's not being used.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This enables golangci-lint via build tags for integration tests (this
should have been done long ago!), and fixes the linting errors.
Two tests were updated to reduce flakiness:
* apply config: wait for nodes to issue "boot done" sequence event
before proceeding
* recover: kill pods even if they appear after the initial set gets
killed (potential race condition with previous test).
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
The problem was that some of the health checks sort the list of the
nodes in place (via `sort.Strings()`). If cluster info provider returns
original slice, it might be mutated in such a way that it gets
corrupted.
We never noticed it before CAPI clusters, as in our tests IPs are
assigned sequentially, and sort operation is a no-op.
Specifically, the problem was with the `Nodes()` function, it returns
`append(controlPlaneNodes, workerNodes...)` slice, which by definition
might share memory with `controlPlaneNodes` slice. For example,
if control plane nodes were `4, 5, 6` and worker nodes were `3`, the
returned slice will be `4, 5, 6, 3`, and it shares memory with
`controlPlaneNodes` slice (firs three items). If we apply `sort` to the
returned slice, it re-orders it as `3, 4, 5, 6`, but as it is done
in-place, the `controlPlaneNodes` slice is now `3, 4, 5`, which is
obviously wrong.
Fix that by always returning a copy of the slice from the functions
implementing `ClusterInfo` interface.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
It only gets enabled if output is a terminal. Failures which resolve
themselves are removed from the final output. Small spinner to indicate
progress.
While I was at it, I fixed client-side `talosctl health` when init node
is missing.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Kubeconfig merge was completely rewritten to be "smarter":
* automatically apply renames done at previous stages to avoid asking
over and over again (in general should ask just once)
* skip checks if parts of the config match exactly
* allow overwrite as an option
* flexible way to control the output
* activating context in the end
* custom merged context name
Fixes#2578Fixes#2587Fixes#2577
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This also refactors much of the CLI code for the `talosctl kubeconfig`:
1. Do all the checks before fetching kubeconfig from the server: as
kubeconfig generation takes a few seconds, it doesn't make sense to
generate it if it's not going to be used.
2. Unify most of merge & write directly features.
3. Don't use ExtractTarGz method to be more flexible.
4. Allow custom paths for kubeconfig, whether it is a directory or full
path to the file to be created.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Adds the ability to apply (replace) an existing node configuration with
a new one via the Machine API.
Fixes#2345
Signed-off-by: Seán C McCord <ulexus@gmail.com>
By default, build outside of Drone works the same and builds only amd64
version, loads images back into dockerd, etc.
If multiple platforms are used, multi-arch images are built which can't
be exported to docker or to `.tar` image, they're always pushed to the
registry (even for PR builds to our internal CI registry).
Artifacts as files (initramfs, kernel) now have `-arch` suffix:
`vmlinuz-amd64`, `initramfs-amd64.xz`. "Magic" script normalizes output
paths depending on whether single platform or multiple platforms were
given.
VM provisioners accept magic `${ARCH}` in initramfs/kernel paths which
gets replaced by cluster architecture.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This adds a command that lists all of the images used by Talos. This is
useful in the case of airgap installs, so that users will know which images
to pull.
Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
Add Kubernetes upgrade as part of the provisioning (upgrade tests):
first K8s control plane is upgraded, then Talos is upgraded (with
kubelet), and e2e test is run last.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Add sonobuoy runner code with log fetching on failure. Use hand-picked
set of e2e tests to run: verify basic pod functionality, verify service
connectivity.
Add option `--run-e2e` to the `talosctl health` to run quick e2e test to
verify cluster health.
Add option to run provision tests with custom CNI, run one track of
provision tests with Cilium.
Bump Cilium to 1.8.2.
Talos 0.6 won't uncordon node automatically after upgrade from 0.5, as
0.5 doesn't put annotation. Workaround that in upgrade tests.
Bump upgrade test version to 0.6.0 release.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This moves to using grub instead of syslinux.
BREAKING CHANGE: Single node upgrades will fail in this change. This
will also break the A/B fallback setup since this version introduces
an entirely new partition scheme, that any fallback will not know about.
We plan on addressing these issues in a follow up change.
Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
Bootkube recover process (and `talosctl recover`) was actually
regenerating assets each time `recover` runs forcing control plane to be
at the state when cluster got created. This PR fixes that by running
recover process correctly.
Recovery via etcd was fixed to handle encrypted etcd data:
it follows the way `apiserver` handles encryption at rest, and as at
the moment AES CBC is the only supported encryption method, code simply
follows the same path.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This moves `pkg/config`, `pkg/client` and `pkg/constants`
under `pkg/machinery` umbrella.
And `pkg/machinery` is published as Go module inside Talos repository.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This change is only moving packages and updating import paths.
Goal: expose `internal/pkg/provision` as `pkg/provision` to enable other
projects to import Talos provisioning library.
As cluster checks are almost always required as part of provisioning
process, package `internal/pkg/cluster` was also made public as
`pkg/cluster`.
Other changes were direct dependencies discovered by `importvet` which
were updated.
Public packages (useful, general purpose packages with stable API):
* `internal/pkg/conditions` -> `pkg/conditions`
* `internal/pkg/tail` -> `pkg/tail`
Private packages (used only on provisioning library internally):
* `internal/pkg/inmemhttp` -> `pkg/provision/internal/inmemhttp`
* `internal/pkg/kernel/vmlinuz` -> `pkg/provision/internal/vmlinuz`
* `internal/pkg/cniutils` -> `pkg/provision/internal/cniutils`
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Fixes#2363#2364#2370#2371
Several changes packed together:
* use compressed `vmlinuz` everywhere, firecracker provisioner
uncompresses it before first use, drop `vmlinux`
* handle reboots in qemu launcher to support reset API case, update
empty disk check to handle reset behavior (erasing partition table)
* make bootloader support default in provisioners, and flag to disable
that
* early support for target architecture for qemu provisioner
This should allow us to use `qemu` in CI/CD (not included into this PR):
integration test passes with qemu.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This makes `pkg/config` directly importable from other projects.
There should be no functional changes.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This bumps next version to the latest 0.6 alpha and latest 0.5.
This also enables single node preserve test.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Fixes#2330
CLI tests require node discovery as `--nodes` flag is enforced for most
of the `talosctl commands`.
For clusters created via `talosctl cluster create`, cluster provisioner
state provides all the necessary information, but clusters created via
CAPI don't have the state attached.
API tests rely on Talos and Kubernetes APIs to fetch kubeconfig and
access Nodes K8s API.
CLI tests should rely only on CLI tools, so we use `kubectl get nodes` +
`talosctl kubeconfig` to fetch list of master and worker nodes.
This discovery method relies on "bootstrap" node being set in
`talosconfig` (to fetch `kubeconfig`).
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>