mirror of
https://github.com/siderolabs/talos.git
synced 2025-10-09 14:41:31 +02:00
docs: consolidate, simplify and correct various docs
This PR updates various docs to clarify them. Signed-off-by: Steve Francis <steve.francis@talos-systems.com>
This commit is contained in:
parent
06f76bfebb
commit
c71c8ca18f
@ -5,10 +5,9 @@ description: "A guide to setting up a Talos Linux cluster on multiple machines."
|
||||
---
|
||||
|
||||
This document will walk you through installing a full Talos Cluster.
|
||||
You may wish to try the [Quickstart]({{< relref "quickstart" >}}) first, to quickly create a local virtual cluster on your workstation.
|
||||
If this is your first use of Talos Linux, we recommend the the [Quickstart]({{< relref "quickstart" >}}) first, to quickly create a local virtual cluster on your workstation.
|
||||
|
||||
Regardless of where you run Talos, there is a pattern to deploying it.
|
||||
In general you need to:
|
||||
Regardless of where you run Talos, in general you need to:
|
||||
|
||||
- acquire the installation image
|
||||
- decide on the endpoint for Kubernetes
|
||||
@ -23,9 +22,8 @@ In general you need to:
|
||||
|
||||
`talosctl` is a CLI tool which interfaces with the Talos API in
|
||||
an easy manner.
|
||||
It also includes a number of useful options for creating and managing clusters.
|
||||
|
||||
You should install `talosctl` before continuing:
|
||||
Install `talosctl` before continuing:
|
||||
|
||||
#### `amd64`
|
||||
|
||||
@ -36,7 +34,7 @@ chmod +x /usr/local/bin/talosctl
|
||||
|
||||
#### `arm64`
|
||||
|
||||
For `linux` and `darwin` operating systems `talosctl` is also available for the `arm64` processor architecture.
|
||||
For `linux` and `darwin` operating systems `talosctl` is also available for the `arm64` architecture.
|
||||
|
||||
```bash
|
||||
curl -Lo /usr/local/bin/talosctl https://github.com/siderolabs/talos/releases/download/{{< release >}}/talosctl-$(uname -s | tr "[:upper:]" "[:lower:]")-arm64
|
||||
@ -45,14 +43,14 @@ chmod +x /usr/local/bin/talosctl
|
||||
|
||||
## Acquire the installation image
|
||||
|
||||
The easiest way to install Talos is to use the ISO image.
|
||||
The most general way to install Talos is to use the ISO image (but note there are easer methods for some platforms, such as pre-built AMIs for AWS - check the specific [Installation Guides]({{< relref "../talos-guides/install/" >}}).)
|
||||
|
||||
The latest ISO image can be found on the Github [Releases](https://github.com/siderolabs/talos/releases) page:
|
||||
|
||||
- X86: [https://github.com/siderolabs/talos/releases/download/{{< release >}}/talos-amd64.iso](https://github.com/siderolabs/talos/releases/download/{{< release >}}/talos-amd64.iso)
|
||||
- ARM64: [https://github.com/siderolabs/talos/releases/download/{{< release >}}/talos-arm64.iso](https://github.com/siderolabs/talos/releases/download/{{< release >}}/talos-arm64.iso)
|
||||
|
||||
When booted from the ISO, Talos will run in RAM, and it will not install itself
|
||||
When booted from the ISO, Talos will run in RAM, and will not install itself
|
||||
until it is provided a configuration.
|
||||
Thus, it is safe to boot the ISO onto any machine.
|
||||
|
||||
@ -80,9 +78,7 @@ Thus, the format of the endpoint may be something like:
|
||||
- `https://kube.mycluster.mydomain.com:6443`
|
||||
- `https://[2001:db8:1234::80]:6443`
|
||||
|
||||
Because the Kubernetes controlplane is meant to be highly
|
||||
available, we must also choose how to bind the API server endpoint to the servers
|
||||
themselves.
|
||||
The Kubernetes API Server endpoint, in order to be highly availabile, should be configured in a way that functions off all available control plane nodes.
|
||||
There are three common ways to do this:
|
||||
|
||||
### Dedicated Load-balancer
|
||||
@ -91,12 +87,14 @@ If you are using a cloud provider or have your own load-balancer available (such
|
||||
as HAProxy, nginx reverse proxy, or an F5 load-balancer), using
|
||||
a dedicated load balancer is a natural choice.
|
||||
Create an appropriate frontend matching the endpoint, and point the backends at each of the addresses of the Talos controlplane nodes.
|
||||
Note that given we have not yet created the control plane nodes, the IP addresses of the backends may not be known yet.
|
||||
We can bind the backends to the frontend at a later point.
|
||||
At this stage, we just need the IP and Port of the frontend to be created, for use later on.
|
||||
|
||||
### Layer 2 Shared IP
|
||||
|
||||
Talos has integrated support for serving Kubernetes from a shared (sometimes
|
||||
called "virtual") IP address.
|
||||
This method relies on OSI Layer 2 connectivity between controlplane Talos nodes.
|
||||
Talos has integrated support for serving Kubernetes from a shared/virtual IP address.
|
||||
This method relies on Layer 2 connectivity between controlplane Talos nodes.
|
||||
|
||||
In this case, we choose an IP address on the same subnet as the Talos
|
||||
controlplane nodes which is not otherwise assigned to any machine.
|
||||
@ -107,8 +105,8 @@ For instance, if your controlplane node IPs are:
|
||||
- 192.168.0.12
|
||||
|
||||
you could choose the ip `192.168.0.15` as your shared IP address.
|
||||
Just make sure that `192.168.0.15` is not used by any other machine and that your DHCP
|
||||
will not serve it to any other machine.
|
||||
(Make sure that `192.168.0.15` is not used by any other machine and that your DHCP server
|
||||
will not serve it to any other machine.)
|
||||
|
||||
Once chosen, form the full HTTPS URL from this IP:
|
||||
|
||||
@ -116,7 +114,7 @@ Once chosen, form the full HTTPS URL from this IP:
|
||||
https://192.168.0.15:6443
|
||||
```
|
||||
|
||||
You are free to set a DNS record to this IP address to identify the Kubernetes API endpoint, but you will need to use the IP address itself, not the DNS name, to configure the shared IP (`machine.network.interfaces[].vip.ip`) in the Talos configuration.
|
||||
If you create a DNS record for this IP, note you will need to use the IP address itself, not the DNS name, to configure the shared IP (`machine.network.interfaces[].vip.ip`) in the Talos configuration.
|
||||
|
||||
For more information about using a shared IP, see the related
|
||||
[Guide]({{< relref "../talos-guides/network/vip" >}})
|
||||
@ -143,49 +141,37 @@ https://kube.cluster1.mydomain.com:6443
|
||||
|
||||
## Decide how to access the Talos API
|
||||
|
||||
Since Talos is entirely API-driven, Talos comes with a number of mechanisms to make accessing the API easier.
|
||||
Much of the power of Talos Linux comes from the API calls that can be made against control plane nodes.
|
||||
|
||||
Controlplane nodes can proxy requests for worker nodes.
|
||||
This means that you only need access to the controlplane nodes in order to access
|
||||
the rest of the network.
|
||||
This is useful for security (your worker nodes do not need to have
|
||||
public IPs or be otherwise connected to the Internet), and it also makes working
|
||||
with highly-variable clusters easier, since you only need to know the
|
||||
controlplane nodes in advance.
|
||||
We recommend accessing the control plane nodes directly from the `talosctl` client, if possible (i.e. set your `endpoints` to the IP addresses of the control plane nodes).
|
||||
This requires your controlplane nodes to be reachable from the client IP.
|
||||
|
||||
Even better, the `talosctl` tool will automatically load balance requests and fail over
|
||||
between all of your controlplane nodes, so long as it is informed of the
|
||||
controlplane node IPs.
|
||||
If your control plane nodes are not directly reachable from your workstation where you run `talosctl`, then configure a load balancer for TCP port 50000 to be forwarded to the controlplane nodes.
|
||||
Do not use Talos Linux's built in VIP support for accessing the Talos API, as it will not function in the event of an etcd failure, and you will not be able to access the Talos API to fix things.
|
||||
|
||||
This means you need to tell your client (`talosctl`) how to communicate with the controlplane nodes, which is done by defining the `endpoints`.
|
||||
In general, it is recommended that these point to the set of control plane
|
||||
nodes, either directly or through a reverse proxy or load balancer, similarly to accessing the Kubernetes API.
|
||||
The difference is that the Talos API listens on port `50000/tcp`.
|
||||
|
||||
Whichever way you wish to access the Talos API, be sure to note the IP(s) or
|
||||
hostname(s) so that you can configure your `talosctl` tool's `endpoints` below.
|
||||
|
||||
**NOTE**: The [Virtual IP]({{< relref "../talos-guides/network/vip.md" >}}) method is not recommended when accessing the Talos API as it requires etcd to be bootstrapped and functional.
|
||||
This can make debugging any issues via the Talos API more difficult as issues with Talos configuration may result in etcd not achieving quorum, and therefore the Virtual IP not being available.
|
||||
In this case setting the endpoints to the IP or hostnames of the control plane nodes themselves is preferred.
|
||||
If you create a load balancer to forward the Talos API calls, make a note of the IP or
|
||||
hostname so that you can configure your `talosctl` tool's `endpoints` below.
|
||||
|
||||
## Configure Talos
|
||||
|
||||
Talos Linux needs to be given configuration information to know how to form a Kubernetes cluster.
|
||||
|
||||
When Talos boots without a configuration, such as when using the Talos ISO, it
|
||||
enters a limited maintenance mode and waits for a configuration to be provided.
|
||||
|
||||
Alternatively, the Talos installer can be booted with the `talos.config` kernel
|
||||
In other installation methods, a configuration can be passed in on boot.
|
||||
For example, Talos can be booted with the `talos.config` kernel
|
||||
commandline argument set to an HTTP(s) URL from which it should receive its
|
||||
configuration.
|
||||
In cases where a PXE server can be available, this is much more efficient than
|
||||
Where a PXE server is available, this is much more efficient than
|
||||
manually configuring each node.
|
||||
If you do use this method, just note that Talos does require a number of other
|
||||
If you do use this method, note that Talos requires a number of other
|
||||
kernel commandline parameters.
|
||||
See the [required kernel parameters]({{< relref "../reference/kernel" >}}) for more information.
|
||||
See [required kernel parameters]({{< relref "../reference/kernel" >}}).
|
||||
If creating [EC2 kubernetes clusters]({{< relref "../talos-guides/install/cloud-platforms/aws/" >}}), the configuration file can be passed in as `--user-data` to the `aws ec2 run-instances` command.
|
||||
|
||||
In either case, we need to generate the configuration which is to be provided.
|
||||
Luckily, the `talosctl` tool comes with a configuration generator for exactly
|
||||
this purpose.
|
||||
In any case, we need to generate the configuration which is to be provided.
|
||||
`talosctl` can do exactly this:
|
||||
|
||||
```sh
|
||||
talosctl gen config "cluster-name" "cluster-endpoint"
|
||||
@ -195,11 +181,11 @@ Here, `cluster-name` is an arbitrary name for the cluster which will be used
|
||||
in your local client configuration as a label.
|
||||
It does not affect anything in the cluster itself, but it should be unique in the configuration on your local workstation.
|
||||
|
||||
The `cluster-endpoint` is where you insert the Kubernetes Endpoint you
|
||||
The `cluster-endpoint` is the Kubernetes Endpoint you
|
||||
selected from above.
|
||||
This is the Kubernetes API URL, and it should be a complete URL, with `https://`
|
||||
and port.
|
||||
(The default port is `6443`.)
|
||||
(The default port is `6443`, but you may have configured your load balancer to forward a different port.)
|
||||
|
||||
When you run this command, you will receive a number of files in your current
|
||||
directory:
|
||||
@ -209,31 +195,28 @@ directory:
|
||||
- `talosconfig`
|
||||
|
||||
The `.yaml` files are what we call Machine Configs.
|
||||
They are installed onto the Talos servers, and they provide their complete configuration,
|
||||
They provide the Talos servers their complete configuration,
|
||||
describing everything from what disk Talos should be installed to, to what
|
||||
sysctls to set, to what network settings it should have.
|
||||
In the case of the `controlplane.yaml`, it even describes how Talos should form its Kubernetes cluster.
|
||||
sysctls to set, to network settings.
|
||||
The `controlplane.yaml` file describes how Talos should form a Kubernetes cluster.
|
||||
|
||||
The `talosconfig` file (which is also YAML) is your local client configuration
|
||||
file.
|
||||
|
||||
### Controlplane and Worker
|
||||
|
||||
The two types of Machine Configs correspond to the two roles of Talos nodes.
|
||||
|
||||
The Controlplane Machine Config describes the configuration of a Talos server on
|
||||
which the Kubernetes Controlplane should run.
|
||||
The Worker Machine Config describes everything else: workload servers.
|
||||
The two types of Machine Configs correspond to the two roles of Talos nodes, controlplane (which run both the Talos and Kubernetes control planes) and worker nodes (which run the workloads).
|
||||
|
||||
The main difference between Controlplane Machine Config files and Worker Machine
|
||||
Config files is that the former contains information about how to form the
|
||||
Kubernetes cluster.
|
||||
|
||||
### Templates
|
||||
### Machine Configs as Templates
|
||||
|
||||
The generated files can be thought of as templates.
|
||||
Individual machines may need specific settings (for instance, each may have a
|
||||
different static IP address).
|
||||
Individual machines may need specific settings: for instance, each may have a
|
||||
different [static IP address]({{< relref "../advanced/advanced-networking/#static-addressing" >}}).
|
||||
By default, interfaces are set to DHCP.
|
||||
When different files are needed for machines of the same type, simply
|
||||
copy the source template (`controlplane.yaml` or `worker.yaml`) and make whatever
|
||||
modifications need to be done.
|
||||
@ -250,16 +233,18 @@ may do something like this:
|
||||
end
|
||||
```
|
||||
|
||||
Then modify each file as needed.
|
||||
|
||||
In cases where there is no special configuration needed, you may use the same
|
||||
file for each machine of the same type.
|
||||
file for all machines of the same type.
|
||||
|
||||
### Apply Configuration
|
||||
|
||||
After you have generated each machine's Machine Config, you need to load them
|
||||
If you have booted into maintenance mode, then after you have generated the Machine Configs, you need to load them
|
||||
into the machines themselves.
|
||||
For that, you need to know their IP addresses.
|
||||
|
||||
If you have access to the console or console logs of the machines, you can read
|
||||
If you have access to the console logs of the machines, you can read
|
||||
them to find the IP address(es).
|
||||
Talos will print them out during the boot process:
|
||||
|
||||
@ -290,9 +275,9 @@ Once you have the IP address, you can then apply the correct configuration.
|
||||
|
||||
The insecure flag is necessary at this point because the PKI infrastructure has
|
||||
not yet been made available to the node.
|
||||
Note that the connection _will_ be encrypted, it is just unauthenticated.
|
||||
Note: the connection _will_ be encrypted, it is just unauthenticated.
|
||||
|
||||
If you have console access, though, you can extract the server
|
||||
If you have console access you can extract the server
|
||||
certificate fingerprint and use it for an additional layer of validation:
|
||||
|
||||
```sh
|
||||
@ -303,37 +288,45 @@ certificate fingerprint and use it for an additional layer of validation:
|
||||
```
|
||||
|
||||
Using the fingerprint allows you to be sure you are sending the configuration to
|
||||
the right machine, but it is completely optional.
|
||||
the correct machine, but it is completely optional.
|
||||
|
||||
After the configuration is applied to a node, it will reboot.
|
||||
|
||||
You may repeat this process for each of the nodes in your cluster.
|
||||
Repeat this process for each of the nodes in your cluster.
|
||||
|
||||
## Configure your talosctl client
|
||||
|
||||
Now that the nodes are running Talos with its full PKI security suite, you need
|
||||
to use that PKI to talk to the machines.
|
||||
That means configuring your client, and that is what that `talosconfig` file is for.
|
||||
That is what that `talosconfig` file is for.
|
||||
|
||||
It is important to understand the conecpt of `endpoints` and `nodes`.
|
||||
|
||||
### Endpoints
|
||||
|
||||
Endpoints are the communication endpoints to which the client directly talks.
|
||||
These can be load balancers, DNS hostnames, a list of IPs, etc.
|
||||
In general, it is recommended that these point to the set of control plane
|
||||
nodes, either directly or through a reverse proxy or load balancer.
|
||||
Endpoints are the IP addresses to which the `talosctl` client directly talks.
|
||||
It is recommended that these point to the set of control plane
|
||||
nodes, either directly or through a load balancer.
|
||||
You tell your client (`talosctl`) to communicate with the controlplane nodes by defining the `endpoints`.
|
||||
|
||||
Each endpoint will automatically proxy requests destined to another node through
|
||||
it, so it is not necessary to change the endpoint configuration just because you
|
||||
wish to talk to a different node within the cluster.
|
||||
Each endpoint will automatically proxy requests destined to another node, so it is not necessary to change the endpoint configuration just to talk to a different node within the cluster.
|
||||
This means that you only need access to the controlplane nodes in order to access
|
||||
the rest of the network.
|
||||
This is useful for security (worker nodes do not need to have
|
||||
public IPs, and can still be queried by `talosctl`), and it also makes working
|
||||
with highly-variable clusters easier, since you only need to know the
|
||||
controlplane nodes in advance.
|
||||
|
||||
Endpoints _do_, however, need to be members of the same Talos cluster as the
|
||||
Endpoints _do_ need to be members of the same Talos cluster as the
|
||||
target node, because these proxied connections reply on certificate-based
|
||||
authentication.
|
||||
|
||||
We need to set the `endpoints` in your `talosconfig`.
|
||||
`talosctl` will automatically load balance and fail over among the endpoints,
|
||||
so no external load balancer or DNS abstraction is required
|
||||
(though you are free to use them).
|
||||
`talosctl` will automatically load balance requests and fail over
|
||||
between all of your endpoints.
|
||||
|
||||
You can pass in `--endpoints <IP Address1>,<IP Address2>` as a comma separated list of IP/DNS addresses to affect the endpoints used by the current `talosctl` command.
|
||||
You can also set the `endpoints` in your `talosconfig`, by calling `talosctl config endpoint <IP Address1> <IP Address2>`.
|
||||
Note: these are space separated, not comma separated.
|
||||
|
||||
As an example, if the IP addresses of our controlplane nodes are:
|
||||
|
||||
@ -350,34 +343,41 @@ We would set those in the `talosconfig` with:
|
||||
|
||||
### Nodes
|
||||
|
||||
The node is the target node on which you wish to perform the API call.
|
||||
The node is the target you wish to perform the API call on.
|
||||
|
||||
Keep in mind, when specifying nodes, their IPs and/or hostnames are *as seen by the endpoint servers*, not as from the client.
|
||||
This is because all connections are proxied through the endpoints.
|
||||
> When specifying nodes, their IPs and/or hostnames are *as seen by the endpoint servers*, not as from the client.
|
||||
> This is because all connections are proxied through the endpoints.
|
||||
|
||||
Some people also like to set a default set of nodes in the `talosconfig`.
|
||||
This can be done in the same manner, replacing `endpoint` with `node`.
|
||||
If you do this, however, know that you could easily reboot the wrong machine
|
||||
by forgetting to declare the right one explicitly.
|
||||
Worse, if you set several nodes as defaults, you could, with one `talosctl upgrade`
|
||||
command upgrade your whole cluster all at the same time.
|
||||
It's a powerful tool, and with that comes great responsibility.
|
||||
|
||||
The author of this document generally sets a single controlplane node to be the
|
||||
default node, which provides the most flexible default operation while limiting
|
||||
the scope of the disaster should a command be entered erroneously:
|
||||
|
||||
```sh
|
||||
talosctl --talosconfig=./talosconfig \
|
||||
config node 192.168.0.2
|
||||
```
|
||||
|
||||
You may simply provide `-n` or `--nodes` to any `talosctl` command to
|
||||
If you do this, however, you could easily reboot the wrong machine, or multiple machines,
|
||||
by forgetting to declare the right one explicitly.
|
||||
|
||||
Our recommendation is to leave `nodes` empty in the talosconfig file, and explicitly pass in the node or nodes to be operated on with each `talosctl` command.
|
||||
You may provide `-n` or `--nodes` to any `talosctl` command to
|
||||
supply the node or (comma-delimited) nodes on which you wish to perform the
|
||||
operation.
|
||||
Supplying the commandline parameter will override any default nodes
|
||||
in the configuration file.
|
||||
|
||||
For example, to see the containers running on node 192.168.0.200:
|
||||
|
||||
```bash
|
||||
talosctl -n 192.168.0.200 containers
|
||||
```
|
||||
|
||||
To see the etcd logs on *both* nodes 192.168.0.10 and 192.168.0.11:
|
||||
|
||||
```bash
|
||||
talosctl -n 192.168.0.10,192.168.0.11 logs etcd
|
||||
```
|
||||
|
||||
To verify default node(s) you're currently configured to use, you can run:
|
||||
|
||||
```bash
|
||||
@ -419,7 +419,6 @@ The `<cluster-name>` you chose above will be used as the context name.
|
||||
|
||||
All of your machines are configured, and your `talosctl` client is set up.
|
||||
Now, you are ready to bootstrap your Kubernetes cluster.
|
||||
If that sounds daunting, you haven't used Talos before.
|
||||
|
||||
Bootstrapping your Kubernetes cluster with Talos is as simple as:
|
||||
|
||||
@ -427,12 +426,10 @@ Bootstrapping your Kubernetes cluster with Talos is as simple as:
|
||||
talosctl bootstrap --nodes 192.168.0.2
|
||||
```
|
||||
|
||||
**IMPORTANT**: the bootstrap operation should only be called **ONCE** and only on a **SINGLE**
|
||||
The bootstrap operation should only be called **ONCE** and only on a **SINGLE**
|
||||
controlplane node!
|
||||
|
||||
The IP can be any of your controlplanes (or the loadbalancer, if you have
|
||||
one).
|
||||
It should only be issued once.
|
||||
The IP can be any of your controlplanes (or the loadbalancer, if used for the Talos API endpoint).
|
||||
|
||||
At this point, Talos will form an `etcd` cluster, generate all of the core
|
||||
Kubernetes assets, and start the Kubernetes controlplane components.
|
||||
@ -449,15 +446,21 @@ configuration in the same way as `talosctl config merge` merged the Talos client
|
||||
configuration into your local Talos client configuration file.
|
||||
|
||||
If you would prefer for the configuration to _not_ be merged into your default
|
||||
Kubernetes configuration file, simple tell it a filename:
|
||||
Kubernetes configuration file, tell it a filename:
|
||||
|
||||
```sh
|
||||
talosctl kubeconfig alternative-kubeconfig
|
||||
```
|
||||
|
||||
If all goes well, you should now be able to connect to Kubernetes and see your
|
||||
You should now be able to connect to Kubernetes and see your
|
||||
nodes:
|
||||
|
||||
```sh
|
||||
kubectl get nodes
|
||||
```
|
||||
|
||||
And use talosctl to explore your cluster:
|
||||
|
||||
```sh
|
||||
talosctl -n <NODEIP> dashboard
|
||||
```
|
||||
|
@ -1,58 +0,0 @@
|
||||
---
|
||||
title: "Concepts"
|
||||
weight: 30
|
||||
description: "Summary of Talos Linux."
|
||||
---
|
||||
|
||||
When people come across Talos, they frequently want a nice, bite-sized summary
|
||||
of it.
|
||||
This is surprisingly difficult when Talos represents such a
|
||||
fundamentally-rethought operating system.
|
||||
|
||||
## Not based on X distro
|
||||
|
||||
A useful way to summarize an operating system is to say that it is based on X, but focused on Y.
|
||||
For instance, Mint was originally based on Ubuntu, but focused on Gnome 2 (instead of, at the time, Unity).
|
||||
Or maybe something like Raspbian is based on Debian, but it is focused on the Raspberry Pi.
|
||||
CentOS is RHEL, but made license-free.
|
||||
|
||||
Talos Linux _isn't_ based on any other distribution.
|
||||
We often think of ourselves as being the second-generation of
|
||||
container-optimised operating systems, where things like CoreOS, Flatcar, and Rancher represent the first generation, but that implies heredity where there is none.
|
||||
|
||||
Talos Linux is actually a ground-up rewrite of the userspace, from PID 1.
|
||||
We run the Linux kernel, but everything downstream of that is our own custom
|
||||
code, written in Go, rigorously-tested, and published as an immutable,
|
||||
integrated, cohesive image.
|
||||
The Linux kernel launches what we call `machined`, for instance, not `systemd`.
|
||||
There is no `systemd` on our system.
|
||||
There are no GNU utilities, no shell, no SSH, no packages, nothing you could associate with
|
||||
any other distribution.
|
||||
We don't even have a build toolchain in the normal sense of the word.
|
||||
|
||||
## Not for individual use
|
||||
|
||||
Technically, Talos Linux installs to a computer much as other operating systems.
|
||||
_Unlike_ other operating systems, Talos is not meant to run alone, on a
|
||||
single machine.
|
||||
Talos Linux comes with tooling from the very foundation to form clusters, even
|
||||
before Kubernetes comes into play.
|
||||
A design goal of Talos Linux is eliminating the management
|
||||
of individual nodes as much as possible.
|
||||
In order to do that, Talos Linux operates as a cluster of machines, with lots of
|
||||
checking and coordination between them, at all levels.
|
||||
|
||||
Break from your mind the idea of running an application on a computer.
|
||||
There are no individual computers.
|
||||
There is only a cluster.
|
||||
Talos is meant to do one thing: maintain a Kubernetes cluster, and it does this
|
||||
very, very well.
|
||||
|
||||
The entirety of the configuration of any machine is specified by a single,
|
||||
simple configuration file, which can often be the _same_ configuration file used
|
||||
across _many_ machines.
|
||||
Much like a biological system, if some component misbehaves, just cut it out and
|
||||
let a replacement grow.
|
||||
Rebuilds of Talos are remarkably fast, whether they be new machines, upgrades,
|
||||
or reinstalls.
|
||||
Never get hung up on an individual machine.
|
@ -6,10 +6,9 @@ description: "Learn about the philosophy behind the need for Talos Linux."
|
||||
|
||||
## Distributed
|
||||
|
||||
Talos is intended to be operated in a distributed manner.
|
||||
That is, it is built for a high-availability dataplane _first_.
|
||||
Talos is intended to be operated in a distributed manner: it is built for a high-availability dataplane _first_.
|
||||
Its `etcd` cluster is built in an ad-hoc manner, with each appointed node joining on its own directive (with proper security validations enforced, of course).
|
||||
Like as kubernetes itself, workloads are intended to be distributed across any number of compute nodes.
|
||||
Like Kubernetes, workloads are intended to be distributed across any number of compute nodes.
|
||||
|
||||
There should be no single points of failure, and the level of required coordination is as low as each platform allows.
|
||||
|
||||
@ -21,18 +20,18 @@ All images are signed and delivered as single, versioned files.
|
||||
We can always run integrity checks on our image to verify that it has not been modified.
|
||||
|
||||
While Talos does allow a few, highly-controlled write points to the filesystem, we strive to make them as non-unique and non-critical as possible.
|
||||
In fact, we call the writable partition the "ephemeral" partition precisely because we want to make sure none of us ever uses it for unique, non-replicated, non-recreatable data.
|
||||
We call the writable partition the "ephemeral" partition precisely because we want to make sure none of us ever uses it for unique, non-replicated, non-recreatable data.
|
||||
Thus, if all else fails, we can always wipe the disk and get back up and running.
|
||||
|
||||
## Minimal
|
||||
|
||||
We are always trying to reduce and keep small Talos' footprint.
|
||||
Because nearly the entire OS is built from scratch in Go, we are already
|
||||
starting out in a good position.
|
||||
We are always trying to reduce Talos' footprint.
|
||||
Because nearly the entire OS is built from scratch in Go, we are
|
||||
in a good position.
|
||||
We have no shell.
|
||||
We have no SSH.
|
||||
We have none of the GNU utilities, not even a rollup tool such as busybox.
|
||||
Everything which is included in Talos is there because it is necessary, and
|
||||
Everything in Talos is there because it is necessary, and
|
||||
nothing is included which isn't.
|
||||
|
||||
As a result, the OS right now produces a SquashFS image size of less than **80 MB**.
|
||||
@ -40,7 +39,7 @@ As a result, the OS right now produces a SquashFS image size of less than **80 M
|
||||
## Ephemeral
|
||||
|
||||
Everything Talos writes to its disk is either replicated or reconstructable.
|
||||
Since the controlplane is high availability, the loss of any node will cause
|
||||
Since the controlplane is highly available, the loss of any node will cause
|
||||
neither service disruption nor loss of data.
|
||||
No writes are even allowed to the vast majority of the filesystem.
|
||||
We even call the writable partition "ephemeral" to keep this idea always in
|
||||
@ -52,7 +51,7 @@ Talos has always been designed with security in mind.
|
||||
With its immutability, its minimalism, its signing, and its componenture, we are
|
||||
able to simply bypass huge classes of vulnerabilities.
|
||||
Moreover, because of the way we have designed Talos, we are able to take
|
||||
advantage of a number of additional settings, such as the recommendations of the Kernel Self Protection Project (kspp) and the complete disablement of dynamic modules.
|
||||
advantage of a number of additional settings, such as the recommendations of the Kernel Self Protection Project (kspp) and completely disabling dynamic modules.
|
||||
|
||||
There are no passwords in Talos.
|
||||
All networked communication is encrypted and key-authenticated.
|
||||
@ -62,7 +61,7 @@ enforced.
|
||||
|
||||
## Declarative
|
||||
|
||||
Everything which can be configured in Talos is done so through a single YAML
|
||||
Everything which can be configured in Talos is done through a single YAML
|
||||
manifest.
|
||||
There is no scripting and no procedural steps.
|
||||
Everything is defined by the one declarative YAML file.
|
||||
@ -70,4 +69,43 @@ This configuration includes that of both Talos itself and the Kubernetes which
|
||||
it forms.
|
||||
|
||||
This is achievable because Talos is tightly focused to do one thing: run
|
||||
kubernetes, in the easiest, most secure, most reliable way it can.
|
||||
Kubernetes, in the easiest, most secure, most reliable way it can.
|
||||
/Users/stevefrancis/workspace/talos/website/content/v1.3/learn-more
|
||||
|
||||
## Not based on X distro
|
||||
|
||||
Talos Linux _isn't_ based on any other distribution.
|
||||
We think of ourselves as being the second-generation of
|
||||
container-optimised operating systems, where things like CoreOS, Flatcar, and Rancher represent the first generation (but the technology is not derived from any of those.)
|
||||
|
||||
Talos Linux is actually a ground-up rewrite of the userspace, from PID 1.
|
||||
We run the Linux kernel, but everything downstream of that is our own custom
|
||||
code, written in Go, rigorously-tested, and published as an immutable,
|
||||
integrated image.
|
||||
The Linux kernel launches what we call `machined`, for instance, not `systemd`.
|
||||
There is no `systemd` on our system.
|
||||
There are no GNU utilities, no shell, no SSH, no packages, nothing you could associate with
|
||||
any other distribution.
|
||||
|
||||
## An Operating System designed for Kubernetes
|
||||
|
||||
Technically, Talos Linux installs to a computer like any other operating system.
|
||||
_Unlike_ other operating systems, Talos is not meant to run alone, on a
|
||||
single machine.
|
||||
A design goal of Talos Linux is eliminating the management
|
||||
of individual nodes as much as possible.
|
||||
In order to do that, Talos Linux operates as a cluster of machines, with lots of
|
||||
checking and coordination between them, at all levels.
|
||||
|
||||
There is only a cluster.
|
||||
Talos is meant to do one thing: maintain a Kubernetes cluster, and it does this
|
||||
very, very well.
|
||||
|
||||
The entirety of the configuration of any machine is specified by a single
|
||||
configuration file, which can often be the _same_ configuration file used
|
||||
across _many_ machines.
|
||||
Much like a biological system, if some component misbehaves, just cut it out and
|
||||
let a replacement grow.
|
||||
Rebuilds of Talos are remarkably fast, whether they be new machines, upgrades,
|
||||
or reinstalls.
|
||||
Never get hung up on an individual machine.
|
||||
|
@ -4,8 +4,7 @@ weight: 110
|
||||
description: "The design and use of the Talos Linux control application."
|
||||
---
|
||||
|
||||
The `talosctl` tool packs a lot of power into a small package.
|
||||
It acts as a reference implementation for the Talos API, but it also handles a lot of
|
||||
The `talosctl` tool acts as a reference implementation for the Talos API, but it also handles a lot of
|
||||
conveniences for the use of Talos and its clusters.
|
||||
|
||||
### Video Walkthrough
|
||||
@ -21,7 +20,7 @@ Otherwise it is in `$HOME/.talos/config`.
|
||||
The location can always be overridden by the `TALOSCONFIG` environment variable or the `--talosconfig` parameter.
|
||||
|
||||
Like `kubectl`, `talosctl` uses the concept of configuration contexts, so any number of Talos clusters can be managed with a single configuration file.
|
||||
Unlike `kubectl`, it also comes with some intelligent tooling to manage the merging of new contexts into the config.
|
||||
It also comes with some intelligent tooling to manage the merging of new contexts into the config.
|
||||
The default operation is a non-destructive merge, where if a context of the same name already exists in the file, the context to be added is renamed by appending an index number.
|
||||
You can easily overwrite instead, as well.
|
||||
See the `talosctl config help` for more information.
|
||||
@ -30,21 +29,21 @@ See the `talosctl config help` for more information.
|
||||
|
||||

|
||||
|
||||
The `endpoints` are the communication endpoints to which the client directly talks.
|
||||
`endpoints` are the communication endpoints to which the client directly talks.
|
||||
These can be load balancers, DNS hostnames, a list of IPs, etc.
|
||||
Further, if multiple endpoints are specified, the client will automatically load
|
||||
If multiple endpoints are specified, the client will automatically load
|
||||
balance and fail over between them.
|
||||
In general, it is recommended that these point to the set of control plane nodes, either directly or through a reverse proxy or load balancer.
|
||||
It is recommended that these point to the set of control plane nodes, either directly or through a load balancer.
|
||||
|
||||
Each endpoint will automatically proxy requests destined to another node through it, so it is not necessary to change the endpoint configuration just because you wish to talk to a different node within the cluster.
|
||||
|
||||
Endpoints _do_, however, need to be members of the same Talos cluster as the target node, because these proxied connections reply on certificate-based authentication.
|
||||
|
||||
The `node` is the target node on which you wish to perform the API call.
|
||||
While you can configure the target node (or even set of target nodes) inside the 'talosctl' configuration file, it is often useful to simply and explicitly declare the target node(s) using the `-n` or `--nodes` command-line parameter.
|
||||
While you can configure the target node (or even set of target nodes) inside the 'talosctl' configuration file, it is recommended not to do so, but to explicitly declare the target node(s) using the `-n` or `--nodes` command-line parameter.
|
||||
|
||||
Keep in mind, when specifying nodes that their IPs and/or hostnames are as seen by the endpoint servers, not as from the client.
|
||||
This is because all connections are proxied first through the endpoints.
|
||||
> When specifying nodes, their IPs and/or hostnames are as seen by the endpoint servers, not as from the client.
|
||||
> This is because all connections are proxied first through the endpoints.
|
||||
|
||||
## Kubeconfig
|
||||
|
||||
|
@ -1,10 +0,0 @@
|
||||
---
|
||||
title: "Platform"
|
||||
description: "Visualization of the bootstrap process on bare metal machines."
|
||||
---
|
||||
|
||||
### Metal
|
||||
|
||||
Below is a image to visualize the process of bootstrapping nodes.
|
||||
|
||||
<img src="/images/metal-overview.png" width="950">
|
@ -146,8 +146,6 @@ Once selected, you need to assign to following:
|
||||
This will provision the Stage and Bootenv with the talos values.
|
||||
Once this is done, you can boot the machine.
|
||||
|
||||
To understand the boot process, we have a higher level overview located at [metal overview]({{< relref "../../../reference/platform" >}}).
|
||||
|
||||
### Bootstrap Etcd
|
||||
|
||||
To configure `talosctl` we will need the first control plane node's IP:
|
||||
|
@ -5,30 +5,26 @@ aliases:
|
||||
- ../../guides/vip
|
||||
---
|
||||
|
||||
One of the biggest pain points when building a high-availability controlplane
|
||||
One of the pain points when building a high-availability controlplane
|
||||
is giving clients a single IP or URL at which they can reach any of the controlplane nodes.
|
||||
The most common approaches all require external resources: reverse proxy, load
|
||||
balancer, BGP, and DNS.
|
||||
The most common approaches - reverse proxy, load
|
||||
balancer, BGP, and DNS - all require external resources, and add complexity in setting up Kubernetes.
|
||||
|
||||
Using a "Virtual" IP address, on the other hand, provides high availability
|
||||
without external coordination or resources, so long as the controlplane members
|
||||
share a layer 2 network.
|
||||
In practical terms, this means that they are all connected via a switch, with no
|
||||
router in between them.
|
||||
To simplify cluster creation, Talos Linux supports a "Virtual" IP (VIP) address to access the Kubernetes API server, providing high availability with no other resources required.
|
||||
|
||||
The term "virtual" is misleading here.
|
||||
The IP address is real, and it is assigned to an interface.
|
||||
Instead, what actually happens is that the controlplane machines vie for
|
||||
control of the shared IP address using etcd elections.
|
||||
There can be only one owner of the IP address at any given time, but if that
|
||||
owner disappears or becomes non-responsive, another owner will be chosen,
|
||||
and it will take up the mantle: the IP address.
|
||||
What happens is that the controlplane machines vie for control of the shared IP address using etcd elections.
|
||||
There can be only one owner of the IP address at any given time.
|
||||
If that owner disappears or becomes non-responsive, another owner will be chosen,
|
||||
and it will take up the IP address.
|
||||
|
||||
Talos has (as of version 0.9) built-in support for this form of shared IP address,
|
||||
and it can utilize this for both the Kubernetes API server and the Talos endpoint set.
|
||||
Talos uses `etcd` for elections and leadership (control) of the IP address.
|
||||
It is not reccomended to use a virtual IP to access the API of Talos itself, since the
|
||||
node using the shared IP is not deterministic and could change.
|
||||
### Requirements
|
||||
|
||||
The controlplane nodes must share a layer 2 network, and the virtual IP must be assigned from that shared network subnet.
|
||||
In practical terms, this means that they are all connected via a switch, with no router in between them.
|
||||
Note that the virtual IP election depends on `etcd` being up, as Talos uses `etcd` for elections and leadership (control) of the IP address.
|
||||
|
||||
The virtual IP is not restricted by ports - you can access any port that the control plane nodes are listening on, on that IP address.
|
||||
Thus it *is* possible to access the Talos API over the VIP, but it is *not recommended*, as you cannot access the VIP when etcd is down - and then you could not access the Talos API to recover etcd.
|
||||
|
||||
## Video Walkthrough
|
||||
|
||||
@ -38,8 +34,7 @@ To see a live demo of this writeup, see the video below:
|
||||
|
||||
## Choose your Shared IP
|
||||
|
||||
To begin with, you should choose your shared IP address.
|
||||
It should generally be a reserved, unused IP address in the same subnet as
|
||||
The Virtual IP should be a reserved, unused IP address in the same subnet as
|
||||
your controlplane nodes.
|
||||
It should not be assigned or assignable by your DHCP server.
|
||||
|
||||
@ -88,16 +83,14 @@ machine:
|
||||
ip: 192.168.1.15
|
||||
```
|
||||
|
||||
Obviously, for your own environment, the interface and the DHCP setting may
|
||||
differ.
|
||||
You are free to use static addressing (`cidr`) instead of DHCP.
|
||||
For your own environment, the interface and the DHCP setting may differ, or you may
|
||||
use static addressing (`cidr`) instead of DHCP.
|
||||
|
||||
## Caveats
|
||||
|
||||
In general, the shared IP should just work.
|
||||
However, since it relies on `etcd` for elections, the shared IP will not come
|
||||
Since VIP functionality relies on `etcd` for elections, the shared IP will not come
|
||||
alive until after you have bootstrapped Kubernetes.
|
||||
In general, this is not a problem, but it does mean that you cannot use the
|
||||
shared IP when issuing the `talosctl bootstrap` command.
|
||||
Instead, that command will need to target one of the controlplane nodes
|
||||
discretely.
|
||||
This does mean that you cannot use the
|
||||
shared IP when issuing the `talosctl bootstrap` command (although, as noted above, it is not recommended to access the Talos API via the VIP).
|
||||
Instead, the `bootstrap` command will need to target one of the controlplane nodes
|
||||
directly.
|
||||
|
Loading…
x
Reference in New Issue
Block a user