This commit fixes a bug in our HA ingress reconciler where ingress resources would
be stuck in a deleting state should their associated VIP service be deleted within
control.
The reconciliation loop would check for the existence of the VIP service and if not
found perform no additional cleanup steps. The code has been modified to continue
onwards even if the VIP service is not found.
Fixes: https://github.com/tailscale/tailscale/issues/18049
Signed-off-by: David Bond <davidsbond93@gmail.com>
Adds a new reconciler for ProxyGroups of type kube-apiserver that will
provision a Tailscale Service for each replica to advertise. Adds two
new condition types to the ProxyGroup, TailscaleServiceValid and
TailscaleServiceConfigured, to post updates on the state of that
reconciler in a way that's consistent with the service-pg reconciler.
The created Tailscale Service name is configurable via a new ProxyGroup
field spec.kubeAPISserver.ServiceName, which expects a string of the
form "svc:<dns-label>".
Lots of supporting changes were needed to implement this in a way that's
consistent with other operator workflows, including:
* Pulled containerboot's ensureServicesUnadvertised and certManager into
kube/ libraries to be shared with k8s-proxy. Use those in k8s-proxy to
aid Service cert sharing between replicas and graceful Service shutdown.
* For certManager, add an initial wait to the cert loop to wait until
the domain appears in the devices's netmap to avoid a guaranteed error
on the first issue attempt when it's quick to start.
* Made several methods in ingress-for-pg.go and svc-for-pg.go into
functions to share with the new reconciler
* Added a Resource struct to the owner refs stored in Tailscale Service
annotations to be able to distinguish between Ingress- and ProxyGroup-
based Services that need cleaning up in the Tailscale API.
* Added a ListVIPServices method to the internal tailscale client to aid
cleaning up orphaned Services
* Support for reading config from a kube Secret, and partial support for
config reloading, to prevent us having to force Pod restarts when
config changes.
* Fixed up the zap logger so it's possible to set debug log level.
Updates #13358
Change-Id: Ia9607441157dd91fb9b6ecbc318eecbef446e116
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Adds a new k8s-proxy command to convert operator's in-process proxy to
a separately deployable type of ProxyGroup: kube-apiserver. k8s-proxy
reads in a new config file written by the operator, modelled on tailscaled's
conffile but with some modifications to ensure multiple versions of the
config can co-exist within a file. This should make it much easier to
support reading that config file from a Kube Secret with a stable file name.
To avoid needing to give the operator ClusterRole{,Binding} permissions,
the helm chart now optionally deploys a new static ServiceAccount for
the API Server proxy to use if in auth mode.
Proxies deployed by kube-apiserver ProxyGroups currently work the same as
the operator's in-process proxy. They do not yet leverage Tailscale Services
for presenting a single HA DNS name.
Updates #13358
Change-Id: Ib6ead69b2173c5e1929f3c13fb48a9a5362195d8
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This commit modifies the k8s operator to allow for customisation of the ingress class name
via a new `OPERATOR_INGRESS_CLASS_NAME` environment variable. For backwards compatibility,
this defaults to `tailscale`.
When using helm, a new `ingress.name` value is provided that will set this environment variable
and modify the name of the deployed `IngressClass` resource.
Fixes https://github.com/tailscale/tailscale/issues/16248
Signed-off-by: David Bond <davidsbond93@gmail.com>
Previously, the operator checked the ProxyGroup status fields for
information on how many of the proxies had successfully authed. Use
their state Secrets instead as a more reliable source of truth.
containerboot has written device_fqdn and device_ips keys to the
state Secret since inception, and pod_uid since 1.78.0, so there's
no need to use the API for that data. Read it from the state Secret
for consistency. However, to ensure we don't read data from a
previous run of containerboot, make sure we reset containerboot's
state keys on startup.
One other knock-on effect of that is ProxyGroups can briefly be
marked not Ready while a Pod is restarting. Introduce a new
ProxyGroupAvailable condition to more accurately reflect
when downstream controllers can implement flows that rely on a
ProxyGroup having at least 1 proxy Pod running.
Fixes#16327
Change-Id: I026c18e9d23e87109a471a87b8e4fb6271716a66
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Ensure that if the ProxyGroup for HA Ingress changes, the TLS Secret
and Role and RoleBinding that allow proxies to read/write to it are
updated.
Fixes#16259
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Validate that any tags that users have specified via tailscale.com/tags
annotation are valid Tailscale ACL tags.
Validate that no more than one HA Tailscale Kubernetes Services in a single cluster refer
to the same Tailscale Service.
Updates tailscale/tailscale#16054
Updates tailscale/tailscale#16035
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
This reconciler allows users to make applications highly available at L3 by
leveraging Tailscale Virtual Services. Many Kubernetes Service's
(irrespective of the cluster they reside in) can be mapped to a
Tailscale Virtual Service, allowing access to these Services at L3.
Updates #15895
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
cmd/{k8s-operator,containerboot}: check TLS cert before advertising VIPService
- Ensures that Ingress status does not advertise port 443 before
TLS cert has been issued
- Ensure that Ingress backends do not advertise a VIPService
before TLS cert has been issued, unless the service also
exposes port 80
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Switch from using the Comment field to a ts-scoped annotation for
tracking which operators are cooperating over ownership of a
VIPService.
Updates tailscale/corp#24795
Change-Id: I72d4a48685f85c0329aa068dc01a1a3c749017bf
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
cmd/k8s-operator: configure HA Ingress replicas to share certs
Creates TLS certs Secret and RBAC that allows HA Ingress replicas
to read/write to the Secret.
Configures HA Ingress replicas to run in read-only mode.
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Update the HA Ingress controller to wait until it sees AdvertisedServices
config propagated into at least 1 Pod's prefs before it updates the status
on the Ingress, to ensure the ProxyGroup Pods are ready to serve traffic
before indicating that the Ingress is ready
Updates tailscale/corp#24795
Change-Id: I1b8ce23c9e312d08f9d02e48d70bdebd9e1a4757
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
When the Ingress is updated to a new hostname, the controller does not
currently clean up the old VIPService from control. Fix this up to parse
the ownership comment correctly and write a test to enforce the improved
behaviour
Updates tailscale/corp#24795
Change-Id: I792ae7684807d254bf2d3cc7aa54aa04a582d1f5
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
cmd/k8s-operator: ensure HA Ingress can operate in multicluster mode.
Update the owner reference mechanism so that:
- if during HA Ingress resource creation, a VIPService
with some other operator's owner reference is already found,
just update the owner references to add one for this operator
- if during HA Ingress deletion, the VIPService is found to have owner
reference(s) from another operator, don't delete the VIPService, just
remove this operator's owner reference
- requeue after HA Ingress reconciles that resulted in VIPService updates,
to guard against overwrites due to concurrent operations from different
clusters.
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Now that packets flow for VIPServices, the last piece needed to start
serving them from a ProxyGroup is config to tell the proxy Pods which
services they should advertise.
Updates tailscale/corp#24795
Change-Id: Ic7bbeac8e93c9503558107bc5f6123be02a84c77
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This change:
- reinstates the HA Ingress controller that was disabled for 1.80 release
- fixes the API calls to manage VIPServices as the API was changed
- triggers the HA Ingress reconciler on ProxyGroup changes
Updates tailscale/tailscale#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Rather than using a string everywhere and needing to clarify that the
string should have the svc: prefix, create a separate type for Service
names.
Updates tailscale/corp#24607
Change-Id: I720e022f61a7221644bb60955b72cacf42f59960
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
cmd/k8s-operator: add logic to parse L7 Ingresses in HA mode
- Wrap the Tailscale API client used by the Kubernetes Operator
into a client that knows how to manage VIPServices.
- Create/Delete VIPServices and update serve config for L7 Ingresses
for ProxyGroup.
- Ensure that ingress ProxyGroup proxies mount serve config from a shared ConfigMap.
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>