Removes ACL edits from e2e tests in favour of trying to simplify the
tests and separate the actual test logic from the environment setup
logic as much as possible. Also aims to fit in with the requirements
that will generally be filled anyway for most devs working on the
operator; in particular using tags that fit in with our documentation.
Updates tailscale/corp#32085
Change-Id: I7659246e39ec0b7bcc4ec0a00c6310f25fe6fac2
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This is step 4 of making syspolicy a build-time feature.
This adds a policyclient.Get() accessor to return the correct
implementation to use: either the real one, or the no-op one. (A third
type, a static one for testing, also exists, so in general a
policyclient.Client should be plumbed around and not always fetched
via policyclient.Get whenever possible, especially if tests need to use
alternate syspolicy)
Updates #16998
Updates #12614
Change-Id: Iaf19670744a596d5918acfa744f5db4564272978
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Step 3 in the series. See earlier cc532efc2000 and d05e6dc09e.
This step moves some types into a new leaf "ptype" package out of the
big "settings" package. The policyclient.Client will later get new
methods to return those things (as well as Duration and Uint64, which
weren't done at the time of the earlier prototype).
Updates #16998
Updates #12614
Change-Id: I4d72d8079de3b5351ed602eaa72863372bd474a2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This commit adds a `replicas` field to the `Connector` custom resource that
allows users to specify the number of desired replicas deployed for their
connectors.
This allows users to deploy exit nodes, subnet routers and app connectors
in a highly available fashion.
Fixes#14020
Signed-off-by: David Bond <davidsbond93@gmail.com>
This is step 2 of ~4, breaking up #14720 into reviewable chunks, with
the aim to make syspolicy be a build-time configurable feature.
Step 1 was #16984.
In this second step, the util/syspolicy/policyclient package is added
with the policyclient.Client interface. This is the interface that's
always present (regardless of build tags), and is what code around the
tree uses to ask syspolicy/MDM questions.
There are two implementations of policyclient.Client for now:
1) NoPolicyClient, which only returns default values.
2) the unexported, temporary 'globalSyspolicy', which is implemented
in terms of the global functions we wish to later eliminate.
This then starts to plumb around the policyclient.Client to most callers.
Future changes will plumb it more. When the last of the global func
callers are gone, then we can unexport the global functions and make a
proper policyclient.Client type and constructor in the syspolicy
package, removing the globalSyspolicy impl out of tsd.
The final change will sprinkle build tags in a few more places and
lock it in with dependency tests to make sure the dependencies don't
later creep back in.
Updates #16998
Updates #12614
Change-Id: Ib2c93d15c15c1f2b981464099177cd492d50391c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This is step 1 of ~3, breaking up #14720 into reviewable chunks, with
the aim to make syspolicy be a build-time configurable feature.
In this first (very noisy) step, all the syspolicy string key
constants move to a new constant-only (code-free) package. This will
make future steps more reviewable, without this movement noise.
There are no code or behavior changes here.
The future steps of this series can be seen in #14720: removing global
funcs from syspolicy resolution and using an interface that's plumbed
around instead. Then adding build tags.
Updates #12614
Change-Id: If73bf2c28b9c9b1a408fe868b0b6a25b03eeabd1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
DERP writes go via TCP and the host OS will have plenty of buffer space.
We've observed in the wild with a backed up TCP socket kernel side
buffers of >2.4MB. The DERP internal queue being larger causes an
increase in the probability that the contents of the backbuffer are
"dead letters" - packets that were assumed to be lost.
A first step to improvement is to size this queue only large enough to
avoid some of the initial connect stall problem, but not large enough
that it is contributing in a substantial way to buffer bloat /
dead-letter retention.
Updates tailscale/corp#31762
Signed-off-by: James Tucker <james@tailscale.com>
I need a ringbuffer in the more traditional sense, one that has a notion
of item removal as well as tail loss on overrun. This implementation is
really a clearable log window, and is used as such where it is used.
Updates #cleanup
Updates tailscale/corp#31762
Signed-off-by: James Tucker <james@tailscale.com>
* cmd/k8s-operator,k8s-operator: allow setting a `priorityClassName`
Fixes#16682
Signed-off-by: Lee Briggs <lee@leebriggs.co.uk>
* Update k8s-operator/apis/v1alpha1/types_proxyclass.go
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
Signed-off-by: Lee Briggs <jaxxstorm@users.noreply.github.com>
* run make kube-generate-all
Change-Id: I5f8f16694fdc181b048217b9f05ec2ee2aa04def
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
---------
Signed-off-by: Lee Briggs <lee@leebriggs.co.uk>
Signed-off-by: Lee Briggs <jaxxstorm@users.noreply.github.com>
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
This update introduces support for DNS records associated with ProxyGroup egress services, ensuring that the ClusterIP Service IP is used instead of Pod IPs.
Fixes#15945
Signed-off-by: Raj Singh <raj@tailscale.com>
* Modifies the k8s-proxy to expose health check and metrics
endpoints on the Pod's IP.
* Moves cmd/containerboot/healthz.go and cmd/containerboot/metrics.go to
/kube to be shared with /k8s-proxy.
Updates #13358
Signed-off-by: David Bond <davidsbond93@gmail.com>
Updates k8s-proxy's config so its auth mode config matches that we set
in kube-apiserver ProxyGroups for consistency.
Updates #13358
Change-Id: I95e29cec6ded2dc7c6d2d03f968a25c822bc0e01
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This commit modifies the kubernetes operator's `DNSConfig` resource
with the addition of a new field at `nameserver.service.clusterIP`.
This field allows users to specify a static in-cluster IP address of
the nameserver when deployed.
Fixes#14305
Signed-off-by: David Bond <davidsbond93@gmail.com>
Adds a new reconciler for ProxyGroups of type kube-apiserver that will
provision a Tailscale Service for each replica to advertise. Adds two
new condition types to the ProxyGroup, TailscaleServiceValid and
TailscaleServiceConfigured, to post updates on the state of that
reconciler in a way that's consistent with the service-pg reconciler.
The created Tailscale Service name is configurable via a new ProxyGroup
field spec.kubeAPISserver.ServiceName, which expects a string of the
form "svc:<dns-label>".
Lots of supporting changes were needed to implement this in a way that's
consistent with other operator workflows, including:
* Pulled containerboot's ensureServicesUnadvertised and certManager into
kube/ libraries to be shared with k8s-proxy. Use those in k8s-proxy to
aid Service cert sharing between replicas and graceful Service shutdown.
* For certManager, add an initial wait to the cert loop to wait until
the domain appears in the devices's netmap to avoid a guaranteed error
on the first issue attempt when it's quick to start.
* Made several methods in ingress-for-pg.go and svc-for-pg.go into
functions to share with the new reconciler
* Added a Resource struct to the owner refs stored in Tailscale Service
annotations to be able to distinguish between Ingress- and ProxyGroup-
based Services that need cleaning up in the Tailscale API.
* Added a ListVIPServices method to the internal tailscale client to aid
cleaning up orphaned Services
* Support for reading config from a kube Secret, and partial support for
config reloading, to prevent us having to force Pod restarts when
config changes.
* Fixed up the zap logger so it's possible to set debug log level.
Updates #13358
Change-Id: Ia9607441157dd91fb9b6ecbc318eecbef446e116
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This commit modifies the k8s-operator and k8s-proxy to support passing down
the accept-routes configuration from the proxy class as a configuration value
read and used by the k8s-proxy when ran as a distinct container managed by
the operator.
Updates #13358
Signed-off-by: David Bond <davidsbond93@gmail.com>
This commit modifies the k8s proxy application configuration to include a
new field named `ServerURL` which, when set, modifies
the tailscale coordination server used by the proxy. This works in the same
way as the operator and the proxies it deploys.
If unset, the default coordination server is used.
Updates https://github.com/tailscale/tailscale/issues/13358
Signed-off-by: David Bond <davidsbond93@gmail.com>
This commit modifies the operator to detect the usage of k8s-apiserver
type proxy groups that wish to use the letsencrypt staging directory and
apply the appropriate environment variable to the statefulset it
produces.
Updates #13358
Signed-off-by: David Bond <davidsbond93@gmail.com>
The observed generation was set to always 0 in #16429, but this had the
knock-on effect of other controllers considering ProxyGroups never ready
because the observed generation is never up to date in
proxyGroupCondition. Make sure the ProxyGroupAvailable function does not
requires the observed generation to be up to date, and add testing
coverage to catch regressions.
Updates #16327
Change-Id: I42f50ad47dd81cc2d3c3ce2cd7b252160bb58e40
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Adds a new k8s-proxy command to convert operator's in-process proxy to
a separately deployable type of ProxyGroup: kube-apiserver. k8s-proxy
reads in a new config file written by the operator, modelled on tailscaled's
conffile but with some modifications to ensure multiple versions of the
config can co-exist within a file. This should make it much easier to
support reading that config file from a Kube Secret with a stable file name.
To avoid needing to give the operator ClusterRole{,Binding} permissions,
the helm chart now optionally deploys a new static ServiceAccount for
the API Server proxy to use if in auth mode.
Proxies deployed by kube-apiserver ProxyGroups currently work the same as
the operator's in-process proxy. They do not yet leverage Tailscale Services
for presenting a single HA DNS name.
Updates #13358
Change-Id: Ib6ead69b2173c5e1929f3c13fb48a9a5362195d8
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Based on feedback that it wasn't clear what the user is meant to do with
the output of the last command, clarify that it's an optional command to
explore what got created.
Updates #13427
Change-Id: Iff64ec6d02dc04bf4bbebf415d7ed1a44e7dd658
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This commit modifies the k8s operator to allow for customisation of the ingress class name
via a new `OPERATOR_INGRESS_CLASS_NAME` environment variable. For backwards compatibility,
this defaults to `tailscale`.
When using helm, a new `ingress.name` value is provided that will set this environment variable
and modify the name of the deployed `IngressClass` resource.
Fixes https://github.com/tailscale/tailscale/issues/16248
Signed-off-by: David Bond <davidsbond93@gmail.com>
Refactors setting status into its own top-level function to make it
easier to ensure we _always_ set the status if it's changed on every
reconcile. Previously, it was possible to have stale status if some
earlier part of the provision logic failed.
Updates #16327
Change-Id: Idab0cfc15ae426cf6914a82f0d37a5cc7845236b
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This commit modifies the operator helm chart values to bring the newly
added `loginServer` field to the top level. We felt as though it was a bit
confusing to be at the `operatorConfig` level as this value modifies the
behaviour or the operator, api server & all resources that the operator
manages.
Updates https://github.com/tailscale/corp/issues/29847
Signed-off-by: David Bond <davidsbond93@gmail.com>
This commit modifies the kubernetes operator to allow for customisation of the tailscale
login url. This provides some data locality for people that want to configure it.
This value is set in the `loginServer` helm value and is propagated down to all resources
managed by the operator. The only exception to this is recorder nodes, where additional
changes are required to support modifying the url.
Updates https://github.com/tailscale/corp/issues/29847
Signed-off-by: David Bond <davidsbond93@gmail.com>
* cmd/k8s-operator: ProxyClass annotation for Services and Ingresses
Previously, the ProxyClass could only be configured for Services and
Ingresses via a Label. This adds the ability to set it via an
Annotation, but prioritizes the Label if both a Label and Annotation are
set.
Updates #14323
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
* Update cmd/k8s-operator/operator.go
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
Signed-off-by: Tom Meadows <tom@tmlabs.co.uk>
* Update cmd/k8s-operator/operator.go
Signed-off-by: Tom Meadows <tom@tmlabs.co.uk>
* cmd/k8s-operator: ProxyClass annotation for Services and Ingresses
Previously, the ProxyClass could only be configured for Services and
Ingresses via a Label. This adds the ability to set it via an
Annotation, but prioritizes the Label if both a Label and Annotation are
set.
Updates #14323
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
---------
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
Signed-off-by: Tom Meadows <tom@tmlabs.co.uk>
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
Previously, the operator checked the ProxyGroup status fields for
information on how many of the proxies had successfully authed. Use
their state Secrets instead as a more reliable source of truth.
containerboot has written device_fqdn and device_ips keys to the
state Secret since inception, and pod_uid since 1.78.0, so there's
no need to use the API for that data. Read it from the state Secret
for consistency. However, to ensure we don't read data from a
previous run of containerboot, make sure we reset containerboot's
state keys on startup.
One other knock-on effect of that is ProxyGroups can briefly be
marked not Ready while a Pod is restarting. Introduce a new
ProxyGroupAvailable condition to more accurately reflect
when downstream controllers can implement flows that rely on a
ProxyGroup having at least 1 proxy Pod running.
Fixes#16327
Change-Id: I026c18e9d23e87109a471a87b8e4fb6271716a66
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This commit adds a NOTES.txt to the operator helm chart that will be written to the
terminal upon successful installation of the operator.
It includes a small list of knowledgebase articles with possible next steps for
the actor that installed the operator to the cluster. It also provides possible
commands to use for explaining the custom resources.
Fixes#13427
Signed-off-by: David Bond <davidsbond93@gmail.com>
Proxies know how to reload configfile on changes since 1.80, which
is going to be the earliest supported proxy version with 1.84 operator,
so remove the mechanism that was updating configfile hash to force
proxy Pod restarts on config changes.
Updates #13032
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Ensure that if the ProxyGroup for HA Ingress changes, the TLS Secret
and Role and RoleBinding that allow proxies to read/write to it are
updated.
Fixes#16259
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
We already present a health warning about this, but it is easy to miss
on a server when blackholing traffic makes it unreachable.
In addition to a health warning, present a risk message when exit node
is enabled.
Example:
```
$ tailscale up --exit-node=lizard
The following issues on your machine will likely make usage of exit nodes impossible:
- interface "ens4" has strict reverse-path filtering enabled
- interface "tailscale0" has strict reverse-path filtering enabled
Please set rp_filter=2 instead of rp_filter=1; see https://github.com/tailscale/tailscale/issues/3310
To skip this warning, use --accept-risk=linux-strict-rp-filter
$
```
Updates #3310
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
Validate that any tags that users have specified via tailscale.com/tags
annotation are valid Tailscale ACL tags.
Validate that no more than one HA Tailscale Kubernetes Services in a single cluster refer
to the same Tailscale Service.
Updates tailscale/tailscale#16054
Updates tailscale/tailscale#16035
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
RELNOTE=Fix CSRF errors in the client Web UI
Replace gorilla/csrf with a Sec-Fetch-Site based CSRF protection
middleware that falls back to comparing the Host & Origin headers if no
SFS value is passed by the client.
Add an -origin override to the web CLI that allows callers to specify
the origin at which the web UI will be available if it is hosted behind
a reverse proxy or within another application via CGI.
Updates #14872
Updates #15065
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
This fixes the implementation and test from #15208 which apparently
never worked.
Ignore the metacert when counting the number of expected certs
presented.
And fix the test, pulling out the TLSConfig setup code into something
shared between the real cmd/derper and the test.
Fixes#15579
Change-Id: I90526e38e59f89b480629b415f00587b107de10a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This reconciler allows users to make applications highly available at L3 by
leveraging Tailscale Virtual Services. Many Kubernetes Service's
(irrespective of the cluster they reside in) can be mapped to a
Tailscale Virtual Service, allowing access to these Services at L3.
Updates #15895
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
Adds Recorder fields to configure the name and annotations of the ServiceAccount
created for and used by its associated StatefulSet. This allows the created Pod
to authenticate with AWS without requiring a Secret with static credentials,
using AWS' IAM Roles for Service Accounts feature, documented here:
https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.htmlFixes#15875
Change-Id: Ib0e15c0dbc357efa4be260e9ae5077bacdcb264f
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>