141 Commits

Author SHA1 Message Date
Andrew Rynhard
6efd6fbe08 chore: move gRPC API to public
In order for other projects to make use of our APIs, they must not
reside underneath the internal directory. This moves the protobuf
definitions to a top-level "api" directory and scopes them according to
their domain. This change also removes generated code from the gitignore
file so that users don't have to generate the code themseleves.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-19 08:55:13 -07:00
Andrew Rynhard
20302eb8f6 chore: fix AWS image dependency
We no longer need to wait for the installer image to be pushed before
creating the AWS image.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-17 21:12:03 -07:00
Andrew Rynhard
472f1aa6e8 chore: upgrade Sonobuoy to v0.15.4
This version has a fix for a bug that is affecting us.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-17 14:52:10 -07:00
Andrew Rynhard
3e62973b2c chore: upgrade conformange image
This upgrade the kube-conformance image used by sonobouy to
v1.16.0-rc.2.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-16 16:05:24 -07:00
Andrew Rynhard
ab4e058489 feat: upgrade Kubernetes to v1.16.0-rc.2
This brings in the release candidate for Kubernetes v1.16.0.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-16 14:56:55 -07:00
Andrew Rynhard
75746266ce feat: upgrade Kubernetes to v1.16.0-rc.1
This brings in the latest RC of 1.16.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-12 20:20:48 -07:00
Andrey Smirnov
980829708e chore: upgrade golancgi-lint to 1.18.0
New linter 'funlen' was disabled as too many functions break the default
limit, but might be considered for the future.

To limit peak memory usage, `GOGC=50` was added to the golangci-lint run
to make Go's garbage collector more aggressive. With this setting peak
seems to be around 8Gb.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-09-11 15:18:57 -07:00
Andrew Rynhard
298ddc8f49 fix: enable slub_debug=P
This is the last KSPP kernel parameter we need to be compliant with KSPP
guidelines.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-10 10:53:19 -07:00
Andrew Rynhard
38690d72df chore: remove unneeded packages
This removes packages we don't need anymore.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-10 08:12:07 -07:00
Andrew Rynhard
e48cee6343 chore: remove existing AMI
We need to remove an exiting AMI, if it exists, in order to create a new
one with the same name.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-10 04:52:43 -07:00
Andrew Rynhard
44dd2fc7c9 chore: remove packer from installer
This moves to making AWS releases align with Azure, and GCP. We no
longer need packer since we will now release an artifact that users can
import.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-09 18:54:37 -07:00
Brad Beam
f21d1244bd test(ci): Add aws for e2e and conformance targets
Add additional scripts and steps to enable doing tests against aws.

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-09-09 13:56:19 -05:00
Brad Beam
be4f7e1e6a chore: Rename maintainers channel
Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-09-09 10:59:48 -05:00
Spencer Smith
8b019d8f33 chore: update provider-components for capi v0.1.9
This PR updates our e2e tests with the provider-components file that's
generated by our capi v0.1.9 update.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-09-06 22:45:44 -04:00
Spencer Smith
71cddfd30b fix: remove basic integration teardown
This was breaking e2e testing, as we depend on it for applying CAPI and
launching VMs from there.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-09-06 15:15:24 -05:00
Brad Beam
f03975bdc3 chore: Retry check for HA control plane
Think this was causing some of our flakeyness for this test

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-09-05 22:04:38 -05:00
Andrey Smirnov
7ab0f8a7f2 chore: enable unit-tests-race
This is experiment to see how stable they are.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-09-02 19:02:38 -07:00
Brad Beam
1373806165 fix(init): Enable containerd subreaper
Should take care of our issue with Zombies

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-08-30 14:32:13 -07:00
Andrey Smirnov
029374f07d chore: disable go test result cache
Go by default caches unit-tests results via build cache, so if source
code doesn't have any changes, test results are cached on package level.
As our unit-tests are not that pure and depend on the environment, it
would be more helpful to make sure all the unit-tests during each build.

Setting number of test runs to one disable test result cache (but build
cache is still being used).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-08-30 22:03:00 +03:00
Brad Beam
b1dc400fea chore: Fix azure image upload
Single quote causes variable to not be evaluated

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-08-28 20:38:30 -05:00
Brad Beam
9b91cd4511 chore: Clean up e2e scripts
- Use az/gcloud cli bundled with container
- Use consistent spacing in scripts ( 2 spaces vs tab )
- Updated count functions to handle the count inline
- Made platform kubeconfig the default

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-08-28 08:31:47 -05:00
Andrew Rynhard
bf8fc1dcbd chore: lint protobuf definitions
This adds linting to our protobuf definitions via prototool.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-27 18:12:36 -07:00
Andrew Rynhard
fd25c019bf chore: fix qemu-boot.sh
Fixes a typo that cased the switch statement to not match Linux
environments.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-24 13:24:24 -07:00
Andrew Rynhard
f5f6c29e99 chore: add QEMU script
This script will help in low-level development.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-24 00:56:12 -07:00
Brad Beam
313c118ad0 refactor(networkd): Replace networkd with a standalone app
This is a major rewrite of our network subsystem.

- This changes networkd to run as a standalone app versus internal goroutine
- This changes out the netlink package with the more idiomatic netlink/rtnetlink
  packages
- This changes the initial network bootstrap/discovery from using a single
  interface to attempting to bring up all interfaces
- This moves us back on to the upstream dhcp library

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-08-21 13:24:51 -05:00
Andrew Rynhard
0af1eba159 refactor: add more runtime modes
In order to DRY up all installation methods and mount methods, this PR
introduces a few more runtime modes. The modes are then used to
determine the strategy for creating and or mounting the paritions.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-19 20:23:45 -07:00
Andrew Rynhard
060498ec87 chore: disable CIS benchmarks
These are failing with false positives. Disable for now so that we can
run our conformance tests.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-19 11:04:15 -07:00
Brad Beam
af47edf1ad chore: Make losetup atomic during installation
This should fix a race conditions where two independent image creation steps
run `losetup -f` and discover the same 'next available' loopback device and
attempt to use it.

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-08-17 15:23:42 -05:00
Andrew Rynhard
7970f977b7 chore: add markdownlint
This will give us a standard tool for linting Markdown files.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-17 03:53:52 -07:00
Spencer Smith
9d759df9bd chore: move to smaller azure instance type
This PR will save us a little dinero over the course of running e2e
builds in azure. It's only a couple cents per hour difference, but will
shave off a fair amount over the course of a month.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-08-16 09:46:17 -07:00
Andrew Rynhard
92452ab981 chore: remove sonobuoy spinner
This is only slowing down the build since we use a remote DB for drone.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-15 05:15:20 -07:00
Andrew Rynhard
48109e9757 chore: apply manifests when init node is ready
If we wait for all masters to check in before applying the PSP, we run
the risk of kube-proxy failing to start for a long period of time.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-14 20:28:34 -07:00
Andrew Rynhard
f18ecca50c chore: use go runner in sonobuoy
This is the recommended fix for waiting on conformance results. Sonobuoy
is returning early even though the --wait flag is specified.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-13 22:26:03 -07:00
Spencer Smith
57d22ef1bb chore: enable floating IP creation in e2e tests
This PR will edit the manifests for e2e so that we can take advantage of https://github.com/talos-systems/cluster-api-provider-talos/pull/47

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-08-13 15:23:28 -07:00
Andrew Rynhard
caa0354fe9 chore: fix drone clone
In order to use promotion against pull requests to trigger things like
E2E, we need to update the default clone logic. The issue is that a
promotion is assumed to be ran against a build that has been merged. In
our case, we need to promote builds that are not necessarily merged.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-12 20:33:29 -07:00
Andrew Rynhard
1956504bd4 chore: fix default pipeline
This prevents the default pipeline from running on releases. It also
ensures that the push step is executed on a release.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-12 17:45:26 -07:00
Andrew Rynhard
e8355f07a0 chore: fix release pipeline
We should only use the "tag" event and remove the promotion event. It
seems like we can't have both.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-12 17:24:12 -07:00
Andrew Rynhard
a420b85b07 chore: run unique E2E tests
In order to run more than one instance of E2E testing at a time, we need
to ensure that all resources are unique to the run.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-12 14:14:08 -07:00
Andrew Rynhard
57db8a77b7 chore: exclude promotion event
We need to exclude the promotion event in a number of places.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-12 11:43:38 -07:00
Andrew Rynhard
ac54a3cb86 chore: add ability to promote to a release
Although the GitHub release plug requires a tag and will fail on a
promotion, this is still useful as it will allow us to mimic a release
before we tag.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-11 11:51:53 -07:00
Andrew Rynhard
2ee769d19e chore: add image test step
Instead of building platform specific images in the default pipeline, we
should build just one image as part of our basic testing to make sure
installations work as expected.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-11 10:51:33 -07:00
Andrew Rynhard
c34ce3a4ed chore: reenable AMI publishing
This was removed during the refactor of our Drone file.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-11 10:07:57 -07:00
Andrew Rynhard
817380bad6 chore: refactor the Jsonnet file
This change improves the drone jsonnet file by making it more DRY and
structuring it in a way that makes it much easier to follow.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-11 09:23:30 -07:00
Andrew Rynhard
620efe52ef chore: fix push step dependencies
We should wait until basic integration is done.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-10 03:52:29 -07:00
Andrey Smirnov
ae54f7e40d fix: stalls in local Docker cluster boot
Problem was triggered by udevd trigger, root cause is not clear, but
workaround is to disable it for container mode.

Implement CPU/mem limits for `osctl cluster create`, apply defaults,
bump defaults for cicd.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-08-10 13:31:47 +03:00
Andrew Rynhard
b965239672 chore: fix clone logic
This is another attempt at fixing the clone logic to make it work when
building the master branch.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-09 23:04:43 -07:00
Andrew Rynhard
217b7e2f9d chore: fix broken clone
This fixes and issue with cloning the master branch caused by git
refusing to fetch into the current branch.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-09 22:44:54 -07:00
Andrew Rynhard
8786916fd0 chore: build drone YAML via jsonnet
This PR aims to DRY the drone config file by using Jsonnet to generate
it.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-09 22:30:37 -07:00
Brad Beam
e60a57e186 chore: Fix up adhoc e2e tests
- Wait a little after cluster comes up
- Change interaction with CONFORMANCE variable to work around
  set -eou pipefail restrictions
- Set sonobouy runner version to latest to work with alpha
  version

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-08-09 13:55:14 -05:00
Brad Beam
bfc1646cd9 chore(ci): Add e2e promotion pipeline
Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-08-08 11:27:57 -05:00