So that we can pick-up kmods contained in sysexts (like zfs) and generate
complete module dependency information. I thought we could skip running depmod
for nvidia drivers because we manually insmod them, but nvidia's GPU operator
driver validation expects to be able to run modprobe - so we have to generate
them.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
The nspawn container runs in it's own scope, which journal output is then
associated with. By passing `--keep-unit` we can guarantee that all log output
will stay associated with the nvidia.service and can be viewed by running
`journalctl -u nvidia.service`.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Installers for 570 sometimes default to Open drivers, which we can't support
properly at this time. Force proprietary drivers. There are also additional
options that suppress certain worrisome error strings - enable those if
supported too.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Users have reported that in some cases the nvidia.service fails because
/opt/nvidia/current is a directory and the symbolic link gets created inside
it. I have no idea how we get there, but to make the service robust in the face
of this kind of issue:
- remove the directory if it exists
- use `-T` with ln to ensure that symbolic link creation fails if `current` is a directory
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
As we are using the `git` eclass, we can't simply use a git ref if this
one is on another branch, we need to pass the git branch as well.
Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
This change moves CONFIG_MICROSOFT_MANA=m from amd64_defconfig-6.6 to
commonconfig-6.6 to support the MANA network driver on ARM64 instances,
too.
Signed-off-by: Thilo Fromm <thilofromm@microsoft.com>
Apply patch to fix an issue when overriding ssh public key from ignition
configuration. Since the fix is not available in releases of
wa-linux-agent, we should apply a separate patch.
See also https://github.com/Azure/WALinuxAgent/pull/3309.
It happens that we have some leftovers instances running in an "error"
state (the error comes from the OpenStack scheduled deletion). This
leads to instance creation error during the test because quota limits
are hit.
Let's clean-up everything before running the new tests.
This won't impact tests from other channels as OpenStack is limited to
one CI job at a time.
Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
Concatenating certificates missing newlines naively with cat results in broken
bundle. Fix the issue by using a sed expression that appends a trailing newline
after the lastline if it is missing.
Issue: flatcar/flatcar#1601
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Hetzner is having some capacity issues[^1]:
- amd64: CPX plans (CPX11 to CPX51) - Falkenstein (FSN) and Nuremberg (NBG)
- arm64: CAX plans (CAX11 to CAX41) - Helsinki (HEL) and Nuremberg (NBG)
Let's switch the location:
* Helsinki (hel1) for amd64
* Keep Falkenstein (fsn1) for arm64
[^1]: https://status.hetzner.com/incident/aa5ce33b-faa5-4fd0-9782-fde43cd270cf
Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>